Model the Dynamic Evolution of Facial Expression from Image Sequences

Huan, Zhaoxin; Shang, Lin

doi:10.1007/978-3-319-93037-4_43

Zhaoxin Huan¹⁹ &
Lin Shang¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10938))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

2023 Accesses
2 Citations

Abstract

Facial Expression Recognition (FER) has been a challenging task for decades. In this paper, we model the dynamic evolution of facial expression by extracting both temporal appearance features and temporal geometry features from image sequences. To extract the pairwise feature evolution, our approach consists of two different models. The first model combines convolutional layers and temporal recursion to extract dynamic appearance features from raw images. While the other model focuses on the geometrical variations based on facial landmarks, in which a novel 2-distance representation and resample technique are also proposed. These two models are combined by weighted method in order to boost the performance of recognition. We test our approach on three widely used databases: CK+, Oulu-CASIA and MMI. The experimental results show that we achieve state-of-the-art accuracy. Moreover, our models have minor-setup parameters and can work for the variable-length frame sequences input, which is flexible in practical applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Acevedo, D., Negri, P., Buemi, M.E., Fernández, F.G., Mejail, M.: A simple geometric-based descriptor for facial expression recognition. In: 12th IEEE International Conference on Automatic Face & Gesture Recognition, FG 2017, Washington, DC, USA, 30 May–3 June 2017, pp. 802–808 (2017). https://doi.org/10.1109/FG.2017.101
Chung, J., Gülçehre, Ç., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. CoRR abs/1412.3555 (2014). http://arxiv.org/abs/1412.3555
Desrosiers, P.A., Daoudi, M., Devanne, M.: Novel generative model for facial expressions based on statistical shape analysis of landmarks trajectories. In: 23rd International Conference on Pattern Recognition, ICPR 2016, Cancún, Mexico, 4–8 December 2016, pp. 961–966 (2016). https://doi.org/10.1109/ICPR.2016.7899760
Donahue, J., Hendricks, L.A., Guadarrama, S., Rohrbach, M., Venugopalan, S., Darrell, T., Saenko, K.: Long-term recurrent convolutional networks for visual recognition and description. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, 7–12 June 2015, pp. 2625–2634 (2015). https://doi.org/10.1109/CVPR.2015.7298878
Guo, Y., Zhao, G., Pietikäinen, M.: Dynamic facial expression recognition using longitudinal facial expression atlases. In: Computer Vision - ECCV 2012–12th European Conference on Computer Vision, Florence, Italy, 7–13 October 2012, Proceedings, Part II, pp. 631–644 (2012). https://doi.org/10.1007/978-3-642-33709-3_45
Chapter Google Scholar
Hassani, B., Mahoor, M.H.: Spatio-temporal facial expression recognition using convolutional neural networks and conditional random fields. In: 12th IEEE International Conference on Automatic Face & Gesture Recognition, FG 2017, Washington, DC, USA, 30 May–3 June 2017, pp. 790–795 (2017). https://doi.org/10.1109/FG.2017.99
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997). https://doi.org/10.1162/neco.1997.9.8.1735
Article Google Scholar
Jung, H., Lee, S., Yim, J., Park, S., Kim, J.: Joint fine-tuning in deep neural networks for facial expression recognition. In: 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, 7–13 December 2015, pp. 2983–2991 (2015). https://doi.org/10.1109/ICCV.2015.341
Kläser, A., Marszalek, M., Schmid, C.: A spatio-temporal descriptor based on 3D-gradients. In: Proceedings of the British Machine Vision Conference 2008, Leeds, September 2008, pp. 1–10 (2008). https://doi.org/10.5244/C.22.99
Liu, M., Li, S., Shan, S., Chen, X.: Au-aware deep networks for facial expression recognition. In: 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition, FG 2013, Shanghai, China, 22–26 April 2013, pp. 1–6 (2013). https://doi.org/10.1109/FG.2013.6553734
Liu, M., Li, S., Shan, S., Wang, R., Chen, X.: Deeply learning deformable facial action parts model for dynamic expression analysis. In: Computer Vision - ACCV 2014 - 12th Asian Conference on Computer Vision, Singapore, Singapore, 1–5 November 2014, Revised Selected Papers, Part IV, pp. 143–157 (2014). https://doi.org/10.1007/978-3-319-16817-3_10
Google Scholar
Liu, M., Shan, S., Wang, R., Chen, X.: Learning expressionlets on spatiotemporal manifold for dynamic facial expression recognition. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, Columbus, OH, USA, 23–28 June 2014, pp. 1749–1756 (2014). https://doi.org/10.1109/CVPR.2014.226
Lucey, P., Cohn, J.F., Kanade, T., Saragih, J.M., Ambadar, Z., Matthews, I.A.: The extended cohn-kanade dataset (CK+): a complete dataset for action unit and emotion-specified expression. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR Workshops 2010, San Francisco, CA, USA, 13–18 June 2010, pp. 94–101 (2010). https://doi.org/10.1109/CVPRW.2010.5543262
Mohammadi, M.R., Fatemizadeh, E., Mahoor, M.H.: Pca-based dictionary building for accurate facial expression recognition via sparse representation. J. Vis. Commun. Image Represent. 25(5), 1082–1092 (2014). https://doi.org/10.1016/j.jvcir.2014.03.006
Article Google Scholar
Sanin, A., Sanderson, C., Harandi, M.T., Lovell, B.C.: Spatio-temporal covariance descriptors for action and gesture recognition. In: 2013 IEEE Workshop on Applications of Computer Vision, WACV 2013, Clearwater Beach, FL, USA, 15–17 January 2013, pp. 103–110 (2013). https://doi.org/10.1109/WACV.2013.6475006
Scovanner, P., Ali, S., Shah, M.: A 3-dimensional sift descriptor and its application to action recognition. In: Proceedings of the 15th International Conference on Multimedia 2007, Augsburg, Germany, 24–29 September 2007, pp. 357–360 (2007). https://doi.org/10.1145/1291233.1291311
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S.E., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, 7–12 June 2015, pp. 1–9 (2015). https://doi.org/10.1109/CVPR.2015.7298594
Taini, M., Zhao, G., Li, S.Z., Pietikäinen, M.: Facial expression recognition from near-infrared video sequences. In: 19th International Conference on Pattern Recognition (ICPR 2008), Tampa, Florida, USA, 8–11 December 2008, pp. 1–4 (2008). https://doi.org/10.1109/ICPR.2008.4761697
Valstar, M., Pantic, M.: Induced disgust, happiness and surprise: an addition to the MMI facial expression database. In: Proceedings International Workshop on Emotion Corpora for Research on Emotion & Affect, pp. 65–70 (2010)
Google Scholar
Wang, Z., Wang, S., Ji, Q.: Capturing complex spatio-temporal relations among facial muscles for facial expression recognition. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA, 23–28 June 2013, pp. 3422–3429 (2013). https://doi.org/10.1109/CVPR.2013.439
Zhao, G., Huang, X., Taini, M., Li, S.Z., Pietikäinen, M.: Facial expression recognition from near-infrared videos. Image Vis. Comput. 29(9), 607–619 (2011). https://doi.org/10.1016/j.imavis.2011.07.002
Article Google Scholar

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (No. 61672276) and Natural Science Foundation of Jiangsu, China (BK20161406).

Author information

Authors and Affiliations

Department of Computer Science and Technology, State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, Jiangsu, China
Zhaoxin Huan & Lin Shang

Authors

Zhaoxin Huan
View author publications
You can also search for this author in PubMed Google Scholar
Lin Shang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lin Shang .

Editor information

Editors and Affiliations

Deakin University, Geelong, Victoria, Australia
Dinh Phung
National Chiao Tung University, Hsinchu City, Taiwan
Vincent S. Tseng
Monash University, Clayton, Victoria, Australia
Geoffrey I. Webb
Japan Advanced Institute of Science and Technology, Nomi, Ishikawa, Japan
Bao Ho
University of Melbourne, Melbourne, Victoria, Australia
Mohadeseh Ganji
University of Melbourne, Melbourne, Victoria, Australia
Lida Rashidi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Huan, Z., Shang, L. (2018). Model the Dynamic Evolution of Facial Expression from Image Sequences. In: Phung, D., Tseng, V., Webb, G., Ho, B., Ganji, M., Rashidi, L. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2018. Lecture Notes in Computer Science(), vol 10938. Springer, Cham. https://doi.org/10.1007/978-3-319-93037-4_43

Download citation

DOI: https://doi.org/10.1007/978-3-319-93037-4_43
Published: 20 June 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-93036-7
Online ISBN: 978-3-319-93037-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics