Abstract
Facial Expression Recognition (FER) has been a challenging task for decades. In this paper, we model the dynamic evolution of facial expression by extracting both temporal appearance features and temporal geometry features from image sequences. To extract the pairwise feature evolution, our approach consists of two different models. The first model combines convolutional layers and temporal recursion to extract dynamic appearance features from raw images. While the other model focuses on the geometrical variations based on facial landmarks, in which a novel 2-distance representation and resample technique are also proposed. These two models are combined by weighted method in order to boost the performance of recognition. We test our approach on three widely used databases: CK+, Oulu-CASIA and MMI. The experimental results show that we achieve state-of-the-art accuracy. Moreover, our models have minor-setup parameters and can work for the variable-length frame sequences input, which is flexible in practical applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Acevedo, D., Negri, P., Buemi, M.E., Fernández, F.G., Mejail, M.: A simple geometric-based descriptor for facial expression recognition. In: 12th IEEE International Conference on Automatic Face & Gesture Recognition, FG 2017, Washington, DC, USA, 30 May–3 June 2017, pp. 802–808 (2017). https://doi.org/10.1109/FG.2017.101
Chung, J., Gülçehre, Ç., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. CoRR abs/1412.3555 (2014). http://arxiv.org/abs/1412.3555
Desrosiers, P.A., Daoudi, M., Devanne, M.: Novel generative model for facial expressions based on statistical shape analysis of landmarks trajectories. In: 23rd International Conference on Pattern Recognition, ICPR 2016, Cancún, Mexico, 4–8 December 2016, pp. 961–966 (2016). https://doi.org/10.1109/ICPR.2016.7899760
Donahue, J., Hendricks, L.A., Guadarrama, S., Rohrbach, M., Venugopalan, S., Darrell, T., Saenko, K.: Long-term recurrent convolutional networks for visual recognition and description. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, 7–12 June 2015, pp. 2625–2634 (2015). https://doi.org/10.1109/CVPR.2015.7298878
Guo, Y., Zhao, G., Pietikäinen, M.: Dynamic facial expression recognition using longitudinal facial expression atlases. In: Computer Vision - ECCV 2012–12th European Conference on Computer Vision, Florence, Italy, 7–13 October 2012, Proceedings, Part II, pp. 631–644 (2012). https://doi.org/10.1007/978-3-642-33709-3_45
Hassani, B., Mahoor, M.H.: Spatio-temporal facial expression recognition using convolutional neural networks and conditional random fields. In: 12th IEEE International Conference on Automatic Face & Gesture Recognition, FG 2017, Washington, DC, USA, 30 May–3 June 2017, pp. 790–795 (2017). https://doi.org/10.1109/FG.2017.99
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997). https://doi.org/10.1162/neco.1997.9.8.1735
Jung, H., Lee, S., Yim, J., Park, S., Kim, J.: Joint fine-tuning in deep neural networks for facial expression recognition. In: 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, 7–13 December 2015, pp. 2983–2991 (2015). https://doi.org/10.1109/ICCV.2015.341
Kläser, A., Marszalek, M., Schmid, C.: A spatio-temporal descriptor based on 3D-gradients. In: Proceedings of the British Machine Vision Conference 2008, Leeds, September 2008, pp. 1–10 (2008). https://doi.org/10.5244/C.22.99
Liu, M., Li, S., Shan, S., Chen, X.: Au-aware deep networks for facial expression recognition. In: 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition, FG 2013, Shanghai, China, 22–26 April 2013, pp. 1–6 (2013). https://doi.org/10.1109/FG.2013.6553734
Liu, M., Li, S., Shan, S., Wang, R., Chen, X.: Deeply learning deformable facial action parts model for dynamic expression analysis. In: Computer Vision - ACCV 2014 - 12th Asian Conference on Computer Vision, Singapore, Singapore, 1–5 November 2014, Revised Selected Papers, Part IV, pp. 143–157 (2014). https://doi.org/10.1007/978-3-319-16817-3_10
Liu, M., Shan, S., Wang, R., Chen, X.: Learning expressionlets on spatiotemporal manifold for dynamic facial expression recognition. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, Columbus, OH, USA, 23–28 June 2014, pp. 1749–1756 (2014). https://doi.org/10.1109/CVPR.2014.226
Lucey, P., Cohn, J.F., Kanade, T., Saragih, J.M., Ambadar, Z., Matthews, I.A.: The extended cohn-kanade dataset (CK+): a complete dataset for action unit and emotion-specified expression. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR Workshops 2010, San Francisco, CA, USA, 13–18 June 2010, pp. 94–101 (2010). https://doi.org/10.1109/CVPRW.2010.5543262
Mohammadi, M.R., Fatemizadeh, E., Mahoor, M.H.: Pca-based dictionary building for accurate facial expression recognition via sparse representation. J. Vis. Commun. Image Represent. 25(5), 1082–1092 (2014). https://doi.org/10.1016/j.jvcir.2014.03.006
Sanin, A., Sanderson, C., Harandi, M.T., Lovell, B.C.: Spatio-temporal covariance descriptors for action and gesture recognition. In: 2013 IEEE Workshop on Applications of Computer Vision, WACV 2013, Clearwater Beach, FL, USA, 15–17 January 2013, pp. 103–110 (2013). https://doi.org/10.1109/WACV.2013.6475006
Scovanner, P., Ali, S., Shah, M.: A 3-dimensional sift descriptor and its application to action recognition. In: Proceedings of the 15th International Conference on Multimedia 2007, Augsburg, Germany, 24–29 September 2007, pp. 357–360 (2007). https://doi.org/10.1145/1291233.1291311
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S.E., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, 7–12 June 2015, pp. 1–9 (2015). https://doi.org/10.1109/CVPR.2015.7298594
Taini, M., Zhao, G., Li, S.Z., Pietikäinen, M.: Facial expression recognition from near-infrared video sequences. In: 19th International Conference on Pattern Recognition (ICPR 2008), Tampa, Florida, USA, 8–11 December 2008, pp. 1–4 (2008). https://doi.org/10.1109/ICPR.2008.4761697
Valstar, M., Pantic, M.: Induced disgust, happiness and surprise: an addition to the MMI facial expression database. In: Proceedings International Workshop on Emotion Corpora for Research on Emotion & Affect, pp. 65–70 (2010)
Wang, Z., Wang, S., Ji, Q.: Capturing complex spatio-temporal relations among facial muscles for facial expression recognition. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA, 23–28 June 2013, pp. 3422–3429 (2013). https://doi.org/10.1109/CVPR.2013.439
Zhao, G., Huang, X., Taini, M., Li, S.Z., Pietikäinen, M.: Facial expression recognition from near-infrared videos. Image Vis. Comput. 29(9), 607–619 (2011). https://doi.org/10.1016/j.imavis.2011.07.002
Acknowledgements
This work is supported by the National Natural Science Foundation of China (No. 61672276) and Natural Science Foundation of Jiangsu, China (BK20161406).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Huan, Z., Shang, L. (2018). Model the Dynamic Evolution of Facial Expression from Image Sequences. In: Phung, D., Tseng, V., Webb, G., Ho, B., Ganji, M., Rashidi, L. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2018. Lecture Notes in Computer Science(), vol 10938. Springer, Cham. https://doi.org/10.1007/978-3-319-93037-4_43
Download citation
DOI: https://doi.org/10.1007/978-3-319-93037-4_43
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-93036-7
Online ISBN: 978-3-319-93037-4
eBook Packages: Computer ScienceComputer Science (R0)