Skip to main content

Model the Dynamic Evolution of Facial Expression from Image Sequences

  • Conference paper
  • First Online:
Advances in Knowledge Discovery and Data Mining (PAKDD 2018)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10938))

Included in the following conference series:

Abstract

Facial Expression Recognition (FER) has been a challenging task for decades. In this paper, we model the dynamic evolution of facial expression by extracting both temporal appearance features and temporal geometry features from image sequences. To extract the pairwise feature evolution, our approach consists of two different models. The first model combines convolutional layers and temporal recursion to extract dynamic appearance features from raw images. While the other model focuses on the geometrical variations based on facial landmarks, in which a novel 2-distance representation and resample technique are also proposed. These two models are combined by weighted method in order to boost the performance of recognition. We test our approach on three widely used databases: CK+, Oulu-CASIA and MMI. The experimental results show that we achieve state-of-the-art accuracy. Moreover, our models have minor-setup parameters and can work for the variable-length frame sequences input, which is flexible in practical applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Acevedo, D., Negri, P., Buemi, M.E., Fernández, F.G., Mejail, M.: A simple geometric-based descriptor for facial expression recognition. In: 12th IEEE International Conference on Automatic Face & Gesture Recognition, FG 2017, Washington, DC, USA, 30 May–3 June 2017, pp. 802–808 (2017). https://doi.org/10.1109/FG.2017.101

  2. Chung, J., Gülçehre, Ç., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. CoRR abs/1412.3555 (2014). http://arxiv.org/abs/1412.3555

  3. Desrosiers, P.A., Daoudi, M., Devanne, M.: Novel generative model for facial expressions based on statistical shape analysis of landmarks trajectories. In: 23rd International Conference on Pattern Recognition, ICPR 2016, Cancún, Mexico, 4–8 December 2016, pp. 961–966 (2016). https://doi.org/10.1109/ICPR.2016.7899760

  4. Donahue, J., Hendricks, L.A., Guadarrama, S., Rohrbach, M., Venugopalan, S., Darrell, T., Saenko, K.: Long-term recurrent convolutional networks for visual recognition and description. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, 7–12 June 2015, pp. 2625–2634 (2015). https://doi.org/10.1109/CVPR.2015.7298878

  5. Guo, Y., Zhao, G., Pietikäinen, M.: Dynamic facial expression recognition using longitudinal facial expression atlases. In: Computer Vision - ECCV 2012–12th European Conference on Computer Vision, Florence, Italy, 7–13 October 2012, Proceedings, Part II, pp. 631–644 (2012). https://doi.org/10.1007/978-3-642-33709-3_45

    Chapter  Google Scholar 

  6. Hassani, B., Mahoor, M.H.: Spatio-temporal facial expression recognition using convolutional neural networks and conditional random fields. In: 12th IEEE International Conference on Automatic Face & Gesture Recognition, FG 2017, Washington, DC, USA, 30 May–3 June 2017, pp. 790–795 (2017). https://doi.org/10.1109/FG.2017.99

  7. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997). https://doi.org/10.1162/neco.1997.9.8.1735

    Article  Google Scholar 

  8. Jung, H., Lee, S., Yim, J., Park, S., Kim, J.: Joint fine-tuning in deep neural networks for facial expression recognition. In: 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, 7–13 December 2015, pp. 2983–2991 (2015). https://doi.org/10.1109/ICCV.2015.341

  9. Kläser, A., Marszalek, M., Schmid, C.: A spatio-temporal descriptor based on 3D-gradients. In: Proceedings of the British Machine Vision Conference 2008, Leeds, September 2008, pp. 1–10 (2008). https://doi.org/10.5244/C.22.99

  10. Liu, M., Li, S., Shan, S., Chen, X.: Au-aware deep networks for facial expression recognition. In: 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition, FG 2013, Shanghai, China, 22–26 April 2013, pp. 1–6 (2013). https://doi.org/10.1109/FG.2013.6553734

  11. Liu, M., Li, S., Shan, S., Wang, R., Chen, X.: Deeply learning deformable facial action parts model for dynamic expression analysis. In: Computer Vision - ACCV 2014 - 12th Asian Conference on Computer Vision, Singapore, Singapore, 1–5 November 2014, Revised Selected Papers, Part IV, pp. 143–157 (2014). https://doi.org/10.1007/978-3-319-16817-3_10

    Google Scholar 

  12. Liu, M., Shan, S., Wang, R., Chen, X.: Learning expressionlets on spatiotemporal manifold for dynamic facial expression recognition. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, Columbus, OH, USA, 23–28 June 2014, pp. 1749–1756 (2014). https://doi.org/10.1109/CVPR.2014.226

  13. Lucey, P., Cohn, J.F., Kanade, T., Saragih, J.M., Ambadar, Z., Matthews, I.A.: The extended cohn-kanade dataset (CK+): a complete dataset for action unit and emotion-specified expression. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR Workshops 2010, San Francisco, CA, USA, 13–18 June 2010, pp. 94–101 (2010). https://doi.org/10.1109/CVPRW.2010.5543262

  14. Mohammadi, M.R., Fatemizadeh, E., Mahoor, M.H.: Pca-based dictionary building for accurate facial expression recognition via sparse representation. J. Vis. Commun. Image Represent. 25(5), 1082–1092 (2014). https://doi.org/10.1016/j.jvcir.2014.03.006

    Article  Google Scholar 

  15. Sanin, A., Sanderson, C., Harandi, M.T., Lovell, B.C.: Spatio-temporal covariance descriptors for action and gesture recognition. In: 2013 IEEE Workshop on Applications of Computer Vision, WACV 2013, Clearwater Beach, FL, USA, 15–17 January 2013, pp. 103–110 (2013). https://doi.org/10.1109/WACV.2013.6475006

  16. Scovanner, P., Ali, S., Shah, M.: A 3-dimensional sift descriptor and its application to action recognition. In: Proceedings of the 15th International Conference on Multimedia 2007, Augsburg, Germany, 24–29 September 2007, pp. 357–360 (2007). https://doi.org/10.1145/1291233.1291311

  17. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S.E., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, 7–12 June 2015, pp. 1–9 (2015). https://doi.org/10.1109/CVPR.2015.7298594

  18. Taini, M., Zhao, G., Li, S.Z., Pietikäinen, M.: Facial expression recognition from near-infrared video sequences. In: 19th International Conference on Pattern Recognition (ICPR 2008), Tampa, Florida, USA, 8–11 December 2008, pp. 1–4 (2008). https://doi.org/10.1109/ICPR.2008.4761697

  19. Valstar, M., Pantic, M.: Induced disgust, happiness and surprise: an addition to the MMI facial expression database. In: Proceedings International Workshop on Emotion Corpora for Research on Emotion & Affect, pp. 65–70 (2010)

    Google Scholar 

  20. Wang, Z., Wang, S., Ji, Q.: Capturing complex spatio-temporal relations among facial muscles for facial expression recognition. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA, 23–28 June 2013, pp. 3422–3429 (2013). https://doi.org/10.1109/CVPR.2013.439

  21. Zhao, G., Huang, X., Taini, M., Li, S.Z., Pietikäinen, M.: Facial expression recognition from near-infrared videos. Image Vis. Comput. 29(9), 607–619 (2011). https://doi.org/10.1016/j.imavis.2011.07.002

    Article  Google Scholar 

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (No. 61672276) and Natural Science Foundation of Jiangsu, China (BK20161406).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lin Shang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Huan, Z., Shang, L. (2018). Model the Dynamic Evolution of Facial Expression from Image Sequences. In: Phung, D., Tseng, V., Webb, G., Ho, B., Ganji, M., Rashidi, L. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2018. Lecture Notes in Computer Science(), vol 10938. Springer, Cham. https://doi.org/10.1007/978-3-319-93037-4_43

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-93037-4_43

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-93036-7

  • Online ISBN: 978-3-319-93037-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics