Abstract
In this paper, a framework for implicit human-centered tagging is presented. The proposed framework draws its inspiration from the psychologically established process of attribution. The latter strives to explain affect-related changes observed during an individual’s participation in an emotional episode, by bestowing the corresponding affect changing properties on a selected perceived stimulus. Our framework tries to reverse-engineer this attribution process. By monitoring the annotator’s focus of attention through gaze-tracking, we identify the stimulus attributed as the cause for the observed change in core affect. The latter is analyzed from the user’s facial expressions. Experimental results attained by a lightweight, cost-efficient application based on the proposed framework show promising accuracy in both the assessment of topical relevance and direct annotation scenarios. These results are especially encouraging given the fact that the behavioral analyzers used to obtain user affective response and eye gaze lack the level of sophistication and high cost usually encountered in the related literature.
Similar content being viewed by others
References
Arapakis, I., Konstas, I., Jose, J.M.: Using facial expressions and peripheral physiological signals as implicit indicators of topical relevance. In: Proceedings of the 17th ACM International conference on Multimedia, pp. 461–470. ACM, New York (2009)
Arapakis, I., Athanasakos, K., Jose, J.M.: A comparison of general vs personalized affective models for the prediction of topical relevance. In: Proceedings of the 33rd International ACM SIGIR conference on Research and development in information retrieval, pp. 371–378. ACM, New York (2010)
Bay, H., Tuytelaars, T., Van Gool, L.: SURF: Speeded up robust features. Computer Vision-ECCV, pp. 404–417. Springer, Berlin (2006)
Buscher, G., van Elst, L., Dengel, A.: Segment-level display time as implicit feedback: a comparison to eye tracking. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 67–74. ACM, New York (2009)
Chen, D., Chen, B., Mamic, G., Fookes, C., Sridharan, S.: Improved grabcut segmentations via GMM optimisation. In: Computing: Techniques and Applications, pp. 39–45. 2008, DICTA ’08, Digital Image. IEEE, New York (2008)
Chen, J., Tong, Y., Gray, W., Ji, Q.: A robust 3D eye gaze tracking system using noise reduction. In: Proceedings of the 2008 Symposium on Eye Tracking Research and Applications, pp. 189–196. ACM, New York (2008)
Cootes, T.F., Taylor, C.J., Cooper, D.H., Graham, J.: Active shape models-their training and application. Comput. Vis. Image Underst. 61(1), 38–59 (1995)
Cootes, T.F., Taylor, C.J.: Active appearance models. IEEE Trans. Pattern Anal. Mach. Intell. 23(6), 681–685 (2001)
Cootes, T.F.: Talking face video.
Ekman, P., Friesen, W.V.: Facial action coding system: a technique for the measurement of facial movement. Consulting Psychologists Press (1978)
Hajimirza, S.N., Proulx, M.J., Izquierdo, E.: Reading users’ minds from their eyes: a method for implicit image annotation. IEEE Trans. Multimed. 14(3), 805–815 (2012)
Huang, C.L., Huang, Y.M.: Fcaial expression recognition using model-based feature extraction and action parameters classification. J. Vis. Commun. Image Represent. 8(3), 278–290 (1997)
Ishikawa, T.: Passive driver gaze tracking with active appearance models. Carnegie Mellon University, The Robotics Institute (2004)
Jesorsky, O., Kirchberg, K.J., Frischholz, R.W.: Robust face detection using the hausdorff distance. In: Audio- and video-based biometric person authentication, pp. 90–95. Springer, Berlin (2001)
Jiao, J., Pantic, M.: Implicit image tagging via facial information. In: Proceedings of the 2nd International Workshop on Social Signal Processing, pp. 59–64. ACM, New York (2010)
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 2169–2178. IEEE, New York (2006)
Lee, E.C., Park, K.R.: A robust gaze tracking method based on a virtual eyeball model. Mach. Vision Appl. 20(5), 319–337 (2009)
Lim, M.Y., Aylett, R.: A new approach to emotion generation and expression. In: Proceedings of the Doctoral Consortium. The 2nd International Conference on Affective Computing and Intelligent, Interaction, pp. 147–154 (2007)
Marks, T.K., Hershey, J.R., Movellan, J.R.: Tracking motion, deformation and texture using conditionally Gaussian processes. IEEE Trans. Pattern Anal. Mach. Intell. 32(2), 348–363 (2010)
Martinez Bedard, B.: Is core affect a natural kind? Philosophy Theses 42 (2008)
Milborrow, S., Morkel, J. Nicolls, F.: The muct landmarked face database. In: Pattern Recognition Association of South Africa (2010)
Nordstrøm, M.M., Larsen, M. Sierakowski, J. Stegmann, M.B.: The IMM face database-an annotated dataset of 240 face images. DTU Informatics, Building 321 (2004)
Otsu, N.: A threshold selection method from gray-level histograms. Automatica 11(285–296), 23–27 (1975)
Pogalin, E., Redert, A., Patras, I., Hendriks, E.A.: Gaze tracking by using factorized likelihoods particle filtering and stereo vision. In: Third International Symposium on 3D Data Processing, Visualization and Transmission, pp. 57–64. IEEE (2006)
Rother, C., Kolmogorov, V., Blake, A.: Grabcut: Interactive foreground extraction using iterated graph cuts. ACM Trans. Graph. 23(3), 309–314 (2004)
Russell, J.A.: Core affect and the psychological construction of emotion. Psychol. Rev. 110(1), 145 (2003)
Sadeghi, M., Tien, G., Hamarneh, G., Atkins, M. S.: Hands-free interactive image segmentation using eyegaze. In: SPIE Medical Imaging (pp. 72601H–72601H). International Society for Optics and Photonics (2009)
Salojärvi, J., Puolamäki, K., Kaski, S.: Implicit relevance feedback from eye movements. Artificial Neural Networks: Biological Inspirations—ICANN 2005, pp. 513–518. Springer, Berlin (2005)
Shan, M.K., Kuo, F.F., Chiang, M.F., Lee, S.Y.: Emotion-based music recommendation by affinity discovery from film music. Expert Syst. Appl. 36(4), 7666–7674 (2009)
Simon, D., Craig, K.D., Gosselin, F., Belin, P., Rainville, P.: Recognition and discrimination of prototypical dynamic expressions of pain and emotions. Pain 135(1–2), 55–64 (2008)
Smith, C., Scott, H.: A componential approach to the meaning of facial expressions. In: The psychology of facial expression, vol. 229 (1997)
Soleymani, M., Pantic, M.: Human-centered implicit tagging: overview and perspectives. IEEE International Conference on Systems, pp. 3304–3309. Man and Cybernetics (SMC). IEEE, New York (2012)
Soleymani, M., Lichtenauer, J., Pun, T., Pantic, M.: A multimodal database for affect recognition and implicit tagging. IEEE Trans. Affect. Comput. 3(1), 42–55 (2012)
Soleymani, M., Pantic, M., Pun, T.: Multimodal emotion recognition in response to videos. IEEE Trans. Affect. Comput. 3(2), 211–223 (2012)
Strupp, S., Schmitz, N., Berns, K.: Visual-based emotion detection for natural man–machine interaction. KI 2008: Advances in Artificial Intelligence, pp. 356–363. Springer, Berlin (2008)
Talbot, J., Xu, X.: Implementing Grabut. Bringham Young University (2006)
Terissi, L.D., Gómez, J.C.: 3D head pose and facial expression tracking using a single camera. J. Univers. Comput. Sci. 16(6), 903–920 (2010)
Tkalčič, M., Odić, A., Kočir, A., Tasič, J.F.: Affective labeling in a content-based recommender system for images. IEEE Trans. Multimed. 15(2), 391–400 (2013)
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2001, CVPR 2001, vol. 1, pp. I-511. IEEE, New York (2001)
Vinciarelli, A., Suditu, N., Pantic, M.: Implicit human-centered tagging. In: IEEE International Conference on Multimedia and Expo, pp. 1428–1431. ICME 2009. IEEE, New York (2009)
Vrochidis, S., Patras, I., Kompatsiaris, I.: An eye-tracking-based approach to facilitate interactive video search. In: Proceedings of the 1st ACM International Conference on Multimedia Retrieval (p. 43). ACM, New York (2011)
Yik, M.: Studying affect among the Chinese: the circular way. J. Personal. Assess. 91(5), 416–428 (2009)
Yik, M., Russell, J.A., Steiger, J.H.: A 12-point circumplex structure of core affect. Emotion-APA 11(4), 705 (2011)
Zeng, Z., Pantic, M., Roisman, G.I., Huang, T.S.: A survey of affect recognition methods: audio, visual and spontaneous expressions. IEEE Trans. Pattern Anal. Mach. Intell. 31(1), 39–58 (2009)
Zhu, J., Yang, J.: Subpixel eye gaze tracking. In: Fifth IEEE International Conference on Automatic Face and Gesture Recognition, pp. 124–129. Proceedings. IEEE, New York (2002)
Acknowledgments
The research leading to this work has received funding from the European Community’s Seventh Framework Programme (FP7/2007-2013) under grant agreement no. ICT-2011-7-287723 (REVERIE project).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Apostolakis, K.C., Daras, P. A framework for implicit human-centered image tagging inspired by attributed affect. Vis Comput 30, 1093–1106 (2014). https://doi.org/10.1007/s00371-013-0903-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-013-0903-4