Abstract
Alzheimer patients face difficulty to remember the identity of persons and performing daily life activities. This paper presents a hybrid method to generate the egocentric video summary of important people, objects and medicines to facilitate the Alzheimer patients to recall their deserted memories. Lifelogging video data analysis is used to recall the human memory; however, the massive amount of lifelogging data makes it a challenging task to select the most relevant content to educate the Alzheimer’s patient. To address the challenges associated with massive lifelogging content, static video summarization approach is applied to select the key-frames that are more relevant in the context of recalling the deserted memories of the Alzheimer patients. This paper consists of three main modules that are face, object, and medicine recognition. Histogram of oriented gradient features are used to train the multi-class SVM for face recognition. SURF descriptors are employed to extract the features from the input video frames that are then used to find the corresponding points between the objects in the input video and the reference objects stored in the database. Morphological operators are applied followed by the optical character recognition to recognize and tag the medicines for Alzheimer patients. The performance of the proposed system is evaluated on 18 real-world homemade videos. Experimental results signify the effectiveness of the proposed system in terms of providing the most relevant content to enhance the memory of Alzheimer patients.
Similar content being viewed by others
References
Aghdam HH, Heravi EJ, Puig D (2015) An unsupervised method for summarizing egocentric sport videos. In: Eighth international conference on machine vision (ICMV 2015)
Bay H, Tuytelaars T, Van Gool L (2006) SURF: speeded up robust features. In: Computer Vision—ECCV 2006. Austria
Blighe M, Doherty A, Smeaton AF, Connor NEO (2008) Keyframe detection in visual lifelogs. In: Conference on pervasive technologies
Bolanos M, Dimiccoli M, Radeva P (2017) Towards storytelling from visual lifelogging: an overview. IEEE Trans Hum Mach Syst 47:77–90
Crandall D, Antani S, Kasturi R (2002) Extraction of special effects caption text events from digital video. Int J Doc Anal Recognit 5:148–150
Dietterich TG, Bakiri G (1995) Solving multiclass learning problems via error-correcting output codes. J Artif Intell Res 263–286
Doherty AR, Byrne D, Smeaton AF, Jones GJF, Hughes M (2008) Investigating keyframe selection methods in the novel domain of passively captured visual lifelogs. In: Proceedings of the 2008 international conference on content-based image and video retrieval, pp 259–268. ACM
Grauman K, Lu Z (2013) Story-driven summarization for egocentric video. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). Texas
Javed A, Bajwa KB, Malik H, Irtaza A (2016) An efficient framework for automatic highlights generation from sports videos. IEEE Signal Process Lett 23(7):954–958
Jeong D, Yoo HJ, Cho NI (2016) A static video summarization method based on the sparse coding of features and representativeness of frames. EURASIP J Image Video Process 2017(1):1
Karaman S, Benois-Pineau J, Dovgalecs V, Mégret R, Pinquier J, André-Obrecht R, Gaëstel Y, Dartigues J-F (2014) Hierarchical Hidden Markov Model in detecting activities of daily living in wearable videos for studies of dementia. Multimedia Tools Appl 69(3):743–771
Lee YJ, Grauman K (2015) Predicting important objects for egocentric video summarization. Int J Comput Vis 114(1):38–55
Lidon A, Bolanos M, Dimiccoli M, Radeva P, Garolera M (2017) Semantic summarization of egocentric photo stream events. In: LTA’17 Proceedings of the 2nd workshop on lifelogging tools and applications, Mountain View, California, USA, 23–24 October 2017. ACM, New York
Lu Y (1995) Machine printed character segmentation—an overview. Pattern Recognit 28(1):67–80
Meditskos G, Plans P-M, Stavropoulos TG, Benois-Pineau J, Buso V, Kompatsiaris I (2018) Multi-modal activity recognition from egocentric vision, semantic enrichment and lifelogging applications for the care of dementia. J Vis Commun Image Represent 51:169–190
Nguyen T-H-C, Nebel J-C, Florez-Revuelta F (2016) Recognition of activities of daily living with egocentric vision: a review. Sensors (Basel) 16:72
Shivakumara P, Sreedhar RP, Phan TQ, Lu S, Tan CL (2012) Multioriented video scene text detection through bayesian classification and boundary growing. IEEE Trans Circuits Syst Video Technol 22(8):1231–1233
Smith R (2007) An overview of the tesseract OCR engine. In: Proceedings of 9th international conference on document analysis and recognition (ICDAR)
Song X, Sun L, Lei J, Tao D, Yuan G, Song M (2016) Event-based large scale surveillance video summarization. J Neurocomput 187(C):66–74
Su Y-C, Grauman K (2016) Detecting engagement in egocentric video. In: Proceedings of the European conference on computer vision (ECCV). Amsterdam
Tang P, Wang C, Wang X, Liu W, Zeng W, Wang J (2018) Object detection in videos by short and long range object linking. arXiv:1801.09823
Toshev A, Makadia A, Daniilidis K (2009) Shape-based object recognition in videos using 3D synthetic object models. In: 2009 IEEE conference on computer vision and pattern recognition
Varini P, Serra G, Cucchiara R (2015) Egocentric video summarization of cultural tour based on user preferences. In: MM ‘15 Proceedings of the 23rd ACM international conference on Multimedia. Brisbane
Varini P, Serra G, Cucchiara R (2015) Personalized egocentric video summarization for cultural experience. In: Proceedings of the 5th ACM on international conference on multimedia retrieval. New York
Viola P, Jones MJ (2004) Robust real-time face detection. Int J Comput Vis 57(2):137–154
Zhang K, Sha F, Chao W-L, Grauman K (2016) Summary transfer: exemplar-based subset selection for video summarization. In: IEEE conference on computer vision and pattern recognition (CVPR). Las Vegas
Zhang K, Chao W-L, Sha F, Grauman K (2016) Video summarization with long short-term memory. In: Proceedings of European conference on computer vision (ECCV), California, 2016
Zhang Y, Kampffmeyer M, Liang X, Tan M, Xing EP (2018a) Query-conditioned three-player adversarial network for video summarization. Computer Vision and Pattern Recognition. BMVC 2018, pp 1–9
Zhang Y, Liang X, Zhang D, Tan M, Xing EP (2018b) Unsupervised object-level video summarization with online motion auto-encoder. Pattern Recogn Lett. https://doi.org/10.1016/j.patrec.2018.07.030
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Sultan, S., Javed, A., Irtaza, A. et al. A hybrid egocentric video summarization method to improve the healthcare for Alzheimer patients. J Ambient Intell Human Comput 10, 4197–4206 (2019). https://doi.org/10.1007/s12652-019-01444-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-019-01444-6