Abstract
This paper presents a software application to generate ground-truth data on video files from traffic surveillance cameras used for Intelligent Transportation Systems (IT systems). The computer vision system to be evaluated counts the number of vehicles that cross a line per time unit –intensity-, the average speed and the occupancy. The main goal of the visual interface presented in this paper is to be easy to use without the requirement of any specific hardware. It is based on a standard laptop or desktop computer and a Jog shuttle wheel. The setup is efficient and comfortable because one hand of the annotating person is almost all the time on the space key of the keyboard while the other hand is on the jog shuttle wheel. The mean time required to annotate a video file ranges from 1 to 5 times its duration (per lane) depending on the content. Compared to general purpose annotation tool a time factor gain of about 7 times is achieved.
Similar content being viewed by others
References
Albiol A et al (2011) Detection of parked vehicles using spatiotemporal maps. IEEE Trans Intell Transport Syst 12(4):1277–1291
Blunsden SJ, Fisher R (2010) The BEHAVE video dataset: ground truthed video for multi-person behavior classification. Annal British Mach Vis Assoc 4:1–12
Bradski G, Kaehler A (2008) Learning OpenCV: Computer vision with the OpenCV library. O'Reilly Media, Incorporated
Brooke J. SUS: a “quick and dirty” usability scale. Usability evaluation in industry. Taylor and Francis
Brostow GJ et al (2009) Semantic object classes in video: a high-definition ground truth database. Pattern Recognit Lett 30(2):88–97
Buch N et al (2011) A review of computer vision techniques for the analysis of urban traffic. IEEE Trans Intell Transp Syst 12(3):920–939
D’Orazio T et al. (2009) A semi-automatic system for ground truth generation of soccer video sequences. Advanced Video and Signal Based Surveillance, 2009. AVSS’09. Sixth IEEE International Conference on (Sep. 2009), 559–564
Dollar P et al (2012) Pedestrian detection: an evaluation of the state of the art. IEEE Trans Pattern Anal Mach Intell 34(4):743–761
Faro A et al (2011) Adaptive background modeling integrated with luminosity sensors and occlusion processing for reliable vehicle detection. IEEE Trans Intell Transport Syst 12(4):1398–1412
Giro-i-Nieto X et al (2010) GAT: a graphical annotation tool for semantic regions. Multimed Tool Appl 46(2–3):155–174
i-LIDS. Image Library for Intelligent Detection Systems: www.ilids.co.uk. Home Office Scientific Development Branch, United Kingdom. Last Accessed February 2013
Kasturi R et al (2009) Framework for performance evaluation of face, text, and vehicle detection and tracking in video: data, metrics, and protocol. IEEE Trans Pattern Anal Mach Intell 31(2):319–336
Laganière R (2011) OpenCV 2 computer vision application programming cookbook. Packt Pub Limited
Lorist MM et al (2000) Mental fatigue and task control: planning and preparation. Psychophysiology 37(5):614–625
Russell B et al (2008) LabelMe: a database and web-based tool for image annotation. Int J Comput Vis 77(1):157–173
Serrano M, Gracía J, Patricio M, Molina J (2010). Interactive video annotation tool. Distributed Computing and Artificial Intelligence, 325–332
Traffic City Cameras. Ajuntament de València, Spain. http://camaras.valencia.es. Last Accessed February 2013
TREC video retrieval evaluation. http://www-nlpir.nist.gov/projects/trecvid/
Vezzani R, Cucchiara R (2010) Video Surveillance Online Repository (ViSOR): an integrated framework. Multimed Tool Appl 50(2):359–380
ViPER: the video performance evaluation resource: http://viper-toolkit.sourceforge.net/
Volkmer T et al. (2005) A web-based system for collaborative annotation of large image and video collections: an evaluation and user study. Proceedings of the 13th annual ACM international conference on Multimedia (New York, NY, USA, 2005), 892–901
Zhang HB, Li SA, Chen SY, Su SZ, Duh DJ, Li SZ (2012) Adaptive photograph retrieval method. Multimedia Tools and Applications, Published online September 2012.
Zou Y et al (2011) Traffic incident classification at intersections based on image sequences by HMM/SVM classifiers. Multimed Tool Appl 52(1):133–145
Acknowledgments
The authors thank Etra I+D and Ruth López and Jaime Benlloch from Local Traffic Authority of Valencia, Spain by providing the video files. Also thanks to the VIGTA 2012 Conference Organizers, its participants and the anonymous reviewers of this journal for their valuable advice.
This work was funded by the Spanish Government project MARTA under the CENIT program and CICYT contract TEC2009-09146.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Mossi, J.M., Albiol, A., Albiol, A. et al. Ground truth annotation of traffic video data. Multimed Tools Appl 70, 461–474 (2014). https://doi.org/10.1007/s11042-013-1396-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-013-1396-x