Skip to main content

A Joint System for Single-Person 2D-Face and 3D-Head Tracking in CHIL Seminars

  • Conference paper
Multimodal Technologies for Perception of Humans (CLEAR 2006)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4122))

Abstract

We present the IBM systems submitted and evaluated within the CLEAR’06 evaluation campaign for the tasks of single person visual 3D tracking (localization) and 2D face tracking on CHIL seminar data. The two systems are significantly inter-connected to justify their presentation within a single paper as a joint vision system for single person 2D-face and 3D-head tracking, suitable for smart room environments with multiple synchronized, calibrated, stationary cameras. Indeed, in the developed system, face detection plays a pivotal role in 3D person tracking, being employed both in system initialization as well as in detecting possible tracking drift. Similarly, 3D person tracking determines the 2D frame regions where a face detector is subsequently applied. The joint system consists of a number of components that employ detection and tracking algorithms, some of which operate on input from all four corner cameras of the CHIL smart rooms, while others select and utilize two out of the four available cameras. Main system highlights constitute the use of AdaBoost-like multi-pose face detectors, a spatio-temporal dynamic programming algorithm to initialize 3D location hypotheses, and an adaptive subspace learning based tracking scheme with a forgetting mechanism as a means to reduce tracking drift. The system is benchmarked on the CLEAR’06 CHIL seminar database, consisting of 26 lecture segments recorded inside the smart rooms of the UKA and ITC CHIL partners. Its resulting 3D single-person tracking performance is 86% accuracy with a precision of 88 mm, whereas the achieved face tracking score is 54% correct with 37% wrong detections and 19% misses. In terms of speed, an inefficient system implementation runs at about 2 fps on a P4 2.8 GHz desktop.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. CHIL: Computers in the Human Interaction Loop. http://chil.server.de

  2. Rowley, H.A., Baluja, S., Kanade, T.: Neural network-based face detection. IEEE Trans. Pattern Anal. Machine Intell. 20(1), 23–28 (1998)

    Article  Google Scholar 

  3. Osuna, E., Freund, R., Girosi, F.: Training support vector machines: An application to face detection. In: Proc. Conf. Computer Vision Pattern Recog., pp. 130–136 (1997)

    Google Scholar 

  4. Roth, D., Yang, M.-H., Ahuja, N.: A SNoW-based face detector. In: Proc. NIPS (2000)

    Google Scholar 

  5. Viola, P., Jones, M.: Robust real time object detection. In: Proc. IEEE ICCV Work. Statistical and Computational Theories of Vision (2001)

    Google Scholar 

  6. Pentland, A.P., Moghaddam, B., Starner, T.: View-based and modular eigenspaces for face recognition. In: Proc. Conf. Computer Vision Pattern Recog., pp. 84–91 (1994)

    Google Scholar 

  7. Li, S.Z., Zhang, Z.: FloatBoost learning and statistical face detection. IEEE Trans. Pattern Anal. Machine Intell. 26(9), 1112–1123 (2004)

    Article  Google Scholar 

  8. Isard, M., Blake, A.: Contour tracking by stochastic propagation of conditional density. In: Proc. Europ. Conf. Computer Vision, pp. 343–356 (1996)

    Google Scholar 

  9. Comaniciu, D., Ramesh, V., Meer, P.: Real-time tracking of non-rigid objects using mean shift. Proc. Int. Conf. Computer Vision Pattern Recog. 2, 142–149 (2000)

    Google Scholar 

  10. Tao, H., Sawhney, H.S., Kumar, R.: Dynamic layer representation with applications to tracking. Proc. Int. Conf. Computer Vision Pattern Recog. 2, 134–141 (2000)

    Google Scholar 

  11. Black, M.J., Jepson, A.: Eigentracking: Robust matching and tracking of articulated objects using a view-based representation. Int. J. Computer Vision 26(1), 63–84 (1998)

    Article  Google Scholar 

  12. Jepson, A.D., Fleet, D.J., El-Maraghi, T.F.: Robust online appearance models for visual tracking. IEEE Trans. Pattern Anal. Machine Intell. 25(10), 1296–1311 (2003)

    Article  Google Scholar 

  13. Collins, R.T., Liu, Y., Leordeanu, M.: Online selection of discriminative tracking features. IEEE Trans. Pattern Anal. Machine Intell. 27(10), 1631–1643 (2005)

    Article  Google Scholar 

  14. Han, B., Davis, L.: On-line density-based appearance modeling for object tracking. In: Proc. Int. Conf. Computer Vision, Beijing (2005)

    Google Scholar 

  15. Lim, J., Ross, D., Lin, R.-S., Yang, M.-H.: Incremental learning for visual tracking. In: Proc. NIPS  (2004)

    Google Scholar 

  16. Zhang, Z., Potamianos, G., Senior, A., Chu, S., Huang, T.: A joint system for person tracking and face detection. In: Proc. Int. Wksp. Human-Computer Interaction, Beijing, China (2005)

    Google Scholar 

  17. Zhang, Z., Potamianos, G., Liu, M., Huang, T.S.: Robust multi-view multi-camera face detection inside smart rooms using spatio-temporal dynamic programming. In: Proc. Int. Conf. Automatic Face Gesture Recog., Southampton, United Kingdom (2006)

    Google Scholar 

  18. Zhang, Z., Potamianos, G., Chu, S.M., Tu, J., Huang, T.S.: Person tracking in smart rooms using dynamic programming and adaptive subspace learning. In: Proc. Int. Conf. Multimedia Expo, Toronto, Canada,

    Google Scholar 

  19. Ho, J., Lee, K.-C., Yang, M.-H., Kriegman, D.: Visual tracking using learned linear subspaces. Proc. Int. Conf. Computer Vision Pattern Recog. 1, 782–789 (2004)

    Google Scholar 

  20. Hall, P., Marshall, D., Martin, R.: Merging and splitting eigenspace models. IEEE Trans. Pattern Anal. Machine Intell. 22(9), 1042–1049 (2000)

    Article  Google Scholar 

  21. Bernardin, K., Elbs, A., Stiefelhagen, R.: Multiple object tracking performance metrics and evaluation in a smart room environment. In: Proc. Int. Wksp. Visual Surveillance, Graz, Austria (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Rainer Stiefelhagen John Garofolo

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Potamianos, G., Zhang, Z. (2007). A Joint System for Single-Person 2D-Face and 3D-Head Tracking in CHIL Seminars. In: Stiefelhagen, R., Garofolo, J. (eds) Multimodal Technologies for Perception of Humans. CLEAR 2006. Lecture Notes in Computer Science, vol 4122. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69568-4_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-69568-4_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-69567-7

  • Online ISBN: 978-3-540-69568-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics