Skip to main content

Non-sequential Multi-view Detection, Localization and Identification of People Using Multi-modal Feature Maps

  • Conference paper
Computer Vision – ACCV 2012 (ACCV 2012)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7726))

Included in the following conference series:

Abstract

We present a novel multi-modal fusion framework for non-sequential person detection, localization and identification from multiple views. Our goal is independent processing of randomly-accessed sections of video, either individual frames or small batches thereof. This way, we aim to limit the error propagation that makes the existing approaches unsuitable for fully-autonomous tracking of multiple people in long video sequences. Our framework uses one or more trained classifiers to fuse multiple weak feature maps. We perform experimental validation on a challenging dataset, demonstrating how the framework can, depending on the provided feature maps, be used either only to improve generic person detection, or enable simultaneous detection and recognition of individuals. Finally, we show that tracking-by-identification using the output of the proposed framework outperforms the state-of-the-art identification-by-tracking approach in terms of preserved track identities.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Iwase, S., Saito, H.: Parallel tracking of all soccer players by integrating detected positions in multiple view images. In: ICPR 2004, pp. 751–754 (2004)

    Google Scholar 

  2. Xu, M., Orwell, J., Jones, G.: Tracking football players with multiple cameras. In: ICIP 2004, pp. 2909–2912 (2004)

    Google Scholar 

  3. Otsuka, K., Mukawa, N.: Multiview occlusion analysis for tracking densely populated objects based on 2-d visual angles. In: CVPR 2004, pp. 90–97 (2004)

    Google Scholar 

  4. Kristan, M., Perš, J., Perše, M., Kovačič, S.: Closed-world tracking of multiple interacting targets for indoor-sports applications. Computer Vision and Image Understanding 113, 598–611 (2009)

    Article  Google Scholar 

  5. Fleuret, F., Berclaz, J., Lengagne, R., Fua, P.: Multicamera people tracking with a probabilistic occupancy map. IEEE TPAMI 30, 267–282 (2008)

    Article  Google Scholar 

  6. Khan, S., Shah, M.: Tracking multiple occluding people by localizing on multiple scene planes. IEEE TPAMI 31, 505–519 (2009)

    Article  Google Scholar 

  7. Berclaz, J., Fleuret, F., Turetken, E., Fua, P.: Multiple object tracking using k-shortest paths optimization. IEEE TPAMI 33, 1806–1819 (2011)

    Article  Google Scholar 

  8. Yilmaz, A., Javed, O., Shah, M.: Object tracking: A survey. ACM Computing Surveys 38 (2006)

    Google Scholar 

  9. Berclaz, J., Fleuret, F., Fua, P.: Principled detection-by-classification from multiple views. In: VISAPP 2008, pp. 375–382 (2008)

    Google Scholar 

  10. Alahi, A., Boursier, Y., Jacques, L., Vandergheynst, P.: Sport players detection and tracking with a mixed network of planar and omnidirectional cameras. In: ICDSC 2009, pp. 1–8 (2009)

    Google Scholar 

  11. Delannay, D., Danhier, N., Vleeschouwer, C.D.: Detection and recognition of sports (wo)men from multiple views. In: ICDSC 2009, pp. 1–7 (2009)

    Google Scholar 

  12. Ahn, J., Gobron, S., Silvestre, Q., Shitrit, H.B., Raca, M., Pettré, J., Thalmann, D., Fua, P., Boulic, R.: Long term real trajectory reuse through region goal satisfaction. In: Allbeck, J.M., Faloutsos, P. (eds.) MIG 2011. LNCS, vol. 7060, pp. 412–423. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  13. Ben Shitrit, H., Berclaz, J., Fleuret, F., Fua, P.: Tracking multiple people under global appearance constraints. In: ICCV 2011, pp. 137–144 (2011)

    Google Scholar 

  14. Breitenstein, M.D., Reichlin, F., Leibe, B., Koller-Meier, E., Van Gool, L.: Online multiperson tracking-by-detection from a single, uncalibrated camera. IEEE TPAMI 33, 1820–1833 (2011)

    Article  Google Scholar 

  15. Zivkovic, Z., van der Heijden, F.: Efficient adaptive density estimation per image pixel for the task of background subtraction. Pattern Recognition Letters 27, 773–780 (2006)

    Article  Google Scholar 

  16. Werlberger, M., Trobin, W., Pock, T., Wendel, A., Cremers, D., Bischof, H.: Anisotropic Huber-L1 optical flow. In: BMVC 2009 (2009)

    Google Scholar 

  17. Werlberger, M., Pock, T., Bischof, H.: Motion estimation with non-local total variation regularization. In: CVPR 2010 (2010)

    Google Scholar 

  18. Li, M., Zhang, Z., Huang, K., Tan, T.: Estimating the number of people in crowded scenes by mid based foreground segmentation and head-shoulder detection. In: ICPR 2008, pp. 1–4 (2008)

    Google Scholar 

  19. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR 2005, pp. 886–893 (2005)

    Google Scholar 

  20. Vleeschouwer, C.D., Chen, F., Delannay, D., Parisot, C., Chaudy, C., Martrou, E., Cavallaro, A.: Distributed video acquisition and annotation for sport-event summarization. In: NEM Summit 2008: Towards Future Media Internet (2008)

    Google Scholar 

  21. D’Orazio, T., Leo, M., Mosca, N., Spagnolo, P., Mazzeo, P.L.: A semi-automatic system for ground truth generation of soccer video sequences. In: AVSS 2009, pp. 559–564 (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Mandeljc, R., Kovačič, S., Kristan, M., Perš, J. (2013). Non-sequential Multi-view Detection, Localization and Identification of People Using Multi-modal Feature Maps. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds) Computer Vision – ACCV 2012. ACCV 2012. Lecture Notes in Computer Science, vol 7726. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37431-9_53

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-37431-9_53

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-37430-2

  • Online ISBN: 978-3-642-37431-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics