Skip to main content

Learning a Family of Detectors via Multiplicative Kernels

  • Chapter
  • First Online:
Topics in Medical Image Processing and Computational Vision

Part of the book series: Lecture Notes in Computational Vision and Biomechanics ((LNCVB,volume 8))

Abstract

Object detection is challenging when the object class exhibits large within-class variations. In this work, we show that foreground-background classification (detection) and within-class classification of the foreground class (pose estimation) can be jointly learned in a multiplicative form of two kernel functions. Model training is accomplished via standard SVM learning. When the foreground object masks are provided in training, the detectors can also produce object segmentations. A tracking-by-detection framework to recover foreground state in video sequences is also proposed with our model. The advantages of our method are demonstrated on tasks of object detection, view angle estimation and tracking. Our approach compares favorably to existing methods on hand and vehicle detection tasks. Quantitative tracking results are given on sequences of moving vehicles and human faces.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In this paper, all vector variables are column vectors.

  2. 2.

    available at http://cs-people.bu.edu/yq/projects/mk.html.

  3. 3.

    available at http://cs-people.bu.edu/yq/projects/mk.html.

References

  1. Agarwal A, Triggs B (2004) 3D human pose from silhouettes by relevance vector regression. In: Proceedings of the IEEE conference on computer vision and pattern recognition

    Google Scholar 

  2. Andriluka M, Roth S, Schiele B (2008) People-tracking-by-detection and people-detection-by-tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition

    Google Scholar 

  3. Athitsos V, Sclaroff S (2003) Estimating 3D hand pose from a cluttered image. In: Proceedings of the IEEE conference on computer vision and pattern recognition

    Google Scholar 

  4. Belongie S, Malik J, Puzicha J (2002) Shape matching and object recognition using shape contexts. IEEE Trans Pattern Anal Mach Intell 24(24):509–522

    Article  Google Scholar 

  5. Bissacco A, Yang M, Soatto S (2006) Deteing humans via their pose. In: Proceedings of advances in neural information processing systems

    Google Scholar 

  6. Blaschko MB, Lampert CH (2008) Learning to localize objects with structured output regression. In: Proceedings of the European conference on computer vision

    Google Scholar 

  7. Borenstein E, Ullman S (2002) Class-specific, top-down segmentation. In: Proceedings of the European conference on computer vision

    Google Scholar 

  8. Cortes C, Vapnik V (1995) Support vector networks. Mach Learn 20:273–297

    MATH  Google Scholar 

  9. Crasborn O, van der Kooij E, Nonhebel A, Emmerik W (2004) ECHO data set for sign language of the Netherlands. Technical report Department of Linguistics, University Nijmegen

    Google Scholar 

  10. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition

    Google Scholar 

  11. Damoulas T, Girolami MA (2008) Pattern recognition with a Bayesian kernel combination machine. Pattern Recogn Lett 30(1):46–54

    Article  Google Scholar 

  12. Enzweiler M, Gavrila DM (2008) A mixed generative-discriminative framework for pedestrian classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition

    Google Scholar 

  13. Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2009) Object detection with discriminatively trained part based models. IEEE Trans Pattern Anal Mach Intell (to appear)

    Google Scholar 

  14. Felzenszwalb PF, Huttenlocher DP (2005) Pictorial structures for object recognition. Int J Comput Vision 61:55–79

    Article  Google Scholar 

  15. Gavrila DM (2000) Pedestrian detection from a moving vehicle. In: Proceedings of the European conference on computer vision

    Google Scholar 

  16. Gross R, Matthews I, Cohn J, Kanade T, Baker S (2008) Multi-PIE. In: Proceedings of the IEEE international conference on face and gesture recognition

    Google Scholar 

  17. Hoiem D, Efros AA, Hebert M (2008) Putting objects in perspective. Int J Comput Vision 80(1):3–15

    Article  Google Scholar 

  18. Huang C, Ai H, Li Y, Lao S (2007) High-performance rotation invariant multiview face detection. IEEE Trans Pattern Anal Mach Intell 29(4):671–686

    Article  Google Scholar 

  19. Ioffe C, Forsyth D (2001) Probabilistic methods for finding people. Int J Comput Vision 43(1):45–68

    Article  MATH  Google Scholar 

  20. Ionescu C, Bo L, Sminchisescu C (2009) Structural SVM for visual localization and continuous state estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition

    Google Scholar 

  21. Isard M, Blake A (1998) CONDENSATION: Conditional density propagation for visual tracking. Int J Comput Vision 29(1):5–28

    Article  Google Scholar 

  22. Joachims T (1999) Making large-scale SVM learning practical. In: Scholkopf B, Burges C, Smola A (eds) Advances in Kernel methods—support vector learning. MIT Press, Cambridge

    Google Scholar 

  23. Kumar MP, Torr PHS, Zisserman A (2005) Obj Cut. In: Proceedings of the IEEE conference on computer vision and pattern recognition

    Google Scholar 

  24. Leibe B, Cornelis N, Cornelis K, Gool LV (2007) Dynamic 3D scene analysis from a moving vehicle. In: Proceedings of the IEEE conference on computer vision and pattern recognition

    Google Scholar 

  25. Leibe B, Leonardis A, Schiele B (2007) Robust object detection with interleaved categorization and segmentation. Int J Comput Vision 77(1):259–289

    Article  Google Scholar 

  26. Li S, Fu Q, Gu L, Scholkopf B, Cheng Y, Zhang H (2001) Kernel machine based learning for multi-view face detection and pose estimation. In: Proceedings of the IEEE international conference on computer vision

    Google Scholar 

  27. Li S, Zhang Z (2004) Floatboost learning and statistical face detection. IEEE Trans Pattern Anal Mach Intell 26(9):1112–1123

    Article  Google Scholar 

  28. Li Y, Ai H, Yamashita T, Lao S, Kawade M (2008) Tracking in low frame rate video: a cascade particle filter with discriminative observers of different life spans. IEEE Trans Pattern Anal Mach Intell 30(10):1728–1740

    Article  Google Scholar 

  29. Everingham M et al (2006) The 2005 PASCAL visual object class challenge. In: Machine learning challenges—evaluating predictive uncertainty, visual object classification, and recognising textual entailment, Springer

    Google Scholar 

  30. Marszalek M, Schmid C, Harzallah H, van de Weijer J (2007) Learning object representations for visual object class recognition. In: Visual recognition challange workshop, in conjunction with ICCV

    Google Scholar 

  31. Murase H, Nayar SK (1995) Visual learning and recognition of 3D objects from appearance. Int J Comput Vision 14(1):5–24

    Article  Google Scholar 

  32. Neidle C (2003) ASLLRP signstream databases. Boston University, Boston. http://ling.bu.edu/asllrpdata/queryPages

  33. Nocedal J, Wright SJ (2006) Numerical optimization. Springer, New York

    Google Scholar 

  34. Oikonomopoulos A, Patras I, Pantic M (2006) Kernel-based recognition of human actions using spatiotemporal salient points. In: Workshop on vision for human computer interaction

    Google Scholar 

  35. Okuma K, Taleghani A, Freitas ND, Little J, Lowe D (2004) A boosted particle filter: multitarget detection and tracking. In: Proceeedings of the European conference on computer vision

    Google Scholar 

  36. Ong E, Bowden R (2004) A boosted classifier tree for hand shape detection. In: Proceedings of the IEEE international conference on face and gesture recognition

    Google Scholar 

  37. Osadchy R, Miller M, LeCun Y (2004) Synergistic face detection and pose estimation with energy-based model. In: Proceedings of advances in neural information processing systems

    Google Scholar 

  38. Papageorgiou C, Poggio T (2000) A trainable system for object detection. Int J Comput Vision 38(1):15–33

    Article  MATH  Google Scholar 

  39. Pentland A, Moghaddam B, Starner T (1994) View-based and modular eigenspaces for face recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition

    Google Scholar 

  40. Platt J (1999) Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In: Smola A, Bartlett P, Scholkopf B, Schuurmans D (eds) Advances in large margin classifiers. MIT Press, Cambridge

    Google Scholar 

  41. Ramanan D, Forsyth DA, Zisserman A (2005) Strike a pose: tracking people by finding stylized poses. In: Proceedings of the IEEE conference on computer vision and pattern recognition

    Google Scholar 

  42. Rifkin R, Klautau A (2004) In defense of one-vs-all classification. J Mach Learn Res 5:101–141

    Google Scholar 

  43. Rosales R, Sclaroff S (2002) Learning body pose via specialized maps. In: Proceedings of advances in neural information processing systems

    Google Scholar 

  44. Russell BC, Torralba A, Murphy KP, Freeman WT (2005) LabelMe: a database and web-based tool for image annotation. Technical report, MIT Press, Cambridge

    Google Scholar 

  45. Seemann E, leibe B, Schiele B (2006) Multi-aspect detection of articulated objects. In: Proceedings of the IEEE conference on computer vision and pattern recognition

    Google Scholar 

  46. Shakhnarovich G, Viola P, Darrell T (2003) Fast pose estimation with parameter-sensitive hashing. In: Proceedings of the IEEE international conference on computer vision

    Google Scholar 

  47. Shi J, Malik J (1997) Normalized cuts and image segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition

    Google Scholar 

  48. Sidenbladh H, Black MJ, Fleet DJ (2000) Stochastic tracking of 3D human figures using 2D image motion. In: Proceedings of the European conference on computer vision, pp 702–718

    Google Scholar 

  49. Sigal L, Bhatia S, Roth S, Black M, Isard M (2004) Tracking loose-limbed people. In: Proceedings of the IEEE conference on computer vision and pattern recognition

    Google Scholar 

  50. Sminchisescu C, Kanaujia A, Metaxas D (2006) Learning joint top-down and bottom-up processes for 3D visual inference. In: Proceedings of the IEEE conference on computer vision and pattern recognition

    Google Scholar 

  51. Stenger B, Thayananthan A, Torr P, Cipolla R (2003) Filtering using a tree-based estimator. In: Proceedings of the IEEE international conference on computer vision

    Google Scholar 

  52. Torralba A, Murphy K, Freeman W (2004) Sharing features: Efficient boosting procedures for multiclass object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition

    Google Scholar 

  53. Varma M, Ray D (2007) Learning the discriminative power-invariance trade-off. In: Proceedings of the IEEE international conference on computer vision. Rio de Janeiro, Brazil

    Google Scholar 

  54. Viola P, Jones M (2003) Fast multi-view face detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition

    Google Scholar 

  55. Viola P, Jones M (2004) Robust real time object detection. Int J Comput Vision 57(2):137–154

    Article  Google Scholar 

  56. Wang L, Shi J, Song G, Shen I (2007) Object detection combining recognition and segmentation. In: Proceedings of Asian conference on computer vision

    Google Scholar 

  57. Wu B, Nevatia R (2007) Cluster boosted tree classifier for multi-view multi-pose object detection. In: Proceedings of the IEEE international conference on computer vision

    Google Scholar 

  58. Wu B, Nevatia R (2007) Simultaneous object detection and segmentation by boosting local shape feature based classifier. In: Proceedings of the IEEE conference on computer vision and pattern recognition

    Google Scholar 

  59. Yuan Q, Thangali A, Ablavsky V, Sclaroff S (2007) Parameter sensitive detectors. In: Proceedings of the IEEE conference on computer vision and pattern recognition

    Google Scholar 

  60. Zhu L, Chen Y, Lin C, Yuille AL (2007) Rapid inference on a novel and/or graph: detection, segmentation and parsing of articulated deformable objects in cluttered backgrounds. In: Proceedings of advances in neural information processing systems

    Google Scholar 

Download references

Acknowledgments

This paper reports work that was supported in part by the U.S. National Science Foundation under grants IIS-0705749 and IIS-0713168.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Quan Yuan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Yuan, Q., Thangali, A., Ablavsky, V., Sclaroff, S. (2013). Learning a Family of Detectors via Multiplicative Kernels. In: Tavares, J., Natal Jorge, R. (eds) Topics in Medical Image Processing and Computational Vision. Lecture Notes in Computational Vision and Biomechanics, vol 8. Springer, Dordrecht. https://doi.org/10.1007/978-94-007-0726-9_1

Download citation

  • DOI: https://doi.org/10.1007/978-94-007-0726-9_1

  • Published:

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-94-007-0725-2

  • Online ISBN: 978-94-007-0726-9

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics