Skip to main content
Log in

Fragmentation handling for visual tracking

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

Object detection and tracking using background subtraction suffers from the fragmentation problem which means one object fragments into several blobs because of being similar with the reference image in color. In this paper, we build a visual tracking framework using background subtraction for object detection, and we address the association difficulty of blobs with objects caused by the fragmentation problem by two steps. We firstly cluster the blobs according to the boundary distances of them estimated by an approximating method proposed in this paper. Blobs clustered into the same blob-set are considered from the same object. Secondly, we consider blob-sets possibly from the same object if they exhibit coherent motion, since blobs of the same object may be clustered into different blob-sets if the object fragments severely. A background-matching method is proposed to determine whether two blob-sets exhibiting coherent motion are truly from the same object or from different objects. We test the proposed methods on several real-world video sequences. Quantitative and qualitative experimental results show that the proposed methods handle the problems caused by fragmentation effectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Notes

  1. Available at http://www.cvg.rdg.ac.uk/PETS2009/data.html.

  2. The values of the parameters can be learned from the previous data, which are available in common applications. Given previous data, one can directly estimate the values of parameter \(\varrho \) and \(\phi \), while we need to explore different values of \(\rho \) and \(\beta \) to determine the optimal settings of them, as described in Sect. 5.

  3. http://www.vision.ee.ethz.ch/~bleibe/data/datasets.html.

  4. i-Lids dataset for AVSS 2007, available at http://www.eecs.qmul.ac.uk/~andrea/avss2007_d.html.

  5. For an object which existed in the scene for \(N_1\) time steps, if it was correctly labeled for \(N_2\) time steps by a labeling method, the correct labeling rate of the method to the object can be computed as \(N_2/N_1\).

Abbreviations

\(N_g\) :

The number of particles used for global sampling

\(b_{j}\) :

The \(j\)th pixel

\(B_i\) :

The boundary of the \(i\)th blob

\(M_i\) :

The number of pixels in \(B_i\)

\(D(i,k)\) :

The boundary distance between \(B_i\) and \(B_k\)

\(\hat{b}_{ik}\) :

The boundary pixel of \(B_i\) closest to the \(k\)th blob

\(\alpha \) :

A constant value deciding the sampling density

\(T\) :

A constant value equals 300

\(I_i\) :

The \(i\)th image

\(rt_i\) :

The time cost of computing the boundary distances of the blobs

\(\tilde{D}(j,k)\) :

The true boundary distance between \(B_j\) and \(B_k\)

\(\upsilon _{j,k}\) :

The accuracy of our method

\(err_{j,k}\) :

The estimation error

\(N\) :

The number of obtained blobs

\(S_b\) :

The obtained blobs

\(\Xi (i,j)\) :

The centroid distance between the \(i\)th blob and the \(j\)th blob

\(\rho \) :

A predefined threshold

\(\beta \) :

A predefined threshold

\(\varrho \) :

The diagonal axis length of the largest object that ever appeared in the scene

\(s(B_i, B_j)\) :

The similarity between the two blobs

\(K_{x}(\cdot )\) :

The flat kernel

\(\phi \) :

A predefined threshold

\({\Lambda _j}\) :

The \(j\)th blob-set

\(t\) :

The index of the time step

\(\mathbf x _{i,t}\) :

The state of the \(i\)th target at time \(t\)

\((w,h)\) :

The width and the height

\(id\) :

The identity of the target

\((\dot{x},\dot{y})\) :

The velocities

\(l\) :

The target label

\(\mathbf Q \) :

The white Gaussian noise

\(N_p\) :

The number of particles used by each tracker

\(g_{i,j}^{(t)}\) :

The motion coherence at time \(t\) between two targets \(\mathbf{x}_{i}\) and \(\mathbf{x}_{j}\)

\(d_{i,j}^{(t)}\) :

The distance between two targets

\(\chi \) :

A predefined threshold

\(\triangle _{0}\) :

The velocity difference of targets

\(\triangle _{v}\) :

The true average velocity difference vector of correlated targets

\(\sigma ^2_{0}\) :

A small variance

\(\sigma ^2_{1}\) :

A large variance

\(q_{i,j}^{(t)}\) :

The correlation at time \(t\) between two targets \(\mathbf{x}_{i}\) and \(\mathbf{x}_{j}\)

\(C\) :

The color histogram of the foreground gap-pixels

\(C_b\) :

The color histogram of the background gap-pixels

\(\xi {(C,C_b)}\) :

The Bhattacharyya similarity coefficient

\(N_h\) :

A constant value equals 20

\(N_s\) :

A constant value equals 20

\(\lambda \) :

A constant value equals 20

\(G\) :

The correlation graph

\(V\) :

The vertices corresponding to the targets in \(G\)

\(E\) :

The edges in \(G\)

\(e_{i,j}\) :

The edge between \(\mathbf{x}_{i}\) and \(\mathbf{x}_{j}\)

\(w_{i,j}\) :

The weight of the edge \(e_{i,j}\)

\(S_C\) :

A correlation-set

References

  1. Stauffer, C., Grimson, W.: Adaptive background mixture modelsfor real-time tracking. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Ft. Collins, Colorado, 23–25 June 1999, pp. 246–252

  2. Chen, Y., Chen, C., Huang, C., Hung, Y.: Efficient hierarchical method for background subtraction. Pattern Recognit. 40(10), 2706–2715 (2007)

    Article  MATH  Google Scholar 

  3. Elgammal, A., Duraiswami, R., Harwood, D., Davis, L.: Background and foreground modeling using nonparametric kernel density estimation for visual surveillance. Proc. IEEE 90(7), 1151–1163 (2002)

    Google Scholar 

  4. Tsai, D., Lai, S.: Independent component analysis-based background subtraction for indoor surveillance. IEEE Trans. Image Process. 18(1), 158–167 (2009)

    Article  MathSciNet  Google Scholar 

  5. Intille, S., Davis, J., Bobick, A.: Real-time closed-world tracking. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. San Juan, Puerto Rico, 17–19 June 1997, pp. 697–704

  6. Dente, E., Bharath, A., Ng, J.: A jump-diffusion particle filter for tracking grouped and fragmented objects. In: Proceedings of International Conference on Digital Signal Processing, London, UK, 5–7 July 2009, pp. 1–6

  7. Torabi, A., Bilodeau, G.: A multiple hypothesis tracking method with fragmentation handling. Proceedings of Canadian Conference on Computer and Robot Vision, Kelowna, Canada, 25–27 May 2009, pp. 8–15

  8. Bose, B., Wang, X., Grimson, E.: Multi-class object tracking algorithm that handles fragmentation and grouping. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Minneapolis, Minnesota, 17–22 June 2007, pp. 1–8

  9. Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Kauai, Hawaii, 8–14 Dec 2001, pp. 511–518

  10. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, San Diego, California, 25–27 June 2005, pp. 886–893

  11. Chang, I., Lin, S.: 3D human motion tracking based on a progressive particle filter. Pattern Recognit. 43(10), 3621–3635 (2010)

    Article  MATH  MathSciNet  Google Scholar 

  12. Sullivan, J., Carlsson, S.: Tracking and labelling of interacting multiple targets. In: Proceedings of European Conference on Computer Vision, Graz, Austria, 7–13 May 2006, pp. 619–632

  13. Javed, O., Shah, M.: Tracking and object classification for automated surveillance. Proceedings of European Conference on Computer Vision, Graz, Austria, 7–13 May 2006, pp. 439–443

  14. Zhao, T., Nevatia, R., Wu, B.: Segmentation and tracking of multiple humans in crowded environments. IEEE Trans. Pattern Anal. Mach. Intell. 30(7), 1198–1211 (2007)

    Article  Google Scholar 

  15. Masoud, O., Papanikolopoulos, N.: A novel method for tracking and counting pedestrians in real-time using a single camera. IEEE Trans. Veh. Technol. 50(5), 1267–1278 (2001)

    Article  Google Scholar 

  16. Lu, H., Jia, C., Zhang, R.: An effective method for detection and segmentation of the body of human in the view of a single stationary camera. In: Proceedings of 19th IEEE International Conference on Pattern Recognition, Tampa, Florida, 8–11 Dec 2009, pp. 1–4

  17. Senior, A., Hampapur, A., Tian, Y., Brown, L., Pankanti, S., Bolle, R.: Tracking people with probabilistic appearance models. In: Proceedings of 3rd International Workshop on Performance Evaluation of Surveillance Systems, Copenhagen, Denmark, 1–3 June 2002, pp. 48–55

  18. Wang, F., Zhao, Q.: New particle filter for nonlinear filtering problems. Chin. J. Comput. 31(2), 346–352 (2008)

    Article  Google Scholar 

  19. Isard, M., Blake, A.: Condensation-conditional density propagation for visual tracking. Int J Comput Vis 29(1), 5–28 (1998)

    Article  Google Scholar 

  20. Nillius, P., Sullivan, J., Carlsson, S.: Multi-target tracking-linking identities using bayesian network inference. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, New York, NY, 17–22 June 2006, pp. 2187–2194

  21. Meijster, A., Roerdink, J., Hesselink, W.: A general algorithm for computing distance transforms in linear time. In: Proceedings of 5th International Conference on Mathematical Morphology and its Applications to Image and Signal Processing, Palo Alto, California, 26–28 June 2000, pp. 331–340

  22. Maurer Jr, C., Qi, R., Raghavan, V.: A linear time algorithm for computing exact euclidean distance transforms of binary images in arbitrary dimensions. IEEE Trans. Pattern Anal. Mach. Intell. 25(2), 265–270 (2003)

    Article  Google Scholar 

  23. Fabbri, R., Costa, L., Torelli, J., Bruno, O.: 2D euclidean distance transform algorithms: a comparative survey. ACM Comput. Surv. 40(1), 2:1–2:44 (2008)

    Google Scholar 

  24. Cheng, Y.: Mean shift, mode seeking, and clustering. IEEE Trans. Pattern Anal. Mach. Intell. 17(8), 790–799 (1995)

    Article  Google Scholar 

Download references

Acknowledgments

The paper was partly supported by a grant from the National Natural Science Foundation of China (No.60772063, No.61175096) and International S&T Cooperation Program of Beijing Institute of Technology.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Weicun Xu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xu, W., Zhao, Q. & Gu, D. Fragmentation handling for visual tracking. SIViP 8, 1639–1649 (2014). https://doi.org/10.1007/s11760-012-0406-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-012-0406-1

Keywords

Navigation