MessyTable: Instance Association in Multiple Camera Views

Cai, Zhongang; Zhang, Junzhe; Ren, Daxuan; Yu, Cunjun; Zhao, Haiyu; Yi, Shuai; Yeo, Chai Kiat; Change Loy, Chen

doi:10.1007/978-3-030-58621-8_1

Zhongang Cai¹²,
Junzhe Zhang^12,13,
Daxuan Ren^12,13,
Cunjun Yu¹²,
Haiyu Zhao¹²,
Shuai Yi¹²,
Chai Kiat Yeo¹³ &
…
Chen Change Loy¹³

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12356))

Included in the following conference series:

European Conference on Computer Vision

4424 Accesses
4 Citations

Abstract

We present an interesting and challenging dataset that features a large number of scenes with messy tables captured from multiple camera views. Each scene in this dataset is highly complex, containing multiple object instances that could be identical, stacked and occluded by other instances. The key challenge is to associate all instances given the RGB image of all views. The seemingly simple task surprisingly fails many popular methods or heuristics that we assume good performance in object association. The dataset challenges existing methods in mining subtle appearance differences, reasoning based on contexts, and fusing appearance with geometric cues for establishing an association. We report interesting findings with some popular baselines, and discuss how this dataset could help inspire new problems and catalyse more robust formulations to tackle real-world instance association problems. (Project page: https://caizhongang.github.io/projects/MessyTable/.)

Z. Cai and J. Zhang—Indicates equal contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Baqué, P., Fleuret, F., Fua, P.: Deep occlusion reasoning for multi-camera multi-target detection. In: ICCV (2017)
Google Scholar
Bradski, G.: The OpenCV library. Dr. Dobb’s J. Softw. Tools 25, 120–125 (2000)
Google Scholar
Caliskan, A., Mustafa, A., Imre, E., Hilton, A.: Learning dense wide baseline stereo matching for people. In: ICCVW (2019)
Google Scholar
Chavdarova, T., et al.: WILDTRACK: a multi-camera HD dataset for dense unscripted pedestrian detection. In: CVPR (2018)
Google Scholar
Chavdarova, T., et al.: Deep multi-camera people detection. In: ICMLA (2017)
Google Scholar
Csurka, G., Humenberger, M.: From handcrafted to deep local features for computer vision applications. CoRR abs/1807.10254 (2018)
Google Scholar
Fleuret, F., Berclaz, J., Lengagne, R., Fua, P.: Multicamera people tracking with a probabilistic occupancy map. PAMI 30, 267–282 (2007)
Article Google Scholar
Gao, J., Nevatia, R.: Revisiting temporal modeling for video-based person ReID. CoRR abs/1805.02104 (2018)
Google Scholar
Raja, Y., Gong, S.: Scalable multi-camera tracking in a metropolis. In: Gong, S., Cristani, M., Yan, S., Loy, C.C. (eds.) Person Re-Identification. ACVPR, pp. 413–438. Springer, London (2014). https://doi.org/10.1007/978-1-4471-6296-4_20
Chapter Google Scholar
Gou, M., et al.: A systematic evaluation and benchmark for person re-identification: features, metrics, and datasets. PAMI (2018)
Google Scholar
Han, X., Leung, T., Jia, Y., Sukthankar, R., Berg, A.C.: MatchNet: unifying feature and metric learning for patch-based matching. In: CVPR (2015)
Google Scholar
Hsu, H.M., Huang, T.W., Wang, G., Cai, J., Lei, Z., Hwang, J.N.: Multi-camera tracking of vehicles based on deep features Re-ID and trajectory-based camera link models. In: CVPRW (2019)
Google Scholar
Li, W., Zhao, R., Xiao, T., Wang, X.: DeepReID: deep filter pairing neural network for person re-identification. In: CVPR (2014)
Google Scholar
Li, W., Mu, J., Liu, G.: Multiple object tracking with motion and appearance cues. In: ICCVW (2019)
Google Scholar
López-Cifuentes, A., Escudero-Viñolo, M., Bescós, J., Carballeira, P.: Semantic driven multi-camera pedestrian detection. CoRR abs/1812.10779 (2018)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. IJCV 60, 91–110 (2004)
Article Google Scholar
Luo, W., et al.: Multiple object tracking: a literature review. CoRR abs/1409.7618 (2014)
Google Scholar
Milan, A., Leal-Taixé, L., Reid, I., Roth, S., Schindler, K.: MOT16: a benchmark for multi-object tracking. CoRR abs/1603.00831 (2016)
Google Scholar
Milan, A., Rezatofighi, S.H., Dick, A., Reid, I., Schindler, K.: Online multi-target tracking using recurrent neural networks. In: AAAI (2017)
Google Scholar
Ristani, E., Solera, F., Zou, R., Cucchiara, R., Tomasi, C.: Performance measures and a data set for multi-target, multi-camera tracking. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 17–35. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48881-3_2
Chapter Google Scholar
Roig, G., Boix, X., Shitrit, H.B., Fua, P.: Conditional random fields for multi-camera object detection. In: ICCV (2011)
Google Scholar
Sadeghian, A., Alahi, A., Savarese, S.: Tracking the untrackable: learning to track multiple cues with long-term dependencies. In: ICCV (2017)
Google Scholar
Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: CVPR (2015)
Google Scholar
Schulter, S., Vernaza, P., Choi, W., Chandraker, M.: Deep network flow for multi-object tracking. In: CVPR (2017)
Google Scholar
Simo-Serra, E., Trulls, E., Ferraz, L., Kokkinos, I., Fua, P., Moreno-Noguer, F.: Discriminative learning of deep convolutional feature point descriptors. In: ICCV (2015)
Google Scholar
Sun, Y., Zheng, L., Yang, Y., Tian, Q., Wang, S.: Beyond part models: person retrieval with refined part pooling (and a strong convolutional baseline). In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11208, pp. 501–518. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01225-0_30
Chapter Google Scholar
Susanto, W., Rohrbach, M., Schiele, B.: 3D object detection with multiple kinects. In: Fusiello, A., Murino, V., Cucchiara, R. (eds.) ECCV 2012. LNCS, vol. 7584, pp. 93–102. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33868-7_10
Chapter Google Scholar
Wei, L., Zhang, S., Gao, W., Tian, Q.: Person transfer GAN to bridge domain gap for person re-identification. In: CVPR (2018)
Google Scholar
Wei, X.S., Cui, Q., Yang, L., Wang, P., Liu, L.: RPC: a large-scale retail product checkout dataset. CoRR abs/1901.07249 (2019)
Google Scholar
Winder, S., Hua, G., Brown, M.: Picking the best DAISY. In: CVPR (2009)
Google Scholar
Xu, Y., Zhou, X., Chen, S., Li, F.: Deep learning for multiple object tracking: a survey. IET Comput. Vis. 13, 355–368 (2019)
Article Google Scholar
Xu, Y., Liu, X., Liu, Y., Zhu, S.C.: Multi-view people tracking via hierarchical trajectory composition. In: CVPR (2016)
Google Scholar
Xu, Y., Liu, X., Qin, L., Zhu, S.C.: Cross-view people tracking by scene-centered spatio-temporal parsing. In: AAAI (2017)
Google Scholar
Ye, M., Shen, J., Lin, G., Xiang, T., Shao, L., Hoi, S.C.: Deep learning for person re-identification: a survey and outlook. CoRR abs/2001.04193 (2020)
Google Scholar
Zagoruyko, S., Komodakis, N.: Learning to compare image patches via convolutional neural networks. In: CVPR (2015)
Google Scholar
Zbontar, J., LeCun, Y.: Computing the stereo matching cost with a convolutional neural network. In: CVPR (2015)
Google Scholar
Zhang, Z., Wu, J., Zhang, X., Zhang, C.: Multi-target, multi-camera tracking by hierarchical clustering: recent progress on DukeMTMC project. CoRR abs/1712.09531 (2017)
Google Scholar
Zhao, H., et al.: Spindle net: person re-identification with human body region guided feature decomposition and fusion. In: CVPR (2017)
Google Scholar
Zheng, W.S., Gong, S., Xiang, T.: Associating groups of people. In: BMVC (2009)
Google Scholar
Zhou, Y., Shao, L.: Aware attentive multi-view inference for vehicle re-identification. In: CVPR (2018)
Google Scholar

Download references

Acknowledgements

This research was supported by SenseTime-NTU Collaboration Project, Singapore MOE AcRF Tier 1 (2018-T1-002-056), NTU SUG, and NTU NAP.

Author information

Authors and Affiliations

SenseTime Research, Tai Po, Hong Kong
Zhongang Cai, Junzhe Zhang, Daxuan Ren, Cunjun Yu, Haiyu Zhao & Shuai Yi
Nanyang Technological University, Singapore, Singapore
Junzhe Zhang, Daxuan Ren, Chai Kiat Yeo & Chen Change Loy

Authors

Zhongang Cai
View author publications
You can also search for this author in PubMed Google Scholar
Junzhe Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Daxuan Ren
View author publications
You can also search for this author in PubMed Google Scholar
Cunjun Yu
View author publications
You can also search for this author in PubMed Google Scholar
Haiyu Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Shuai Yi
View author publications
You can also search for this author in PubMed Google Scholar
Chai Kiat Yeo
View author publications
You can also search for this author in PubMed Google Scholar
Chen Change Loy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chen Change Loy .

Editor information

Editors and Affiliations

University of Oxford, Oxford, UK
Andrea Vedaldi
Graz University of Technology, Graz, Austria
Horst Bischof
University of Freiburg, Freiburg im Breisgau, Germany
Thomas Brox
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Jan-Michael Frahm

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 11656 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cai, Z. et al. (2020). MessyTable: Instance Association in Multiple Camera Views. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12356. Springer, Cham. https://doi.org/10.1007/978-3-030-58621-8_1

Download citation

DOI: https://doi.org/10.1007/978-3-030-58621-8_1
Published: 27 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58620-1
Online ISBN: 978-3-030-58621-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics