Abstract
With the increasing concern about social public security and the development of large scale data storage technology, person re-identification in security surveillance system becomes a hot topic. Large variations in viewpoint and lighting across different camera views could change the appearance of the person a lot, which makes person re-identification still a challenging problem. Therefore, developing robust feature descriptors and designing discriminative distance metrics to measure the similarity between pedestrian images are two key aspects in person re-identification. In this paper, we propose a method using both deep learning and multiple metric ensembles to improve the performance of the re-identification. Firstly, we jointly use the various datasets to train a general Convolutional Neural Network (CNN) which is employed to extract the deep features of training and testing set afterwards. The deep architecture makes it possible to learn more abstract and internal features which are robust against the variations in viewpoint and lighting. Then we utilize the deep features of the training set to learn the specific distance metric of different datasets and combine it with Cosine distance metric together, multiple metric ensembles can measure the similarity between different images in a more comprehensive way. Finally, extensive experiments demonstrate that our method can improve the recognition performance effectively when compared with the state-of-the-art methods.
Similar content being viewed by others
References
Ahmed E, Jones M, Marks TK (2015) An improved deep learning architecture for person re-identification. In: Computer vision and pattern recognition. IEEE, pp 3908–3916
An L, Kafai M, Yang S, Bhanu B (2013) Reference-based person re-identification IEEE international conference on advanced video and signal based surveillance, pp 244–249
Baltieri D, Vezzani R, Cucchiara R (2011) 3dpes: 3d people dataset for surveillance and forensics. In: International ACM workshop on multimedia access to 3d human objects, pp 59–64
Bohn J, Ying Y, Gentric S, Pontil M (2014) Large margin local metric learning, computer vision C ECCV 2014. Springer International Publishing, Zurich, Switzerland, pp 679–694
Cheng DS, Cristani M, Stoppa M, Bazzani L, Murino V (2011) Custom pictorial structures for re-identification. British Mach Vision Conf 2:68.1–68.11
Cui J, Liu Y, Xu Y, Zhao H, Zha H (2013) Tracking generic human motion via fusion of low- and high-dimensional approaches. IEEE Trans Syst Man Cybern Syst Hum 43(4):996–1002
Davis JV, Kulis B, Jain P, Sra S, Dhillon IS (2007) Information theoretic metric learning Proceedings of the 24th international conference on machine learning, ACM, vol 227, pp 209–216
Farenzena M, Bazzani L, Perina A, Murino V, Cristani M (2010) Person re-identification by symmetry-driven accumulation of local features. In: 2010 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, vol 23, pp 2360–2367
Gray D, Brennan S, Tao H (2007) Evaluating appearance models for recognition, reacquisition, and tracking Proceedings IEEE international workshop on performance evaluation for tracking and surveillance
Gray D, Tao H (2008) Viewpoint invariant pedestrian recognition with an ensemble of localized features. Marseille, France: Springer Berlin Heidelberg, Computer Vision CECCV, pp. 262–275
Hirzer M, Beleznai C, Roth PM, Bischof H (2011) Person re-identification by descriptive and discriminative classification. Scandinavian Conf Image Anal 6688:91–102
Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. Computer Science
Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: Convolutional architecture for fast feature embedding Proceedings of the 22nd ACM international conference on multimedia, pp 675–678
Koestinger M, Hirzer M, Wohlhart P, Roth PM, Bischof H (2012) Large scale metric learning from equivalence constraints. In: IEEE conference on computer vision & pattern recognition, pp 2288–2295
Kawanishi Y, Wu Y, Mukunoki M, Minoh M (2014) Shinpuhkan2014: A multi-camera pedestrian dataset for tracking people across multiple cameras. The Korea-Japan Joint Workshop on Frontiers of Computer Vision, pp 322–329
Li Z, Chang S, Liang F, Huang TS, Cao L, Smith JR (2013) Learning locally-adaptive decision functions for person verification IEEE conference on computer vision and pattern recognition IEEE computer society, vol 9, pp 3610–3617
Li W, Zhao R, Wang X (2012) Human re-identification with transferred metric learning. Daejeon, Korea: Springer Berlin Heidelberg, 2013, Computer Vision C ACCV 2012, pp 31–34
Li W, Zhao R, Xiao T, Wang X (2014) Deepreid: Deep filter pairing neural network for person re-identification 2014 IEEE conference on computer vision and pattern recognition (CVPR) IEEE computer society, pp 152–159
Liao S, Hu Y, Zhu X, Li SZ (2015) Person re-identification by local maximal occurrence representation and metric learning Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2197–2206
Liu L, Cheng L, Liu Y, Jia Y, Rosenblum DS (2016) Recognizing complex activities by a probabilistic interval-based model Thirtieth AAAI conference on artificial intelligence
Liu Y, Cui J, Zhao H, Zha H (2012) Fusion of low-and high-dimensional approaches by trackers sampling for generic human motion tracking International conference on pattern recognition. IEEE, pp 898–901
Liu Y, Liang Y, Liu S, Rosenblum DS, Zheng Y (2016) Predicting urban water quality with ubiquitous data, arXiv preprint 1610.09462
Liu K, Ma B, Zhang W, Huang R (2015) A spatio-temporal appearance representation for viceo-based pedestrian re-identification IEEE international conference on computer vision, pp 3810–3818
Liu Y, Nie L, Han L, Zhang L, Rosenblum DS (2016) Action2Activity: recognizing complex activities from sensor data International conference on artificial intelligence, pp 1617–1623
Liu Y, Nie L, Liu L, Rosenblum DS (2016) From action to activity: Sensor-based activity recognition. Neurocomputing 181:108–115
Liu X, Song M, Tao D, Liu Z, Zhang L, Bu J, Chen C (2013) Semi-supervised node splitting for random forest construction Proceedings of CVPR, vol 9, pp 492–449
Liu Y, Zhang X, Cui J, Wu C, Aghajan H, Zha H (2010) Visual analysis of child-adult interactive behaviors in video sequences International conference on virtual systems and multimedia. IEEE, pp 26–33
Liu Y, Zheng Y, Liang Y, Liu S, Rosenblum DS (2016) Urban water quality prediction based on multi-task multi-view learning Proceedings of the international joint conference on artificial intelligence
Liu Y, Zhang L, Nie L, Yan Y, Rosenblum DS (2016) Fortune teller: predicting your career path. AAAI, pp 201–207
Lu Y, Wei Y, Liu L, Zhong J, Sun L, Liu Y (2016) Towards unsupervised physical activity recognition using smartphone accelerometers. Multimedia Tools & Applications, pp 1–19
Mignon A, Jurie F (2012) Pcca: a new approach for distance learning from sparse pairwise constraints 2012 IEEE conference on computer vision and pattern recognition (CVPR), vol 157, pp 2666–2672
Paisitkriangkrai S, Shen C, Hengel Avd (2015) Learning to rank in person re-identification with metric ensembles Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1846–1855
Roth PM, Wohlhart P, Hirzer M, Kostingerand M, Bischof H (2012) Large scale metric learning from equivalence constraints IEEE conference on computer vision & pattern recognition, pp 2288–2295
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826
Wang T, Gong S, Zhu X, Wang S (2014) Person re-identification by video ranking European conference on computer vision, vol 8692, pp 688–703
Weinberger KQ, Saul LK (2009) Distance metric learning for large margin nearest neighbor classification. J Mach Learn Res 10:207–244
Wang W, Yan Y, Zhang L, Hong R, Sebe N (2016) Collaborative sparse coding for multi-view action recognition. IEEE Multimedia Magazine 23(4):80–87
Wang W, Yan Y, Zhang L, Hong R, Sebe N (2016) Collaborative sparse coding for multiview action recognition. IEEE multiMedia 23(4):80–87
Xiao T, Li H, Ouyang W, Wang X (2016) Learning deep feature representations with domain guided dropout for person re-identification Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1249–1258
Yi D, Lei Z, Li SZ (2014) Deep metric learning for practical person re-identification. ICPR, pp 34–39
Zhang L, Gao Y, Hong C, Feng Y, Zhu J, Cai D (2014) Feature correlation hypergraph: exploiting high-order potentials for multimodal recognition. IEEE Transactions on Cybernetics 44(8):1408–1419
Zhang L, Gao Y, Ji R, Dai Q, Li X (2014) Actively learning human gaze shifting paths for photo cropping. IEEE 23(5):2235–45
Zhang L, Gao Y, Ji R, Lu K, Shen J (2014) Representative discovery of structure cues for weakly-supervised image segmentation. IEEE Trans Multimedia 16 (2):470–479
Zhang L, Gao Y, Zimmermann R, Tian Q, Li X (2014) Fusion of multi-channel local and global structural cues for photo aesthetics evaluation. IEEE Trans Image Process A Pub IEEE Sign Process Soc 23(3):1419–29
Zhang L, Han Y, Yang Y, Song M, Yan S, Tian Q (2013) Discovering discriminative graphlets for aerial image categories recognition. IEEE Trans Image Process 22(12):5071–5084
Zhang L, Hong R, Gao Y, Ji R, Dai Q, Li X (2016) Image categorization by learning a propagated graphlet path. IEEE Trans Neural Netw Learn Syst 27(3):674–685
Zhang L, Li X, Nie L, Yang Y, Xia Y (2016) Weakly supervised human fixations prediction. IEEE Trans Cybern 46(1):258–269
Zhang L, Li X, Nie L, Yan Y, Zimmermann R (2016) Semantic photo retargeting under noisy image labels. ACM 12(3):37
Zhang L, Song M, Li N, Bu J, Chen C (2009) Feature selection for fast speech emotion recognition International conference on multimedia 2009, Vancouver, British Columbia, Canada, pp 753–756
Zhang L, Song M, Liu Z, Liu X, Bu J, Chen C (2013) Probabilistic graphlet cut: exploring spatial structure cuefor weakly supervised image segmentation Proceedings of CVPR, vol 9, pp 1908–1915
Zhang L, Song M, Zhao Q, Liu X, Bu J, Chen C (2013) Probabilistic graphlet transfer for photo cropping. IEEE Trans Cybern 21(5):2887–2897
Zhang L, Wang M, Hong R, Yin B, Li X (2016) Large-scale aerial image categorization using a multitask topological codebook. IEEE Trans Cybern 46 (2):535–545
Zhang L, Yang Y, Gao Y, Wang C, Yu Y, Li X (2014) A probabilistic associative model for segmenting weakly-supervised images. IEEE Trans Image Process 23(9):4150–4159
Zhang L, Yang Y, Wang M, Hong R, Nie L, Li X (2016) Detecting densely distributed graph patterns for fine-grained image categorization. IEEE Trans Image Process 25(2):553–565
Zhao R, Ouyang W, Wang X (2014) Learning mid-level filters for person re-identification. In: 2014 IEEE conference on computer vision and pattern recognition, pp 144–151
Zheng W-S, Gong S, Xiang T (2009) Associating groups of people. BMVC
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Qi, M., Han, J., Jiang, J. et al. Deep feature representation and multiple metric ensembles for person re-identification in security surveillance system. Multimed Tools Appl 78, 27029–27043 (2019). https://doi.org/10.1007/s11042-017-4649-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-017-4649-2