Abstract
Along with the rapid development of mobile terminal devices, landmark recognition applications based on mobile devices have been widely researched in recent years. Due to the fast response time requirement of mobile users, an accurate and efficient landmark recognition system is thus urgent for mobile applications. In this paper, we propose a landmark recognition framework by employing a novel discriminative feature selection method and the improved extreme learning machine (ELM) algorithm. The scalable vocabulary tree (SVT) is first used to generate a set of preliminary codewords for landmark images. An efficient codebook learning algorithm derived from the word mutual information and Visual Rank technique is proposed to filter out those unimportant codewords. Then, the selected visual words, as the codebook for image encoding, are used to produce a compact Bag-of-Words (BoW) histogram. The fast ELM algorithm and the ensemble approach using the ELM classifier are utilized for landmark recognition. Experiments on the Nanyang Technological University campus’s landmark database and the Fifteen Scene database are conducted to illustrate the advantages of the proposed framework.
Similar content being viewed by others
References
Arai K, Barakbah AR (2007) Hierarchical K-means: an algorithm for centroids initialization for K-means. Rep Fac Sci Engrg 36:25–31
Bhattacharya P, Gavrilova M (2013) A survey of landmark recognition using the bag-of-words framework. Intelligent Computer Graphics 2012, Studies in Computational Intelligence 441:243–263
Bobek S, Nalepa GJ, Ligȩza A, Adrian WT, Kaczor K (2014) Mobile context-based framework for threat monitoring in urban environment with social threat monitor. Multimed Tools Appl, in press. doi:10.1007/s11042-014-2060-9
Bosch A, Zisserman A, Munoz X (2008) Scene classification using a hybrid generative/discriminative approach. IEEE Trans. Pattern Anal. Mach. Intell. 4(3):712–726
Cao J, Lin Z, Huang G-B (2010) Composite function wavelet neural networks with extreme learning machine. Neurocomputing 73:1405–1416
Cao J, Lin Z, Huang G-B, Liu N (2012) Voting based extreme learning machine. Inf Sci 185:66–77
Cao J, Lin Z, Huang G-B (2013) Voting base online sequential extreme learning machine for multi-class classification. Proc IEEE Int Symp Circ Syst:2327–2330
Cao J, Chen T, Fan J (2014) Recognition based on BoW Framework. In: Proc. of 2014 IEEE Conf. Indust. Elect. Applica., pp 1163–1168, Hangzhou, China
Chen T, Yap K-H (2014) Discriminative BoW framework for mobile landmark recognition. IEEE Trans Cybern 44(3):695–706
Chen T, Wu K, Yap K-H, Li Z, Tsai FS (2009) A survey on mobile landmark recognition for information retrieval. In: International Conferences Mobile Data Management: Systems, Services and Middleware, pp 625–630
Chen T, Yap K-H, Chau L-P (2011) Integrated content and context analysis for mobile landmark recognition. IEEE Trans Circuits Syst Video Technol 21(10):41476–1486
Cheng C, Page D, Abidi L (2008) Object-based place recognition and loop closing with jigsaw puzzle image segmentation algorithm. In: Proceedings IEEE Conferences Robotics and Automation, Pasadena, CA, pp 557–562
Chin T, Goh H, Lim J (2008) Boosting descriptors condensed from video sequences for place recognition. In: Proceedings Computers Vision and Pattern Recognition Workshop on Visual Localization for Mobile Platforms
Csurka G, Dance C, Fan L, Willamowski J, Bray C (2004) Visual categorization with bags of keypoints. In: Proceedings International Workshop Statistics Learning Computing Vision
Fergus R, Perona P, Zisserman A (2003) Object class recognition by unsupervised scale-invariant learning. Proc IEEE Int Conf Comput Vis Pattern Recognit 2:264–271
Fritz G, Seifert C, Paletta L (2006) A mobile vision system for urban detection with informative local descriptors. Proc IEEE Int Conf Comput Vis Syst 30–35
Ge Y, Yu J (2008) A scene recognition algorithm based on covariance descriptor. In: Proc. IEEE Conf. Cybernetics and Systems, pp 838–842
Han J, Xu M, Li X, Guo L, Liu T (2014) Interactive object-based image retrieval and annotation on iPad. Multimed Tools Appl 72:2275–2297
Huang G-B, Zhu Q-Y, Siew C-K (2006) Extreme learning machine: theory and applications. Neurocomputing 70:489–501
Huang G-B, Zhou H, Ding X, Zhang R (2012) Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern B Cybern 42:513–529
Hsu C-W, Lin C-J (2002) A comparison of methods for multiclass support vector machines. IEEE Trans Neural Netw 13(2):415–425
Lan Y, Soh YC, Huang G-B (2009) Ensemble of online sequential extreme learning machine. Neurocomputing 72(13–15):3391–3395
Lee Y, Kim C, Kim Y, Whangbo T (2013) Facial landmarks detection using improved active shape model on android platform. Multimed Tools Appl, in press. doi:10.1007/s11042-013-1565-y
Li F-F, Perona P (2005) A Bayesian hierarchical model for learning natural scene categories. In: Proceedings IEEE Conference Computer Vision and Pattern Recognition, pp 524–531
Li Y, Lim J (2007) Outdoor place recognition using compact local descriptors and multiple queries with user verification, In: Proceedings 15th International Conference Multiple, Augsburg, Germany, pp 549–552
Li Y, Lim JH, Goh H (2008) Cascaded classification with optimal candidate selection for effective place recognition. Proc IEEE Conf Multimedia 1493–1496
Lim J, Li Y, You Y (2007) Scene recognition with camera phones for tourist information access. Proc Int Conf Multimedia 100–103
Lin SB, Liu X, Fang J, Xu ZB Is extreme learning machine feasible? A theoretical assessment (Part II), arXiv: 1401.6240v1 [cs.LG], 24 Jan. 2014
Linde O, Lindeberg T (2004) Object recognition using composed receptive field histograms of higher dimensionality. In: Proc. IEEE Conf. Image process. Pattern Recognit.
Liu N, Wang H (2010) Ensemble based extreme learning machine. IEEE Signal Process Lett 17 (8):754–757
Lowe D (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vision 60:91–110
Lu L, Toyama K, Hagar GD (2005) A two level approach for scene recognition. Proc IEEE Int Conf Comput Vision Pattern Recognit 1:688–695
Lu H, An C, Zheng E, Lu Y (2014) Dissimilarity based ensemble of extreme learning machine for gene expression data classification. Neurocomputing 128:22–30
MacQueen JB (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings 5th Berkeley Symp. on mathematical statistics and probability, pp 281–297
Nister D, Stewenius H (2006) Scalable recognition with a vocabulary tree. Proc IEEE Int Conf Comput Vis Pattern Recognit 2:2161–2168
Oliva A, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. Int J Comput Vis 42(3):145–175
Page L, Brin S, Motwani R, Winograd T (1999) The PageRank citation ranking: bringing order to the web, Technical Report, Stanford InfoLab. http://ilpubs.stanford.edu:8090/422/
Parikh D, Zitnick CL, Chen T (2008) Determining patch saliency using low-level context. Proc 10th Eur Conf Comput Vision 446–459
Pessemier TD, Dooms S, Martens L (2014) Context-aware recommendations through context and activity recognition in a mobile environment. Multimed Tools Appl 72:2925–2948
Pinz A, Fussenegger A, Auer M (2006) Generic object recognition with boosting. IEEE Trans Pattern Anal Mach Intell 28:416–431
Pluim JP, Mainta JB, Viergever MA (2003) Mutual-information-based registration of medical images: a survey. IEEE Trans Med Imaging 22:986–1004
Pronobis S, Caputo A (2007) Confidence-based cue integration for visual place recognition. In: Proceedings IEEE Conference Robots, Intelligent systems, San Diego, CA, pp 2394–2401
Pronobis A, Caputo B, Jensfelt P, Christensen HI (2006) A discriminative approach to robust visual place recognition. In: Proceedings IEEE Conference Intelligent Robots Systems
Scalzo F, Piater JH (2007) Adaptive Patch Features for Object Class Recognition with Learned Hierarchical Models. Proc IEEE Int Conf Comput Vision and Pattern Recognit 1-8
Sivic J, Zisserman A (2003) Video google: a text retrieval approach to object matching in videos. Proc IEEE Int Conf Comput Vis 1470–1478
Torralba A, Murphy KP, Freeman WT, Rubin MA (2003) Context-based vision system for place and object recognition. Proc IEEE Int Conf Comput Vis 273–280
Yap K-H, Chen T, Li Z, Wu K (2010) A comparative study of mobilebased landmark recognition techniques. IEEE Intell Syst 25(1):48–57
Yeh T, Tollmar K, Darrell T (2004) Searching the web with mobile images for location recognition. In: Proc. IEEE Conf. Comput. Vision and Pattern Recognit., pp 76–81
Yin C, Zhong S, Chen W (2012) Design of sliding mode controller for a class of fractional-order chaotic systems. Commun Nonlinear Sci Numer Simulat 17:356–366
Yin C, Dadras S, Zhong S, Chen Y (2013) Control of a novel class of fractional-order chaotic systems via adaptive sliding mode control approach. Appl Math Modell 37(4):2469–2483
Yin C, Chen Y, Zhong S (2014) Fractional-order sliding mode based extremum seeking control of a class of nonlinear systems. Automatica 50:3173–3181
Zamir AR, Shah M (2010) Accurate image localization based on Google Maps street view. Proc Eur Conf Comput Vision 6314:255–268
Zhang SL, Tian Q, Hua G, Huang Q, Li S (2009) Descriptive visual words and visual phrases for image applications. Proc ACM Inter Conf Multimedia 75–84
Zhang SL, Tian Q, Hua G, Huang Q, Li S, Gao W (2011) Generating descriptive visual words and visual phrases for large-scale image applications. IEEE Trans Image Processing 20(9):2664–2677
Zheng Y-T, Zhao M, Song Y, Adam H, Buddemeier U, Bissacco A, Brucher F, Chua T-S, Neven H (2009) Tour the world: Building a webscale landmark recognition engine. Proc IEEE Int Conf Comput Vis Pattern Recognit 1085–1092
Acknowledgements
This work was supported by the National Natural Science Major Foundation of Research Instrumentation of P. R. China under Grants 61427808, the Key Foundation of P. R. China under Grants 61333009, and in part by the National Key Basic Research Program of P. R. China under Grants 2012CB821204. We would like to thank the reviewers and the Editor for their constructive comments and suggestions on improving our paper.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Cao, J., Chen, T. & Fan, J. Landmark recognition with compact BoW histogram and ensemble ELM. Multimed Tools Appl 75, 2839–2857 (2016). https://doi.org/10.1007/s11042-014-2424-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-014-2424-1