Abstract
The work presented in this paper aims at reducing the semantic gap between low level video features and semantic video objects. The proposed method for finding associations between segmented frame region characteristics relies on the strength of Latent Semantic Analysis (LSA). Our previous experiments [1], using color histograms and Gabor features, have rapidly shown the potential of this approach but also uncovered some of its limitation. The use of structural information is necessary, yet rarely employed for such a task. In this paper we address two important issues. The first is to verify that using structural information does indeed improve performance, while the second concerns the manner in which this additional information is integrated within the framework. Here, we propose two methods using the structural information. The first adds structural constraints indirectly to the LSA during the preprocessing of the video, while the other includes the structure directly within the LSA. Moreover, we will demonstrate that when the structure is added directly to the LSA the performance gain of combining visual (low level) and structural information is convincing.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Souvannavong, F., Merialdo, B., Huet, B.: Video content modeling with latent semantic analysis. In: Third InternationalWorkshop on Content-Based Multimedia Indexing (2003)
TREC Video Retrieval Workshop (TRECVID), http://www-nlpir.nist.gov/projects/trecvid/
Deerwester, S.C., Dumais, S.T., Landauer, T.K., Furnas, G.W., Harshman, R.A.: Indexing by latent semantic analysis. American Soc. of Information Science Journal 41, 391–407 (1990)
Zhao, R., Grosky, W.I.: Video Shot Detection Using Color Anglogram and Latent Semantic Indexing: From Contents to Semantics. CRC Press, Boca Raton (2003)
Monay, F., Gatica-Perez, D.: On image auto-annotation with latent space models. In: ACM Int. Conf. on Multimedia (2003)
Wiemer-Hastings, P.: Adding syntactic information to lsa. In: Proceedings of the Twentysecond Annual Conference of the Cognitive Science Society, pp. 989–993 (2000)
Landauer, T., Laham, D., Rehder, B., Schreiner, M.: How well can passage meaning be derived without using word order. Cognitive Science Society, 412–417 (1997)
Swain, M., Ballard, D.: Indexing via colour histograms. In: ICCV, pp. 390–393 (1990)
Flickner, M., Sawhney, H., et al.: Query by image and video content: the qbic system. IEEE Computer 28, 23–32 (1995)
Pentland, A., Picard, R., Sclaroff, S.: Photobook: Content-based manipulation of image databases. International Journal of Computer Vision 18, 233–254 (1996)
Gimelfarb, G., Jain, A.: On retrieving textured images from an image database. Pattern Recognition 29, 1461–1483 (1996)
Shearer, K., Venkatesh, S., Bunke, H.: An efficient least common subgraph algorithm for video indexing. In: International Conference on Pattern Recognition, vol. 2, pp. 1241–1243 (1998)
Huet, B., Hancock, E.: Line pattern retrieval using relational histograms. IEEE Transactions on Pattern Analysis and Machine Intelligence 21, 1363–1370 (1999)
Sengupta, K., Boyer, K.: Organizing large structural modelbases. IEEE Transactions on Pattern Analysis and Machine Intelligence (1995)
Messmer, B., Bunke, H.: A new algorithm for error-tolerant subgraph isomorphism detection. IEEE Transactions on Pattern Analysis and Machine Intelligence (1998)
Felzenszwalb, P., Huttenlocher, D.: Efficiently computing a good segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 98–104 (1998)
Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice Hall, Englewood Cliffs (1988)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hohl, L., Souvannavong, F., Merialdo, B., Huet, B. (2004). Using Structure for Video Object Retrieval. In: Enser, P., Kompatsiaris, Y., O’Connor, N.E., Smeaton, A.F., Smeulders, A.W.M. (eds) Image and Video Retrieval. CIVR 2004. Lecture Notes in Computer Science, vol 3115. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-27814-6_66
Download citation
DOI: https://doi.org/10.1007/978-3-540-27814-6_66
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22539-3
Online ISBN: 978-3-540-27814-6
eBook Packages: Springer Book Archive