Synonyms
Definition
The Vector-Space Model (VSM) for Information Retrieval represents documents and queries as vectors of weights. Each weight is a measure of the importance of an index term in a document or a query, respectively. The index term weights are computed on the basis of the frequency of the index terms in the document, the query or the collection. At retrieval time, the documents are ranked by the cosine of the angle between the document vectors and the query vector. For each document and query, the cosine of the angle is calculated as the ratio between the inner product between the document vector and the query vector, and the product of the norm of the document vector by the norm of the query vector. The documents are then returned by the system by decreasing cosine.
Historical Background
The use of vectors for modeling IR systems dates back to the early days of IR, especially as a tool for describing how a system should be designed and implemented. The popularity of...
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Recommended Reading
Deerwester S., Dumais S., Furnas G., Landauer T., and Harshman R. Indexing by latent semantic analysis. J. Am. Soc. Inform. Sci., 41(6):391–407, 1990.
Dubin D. The most influential paper Gerard Salton never wrote. Libr. Trends, 52(4):748–764, 2004.
Halmos P. Finite-Dimensional Vector Spaces. Undergraduate Texts in Mathematics, Springer, 1987.
Melucci M. A basis for information retrieval in context. ACM Trans. Inform. Syst., 26(3), 2008.
Salton G. Associative document retrieval techniques using bibliographic information. J. ACM, 10440–457, 1963.
Salton G. Automatic Text Processing. Addison-Wesley, 1989.
Salton G. Mathematics and information retrieval. J. Doc., 35(1):1–29, 1979.
Salton G. and Buckley C. Term Weighting Approaches in Automatic Text Retrieval. Inform. Process. Manage., 24(5):513–523, 1988.
Salton G., Wong A., and Yang C. A vector space model for automatic indexing. Commun. ACM, 18(11):613–620, 1975.
Salton G., Yang C., and Yu C. A theory of term importance in automatic text analysis. J. Am. Soc. Inform. Sci., 26(1):33–44, 1975.
Singhal A., Buckley C., and Mitra M. Pivoted Document Length Normalization. In Proc. 19th Annual Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, pp. 21–29.1996.
van Rijsbergen C. The Geometry of Information Retrieval. Cambridge University Press, UK, 2004.
Wong S. and Raghavan V. Vector space model of information retrieval – a reevaluation. In Proc. 7th Annual Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, 1984, pp. 167–185.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer Science+Business Media, LLC
About this entry
Cite this entry
Melucci, M. (2009). Vector-Space Model. In: LIU, L., ÖZSU, M.T. (eds) Encyclopedia of Database Systems. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-39940-9_918
Download citation
DOI: https://doi.org/10.1007/978-0-387-39940-9_918
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-35544-3
Online ISBN: 978-0-387-39940-9
eBook Packages: Computer ScienceReference Module Computer Science and Engineering