Skip to main content

Augmenting Data Retrieval with Information Retrieval Techniques by Using Word Similarity

  • Conference paper
Natural Language and Information Systems (NLDB 2008)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5039))

Abstract

Data retrieval (DR) and information retrieval (IR) have traditionally occupied two distinct niches in the world of information systems. DR systems effectively store and query structured data, but lack the flexibility of IR, i.e., the ability to retrieve results which only partially match a given query. IR, on the other hand, is quite useful for retrieving partial matches, but lacks the completed query specification on semantically unambiguous data of DR systems. Due to these drawbacks, we propose an approach to combine the two systems using pre-defined word similarities to determine the correlation between a keyword query (commonly used in IR) and data records stored in the inner framework of a standard RDBMS. Our integrated approach is flexible, context-free, and can be used on a wide variety of RDBs. Experimental results show that RDBMSs using our word-similarity matching approach achieve high mean average precision in retrieving relevant answers, besides exact matches, to a keyword query, which is a significant enhancement of query processing in RDBMSs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aslam, J., Pavlu, V., Yilmaz, E.: A Statistical Method for System Evaluation Using Incomplete Judgments. In: Intl. ACM SIGIR Conf., pp. 541–548. ACM, New York (2006)

    Google Scholar 

  2. Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison Wesley, Reading (1999)

    Google Scholar 

  3. Bai, J., Song, D., Bruza, P., Nie, J.-Y., Cao, G.: Query Expansion Using Term Relationships in Language Models for Information Retrieval. In: ACM CIKM, pp. 688–695. ACM, New York (2005)

    Google Scholar 

  4. Bremer, J.M., Gertz, M.: Integrating Document and Data Retrieval Based on XML. VLDB J. 1(15), 53–83 (2006)

    Google Scholar 

  5. Bruno, N., Chaudhuri, S., Gravano, L.: Top-K Selection Queries over Relational Databases: Mapping Strategies and Performance Evaluation. ACM TODS 2(27), 153–187 (2002)

    Article  Google Scholar 

  6. Carpineto, C., de Mori, R., Romano, G., Bigi, B.: An Information-Theoretic Approach to Automatic Query Expansion. ACM Transactions on Information Systems 1(19), 1–27 (2001)

    Article  Google Scholar 

  7. Cohen, W.W.: Data Integration Using Similarity Joins and a Word-Based Information Representation Language. ACM Transaction on Information Systems 3(18), 288–321 (2000)

    Article  Google Scholar 

  8. Fang, H., Zhai, C.: Semantic Term Matching in Axiomatic Approaches to Information Retrieval. In: ACM SIGIR Conf., pp. 115–122. ACM, New York (2006)

    Google Scholar 

  9. Goldman, R., Widom, J.: WSQ/DSQ: A Practical Approach for Combined Querying of Databases and the Web. In: ACM SIGMOD Conf., pp. 285–296. ACM, New York (2000)

    Chapter  Google Scholar 

  10. Grossman, D., Frieder, P.: Information Retrieval: Algorithms and Heuristics. In: Information Retrieval Functionality Using the Relational Model, pp. 168–176. Kluwer Academic, Dordrecht (1998)

    Google Scholar 

  11. Hristidis, V., Gravano, L., Papakonstantinou, Y.: Efficient IR-Style Keyword Search over Relational Databases. In: Intl. Conf. on Very Large Data Bases, pp. 850–861. ACM, New York (2003)

    Google Scholar 

  12. Kelly, D., Dollu, V., Fu, X.: The Loquacious User: A Document-Independent Source of Terms for Query Expansion. In: ACM SIGIR Conf., pp. 457–464. ACM, New York (2005)

    Google Scholar 

  13. Liu, S., Liu, F., Yu, C., Meng, W.: An Effective Approach to Document Retrieval via Utilizing WordNet and Recognizing Phrases. In: ACM SIGIR Conf., pp. 266–272. ACM, New York (2004)

    Google Scholar 

  14. Liu, F., Yu, C., Meng, W., Chowdhury, A.: Effective Keyword Search in Relational Databases. In: ACM SIGMOD Conf., pp. 563–574. ACM, New York (2006)

    Google Scholar 

  15. Sentz, K., Ferson, S.: Combination of Evidence in Dempster-Shafer Theory. SANDIA SAND2002-0835 (April 2002)

    Google Scholar 

  16. Silverstein, C., Marais, H., Henzinger, M., Moricz, M.: Analysis of a Very Large Web Search Engine Query Log. ACM SIGIR Forum 1(33), 6–12 (1999)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Epaminondas Kapetanios Vijayan Sugumaran Myra Spiliopoulou

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gustafson, N., Ng, YK. (2008). Augmenting Data Retrieval with Information Retrieval Techniques by Using Word Similarity. In: Kapetanios, E., Sugumaran, V., Spiliopoulou, M. (eds) Natural Language and Information Systems. NLDB 2008. Lecture Notes in Computer Science, vol 5039. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69858-6_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-69858-6_17

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-69857-9

  • Online ISBN: 978-3-540-69858-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics