Skip to main content

BOOSTING ITEM FINDABILITY: BRIDGING THE SEMANTIC GAP BETWEEN SEARCH PHRASES AND ITEM INFORMATION

  • Conference paper
Enterprise Information Systems VII

Abstract

Most search engines do their text query and retrieval based on keyword phrases. However, publishers cannot anticipate all possible ways in which users search for the items in their documents. In fact, many times, there may be no direct keyword match between a search phrase and descriptions of items that are perfect “hits” for the search. We present a highly automated solution to the problem of bridging the semantic gap between item information and search phrases. Our system can learn rule-based definitions that can be ascribed to search phrases with dynamic connotations by extracting structured item information from product catalogs and by utilizing a frequent itemset mining algorithm. We present experimental results for a realistic e-commerce domain. Also, we compare our rule-mining approach to vector-based relevance feedback retrieval techniques and show that our system yields definitions that are easier to validate and perform better.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • R. Agrawal and R. Srikant. 1994, “Fast Algorithms for mining association rules”. In Proc. 20th Int. Conf. VLDB pp. 487–499

    Google Scholar 

  • H. Aholen, O. Heinonen, M. Klemettinen, and A. I. Verkamo. 1998, “Applying Data Mining Techniques for Descriptive Phrase Extraction in Digital Collections”. In Proceedings of ADL'98, Santa Barabara, USA W. Andrews. 2003 “Gartner Report: Visionaries Invade the 2003 Search Engine Magic Quadrant”.

    Google Scholar 

  • V. Crescenzi, G. Mecca, and P. Merialdo. 2001 “Roadrunner: Towards automatic data extraction from large web sites”, In Proc. of the 2001 Intl. Conf. on Very Large Data Bases.

    Google Scholar 

  • H. Davulcu, S. Vadrevu, S. Nagarajan, I.V. Ramakrishnan. 2003, “OntoMiner: Bootstrapping and Populating Ontologies From Domain Specific Web Sites”, in IEEE Intelligent Systems, Volume 18, Number 5.

    Google Scholar 

  • Deerwester, S., Dumais, S. T., Landauer, T. K., Furnas, G. W. and Harshman, R. A. 1990, “Indexing. Latent semantic analysis”, journal of the Society for Information Science, 41(6), pp. 391–407.

    Article  Google Scholar 

  • Steve Finch and Andrei Mikheev. 1997, “A Workbench for Finding Structure in Texts”. Applied Natural Language Processing, Washington D.C.

    Google Scholar 

  • J. Han J.Pei, Y.Yin, and R. Mao. 2000, “Mining frequent pattern without candidate generation.” In Proceedings of the ACM SIGMOD International Conference on Management of Data, volume 29(2) of SIGMOD Record, ACM Press.

    Google Scholar 

  • J. Han, and M. Kamber. 2001, “Data Mining: Concepts and Techniques”, Morgan Kaufmann Publishers.

    Google Scholar 

  • Hung V. Nguyen, P. Velamuru, D. Kolippakkam, H. Davulcu, H. Liu, and M. Ates. 2003, “Mining “Hidden Phrase” Definitions from the Web”. APWeb, Xi'an, China, Springer-Velag, LNCS Vol 2642, pp. 156–165.

    Google Scholar 

  • M.F. Porter. 1980, “An algorithm for suffix stripping”, Program, 14 no. 3, pp. 130–137.

    Google Scholar 

  • G. Salton and C. Buckley. 1990, “Improving retrieval performance by relevance feedback”, journal of the American Society for Information Science, pp. 288–297.

    Google Scholar 

  • R. A. Baeza-Yates and Berthier A. Ribeiro-Neto. 1999, “Modern Information Retrieval”, ACM Press / Addison-Wesley.

    Google Scholar 

  • M.J. Zaki. 2000, “Scalable algorithms for association mining”. IEEE Transactions on Knowledge and Data Engineering, 12(3), pp. 372–390.

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer

About this paper

Cite this paper

Davulcu, H., Nguyen, H.V., Ramachandran, V. (2007). BOOSTING ITEM FINDABILITY: BRIDGING THE SEMANTIC GAP BETWEEN SEARCH PHRASES AND ITEM INFORMATION. In: Chen, CS., Filipe, J., Seruca, I., Cordeiro, J. (eds) Enterprise Information Systems VII. Springer, Dordrecht. https://doi.org/10.1007/978-1-4020-5347-4_24

Download citation

  • DOI: https://doi.org/10.1007/978-1-4020-5347-4_24

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-1-4020-5323-8

  • Online ISBN: 978-1-4020-5347-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics