Skip to main content
Log in

User Modelling for News Web Sites with Word Sense Based Techniques

  • Published:
User Modeling and User-Adapted Interaction Aims and scope Submit manuscript

Abstract

SiteIF is a personal agent for a bilingual news web site that learns user’s interests from the requested pages. In this paper we propose to use a word sense based document representation as a starting point to build a model of the user’s interests. Documents passed over are processed and relevant senses (disambiguated over WordNet) are extracted and then combined to form a semantic network. A filtering procedure dynamically predicts new documents on the basis of the semantic network.

There are two main advantages of a sense-based approach: first, the model predictions, being based on senses rather than words, are more accurate; second, the model is language independent, allowing navigation in multilingual sites. We report the results of a comparative experiment that has been carried out to give a quantitative estimation of these improvements.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Artale, A., Magnini, B. and Strapparava, C.: 1997, WORD NET for Italian and its Use for Lexical Discrimination. In: AI*IA97: Advances in Artificial Intelligence. Springer-Verlag, pp. 346–356.

  • Ballesteros, L. and Croft, W. B.: 1997, Phrasal translation and query expansion techniques for cross-language information retrieval. In: N. J. Belkin, A. D. Narasimhalu, and P. Willett (eds.): Proceedings of the 20th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR-97), Vol. 31, special issue of SIGIR Forum. New York, pp. 84–91.

    Article  Google Scholar 

  • Billsus, D., Brunk, C. A., Evans, C., Gladish, B. and Pazzani M.: 2002, Adaptive interfaces for ubiquitous web. Communications of The ACM 45(5), 34–38.

    Article  Google Scholar 

  • Brusilovsky, P.: 1998, Methods and Techniques of Adaptive Hypermedia. In: P. Brusilovsky, A. Kobsa and J. Vassileva (eds.): Adaptive Hypertext andHypermedia. Dordrecht: Kluwer Academic Publisher, pp. 1–43.

    Google Scholar 

  • Fellbaum, C.: 1998, WordNet. An Electronic Lexical Database. Cambridge, MA: MIT Press.

    Google Scholar 

  • Fellbaum, C. and Vossen P.: (eds.): 2002, Proceedings of the First International WordNet Conference. Mysore, India.

  • Gonzalo, J., Verdejio, F., Chugur and Cigarran, J.: 1998a, Indexing with WordNet synsets can improve text retrieval. In: S. Harabagiu (ed.): Proceedings of the Workshop ‘Usage of WordNet in Natural Language Processing Systems’. Montreal, Quebec, Canada.

  • Gonzalo, J., Verdejio, F., Peters, C. and Calzolari, N.: 1998b, Applying eurowordnet to crosslanguage text retrieval. Computers and Humanities 32(2-3), 185–207.

    Article  Google Scholar 

  • Grefenstette, G.: 1998, Cross-Language Information Retrieval, Boston: Kluwer.

    Google Scholar 

  • Hull, D. A.: 1997, Using Structured Queries for Disambiguation in Cross-Language Information Retrieval. In: Working Notes of AAAI Spring Symposium on Cross-Language Text and Speech Retrieval. Stanford, CA, 73–81.

  • Kilgarriff, A. and Yallop, C.: 2000, What’s in a thesaurus?. In: Proceedings of LREC-2000, Second International Conference on Language Resources and Evaluation. Athens, Greece, 1371–1379.

  • Krovetz, R. and Croft, W. B.: 1992, Lexical Ambiguity and Information Retrieval. ACM Transactions on Information Systems 10(2), 115–141.

    Article  Google Scholar 

  • Lieberman, H., Dyke, N. W. V. and Vivacqua, A. S.: 1999, Let’s Browse: A Collaborative Web Browsing Agent. In: Proceedings of the 1999 International Conference on Intelligent User Interfaces. pp. 65–68.

  • Magnini, B. and Cavagliá, G.: 2000, Integrating Subject Field Codes into WordNet. In: Proceedings of LREC-2000, Second International Conference on Language Resources and Evaluation. Athens, Greece, pp. 1413–1418.

  • Magnini, B. and Strapparava, C.: 2000, Experiments in Word Domain Disambiguation for Parallel Texts. In: Proc. of SIGLEX Workshop on Word Senses and Multi-linguality. Hong-Kong, pp. 27–33. held in conjunction with ACL2000.

  • Magnini, B. and Strapparava, C.: 2001, Improving User Modelling with Content-Based Techniques. In: UM2001 User Modeling: Proc. of 8th International Conference on User Modeling (UM2001). Sonthofen (Germany), pp. 74–83.

  • Magnini, B., Strapparava, C., Pezzulo, G. and Gliozzo, A.: 2002, The role of domain information in word sense disambiguation. Journal of Natural Language Engineering 8(4), 359–373.

    Article  Google Scholar 

  • Micarelli, A. and Sciarrone, F.: 2004, Anatomy and Empirical Evaluation of an Adaptive Web-Based Information Filtering System. User Modeling and User-Adapted Interaction (this issue).

  • Miller, G.: 1995, A lexical database for English. Communications of the ACM 38(11), 39–41.

    Article  Google Scholar 

  • Minio, M. and Tasso, C.: 1996, User Modeling for Information Filtering on internet Services: Exploiting an Extended Version of the UMT Shell. In: Proc. of Workshop on User Modeling for Information Filtering on the World Wide Web. Kailia-Kuna Hawaii. Held in conjunction with UM’96.

  • Resnik, P.: 1995, Disambiguating Noun Groupings with Respect to WordNet Senses. In: Proc. of third workshop on very large corpora. MIT, Boston.

  • Schmid, H.: 1994, Probabilistic Part-of-Speech Tagging Using Decision Trees. In: Proceedings of International Conference on New Methods in Language Processing. Manchester, UK.

  • Stefani, A. and Strapparava, C.: 1998, Personalizing Access to Web Sites: The SiteIF Project. In: Proc. of second Workshop on Adaptive Hypertext and Hypermedia. Pittsburgh. Held in conjunction with HYPERTEXT 98. http://wwwis.win.tue.nl/ah98/Stefani/ Stefani.html.

  • Stevenson, M. and Wilks, Y.: 2001, The interaction of knowledge sources in word sense disambiguation. Computational Linguistics 27(3), 321–350.

    Article  Google Scholar 

  • Strapparava, C., Magnini, B. and Stefani, A.: 2000, Sense-Based User Modelling for Web Sites. In: Adaptive Hypermedia and Adaptive Web-Based Systems-Lecture Notes in Computer Science 1892. Heidelberg: Springer-Verlag, pp. 388–391.

    Google Scholar 

  • SENSEVAL-2: 2001. http://www.sle.sharp.co.uk/senseval2/.

  • Voorhees, E. M.: 1993, Using WordNet to disambiguate word senses for text retrieval. In: Sixteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Pittsburgh, Pennsylvania, pp. 171–180.

  • Vossen, P.: 1998, Special Issue on EuroWordNet. Computers and Humanities 32.

  • Waern, A.: 2004, User involvement in automatic filtering an experimental study user involvement in automatic filtering-an experimental study. User Modeling and User-Adapted Interaction. (this issue).

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Magnini, B., Strapparava, C. User Modelling for News Web Sites with Word Sense Based Techniques. User Model User-Adap Inter 14, 239–257 (2004). https://doi.org/10.1023/B:USER.0000028980.13669.44

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/B:USER.0000028980.13669.44

Navigation