Abstract
SiteIF is a personal agent for a bilingual news web site that learns user’s interests from the requested pages. In this paper we propose to use a word sense based document representation as a starting point to build a model of the user’s interests. Documents passed over are processed and relevant senses (disambiguated over WordNet) are extracted and then combined to form a semantic network. A filtering procedure dynamically predicts new documents on the basis of the semantic network.
There are two main advantages of a sense-based approach: first, the model predictions, being based on senses rather than words, are more accurate; second, the model is language independent, allowing navigation in multilingual sites. We report the results of a comparative experiment that has been carried out to give a quantitative estimation of these improvements.
Similar content being viewed by others
References
Artale, A., Magnini, B. and Strapparava, C.: 1997, WORD NET for Italian and its Use for Lexical Discrimination. In: AI*IA97: Advances in Artificial Intelligence. Springer-Verlag, pp. 346–356.
Ballesteros, L. and Croft, W. B.: 1997, Phrasal translation and query expansion techniques for cross-language information retrieval. In: N. J. Belkin, A. D. Narasimhalu, and P. Willett (eds.): Proceedings of the 20th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR-97), Vol. 31, special issue of SIGIR Forum. New York, pp. 84–91.
Billsus, D., Brunk, C. A., Evans, C., Gladish, B. and Pazzani M.: 2002, Adaptive interfaces for ubiquitous web. Communications of The ACM 45(5), 34–38.
Brusilovsky, P.: 1998, Methods and Techniques of Adaptive Hypermedia. In: P. Brusilovsky, A. Kobsa and J. Vassileva (eds.): Adaptive Hypertext andHypermedia. Dordrecht: Kluwer Academic Publisher, pp. 1–43.
Fellbaum, C.: 1998, WordNet. An Electronic Lexical Database. Cambridge, MA: MIT Press.
Fellbaum, C. and Vossen P.: (eds.): 2002, Proceedings of the First International WordNet Conference. Mysore, India.
Gonzalo, J., Verdejio, F., Chugur and Cigarran, J.: 1998a, Indexing with WordNet synsets can improve text retrieval. In: S. Harabagiu (ed.): Proceedings of the Workshop ‘Usage of WordNet in Natural Language Processing Systems’. Montreal, Quebec, Canada.
Gonzalo, J., Verdejio, F., Peters, C. and Calzolari, N.: 1998b, Applying eurowordnet to crosslanguage text retrieval. Computers and Humanities 32(2-3), 185–207.
Grefenstette, G.: 1998, Cross-Language Information Retrieval, Boston: Kluwer.
Hull, D. A.: 1997, Using Structured Queries for Disambiguation in Cross-Language Information Retrieval. In: Working Notes of AAAI Spring Symposium on Cross-Language Text and Speech Retrieval. Stanford, CA, 73–81.
Kilgarriff, A. and Yallop, C.: 2000, What’s in a thesaurus?. In: Proceedings of LREC-2000, Second International Conference on Language Resources and Evaluation. Athens, Greece, 1371–1379.
Krovetz, R. and Croft, W. B.: 1992, Lexical Ambiguity and Information Retrieval. ACM Transactions on Information Systems 10(2), 115–141.
Lieberman, H., Dyke, N. W. V. and Vivacqua, A. S.: 1999, Let’s Browse: A Collaborative Web Browsing Agent. In: Proceedings of the 1999 International Conference on Intelligent User Interfaces. pp. 65–68.
Magnini, B. and Cavagliá, G.: 2000, Integrating Subject Field Codes into WordNet. In: Proceedings of LREC-2000, Second International Conference on Language Resources and Evaluation. Athens, Greece, pp. 1413–1418.
Magnini, B. and Strapparava, C.: 2000, Experiments in Word Domain Disambiguation for Parallel Texts. In: Proc. of SIGLEX Workshop on Word Senses and Multi-linguality. Hong-Kong, pp. 27–33. held in conjunction with ACL2000.
Magnini, B. and Strapparava, C.: 2001, Improving User Modelling with Content-Based Techniques. In: UM2001 User Modeling: Proc. of 8th International Conference on User Modeling (UM2001). Sonthofen (Germany), pp. 74–83.
Magnini, B., Strapparava, C., Pezzulo, G. and Gliozzo, A.: 2002, The role of domain information in word sense disambiguation. Journal of Natural Language Engineering 8(4), 359–373.
Micarelli, A. and Sciarrone, F.: 2004, Anatomy and Empirical Evaluation of an Adaptive Web-Based Information Filtering System. User Modeling and User-Adapted Interaction (this issue).
Miller, G.: 1995, A lexical database for English. Communications of the ACM 38(11), 39–41.
Minio, M. and Tasso, C.: 1996, User Modeling for Information Filtering on internet Services: Exploiting an Extended Version of the UMT Shell. In: Proc. of Workshop on User Modeling for Information Filtering on the World Wide Web. Kailia-Kuna Hawaii. Held in conjunction with UM’96.
Resnik, P.: 1995, Disambiguating Noun Groupings with Respect to WordNet Senses. In: Proc. of third workshop on very large corpora. MIT, Boston.
Schmid, H.: 1994, Probabilistic Part-of-Speech Tagging Using Decision Trees. In: Proceedings of International Conference on New Methods in Language Processing. Manchester, UK.
Stefani, A. and Strapparava, C.: 1998, Personalizing Access to Web Sites: The SiteIF Project. In: Proc. of second Workshop on Adaptive Hypertext and Hypermedia. Pittsburgh. Held in conjunction with HYPERTEXT 98. http://wwwis.win.tue.nl/ah98/Stefani/ Stefani.html.
Stevenson, M. and Wilks, Y.: 2001, The interaction of knowledge sources in word sense disambiguation. Computational Linguistics 27(3), 321–350.
Strapparava, C., Magnini, B. and Stefani, A.: 2000, Sense-Based User Modelling for Web Sites. In: Adaptive Hypermedia and Adaptive Web-Based Systems-Lecture Notes in Computer Science 1892. Heidelberg: Springer-Verlag, pp. 388–391.
SENSEVAL-2: 2001. http://www.sle.sharp.co.uk/senseval2/.
Voorhees, E. M.: 1993, Using WordNet to disambiguate word senses for text retrieval. In: Sixteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Pittsburgh, Pennsylvania, pp. 171–180.
Vossen, P.: 1998, Special Issue on EuroWordNet. Computers and Humanities 32.
Waern, A.: 2004, User involvement in automatic filtering an experimental study user involvement in automatic filtering-an experimental study. User Modeling and User-Adapted Interaction. (this issue).
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Magnini, B., Strapparava, C. User Modelling for News Web Sites with Word Sense Based Techniques. User Model User-Adap Inter 14, 239–257 (2004). https://doi.org/10.1023/B:USER.0000028980.13669.44
Issue Date:
DOI: https://doi.org/10.1023/B:USER.0000028980.13669.44