Skip to main content

Using Lexical and Thematic Knowledge for Name Disambiguation

  • Conference paper
Information Retrieval Technology (AIRS 2012)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7675))

Included in the following conference series:

Abstract

In this paper we present a novel approach to disambiguate names based on two different types of semantic information: lexical and thematic. We propose to use translation-based language models to resolve the synonymy problem in every word match, and to use topic-based ranking function to capture rich thematic contexts for names. We test three ranking functions that combine lexical relatedness and thematic relatedness. The experiments on Wikipedia data set and TAC-KBP 2010 data set show that our proposed method is very effective for name disambiguation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Dredze, M., McNamee, P., Rao, D., Gerber, A., Finin, T.: Entity disambiguation for knowledge base population. In: Proc. COLING 2010, pp. 277–285 (2010)

    Google Scholar 

  2. Bunescu, R.: Using encyclopedic knowledge for named entity disambiguation. In: EACL, pp. 9–16 (2006)

    Google Scholar 

  3. Cucerzan, S.: Large-scale named entity disambiguation based on Wikipedia data. In: Proc. EMNLP-CoNLL 2007, pp. 708–716 (June 2007)

    Google Scholar 

  4. Gottipati, S., Jiang, J.: Linking entities to a knowledge base with query expansion. In: Proc. EMNLP 2011, pp. 804–813 (2011)

    Google Scholar 

  5. Pilz, A., Paaß, G.: From names to entities using thematic context distance. In: Proc. CIKM 2011, pp. 857–866 (2011)

    Google Scholar 

  6. Kozareva, Z., Ravi, S.: Unsupervised name ambiguity resolution using a generative model. In: Proc. EMNLP 2011, pp. 105–112 (2011)

    Google Scholar 

  7. Mihalcea, R., Csomai, A.: Wikify!: linking documents to encyclopedic knowledge. In: Proc. CIKM 2007, pp. 233–242 (2007)

    Google Scholar 

  8. Medelyan, O., Witten, I.H., Milne, D.: Topic indexing with wikipedia. In: Proc. AAAI 2008 (2008)

    Google Scholar 

  9. Milne, D., Witten, I.H.: Learning to link with wikipedia. In: Proc. CIKM 2008, pp. 509–518 (2008)

    Google Scholar 

  10. Han, X., Sun, L.: A generative entity-mention model for linking entities with knowledge base. In: Proc. HLT 2011, pp. 945–954 (2011)

    Google Scholar 

  11. Zhai, C., Lafferty, J.: A study of smoothing methods for language models applied to information retrieval. ACM Trans. Inf. Syst. 22(2), 179–214 (2004)

    Article  Google Scholar 

  12. Berger, A., Lafferty, J.: Information retrieval as statistical translation. In: Proc. SIGIR 1999, pp. 222–229 (1999)

    Google Scholar 

  13. Xue, X., Jeon, J., Croft, W.B.: Retrieval models for question and answer archives. In: Proc. SIGIR 2008, pp. 475–482 (2008)

    Google Scholar 

  14. Gao, J., He, X., Nie, J.Y.: Clickthrough-based translation models for web search: from word models to phrase models. In: Proc. CIKM 2010, pp. 1139–1148 (2010)

    Google Scholar 

  15. Lu, Y., Zhai, C., Sundaresan, N.: Rated aspect summarization of short comments. In: Proc. WWW 2009, pp. 131–140 (2009)

    Google Scholar 

  16. Kullback, S., Leibler, R.A.: On information and sufficiency. The Annals of Mathematical Statistics 22(1), 79–86 (1951)

    Article  MathSciNet  MATH  Google Scholar 

  17. Rosen-Zvi, M., Griffiths, T., Steyvers, M., Smyth, P.: The author-topic model for authors and documents. In: Proc. UAI 2004, pp. 487–494 (2004)

    Google Scholar 

  18. Heng, J., Ralph, G., Hoa, T.D., Kira, G., Joe, E.: Overview of the tac 2010 knowledge base population track. In: Proc. TAC 2010 (2010)

    Google Scholar 

  19. McCallum, A.K.: Mallet: A machine learning for language toolkit (2002), http://mallet.cs.umass.edu

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wang, J., Zhao, W.X., Yan, R., Wei, H., Nie, JY., Li, X. (2012). Using Lexical and Thematic Knowledge for Name Disambiguation. In: Hou, Y., Nie, JY., Sun, L., Wang, B., Zhang, P. (eds) Information Retrieval Technology. AIRS 2012. Lecture Notes in Computer Science, vol 7675. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35341-3_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-35341-3_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-35340-6

  • Online ISBN: 978-3-642-35341-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics