Skip to main content

Automatic Document Topic Identification Using Social Knowledge Network

  • Reference work entry
  • First Online:
Encyclopedia of Social Network Analysis and Mining

Synonyms

Automatic document topic identification; Clustering; Ontology; Social knowledge network; Wikipedia

Glossary

ADTI:

Stands for automatic document topic identification

Ontology:

“A model for describing the world, that consists of a set of types (concepts), properties, and relationship types” (Garshol 2004)

SKN:

Stands for social knowledge network

WHO:

Stands for Wikipedia Hierarchical Ontology

TF-IDF:

A term weighting methodology that is commonly used in text mining and in information retrieval. It stands for term frequency-inverse document frequency

hi5:

An online social networking website

RDF:

Stands for Resource Description Framework. It is a method of representing information to facilitate the data interchange on the Web

ASR:

Stands for automatic speech recognition

NMI:

Stands for normalized mutual information. It is a well-known document clustering performance measure

NMF:

Stands for nonnegative matrix factorization. Nonnegative matrix factorization is a family of algorithms...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 1,500.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 549.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • (9)Auer S, Lehmann J (2007) What have Innsbruck and Leipzig in common? Extracting semantics from Wiki content. In: Franconi E, Kifer M, May W (eds) The semantic web: research and applications. Springer, Berlin/New York, pp 503–517

    Google Scholar 

  • Coursey K, Mihalcea R (2009) Topic identification using Wikipedia graph centrality. In: Proceedings of human language technologies: the 2009 annual conference of the North American chapter of the Association for Computational Linguistics, companion volume: short papers, Boulder. Association for Computational Linguistics, pp 117–120

    Google Scholar 

  • Coursey K, Mihalcea R, Moen W (2009) Using encyclopedic knowledge for automatic topic identification. In: Proceedings of the thirteenth conference on computational natural language learning, Boulder. Association for Computational Linguistics, pp 210–218

    Google Scholar 

  • European Travel Commission (2013) Social network- ing and UGC. http://www.newmediatrendwatch.com/ world-overview/137-social-networking-and-ugc, June 2013. (Online; Accessed 25 Oct 2013)

  • Garshol L (2004) Metadata? Thesauri? Taxonomies? Topic maps! Making sense of it all. J Inf Sci 30(4):378

    Google Scholar 

  • Giles J (2005) Internet encyclopaedias go head to head. Nature 438(7070):900–901

    Google Scholar 

  • Hassan M (2013) Automatic document topic identification using hierarchical ontology extracted from human background knowledge. Ph.D. dissertation, University of Waterloo

    Google Scholar 

  • Huynh D, Cao T, Pham P, Hoang T (2009) Using hyperlink texts to improve quality of identifying document topics based on Wikipedia. In: International conference on knowledge and systems engineering, 2009 (KSE'09), Hanoi. IEEE, pp 249–254

    Google Scholar 

  • Janik M, Kochut K (2008a) Training-less Ontology-based Text Categorization. In: workshop on exploiting semantic annotations in information retrieval (ESAIR 2008) at the 30th European Conference on Information Retrieval, ECIR

    Google Scholar 

  • Janik M, Kochut K (2008b) Wikipedia in action: ontological knowledge in text categorization. In: IEEE international conference on semantic computing, 2008, Santa Clara. IEEE, pp 268–275

    Google Scholar 

  • Korfiatis NT, Poulos M, Bokos G (2006) Evaluating authoritative sources using social networks: an insight from Wikipedia. Online Inf Rev 30(3):252–262

    Google Scholar 

  • Kuhn HW (2005) The Hungarian method for the assignment problem. Nav Res Logist (NRL) 52(1): 7–21

    Google Scholar 

  • Lloyd S (1982) Least squares quantization in PCM. IEEE Trans Inf Theory 28(2):129–137

    MATH  MathSciNet  Google Scholar 

  • Medelyan O (2009) Human-competitive automatic topic indexing. Ph.D. dissertation, The University of Waikato

    Google Scholar 

  • Medelyan O, Witten I, Milne D (2008) Topic indexing with Wikipedia. In: Proceedings of AAAI workshop on Wikipedia and artificial intelligence: an evolving synergy, Chicago. AAAI, pp 19–24

    Google Scholar 

  • Ng A, Jordan M, Weiss Y et al (2002) On spectral clustering: analysis and an algorithm. Adv Neural Inf Process Syst 2:849–856

    Google Scholar 

  • Popescul A, Ungar LH (2000) Automatic labeling of document clusters. http://citeseer.ist.psu.edu/viewdoc/download? doi=10.1.1.33.141&rep=rep1&type=pdf

  • Schönhofen P (2009) Identifying document topics using the Wikipedia category network. Web Intell Agent Syst 7(2):195–207

    Google Scholar 

  • Xu W, Gong Y (2004) Document clustering by concept factorization. In: Proceedings of the 27th annual international ACM SIGIR conference on research and development in information retrieval, Sheffield. ACM, pp 202–209

    Google Scholar 

  • Xu W, Liu X, Gong Y (2003) Document clustering based on non-negative matrix factorization. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval, Toronto. ACM, pp 267–273

    Google Scholar 

  • Zhao Y, Karypis G, Fayyad U (2005) Hierarchical clustering algorithms for document datasets. Data Min Knowl Discov 10(2):141–168

    MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer Science+Business Media New York

About this entry

Cite this entry

Hassan, M.M., Karray, F., Kamel, M.S. (2014). Automatic Document Topic Identification Using Social Knowledge Network. In: Alhajj, R., Rokne, J. (eds) Encyclopedia of Social Network Analysis and Mining. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-6170-8_352

Download citation

Publish with us

Policies and ethics