TKG: A Graph-Based Approach to Extract Keywords from Tweets

Abilhoa, Willyan Daniel; de Castro, Leandro Nunes

doi:10.1007/978-3-319-07593-8_49

Willyan Daniel Abilhoa⁸ &
Leandro Nunes de Castro⁸

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 290))

1500 Accesses
3 Citations

Abstract

Twitter is a microblog service that generates a huge amount of textual content daily. All this content needs to be explored by means of text mining, natural language processing, information retrieval, and other techniques. In this context, automatic keyword extraction is a task of great usefulness. A fundamental step in text mining techniques consists of building a model for text representation. This paper proposes a keyword extraction method for tweet collections that represents texts as graphs and applies centrality measures for finding the relevant vertices (keywords). The proposal is applied to two tweet collections of Brazilian TV shows and its results are compared to those of TFIDF and KEA.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Kietzmann, J.H., Hermkens, K., McCarthy, I.P., Silvestre, B.S.: Social media? Get serious! Understanding the functional building blocks of social media. Business Horizons 54, 241–251 (2011)
Article Google Scholar
Yoshida, M., Matsushima, S., Ono, S., Sato, I., Nakagawa, H.: ITC-UT: Tweet Categorization by Query Categorization of On-line Reputation Management. In: Conference on Multilingual and Multimodal Information Access Evaluation (2010)
Google Scholar
Prabowo, R., Thelwall, M.: Sentiment analysis: A combined approach. Journal of Informetrics 3, 143–157 (2009)
Article Google Scholar
Bermingham, A., Smeaton, A.: On Using Twitter to Monitor Political Sentiment and Predict Election Results. Sentiment Analysis Where AI Meets Psychology, 2–10 (2011)
Google Scholar
Feldman, R., Sanger, J.: The Text Mining Handbook Advanced Approaches in Analysing Unstructured Data, Cambridge (2007)
Google Scholar
Hirschman, L., Thompson, H.S.: Overview of evaluation in speech and natural language processing. In: Survey of the State of the Art in Human Language Technology, pp. 409–414. Cambridge University Press and Giardini Editori, Pisa (1997)
Google Scholar
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. ACM Press (1999)
Google Scholar
Salton, G., Yang, C.S., Yu, C.T.: A Theory of Term Importance in Automatic Text Analysis. Journal of the American society for Information Science 26, 33–44 (1975)
Article Google Scholar
Zhang, C., Wang, H., Liu, Y., Wu, Y., Liao, Y., Wang, B.: Automatic Keyword Extraction from Documents Using Conditional Random Fields. Journal of Computational Information Systems, 1169–1180 (2008)
Google Scholar
Gross, J.L., Yellen, J.: Graph Theory and Its Applications, 2nd edn. Chapman & Hall/CRC (2006)
Google Scholar
Jin, W., Srihari, R.K.: Graph-based text representation and knowledge discovery. In: Proceedings of the 2007 ACM Symposium on Applied Computing, vol. 7, pp. 807–811 (2007)
Google Scholar
Palshikar, G.K.: Keyword Extraction from a Single Document Using Centrality Measures. In: Ghosh, A., De, R.K., Pal, S.K. (eds.) PReMI 2007. LNCS, vol. 4815, pp. 503–510. Springer, Heidelberg (2007)
Chapter Google Scholar
Zhou, F., Zhang, F., Yang, B.: Graph-based text representation model and its realization. Natural Language Processing and Knowledge Engineering (NLP-KE) 8(1), 21–23 (2010)
Google Scholar
Schenker, A., Last, M., Bunke, H.: Classification of Web documents using a graph model. Document Analysis and Recognition 1, 240–244 (2003)
Google Scholar
Hensman, S.: Construction of conceptual graph representation of texts. In: Proceedings of Student Research Workshop at HLT-NAACL, Boston, pp. 49–54 (2004)
Google Scholar
Nieminen, J.: On the centrality in a graph. Scand. J. Psychol. 15, 332–336 (1974)
Article Google Scholar
Wasserman, S., Faust, K., Iacobucci, D.: Social Network Analysis: Methods and Applications. Cambridge University Press, Cambridge (1995)
Google Scholar
Hage, P., Harary, F.: Eccentricity and centrality in networks. Social Networks 17, 57–63 (1995)
Article Google Scholar
Zhang, K., Xu, H., Tang, J., Li, J.: Keyword Extraction Using Support Vector Machine. In: Yu, J.X., Kitsuregawa, M., Leong, H.-V. (eds.) WAIM 2006. LNCS, vol. 4016, pp. 85–96. Springer, Heidelberg (2006)
Chapter Google Scholar
Rose, S., Engel, D., Cramer, N., Cowley, W.: Automatic Keyword Extraction from Individual Documents. Text Mining: Applications and Theory, 1–20 (2010)
Google Scholar
Lott, B.: Survey of Keyword Extraction Techniques. UNM Education (2012)
Google Scholar
Witten, I.H., Paynter, G.W., Frank, E., Gutwin, C., Nevill-Manning, C.G.: KEA Practical Automatic Keyphrase Action. In: Proceedings of the 4th ACM Conference on Digital Library (DL 1999), Berkeley, CA, USA, pp. 254–226 (1999)
Google Scholar
Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Academic Press (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Natural Computing Laboratory, Mackenzie Presbyterian University, São Paulo, Brazil
Willyan Daniel Abilhoa & Leandro Nunes de Castro

Authors

Willyan Daniel Abilhoa
View author publications
You can also search for this author in PubMed Google Scholar
Leandro Nunes de Castro
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Willyan Daniel Abilhoa .

Editor information

Editors and Affiliations

Faculty of Engineering, Osaka Institute of Technology, Osaka, Osaka, Japan
Sigeru Omatu
Université Libre de Bruxelles, Bruxelles, Belgium
Hugues Bersini
Dept. of Computing Science and Control Faculty of Science, University of Salamanca, Salamanca, Spain
Juan M. Corchado
Department of Computing Science and Control, Faculty of Science, University of Salamanca, Salamanca, Spain
Sara Rodríguez
Faculty of Engineering Management, Poznan University of Technology, Poznan, Poland
Paweł Pawlewski
Dep. PPEQS, Section of Economics and Quantitative Methods, University of Chieti-Pescara, Pescara, Italy
Edgardo Bucciarelli

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Abilhoa, W.D., de Castro, L.N. (2014). TKG: A Graph-Based Approach to Extract Keywords from Tweets. In: Omatu, S., Bersini, H., Corchado, J., Rodríguez, S., Pawlewski, P., Bucciarelli, E. (eds) Distributed Computing and Artificial Intelligence, 11th International Conference. Advances in Intelligent Systems and Computing, vol 290. Springer, Cham. https://doi.org/10.1007/978-3-319-07593-8_49

Download citation

DOI: https://doi.org/10.1007/978-3-319-07593-8_49
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07592-1
Online ISBN: 978-3-319-07593-8
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics