Tweet Expansion Method for Filtering Task in Twitter

Karisani, Payam; Oroumchian, Farhad; Rahgozar, Maseud

doi:10.1007/978-3-319-24027-5_5

Payam Karisani²¹,
Farhad Oroumchian²² &
Maseud Rahgozar²¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9283))

Included in the following conference series:

International Conference of the Cross-Language Evaluation Forum for European Languages

1813 Accesses
1 Citations

Abstract

In this article we propose a supervised method for expanding tweet contents to improve the recall of tweet filtering task in online reputation management systems. Our method does not use any external resources. It consists of creating a K-NN classifier in three steps. In these steps the tweets labeled related and unrelated in the training set are expanded by extracting and adding the most discriminative terms, calculating and adding the most frequent terms, and re-weighting the original tweet terms from training set. Our experiments in RepLab 2013 data set show that our method improves the performance of filtering task, in terms of F criterion, up to 13% over state-of-the-art classifiers such as SVM. This data set consists of 61 entities from different domains of automotive, banking, universities, and music.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Amigó, E., Carrillo de Albornoz, J., Chugur, I., Corujo, A., Gonzalo, J., Martín, T., Meij, E., de Rijke, M., et al.: Overview of RepLab 2013: evaluating online reputation monitoring systems. In: Forner, P., Müller, H., Paredes, R., Rosso, P., Stein, B. (eds.) CLEF 2013. LNCS, vol. 8138, pp. 333–352. Springer, Heidelberg (2013)
Google Scholar
Amigó, E., Carrillo-de-Albornoz, J., Chugur, I., Corujo, A., Gonzalo, J., Meij, E., de Rijke, M., Spina, D.: Overview of RepLab 2014: author profiling and reputation dimensions for online reputation management. In: Kanoulas, E., Lupu, M., Clough, P., Sanderson, M., Hall, M., Hanbury, A., Toms, E. (eds.) CLEF 2014. LNCS, vol. 8685, pp. 307–322. Springer, Heidelberg (2014)
Google Scholar
Spina, D., Gonzalo, J., Amigó, E.: Discovering filter keywords for company name disambiguation in twitter. Expert Systems with Applications 40(12), 4986–5003 (2013)
Article Google Scholar
Hoffman, T.: Online reputation management is hot—but is it ethical. Computerworld, p. 2, February 2008
Google Scholar
Meij, E., Weerkamp, W., de Rijke, M.: Adding semantics to microblog posts. In: Proceedings of the Fifth ACM International Conference on Web Search and Data Mining. ACM (2012)
Google Scholar
Saleiro, P., Rei, L., Pasquali, A., Soares, C., Teixeira, J., Pinto, F., Nozari, M., Félix, C., Strecht, P.: POPSTAR at RepLab 2013: name ambiguity resolution on twitter. In: CLEF 2013 Eval. Labs and Workshop Online Working Notes (2013)
Google Scholar
Lavrenko, V., Bruce Croft, W.: Relevance based language models. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM (2001)
Google Scholar
Baeza-Yates, R., Ribeiro-Neto, B.: Modern information retrieval, vol. 463. ACM Press, New York (1999)
Google Scholar
Kullback, S., Leibler, R.A.: On information and sufficiency. The Annals of Mathematical Statistics, 79–86 (1951)
Google Scholar
Allan, J., Connell, M.E., Bruce Croft, W., Fang-Fang F., Fisher, D., Li, X.: Inquery and trec-9, DTIC Document (2000)
Google Scholar
Amigó, E., Gonzalo, J., Verdejo, F.: A general evaluation measure for document organization tasks. In: Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM (2013)
Google Scholar
Keerthi, S.S., Shevade, S.K., Bhattacharyya, C., Murthy, K.R.K.: Improvements to Platt’s SMO algorithm for SVM classifier design. Neural Computation 13(3), 637–649 (2001)
Article MATH Google Scholar
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. Newsl. 11(1), 10–18 (2009)
Article Google Scholar
Lavelli, A., Sebastiani, F., Zanoli, R.: Distributional term representations: an experimental comparison. In: Proceedings of the Thirteenth ACM International Conference on Information and Knowledge Management. ACM (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Database Research Group, Control and Intelligent Processing Center of Excellence, School of Electrical and Computer Engineering, University of Tehran, Tehran, Iran
Payam Karisani & Maseud Rahgozar
University of Wollongong in Dubai, Dubai, United Arab Emirates
Farhad Oroumchian

Authors

Payam Karisani
View author publications
You can also search for this author in PubMed Google Scholar
Farhad Oroumchian
View author publications
You can also search for this author in PubMed Google Scholar
Maseud Rahgozar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Payam Karisani .

Editor information

Editors and Affiliations

Institut de Recherche en Informatique de Toulouse, Toulouse , France
Josanne Mothe
Department of Computer Science, University of Neuchatel, Neuchâtel, Switzerland
Jacques Savoy
Faculteit der Geesteswetenschappen, Universiteit Amsterdam, Amsterdam, The Netherlands
Jaap Kamps
Institut de Recherche en Informatique de Toulouse, Toulouse, France
Karen Pinel-Sauvagnat
School of Computing, Dublin City University, Dublin, Ireland
Gareth Jones
LIA - CERI, Université d'Avignon et des Pays de Vaucluse, Avignon, France
Eric San Juan
Department of Information Engineering, University of Padua, Padua, Italy
Linda Capellato
of Information Engineering (DEI), University of Padua, Department, Padova, Italy
Nicola Ferro

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Karisani, P., Oroumchian, F., Rahgozar, M. (2015). Tweet Expansion Method for Filtering Task in Twitter. In: Mothe, J., et al. Experimental IR Meets Multilinguality, Multimodality, and Interaction. CLEF 2015. Lecture Notes in Computer Science(), vol 9283. Springer, Cham. https://doi.org/10.1007/978-3-319-24027-5_5

Download citation

DOI: https://doi.org/10.1007/978-3-319-24027-5_5
Published: 20 November 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24026-8
Online ISBN: 978-3-319-24027-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics