Filtering Users Accounts for Enhancing the Results of Social Media Mining Tasks

Shalaby, May; Rafea, Ahmed

doi:10.1007/978-3-030-45691-7_36

May Shalaby²⁰ &
Ahmed Rafea²⁰

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1160))

Included in the following conference series:

World Conference on Information Systems and Technologies

1699 Accesses

Abstract

Filtering out the illegitimate Twitter accounts for online social media mining tasks reduces the noise and thus improves the quality of the outcomes of those tasks. Developing a supervised machine learning classifier requires a large annotated dataset. While building the annotation guidelines, the rules were found suitable to develop an unsupervised rule-based classifying program. However, despite its high accuracy, the performance of the rule-based program was not time efficient. So, we decided to use the unsupervised rule-based program to create a massive annotated dataset to build a supervised machine learning classifier, which was found to be fast and matched the unsupervised classifier performance with an F-Score of 92%. The impact of removing those illegitimate accounts on an influential users identification program developed by the authors, was investigated. There were slight improvements in the precision results but not statistically significant, which indicated that the influential user program didn’t identify erroneously spam accounts as influential.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ala’M, A. Z., Alqatawna, J., Faris, H.: Spam profile detection in social networks based on public features. In: 2017 8th International Conference on Information and Communication Systems (ICICS), pp. 130–135. IEEE (2017)
Google Scholar
Aslan, Ç.B., Sağlam, R.B., Li, S.: Automatic detection of cyber security related accounts on online social networks: Twitter as an example. In: Proceedings of the 9th International Conference on Social Media and Society, pp. 236–240. ACM (2018)
Google Scholar
Chavoshi, N., Hamooni, H., Mueen, A.: Identifying correlated bots in Twitter. In: Spiro, E., Ahn, Y.-Y. (eds.) SocInfo 2016. LNCS, vol. 10047, pp. 14–21. Springer, Cham (2016a)
Google Scholar
Chavoshi, N., Hamooni, H., Mueen, A.: DeBot: Twitter bot detection via warped correlation. In: ICDM, pp. 817–822 (2016b)
Google Scholar
Chavoshi, N., Hamooni, H., Mueen, A.: Temporal patterns in bot activities. In: Proceedings of the 26th International Conference on World Wide Web Companion, pp. 1601–1606. International World Wide Web Conferences Steering Committee (2017)
Google Scholar
Davis, C.A., Varol, O., Ferrara, E., Flammini, A., Menczer, F.: Botornot: a system to evaluate social bots. In: Proceedings of the 25th International Conference Companion on World Wide Web, pp. 273–274. International World Wide Web Conferences Steering Committee (2016)
Google Scholar
Duh, A., Slak Rupnik, M., Korošak, D.: Collective behavior of social bots is encoded in their temporal Twitter activity. Big data 6(2), 113–123 (2018)
Article Google Scholar
Inuwa-Dutse, I., Bello, B.S., Korkontzelos, I.: Lexical analysis of automated accounts on Twitter. arXiv preprint arXiv:1812.07947 (2018)
Jain, G., Sharma, M., Agarwal, B.: Spam detection on social media using semantic convolutional neural network. Int. J. Knowl. Discov. Bioinform. 8(1), 12–26 (2018)
Article Google Scholar
Kudugunta, S., Ferrara, E.: Deep neural networks for bot detection. Inf. Sci. 467, 312–322 (2018)
Article Google Scholar
Liu, S., Wang, Yu., Chen, C., Xiang, Y.: An ensemble learning approach for addressing the class imbalance problem in Twitter spam detection. In: Liu, J.K.K., Steinfeld, R. (eds.) ACISP 2016. LNCS, vol. 9722, pp. 215–228. Springer, Cham (2016)
Chapter Google Scholar
Madisetty, S., Desarkar, M.S.: A neural network-based ensemble approach for spam detection in Twitter. IEEE Trans. Comput. Soc. Syst. 5(4), 973–984 (2018)
Article Google Scholar
Shalaby, M., Rafea, A.: Identifying the topic-specific influential users in Twitter. Int. J. Comput. Appl. 179(18), 34–39 (2018)
Google Scholar
Subrahmanian, V.S., Azaria, A., Durst, S., Kagan, V., Galstyan, A., Lerman, K., Zhu, L., Ferrara, E., Flammini, A., Menczer, F.: The DARPA Twitter bot challenge. Computer 49(6), 38–46 (2016)
Google Scholar
Varol, O., Ferrara, E., Davis, C. A., Menczer, F., Flammini, A.: Online human-bot interactions: detection, estimation, and characterization. In: Eleventh International AAAI Conference on Web and Social Media (2017)
Google Scholar

Download references

Acknowledgments

The authors would like to thank ITIDA and AUC for sponsoring the project entitled “Sentiment Analysis Tool for Arabic”.

Author information

Authors and Affiliations

The American University in Cairo, Cairo, Egypt
May Shalaby & Ahmed Rafea

Authors

May Shalaby
View author publications
You can also search for this author in PubMed Google Scholar
Ahmed Rafea
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to May Shalaby or Ahmed Rafea .

Editor information

Editors and Affiliations

Departamento de Engenharia Informática, Universidade de Coimbra, Coimbra, Portugal
Álvaro Rocha
College of Engineering, The Ohio State University, Columbus, OH, USA
Hojjat Adeli
FEUP, Universidade do Porto, Porto, Portugal
Luís Paulo Reis
DIMES, Università della Calabria, Arcavacata di Rende, Italy
Sandra Costanzo
Faculty of Electrical Engineering, University of Montenegro, Podgorica, Montenegro
Irena Orovic
Universidade Portucalense, Porto, Portugal
Fernando Moreira

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shalaby, M., Rafea, A. (2020). Filtering Users Accounts for Enhancing the Results of Social Media Mining Tasks. In: Rocha, Á., Adeli, H., Reis, L., Costanzo, S., Orovic, I., Moreira, F. (eds) Trends and Innovations in Information Systems and Technologies. WorldCIST 2020. Advances in Intelligent Systems and Computing, vol 1160. Springer, Cham. https://doi.org/10.1007/978-3-030-45691-7_36

Download citation

DOI: https://doi.org/10.1007/978-3-030-45691-7_36
Published: 08 June 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-45690-0
Online ISBN: 978-3-030-45691-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics