Abstract
The new coronavirus outbreak (COVID-19) has swept the world since December 2019 posing a global threat to all countries and communities on the planet. Information about the outbreak has been rapidly spreading on different social media platforms in unprecedented level. As it continues to spread in different countries, people tend to increasingly share information and stay up-to-date with the latest news. It is crucial to capture the discussions and conversations happening on social media to better understand human behavior during pandemics and alter possible strategies to combat the pandemic. In this work, we analyze the Arabic content of Twitter to capture the main discussed topics among Arabic users. We utilize Non-negative Matrix Factorization (NMF) to discover main issues and topics based on a dataset of Arabic tweets from early January to the end of April, and identify the most frequent unigrams, bigrams, and trigrams of the tweets. Eventually, the discovered topics are then presented and discussed which can be roughly classified into COVID-19 origin topics, prevention measures in different Arabic countries, prayers and supplications, news and reports, and finally topics related to preventing the spread of the disease such as curfew and quarantine. To our best knowledge, this is the first work addressing the issue of detecting COVID-19 related topics from Arabic tweets.
This work was supported by King Abdulaziz City for Science and Technology. Grant Number: 5-20-01-007-0033.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
Available at: https://github.com/BatoolHamawi/COVID19Word2Vec.
References
Culotta, A.: Towards detecting influenza epidemics by analyzing Twitter messages. In: Proceedings of the First Workshop on Social Media Analytics, pp. 115–122 (2010)
Mourtada, R., Salem, F.: Citizen engagement and public services in the Arab world: the potential of social media. In: Arab Social Media Report Series, 6th edn, June 2014
de Quincey, E., Kostkova, P.: Early warning and outbreak detection using social networking websites: the potential of Twitter. In: Kostkova, P. (ed.) eHealth 2009. LNICST, vol. 27, pp. 21–24. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-11745-9_4
Morin, C., Bost, I., Mercier, A., Dozon, J.-P., Atlani-Duault, L.: Information circulation in times of Ebola: Twitter and the sexual transmission of Ebola by survivors. PLoS Currents 10 (2018)
Alshaabi, T., et al.: How the world’s collective attention is being paid to a pandemic: Covid-19 related 1-gram time series for 24 languages on Twitter. arXiv preprint arXiv:2003.12614 (2020)
Li, X., Zhou, M., Wu, J., Yuan, A., Wu, F., Li, J.: Analyzing COVID-19 on online social media: trends, sentiments and emotions. arXiv preprint arXiv:2005.14464 (2020)
Haouari, F., Hasanain, M., Suwaileh, R., Elsayed, T.; ARCOV-19: the first Arabic COVID-19 Twitter dataset with propagation networks. arXiv, arXiv-2004 (2020)
Alhajji, M., Al Khalifah, A., Aljubran, M., Alkhalifah, M.: Sentiment analysis of tweets in Saudi Arabia regarding governmental preventive measures to contain COVID-19 (2020)
Prier, K.W., Smith, M.S., Giraud-Carrier, C., Hanson, C.L.: Identifying health-related topics on Twitter. In: Salerno, J., Yang, S.J., Nau, D., Chai, S.-K. (eds.) SBP 2011. LNCS, vol. 6589, pp. 18–25. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-19656-0_4
Sokolova, M., et al.: Topic modelling and event identification from Twitter textual data. arXiv preprint arXiv:1608.02519 (2016)
Sitorus, A.P., Murfi, H., Nurrohmah, S., Akbar, A.: Sensing trending topics in twitter for greater Jakarta area. Int. J. Electr. Comput. Eng. 7(1), 330 (2017)
Klinczak, M.N., Kaestner, C.A.: A study on topics identification on Twitter using clustering algorithms. In: 2015 Latin America Congress on Computational Intelligence (LA-CCI), pp. 1–6. IEEE (2015)
Alqurashi, S., Alhindi, A., Alanazi, E.: Large Arabic Twitter dataset on COVID-19. arXiv preprint arXiv:2004.04315 (2020)
Zerrouki, T., Balla, A.: Tashkeela: novel corpus of Arabic vocalized texts, data for auto-diacritization systems. Data Brief 11, 147 (2017)
Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401(6755), 788–791 (1999)
Kuang, D., Yun, S., Park, H.: SYMNMF: nonnegative low-rank approximation of a similarity matrix for graph clustering. J. Global Optim. 62(3), 545–574 (2015)
O’callaghan, D., Greene, D., Carthy, J., Cunningham, P.: An analysis of the coherence of descriptors in topic modeling. Expert Syst. Appl. 42(13), 5645–5657 (2015)
Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inf. Process. Manage. 24(5), 513–523 (1988)
Greene, D., Cross, J.P.: Exploring the political agenda of the European parliament using a dynamic topic modeling approach. arXiv preprint arXiv:1607.03055 (2016)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Abbas, N.: These Arab Countries Are Now In Lockdown, 2020. https://www.forbesmiddleeast.com/industry/healthcare/in-numbers-the-global-ventilator-shortage. Accessed 20 Aug 2020
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Hamoui, B., Alashaikh, A., Alanazi, E. (2020). COVID-19: What Are Arabic Tweeters Talking About?. In: Chellappan, S., Choo, KK.R., Phan, N. (eds) Computational Data and Social Networks. CSoNet 2020. Lecture Notes in Computer Science(), vol 12575. Springer, Cham. https://doi.org/10.1007/978-3-030-66046-8_35
Download citation
DOI: https://doi.org/10.1007/978-3-030-66046-8_35
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-66045-1
Online ISBN: 978-3-030-66046-8
eBook Packages: Computer ScienceComputer Science (R0)