Automated Identification of Social Media Bots Using Deepfake Text Detection

Saravani, Sina Mahdipour; Ray, Indrajit; Ray, Indrakshi

doi:10.1007/978-3-030-92571-0_7

Sina Mahdipour Saravani¹¹,
Indrajit Ray¹¹ &
Indrakshi Ray¹¹

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 13146))

Included in the following conference series:

International Conference on Information Systems Security

1029 Accesses
4 Citations

Abstract

Social networks are playing an increasingly important role in modern society. Social media bots are also on the rise. Bots can propagate misinformation and spam, thereby influencing economy, politics, and healthcare. The progress in Natural Language Processing (NLP) techniques makes bots more deceptive and harder to detect. Easy availability of readily deployable bots empowers the attacker to perform malicious activities; this makes bot detection an important problem in social networks. Researchers have worked on the problem of bot detection. Most research focus on identifying bot accounts in social media; however, the meta-data needed for bot account detection is unavailable in many cases. Moreover, if the account is controlled by a cyborg (a bot-assisted human or human-assisted bot) such detection mechanisms will fail. Consequently, we focus on identifying bots on the basis of textual contents of posts they make in the social media, which we refer to as fake posts. NLP techniques based on Deep Learning appear to be the most promising approach for fake text detection. We employ an end-to-end neural network architecture for deep fake text detection on a real-world Twitter dataset containing deceptive Tweets. Our experiments achieve the state of the art performance and improve the classification accuracy by 2% compared to previously tested models. Moreover, our content-level approach can be used for fake posts detection in social media in real-time. Detecting fake texts before it gets propagated will help curb the spread of misinformation.

This work was supported in part by funds from NIST under award number 60NANB18D204, and from NSF under award number CNS 2027750, CNS 1822118 and from NIST, Statnett, Cyber Risk Research, AMI, ARL, and from DoE NEUP Program contract number DE-NE0008986.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://www.kaggle.com/mtesconi/twitter-deep-fake-text.
2.
Our code for this paper is published in the GitHub repository at https://github.com/sinamps/bot-detection.
3.
https://github.com/sinamps/bot-detection.

References

Abokhodair, N., Yoo, D., McDonald, D.W.: Dissecting a social botnet: growth, content and influence in Twitter. In: CSCW, pp. 839–851 (2015)
Google Scholar
Adelani, D.I., Mai, H., Fang, F., Nguyen, H.H., Yamagishi, J., Echizen, I.: Generating Sentiment-Preserving fake online reviews using neural language models and their human- and machine-based detection. In: AINA, pp. 1341–1354 (2020)
Google Scholar
Alothali, E., Zaki, N., Mohamed, E.A., Alashwal, H.: Detecting social bots on twitter: a literature review. In: IIT, pp. 175–180 (2018)
Google Scholar
Bakhtin, A., Gross, S., Ott, M., Deng, Y., Ranzato, M., Szlam, A.: Real or Fake? Learning to Discriminate Machine from Human Generated Text. arXiv preprint arXiv:1906.03351 (2019)
Beltagy, I., Lo, K., Cohan, A.: SciBERT: a pretrained language model for scientific text. In: EMNLP-IJCNLP, pp. 3615–3620 (2019)
Google Scholar
Chavoshi, N., Hamooni, H., Mueen, A.: DeBot: Twitter bot detection via warped correlation. In: ICDM. pp. 817–822 (2016)
Google Scholar
Chu, Z., Gianvecchio, S., Wang, H., Jajodia, S.: Detecting automation of twitter accounts: are you a human, bot, or cyborg? TDSC 9(6), 811–824 (2012)
Google Scholar
Cresci, S., Di Pietro, R., Petrocchi, M., Spognardi, A., Tesconi, M.: The paradigm-shift of social spambots: evidence, theories, and tools for the arms race. In: WWW Companion, pp. 963–972 (2017)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Dukić, D., Keča, D., Stipić, D.: Are you human? Detecting bots on Twitter Using BERT. In: DSAA, pp. 631–636 (2020)
Google Scholar
Fagni, T., Falchi, F., Gambini, M., Martella, A., Tesconi, M.: TweepFake: about detecting deepfake tweets. PLoS ONE 16(5), e0251415 (2021)
Google Scholar
Gayo-Avello, D.: Social media won’t free us. IEEE Internet Comput. 21(4), 98–101 (2017)
Article Google Scholar
Gehrmann, S., Strobelt, H., Rush, A.M.: GLTR: statistical detection and visualization of generated text. In: ACL: System Demonstrations, pp. 111–116 (2019)
Google Scholar
Heidari, M., Jones, J.H.: Using BERT to extract topic-independent sentiment features for social media bot detection. In: UEMCON, pp. 0542–0547 (2020)
Google Scholar
Ippolito, D., Duckworth, D., Callison-Burch, C., Eck, D.: Automatic detection of generated text is easiest when humans are fooled. In: ACL, pp. 1808–1822 (2020)
Google Scholar
Jia, J., Wang, B., Gong, N.Z.: Random walk based fake account detection in online social networks. In: DSN, pp. 273–284 (2017)
Google Scholar
Karataş, A., Şahin, S.: A review on social bot detection techniques and research directions. In: ISCTurkey, pp. 156–161 (2017)
Google Scholar
Kudugunta, S., Ferrara, E.: Deep neural networks for bot detection. Inf. Sci. 467, 312–322 (2018)
Article Google Scholar
Lee, H., Yu, Y., Kim, G.: Augmenting data for sarcasm detection with unlabeled conversation context. In: FigLang, pp. 12–17 (2020)
Google Scholar
Lin, R., Xiao, J., Fan, J.: NeXtVLAD: an efficient neural network to aggregate frame-level features for large-scale video classification. In: ECCV, pp. 206–218 (2018)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Article Google Scholar
Müller, M., Salathé, M., Kummervold, P.E.: COVID-Twitter-BERT: a natural language processing model to Analyse COVID-19 Content on Twitter. arXiv preprint arXiv:2005.07503 (2020)
Rangel, F., Rosso, P.: Overview of the 7th author profiling task at PAN 2019: bots and gender profiling in Twitter. In: CEUR Workshop, pp. 1–36 (2019)
Google Scholar
Srivastava, H., Varshney, V., Kumari, S., Srivastava, S.: A novel hierarchical BERT architecture for Sarcasm detection. In: FigLang, pp. 93–97 (2020)
Google Scholar
Varol, O., Ferrara, E., Davis, C., Menczer, F., Flammini, A.: Online human-bot interactions: detection, estimation, and characterization. In: ICWSM, pp. 280–289 (2017)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: NIPS, pp. 5998–6008 (2017)
Google Scholar
Wolf, T., et al.: HuggingFace’s transformers: state-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019)
Zellers, R., et al.: Defending against neural fake news. In: NIPS, pp. 9054–9065 (2019)
Google Scholar

Download references

Author information

Authors and Affiliations

Colorado State University, Fort Collins, CO, 80523, USA
Sina Mahdipour Saravani, Indrajit Ray & Indrakshi Ray

Authors

Sina Mahdipour Saravani
View author publications
You can also search for this author in PubMed Google Scholar
Indrajit Ray
View author publications
You can also search for this author in PubMed Google Scholar
Indrakshi Ray
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sina Mahdipour Saravani .

Editor information

Editors and Affiliations

Indian Institute of Technology Patna, Patna, India
Somanath Tripathy
Indian Institute of Technology Bombay, Mumbai, India
Rudrapatna K. Shyamasundar
Newcastle University, Newcastle upon Tyne, UK
Rajiv Ranjan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Saravani, S.M., Ray, I., Ray, I. (2021). Automated Identification of Social Media Bots Using Deepfake Text Detection. In: Tripathy, S., Shyamasundar, R.K., Ranjan, R. (eds) Information Systems Security. ICISS 2021. Lecture Notes in Computer Science(), vol 13146. Springer, Cham. https://doi.org/10.1007/978-3-030-92571-0_7

Download citation

DOI: https://doi.org/10.1007/978-3-030-92571-0_7
Published: 10 December 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-92570-3
Online ISBN: 978-3-030-92571-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Automated Identification of Social Media Bots Using Deepfake Text Detection