Integrating Machine Learning Techniques in Semantic Fake News Detection

Braşoveanu, Adrian M. P.; Andonie, Răzvan

doi:10.1007/s11063-020-10365-x

Integrating Machine Learning Techniques in Semantic Fake News Detection

Published: 29 October 2020

Volume 53, pages 3055–3072, (2021)
Cite this article

Neural Processing Letters Aims and scope Submit manuscript

1485 Accesses
25 Citations
Explore all metrics

Abstract

The nuances of languages, as well as the varying degrees of truth observed in news items, make fake news detection a difficult problem to solve. A news item is never launched without a purpose, therefore in order to understand its motivation it is best to analyze the relations between the speaker and its subject, as well as different credibility metrics. Inferring details about the various actors involved in a news item is a problem that requires a hybrid approach that mixes machine learning, semantics and natural language processing. This article discusses a semantic fake news detection method built around relational features like sentiment, entities or facts extracted directly from text. Our experiments are focused on short texts with different degrees of truth and show that adding semantic features improves accuracy significantly.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fake news, disinformation and misinformation in social media: a review

Article 09 February 2023

Natural language processing: state of the art, current trends and challenges

Article 14 July 2022

Detecting fake news and disinformation using artificial intelligence and machine learning to avoid supply chain disruptions

Article Open access 01 November 2022

Notes

References

Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M, Kudlur M, Levenberg J, Monga R, Moore S, Murray DG, Steiner B, Tucker PA, Vasudevan V, Warden P, Wicke M, Yu Y, Zhang X (2016) Tensorflow: a system for large-scale machine learning. CoRR. arXiv:1605.08695
Aghakhani H, Machiry A, Nilizadeh S, Kruegel C, Vigna G (2018) Detecting deceptive reviews using generative adversarial networks. CoRR. arXiv:1805.10364
Al-Moslmi T, Ocaña MG, Opdahl AL, Veres C (2020) Named entity extraction for knowledge graphs: a literature overview. IEEE Access 8:32862–32881. https://doi.org/10.1109/ACCESS.2020.2973928
Article Google Scholar
Allcott H, Gentzkow M (2017) Social media and fake news in the 2016 election. J. Econ. Perspect. 31(2):211–36
Article Google Scholar
Atanasova P, Nakov P, Màrquez L, Barrón-Cedeño A, Karadzhov G, Mihaylova T, Mohtarami M, Glass JR (2019) Automatic fact-checking using context and discourse information. J Data Inf Qual. https://doi.org/10.1145/3297722
Article Google Scholar
Barrón-Cedeño A, Martino GDS, Jaradat I, Nakov P (2019) Proppy: a system to unmask propaganda in online news. In: The 33rd AAAI conference on artificial intelligence, AAAI 2019, the thirty-first innovative applications of artificial intelligence conference, IAAI 2019, the ninth AAAI symposium on educational advances in artificial intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27–February 1, 2019, AAAI Press, pp 9847–9848. https://aaai.org/ojs/index.php/AAAI/article/view/5061
Bender EM, Derczynski L, Isabelle P (eds) (2018) Proceedings of the 27th international conference on computational linguistics, COLING 2018, Santa Fe, New Mexico, USA, August 20–26, 2018, association for computational linguistics. https://www.aclweb.org/anthology/volumes/C18-1/
Berghel H (2017) Lies, damn lies, and fake news. IEEE Comput 50(2):80–85. https://doi.org/10.1109/MC.2017.56
Article Google Scholar
Brasoveanu AMP, Andonie R (2019) Semantic fake news detection: a machine learning perspective. In: Rojas I, Joya G, Català A (eds) Advances in computational intelligence—15th international work-conference on artificial neural networks, IWANN 2019, Gran Canaria, Spain, June 12–14, 2019, Proceedings, part I, Springer, lecture notes in computer science, vol 11506, pp 656–667. https://doi.org/10.1007/978-3-030-20521-8_54
Cambria E, Poria S, Gelbukh AF, Thelwall M (2017) Sentiment analysis is a big suitcase. IEEE Intell Syst 32(6):74–80. https://doi.org/10.1109/MIS.2017.4531228
Article Google Scholar
Chiu JPC, Nichols E (2016) Named entity recognition with bidirectional LSTM-CNNs. TACL 4:357–370. https://transacl.org/ojs/index.php/tacl/article/view/792
Chollet F (2017) Deep learning with python. Manning Publications Co
Clark K, Khandelwal U, Levy O, Manning CD (2019) What does BERT look at? An analysis of bert’s attention. CoRR. arXiv:1906.04341
Daiber J, Jakob M, Hokamp C, Mendes PN (2013) Improving efficiency and accuracy in multilingual entity extraction. In: Sabou M, Blomqvist E, Noia TD, Sack H, Pellegrini T (eds) I-SEMANTICS 2013—9th international conference on semantic systems, ISEM ’13, Graz, Austria, September 4–6, 2013, ACM, pp 121–124. https://doi.org/10.1145/2506182.2506198
Devlin J, Chang M, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Burstein J, Doran C, Solorio T (eds) Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2–7, 2019, vol 1 (long and short papers), Association for Computational Linguistics, pp 4171–4186. https://doi.org/10.18653/v1/n19-1423
Fentaw HW, Kim TH (2019) Design and investigation of capsule networks for sentence classification. Appl Sci 9(11):2200. https://doi.org/10.3390/app9112200
Article Google Scholar
Fourney A, Rácz MZ, Ranade G, Mobius M, Horvitz E (2017) Geographic and temporal trends in fake news consumption during the 2016 US presidential election. In: [36], pp 2071–2074. https://doi.org/10.1145/3132847.3133147
Gandon F (2018) A survey of the first 20 years of research on semantic web and linked data. Ingénierie des Systèmes d’Information 23(3–4):11–38. https://doi.org/10.3166/isi.23.3-4.11-38
Article Google Scholar
Gangemi A, Presutti V, Recupero DR, Nuzzolese AG, Draicchio F, Mongiovì M (2017) Semantic web machine reading with FRED. Semant Web 8(6):873–893. https://doi.org/10.3233/SW-160240
Article Google Scholar
Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville AC, Bengio Y (2014) Generative adversarial nets. In: Ghahramani Z, Welling M, Cortes C, Lawrence ND, Weinberger KQ (eds) Advances in neural information processing systems 27: annual conference on neural information processing systems 2014, December 8–13 2014, Montreal, Quebec, Canada, pp 2672–2680. http://papers.nips.cc/paper/5423-generative-adversarial-nets
Gururangan S, Dang T, Card D, Smith NA (2019) Variational pretraining for semi-supervised text classification. In: [34], pp 5880–5894. https://doi.org/10.18653/v1/p19-1590
Guyon I, von Luxburg U, Bengio S, Wallach HM, Fergus R, Vishwanathan SVN, Garnett R (eds) (2017) Advances in neural information processing systems 30: annual conference on neural information processing systems 2017, 4–9 December 2017, Long Beach, CA, USA
Habib A, Asghar MZ, Khan A, Habib A, Khan A (2019) False information detection in online content and its role in decision making: a systematic literature review. Soc Netw Anal Min 9(1):50
Article Google Scholar
Hastie T, Tibshirani R, Friedman JH (2009) The elements of statistical learning: data mining, inference, and prediction, 2nd edn. Springer, Berlin. http://www.worldcat.org/oclc/300478243
Irie K, Tüske Z, Alkhouli T, Schlüter R, Ney H (2016) LSTM, GRU, highway and a bit of attention: an empirical overview for language modeling in speech recognition. In: Morgan N (ed) Interspeech 2016, 17th annual conference of the international speech communication association, San Francisco, CA, USA, September 8–12, 2016, ISCA, pp 3519–3523. https://doi.org/10.21437/Interspeech.2016-491
Ji H, Nothman J (2016) Overview of TAC-KBP2016 tri-lingual EDL and its impact on end-to-end KBP. In: Eighth text analysis conference (TAC), NIST. https://tac.nist.gov/publications/2016/additional.papers/
Jin Z, Cao J, Zhang Y, Luo J (2016) News verification by exploiting conflicting social viewpoints in microblogs. In: Schuurmans D, Wellman MP (eds) Proceedings of the thirtieth AAAI conference on artificial intelligence, February 12–17, 2016, Phoenix, Arizona, USA, AAAI Press, pp 2972–2978. http://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/view/12128
Karimi H, Roy P, Saba-Sadiya S, Tang J (2018) Multi-source multi-class fake news detection. In: [7], pp 1546–1557. https://aclanthology.info/papers/C18-1131/c18-1131
Kiesel J, Mestre M, Shukla R, Vincent E, Adineh P, Corney D, Stein B, Potthast M (2019) Semeval-2019 task 4: hyperpartisan news detection. In: May J, Shutova E, Herbelot A, Zhu X, Apidianaki M, Mohammad SM (eds) Proceedings of the 13th international workshop on semantic evaluation, SemEval@NAACL-HLT 2019, Minneapolis, MN, USA, June 6–7, 2019, Association for Computational Linguistics, pp 829–839. https://www.aclweb.org/anthology/S19-2145/
Kim J, Jang S, Park EL, Choi S (2020) Text classification using capsules. Neurocomputing 376:214–221. https://doi.org/10.1016/j.neucom.2019.10.033
Article Google Scholar
Kim Y (2014) Convolutional neural networks for sentence classification. In: Moschitti A, Pang B, Daelemans W (eds) Proceedings of the 2014 conference on empirical methods in natural language processing, EMNLP 2014, October 25–29, 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL, ACL, pp 1746–1751. https://www.aclweb.org/anthology/D14-1181/
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. CoRR. arXiv:1412.6980
Kiperwasser E, Goldberg Y (2016) Simple and accurate dependency parsing using bidirectional LSTM feature representations. TACL 4:313–327. https://transacl.org/ojs/index.php/tacl/article/view/885
Korhonen A, Traum DR, Màrquez L (eds) (2019) Proceedings of the 57th conference of the association for computational linguistics, ACL 2019, Florence, Italy, July 28–August 2, 2019, vol 1, Long Papers, Association for Computational Linguistics. https://www.aclweb.org/anthology/volumes/P19-1/
Lehmann J, Isele R, Jakob M, Jentzsch A, Kontokostas D, Mendes PN, Hellmann S, Morsey M, van Kleef P, Auer S, Bizer C (2015) DBpedia—a large-scale, multilingual knowledge base extracted from wikipedia. Semant Web 6(2):167–195. https://doi.org/10.3233/SW-140134
Article Google Scholar
Lim E, Winslett M, Sanderson M, Fu AW, Sun J, Culpepper JS, Lo E, Ho JC, Donato D, Agrawal R, Zheng Y, Castillo C, Sun A, Tseng VS, Li C (eds) (2017) Proceedings of the 2017 ACM on conference on information and knowledge management, CIKM 2017, Singapore, November 06–10, 2017, ACM. http://dl.acm.org/citation.cfm?id=3132847
Liu C, Wu X, Yu M, Li G, Jiang J, Huang W, Lu X (2019) A two-stage model based on bert for short fake news detection. In: International conference on knowledge science, Springer, Engineering and Management, pp 172–183
Liu Y, Wu YB (2018) Early detection of fake news on social media through propagation path classification with recurrent and convolutional networks. In: McIlraith SA, Weinberger KQ (eds) Proceedings of the thirty-second AAAI conference on artificial intelligence, New Orleans, Louisiana, USA, February 2–7, 2018, AAAI Press. https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16826
Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) Roberta: a robustly optimized BERT pretraining approach. CoRR. arXiv:1907.11692
Long Y, Lu Q, Xiang R, Li M, Huang C (2017) Fake news detection through multi-perspective speaker profiles. In: Kondrak G, Watanabe T (eds) Proceedings of the eighth international joint conference on natural language processing, IJCNLP 2017, Taipei, Taiwan, November 27–December 1, 2017, vol 2: short papers, Asian Federation of Natural Language Processing, pp 252–256. https://aclanthology.info/papers/I17-2043/i17-2043
Lundberg SM, Lee S (2017) A unified approach to interpreting model predictions. In: [22], pp 4765–4774. http://papers.nips.cc/paper/7062-a-unified-approach-to-interpreting-model-predictions
Mikolov T, Grave E, Bojanowski P, Puhrsch C, Joulin A (2018) Advances in Pre-Training Distributed Word Representations. In: Calzolari N, Choukri K, Cieri C, Declerck T, Goggi S, Hasida K, Isahara H, Maegaard B, Mariani J, Mazo H, Moreno A, Odijk J, Piperidis S, Tokunaga T (eds) Proceedings of the eleventh international conference on language resources and evaluation, LREC 2018, Miyazaki, Japan, May 7–12, 2018., European Language Resources Association (ELRA). http://www.lrec-conf.org/lrec2018
Nickel M, Murphy K, Tresp V, Gabrilovich E (2016) A review of relational machine learning for knowledge graphs. Proc IEEE 104(1):11–33. https://doi.org/10.1109/JPROC.2015.2483592
Article Google Scholar
Parikh SB, Atrey PK (2018) Media-rich fake news detection: a survey. In: IEEE 1st conference on multimedia information processing and retrieval, MIPR 2018, Miami, FL, USA, April 10–12, 2018, IEEE, pp 436–441. http://doi.ieeecomputersociety.org/10.1109/MIPR.2018.00093
Qi Y, Sachan DS, Felix M, Padmanabhan S, Neubig G (2018) When and why are pre-trained word embeddings useful for neural machine translation? In: Walker MA, Ji H, Stent A (eds) Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, NAACL-HLT, New Orleans, Louisiana, USA, June 1–6, 2018, vol 2 (Short Papers), Association for Computational Linguistics, pp 529–535. https://aclanthology.info/papers/N18-2084/n18-2084
Rashkin H, Choi E, Jang JY, Volkova S, Choi Y (2017) Truth of varying shades: analyzing language in fake news and political fact-checking. In: Palmer M, Hwa R, Riedel S (eds) Proceedings of the 2017 conference on empirical methods in natural language processing, EMNLP 2017, Copenhagen, Denmark, September 9–11, 2017, Association for Computational Linguistics, pp 2931–2937. https://aclanthology.info/papers/D17-1317/d17-1317
Ribeiro MT, Singh S, Guestrin C (2016) “why should I trust you?”: explaining the predictions of any classifier. In: Krishnapuram B, Shah M, Smola AJ, Aggarwal CC, Shen D, Rastogi R (eds) Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, San Francisco, CA, USA, August 13–17, 2016, ACM, pp 1135–1144. https://doi.org/10.1145/2939672.2939778
Rubin V, Conroy N, Chen Y, Cornwell S (2016) Fake news or truth? Using satirical cues to detect potentially misleading news. In: Proceedings of the second workshop on computational approaches to deception detection, pp 7–17
Rubin VL, Chen Y, Conroy NJ (2015) Deception detection for news: three types of fakes. In: Information science with impact: research in and for the community—proceedings of the 78th ASISand T annual meeting, ASIST 2015, St. Louis, Missouri, Missouri, USA, October 6–10, 2015, Wiley, Proceedings of the association for information science and technology, vol 52, no 1, pp 1–4. https://doi.org/10.1002/pra2.2015.145052010083
Ruchansky N, Seo S, Liu Y (2017) CSI: a hybrid deep model for fake news detection. In: [36], pp 797–806
Sabour S, Frosst N, Hinton GE (2017) Dynamic routing between capsules. In: [22], pp 3859–3869. http://papers.nips.cc/paper/6975-dynamic-routing-between-capsules
Schlichtkrull MS, Kipf TN, Bloem P, van den Berg R, Titov I, Welling M (2018) Modeling relational data with graph convolutional networks. In: Gangemi A, Navigli R, Vidal M, Hitzler P, Troncy R, Hollink L, Tordai A, Alam M (eds) The semantic web: 15th international conference, ESWC 2018, Heraklion, Crete, Greece, June 3–7, 2018, Proceedings, Springer, lecture notes in computer science, vol 10843, pp 593–607. https://doi.org/10.1007/978-3-319-93417-4_38
Shu K, Sliva A, Wang S, Tang J, Liu H (2017) Fake news detection on social media: a data mining perspective. SIGKDD Explor 19(1):22–36. https://doi.org/10.1145/3137597.3137600
Article Google Scholar
Shu K, Wang S, Liu H (2017) Exploiting tri-relationship for fake news detection. CoRR. arXiv:1712.07709
Singhania S, Fernandez N, Rao S (2017) 3HAN: a deep neural network for fake news detection. In: Liu D, Xie S, Li Y, Zhao D, El-Alfy EM (eds) Neural information processing: 24th international conference, ICONIP 2017, Guangzhou, China, November 14–18, 2017, Proceedings, part II, Springer, lecture notes in computer science, vol 10635, pp 572–581. https://doi.org/10.1007/978-3-319-70096-0_59
Solaiman I, Brundage M, Clark J, Askell A, Herbert-Voss A, Wu J, Radford A, Wang J (2019) Release strategies and the social impacts of language models. CoRR. arXiv:1908.09203
Strubell E, Ganesh A, McCallum A (2019) Energy and policy considerations for deep learning in NLP. In: [34], pp 3645–3650. https://doi.org/10.18653/v1/p19-1355
Thorne J, Vlachos A (2018) Automated fact checking: Task formulations, methods and future directions. In: [7], pp 3346–3359. https://www.aclweb.org/anthology/C18-1283/
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: [22], pp 6000–6010. http://papers.nips.cc/paper/7181-attention-is-all-you-need
Vo N, Lee K (2018) The rise of guardians: fact-checking URL recommendation to combat fake news. In: Collins-Thompson K, Mei Q, Davison BD, Liu Y, Yilmaz E (eds) The 41st international ACM SIGIR conference on research and development in information retrieval, SIGIR 2018, Ann Arbor, MI, USA, July 08–12, 2018, ACM, pp 275–284. https://doi.org/10.1145/3209978.3210037
Vosoughi S, Roy D, Aral S (2018) The spread of true and false news online. Science 359(6380):1146–1151
Article Google Scholar
Wang WY (2017) “Liar, liar pants on fire”: A new benchmark dataset for fake news detection. CoRR. arXiv:1705.00648
Wu L, Liu H (2018) Tracing fake-news footprints: characterizing social media messages by how they propagate. In: Chang Y, Zhai C, Liu Y, Maarek Y (eds) Proceedings of the eleventh ACM international conference on web search and data mining, WSDM 2018, Marina Del Rey, CA, USA, February 5–9, 2018, ACM, pp 637–645. https://doi.org/10.1145/3159652.3159677
Yang K, Niven T, Kao H (2019) Fake news detection as natural language inference. CoRR. arXiv:1907.07347
Young T, Hazarika D, Poria S, Cambria E (2018) Recent trends in deep learning based natural language processing [review article]. IEEE Comp Int Mag 13(3):55–75. https://doi.org/10.1109/MCI.2018.2840738
Article Google Scholar
Zannettou S, Sirivianos M, Blackburn J, Kourtellis N (2018) The web of false information: rumors, fake news, Hoaxes, Clickbait, and various other shenanigans. CoRR. arXiv:1804.03461
Zellers R, Holtzman A, Rashkin H, Bisk Y, Farhadi A, Roesner F, Choi Y (2019) Defending against neural fake news. In: Wallach HM, Larochelle H, Beygelzimer A, d’Alché-Buc F, Fox EB, Garnett R (eds) Advances in neural information processing systems 32: annual conference on neural information processing systems 2019, NeurIPS 2019, 8–14 December 2019, Vancouver, BC, Canada, pp 9051–9062. http://papers.nips.cc/paper/9106-defending-against-neural-fake-news

Download references

Author information

Authors and Affiliations

MODUL Technology GmbH, Vienna, Austria
Adrian M. P. Braşoveanu
Computer Science Department, Central Washington University, Ellensburg, WA, USA
Răzvan Andonie
Electronics and Computers Department, Transilvania University of Braşov, Braşov, Romania
Adrian M. P. Braşoveanu & Răzvan Andonie

Authors

Adrian M. P. Braşoveanu
View author publications
You can also search for this author in PubMed Google Scholar
Răzvan Andonie
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Adrian M. P. Braşoveanu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Braşoveanu, A.M.P., Andonie, R. Integrating Machine Learning Techniques in Semantic Fake News Detection. Neural Process Lett 53, 3055–3072 (2021). https://doi.org/10.1007/s11063-020-10365-x

Download citation

Accepted: 03 October 2020
Published: 29 October 2020
Issue Date: October 2021
DOI: https://doi.org/10.1007/s11063-020-10365-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Integrating Machine Learning Techniques in Semantic Fake News Detection

Abstract

Access this article

Similar content being viewed by others

Fake news, disinformation and misinformation in social media: a review

Natural language processing: state of the art, current trends and challenges

Detecting fake news and disinformation using artificial intelligence and machine learning to avoid supply chain disruptions

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Integrating Machine Learning Techniques in Semantic Fake News Detection

Abstract

Access this article

Similar content being viewed by others

Fake news, disinformation and misinformation in social media: a review

Natural language processing: state of the art, current trends and challenges

Detecting fake news and disinformation using artificial intelligence and machine learning to avoid supply chain disruptions

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation