Skip to main content

A Heterogeneous Network-Based Positive and Unlabeled Learning Approach to Detect Fake News

  • Conference paper
  • First Online:
Intelligent Systems (BRACIS 2021)

Abstract

The dynamism of fake news evolution and dissemination plays a crucial role in influencing and confirming personal beliefs. To minimize the spread of disinformation approaches proposed in the literature, automatic fake news detection generally learns models through binary supervised algorithms considering textual and contextual information. However, labeling significant amounts of real news to build accurate classifiers is difficult and time-consuming due to their broad spectrum. Positive and unlabeled learning (PUL) can be a good alternative in this scenario. PUL algorithms learn models considering little labeled data of the interest class and use unlabeled data to increase classification performance. This paper proposes a heterogeneous network variant of the PU-LP algorithm, a PUL algorithm based on similarity networks. Our network incorporates different linguistic features to characterize fake news, such as representative terms, emotiveness, pausality, and average sentence size. Also, we considered two representations of the news to compute similarity: term frequency-inverse document frequency, and Doc2Vec, which creates fixed-sized document representations regardless of its length. We evaluated our approach in six datasets written in Portuguese or English, comparing its performance with a binary semi-supervised baseline algorithm, using two well-established label propagation algorithms: LPHN and GNetMine. The results indicate that PU-LP with heterogeneous networks can be competitive to binary semi-supervised learning. Also, linguistic features such as representative terms and pausality improved the classification performance, especially when there is a small amount of labeled news.

Supported by Coordenação de Aperfeiçoamento de Pessoal de Nível Superior [10662147/D], Fundação de Amparo à Pesquisa do Estado de São Paulo [2019/25010-5, 2019/07665-4], and Conselho Nacional de Desenvolvimento Científico e Tecnológico [426663/2018-7, 433082/2018-6, and 438017/2018-8].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    All datasets and source codes used in this paper are available in our public repository: https://github.com/marianacaravanti/A-Heterogeneous-Network-based-Positive-and-Unlabeled-Learning-Approach-to-Detecting-Fake-News.

  2. 2.

    https://github.com/several27/FakeNewsCorpus.

  3. 3.

    We also evaluate the average of \(F_1\) for both classes (macro-averaging \(F_{1}\)). Due to space limitations, the complete results are available in our public repository: https://github.com/marianacaravanti/A-Heterogeneous-Network-based-Positive-and-Unlabeled-Learning-Approach-to-Detecting-Fake-News/tree/main/Results.

References

  1. Aggarwal, C.C.: Machine Learning for Text. Springer Publishing (2018). https://doi.org/10.1007/978-3-319-73531-3

  2. Vargas, F.A., Pardo, T.A.S.: Studying dishonest intentions in Brazilian Portuguese texts. arXiv e-prints (2020)

    Google Scholar 

  3. Bekker, J., Davis, J.: Learning from positive and unlabeled data: a survey. Mach. Learn. 109(4), 719–760 (2020). https://doi.org/10.1007/s10994-020-05877-5

    Article  MathSciNet  MATH  Google Scholar 

  4. Bondielli, A., Marcelloni, F.: A survey on fake news and rumour detection techniques. Inf. Sci. 497, 38–55 (2019)

    Article  Google Scholar 

  5. Faustini, P., Covões, T.F.: Fake news detection using one-class classification. In: 2019 8th Brazilian Conference on Intelligent Systems, pp. 592–597. IEEE (2019)

    Google Scholar 

  6. Greifeneder, R., Jaffe, M., Newman, E., Schwarz, N.: The Psychology of Fake News: Accepting, Sharing, and Correcting Misinformation. Routledge, Milton Park (2021)

    Google Scholar 

  7. Ji, M., Sun, Y., Danilevsky, M., Han, J., Gao, J.: Graph regularized transductive classification on heterogeneous information networks. In: European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 570–586 (2010)

    Google Scholar 

  8. Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: International Conference on Machine Learning, pp. 1188–1196 (2014)

    Google Scholar 

  9. Ma, S., Zhang, R.: PU-LP: a novel approach for positive and unlabeled learning by label propagation. In: 2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pp. 537–542. IEEE (2017)

    Google Scholar 

  10. Pennebaker, J.W., Boyd, R.L., Jordan, K., Blackburn, K.: The development and psychometric properties of liwc2015. University of Texas, Technical report (2015)

    Google Scholar 

  11. Ren, Y., Wang, B., Zhang, J., Chang, Y.: Adversarial active learning based heterogeneous graph neural network for fake news detection. In: 2020 IEEE International Conference on Data Mining (ICDM), pp. 452–461. IEEE (2020)

    Google Scholar 

  12. Rossi, R.G.: Automatic text classification through network-based machine learning. Ph.D. thesis, University of São Paulo, Doctoral thesis (2016). (in Portuguese)

    Google Scholar 

  13. Santos, R.L.S., Pardo, T.A.S.: Fact-checking for Portuguese: knowledge graph and google search-based methods. In: Quaresma, P., Vieira, R., Aluísio, S., Moniz, H., Batista, F., Gonçalves, T. (eds.) PROPOR 2020. LNCS (LNAI), vol. 12037, pp. 195–205. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-41505-1_19

    Chapter  Google Scholar 

  14. Santos, B.N.: Transductive classification of events using heterogeneous networks. Master’s Thesis - Federal University of Mato Grosso do Sul (2018). (in Portuguese)

    Google Scholar 

  15. Heterogeneous Information Network Analysis and Applications. DA. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-56212-4_9

  16. Shu, K., Mahudeswaran, D., Wang, S., Lee, D., Liu, H.: FakeNewsNet: a data repository with news content, social context, and spatiotemporal information for studying fake news on social media. Big Data 8(3), 171–188 (2020)

    Article  Google Scholar 

  17. Silva, R.M., Santos, R.L., Almeida, T.A., Pardo, T.A.: Towards automatically filtering fake news in Portuguese. Expert Syst. Appl. 146, 113–199 (2020)

    Article  Google Scholar 

  18. Singh, V.K., Ghosh, I., Sonagara, D.: Detecting fake news stories via multimodal analysis. Assoc. Inf. Sci. Technol. 72(1), 3–17 (2021)

    Article  Google Scholar 

  19. Yu, J., Huang, Q., Zhou, X., Sha, Y.: IARnet: an information aggregating and reasoning network over heterogeneous graph for fake news detection. In: 2020 International Joint Conference on Neural Networks (IJCNN), pp. 1–9. IEEE (2020)

    Google Scholar 

  20. Yu, S., Li, C.: PE-PUC: a graph based PU-learning approach for text classification. In: Perner, P. (ed.) MLDM 2007. LNCS (LNAI), vol. 4571, pp. 574–584. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-73499-4_43

    Chapter  Google Scholar 

  21. Zhang, J., Dong, B., Philip, S.Y.: Deep diffusive neural network based fake news detection from heterogeneous social networks. In: Big Data 2019: International Conference on Big Data, pp. 1259–1266. IEEE (2019)

    Google Scholar 

  22. Zhang, X., Ghorbani, A.A.: An overview of online fake news: characterization, detection, and discussion. Inf. Process. Manage. 57(2), 102025 (2020)

    Google Scholar 

  23. Zhu, X., Ghahramani, Z., Lafferty, J.D.: Semi-supervised learning using gaussian fields and harmonic functions. In: Proceedings of the 20th International Conference on Machine Learning (ICML-2003), pp. 912–919 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mariana C. de Souza .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

de Souza, M.C., Nogueira, B.M., Rossi, R.G., Marcacini, R.M., Rezende, S.O. (2021). A Heterogeneous Network-Based Positive and Unlabeled Learning Approach to Detect Fake News. In: Britto, A., Valdivia Delgado, K. (eds) Intelligent Systems. BRACIS 2021. Lecture Notes in Computer Science(), vol 13074. Springer, Cham. https://doi.org/10.1007/978-3-030-91699-2_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-91699-2_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-91698-5

  • Online ISBN: 978-3-030-91699-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics