Skip to main content

Efficient Multivariate Data Fusion for Misinformation Detection During High Impact Events

  • Conference paper
  • First Online:
Discovery Science (DS 2022)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13601))

Included in the following conference series:

Abstract

With the evolution of social media, cyberspace has become the de-facto medium for users to communicate during high-impact events such as natural disasters, terrorist attacks, and periods of political unrest. However, during such high-impact events, misinformation on social media can rapidly spread, affecting decision-making and creating social unrest. Identifying the spread of misinformation during high-impact events is a significant data challenge, given the variety of data associated with social media posts. Recent machine learning advances have shown promise for detecting misinformation, however, there are still key limitations that make this a significant challenge. These limitations include the effective and efficient modeling of the underlying non-linear associations of multi-modal data as well as the explainability of a system geared at the detection of misinformation. This paper presents a novel multivariate data fusion framework based on pre-trained deep learning features and a well-structured and parameter-free joint blind source separation method named independent vector analysis, that can reliably respond to this set of limitations. We present the mathematical formulation of the new data fusion algorithm, demonstrate its effectiveness, and present multiple explainability case studies using a popular multi-modal dataset that consists of tweets during several high-impact events.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/MKLab-ITI/image-verification-corpus.

  2. 2.

    We also evaluated features created using Bidirectional Encoder Representations from Transformers, or BERT [18].

  3. 3.

    https://code.google.com/archive/p/word2vec/.

  4. 4.

    Additionally, we evaluated Word2Vec trained using our own data.

  5. 5.

    We also analyzed using the ’avgpool’ layer from a pre-trained ResNet-18 model.

  6. 6.

    We consider continuous-valued random variables and in the sequel, refer to differential entropy as simply entropy for simplicity.

  7. 7.

    https://www.independent.co.uk/news/world/asia/lahore-attack-photo-showing-eiffel-tower-lit-up-in-colours-of-pakistan-flag-is-from-2007-rugby-world-cup-a6959231.html.

References

  1. The Washington Post (2018). https://rebrand.ly/ieeovv

  2. Newsweek (2019). https://rebrand.ly/z6t52a

  3. Hateful memes challenge and data set for research on harmful multimodal content. https://ai.facebook.com/blog/hateful-memes-challenge-and-data-set/

  4. Adalı, T., Anderson, M., Fu, G.S.: Diversity in Independent Component and Vector Analyses: Identifiability, algorithms, and applications in medical imaging. IEEE Sig. Process. Mag. 31(3), 18–33 (2014)

    Article  Google Scholar 

  5. Anderson, M., Adalı, T., Li, X.L.: Joint blind source separation with multivariate gaussian model: algorithms and performance analysis. Sig. Process. IEEE Trans. 60(4), 1672–1683 (2012). https://doi.org/10.1109/TSP.2011.2181836

    Article  MathSciNet  MATH  Google Scholar 

  6. Baltrušaitis, T., Ahuja, C., Morency, L.P.: Multimodal machine learning: a survey and taxonomy. IEEE Trans. Pattern Anal. Mach. Intell. 41(2), 423–443 (2018)

    Article  Google Scholar 

  7. BBC: Social media firms fail to act on covid-19 fake news. www.bbc.com/news/technology-52903680, June 2020

  8. Boididou, C., Papadopoulos, S., Zampoglou, M., Apostolidis, L., Papadopoulou, O., Kompatsiaris, I.: Detection and visualization of misleading content on twitter. Int. J. Multimedia Inf. Retrieval 7 (2018). https://doi.org/10.1007/s13735-017-0143-x

  9. Boididou, C., Papadopoulos, S., Zampoglou, M., Apostolidis, L., Papadopoulou, O., Kompatsiaris, Y.: Detection and visualization of misleading content on twitter. Int. J. Multimedia Inf. Retrieval 7(1), 71–86 (2018). https://doi.org/10.1007/s13735-017-0143-x

    Article  Google Scholar 

  10. Boukouvalas, Z., Fu, G.S., Adalı, T.: An efficient multivariate generalized gaussian distribution estimator: Application to IVA. In: 2015 49th Annual Conference on Information Sciences and Systems (CISS), pp. 1–4. IEEE (2015)

    Google Scholar 

  11. Boukouvalas, Z., Levin-Schwartz, Y., Mowakeaa, R., Fu, G.S., Adalı, T.: Independent component analysis using semi-parametric density estimation via entropy maximization. In: 2018 IEEE Statistical Signal Processing Workshop (SSP), pp. 403–407. IEEE (2018)

    Google Scholar 

  12. Boukouvalas, Z., Puerto, M., Elton, D.C., Chung, P.W., Fuge, M.D.: Independent vector analysis for molecular data fusion: Application to property prediction and knowledge discovery of energetic materials. In: 2020 28th European Signal Processing Conference (EUSIPCO), pp. 1030–1034. IEEE (2021)

    Google Scholar 

  13. Cao, J., Qi, P., Sheng, Q., Yang, T., Guo, J., Li, J.: Exploring the role of visual content in fake news detection. In: Shu, K., Wang, S., Lee, D., Liu, H. (eds.) Disinformation, Misinformation, and Fake News in Social Media. LNSN, pp. 141–161. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-42699-6_8

    Chapter  Google Scholar 

  14. Chang, J., Gerrish, S., Wang, C., Boyd-Graber, J.L., Blei, D.M.: Reading tea leaves: how humans interpret topic models. In: Advances in Neural Information Processing Systems, pp. 288–296 (2009)

    Google Scholar 

  15. Comon, P., Jutten, C.: Handbook of Blind Source Separation: Independent Component Analysis and Applications. Academic Press, Cambridge (2010)

    Google Scholar 

  16. Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995). https://doi.org/10.1023/A:1022627411411

  17. Damasceno, L.P., Cavalcante, C.C., Adalı, T., Boukouvalas, Z.: Independent vector analysis using semi-parametric density estimation via multivariate entropy maximization. In: ICASSP 2021–2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 3715–3719. IEEE (2021)

    Google Scholar 

  18. Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. CoRR abs/1810.04805 (2018). arxiv.org/abs/1810.04805

  19. Dick, J., Kuo, F.Y., Sloan, I.H.: High-dimensional integration: the quasi-monte Carlo way. Acta Numerica 22, 133–288 (2013). https://doi.org/10.1017/S0962492913000044

    Article  MathSciNet  MATH  Google Scholar 

  20. Fu, G., Boukouvalas, Z., Adali, T.: Density estimation by entropy maximization with kernels. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1896–1900, April 2015. https://doi.org/10.1109/ICASSP.2015.7178300

  21. Hansen, L.K., Rieger, L.: Interpretability in intelligent systems – a new concept? In: Samek, W., Montavon, G., Vedaldi, A., Hansen, L.K., Müller, K.-R. (eds.) Explainable AI: Interpreting, Explaining and Visualizing Deep Learning. LNCS (LNAI), vol. 11700, pp. 41–49. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-28954-6_3

    Chapter  Google Scholar 

  22. Hardoon, D.R., Szedmak, S., Shawe-Taylor, J.: Canonical correlation analysis: an overview with application to learning methods. Neural Comput. 16(12), 2639–2664 (2004)

    Article  MATH  Google Scholar 

  23. Hiten Patel, M.: Fake news about covid-19 is spreading faster than virus. https://wexnermedical.osu.edu/blog/fake-news-about-covid-19, April 2020

  24. Hyvärinen, A., Karhunen, J., Oja, E.: Independent Component Analysis, vol. 46. Wiley, Hoboken (2004)

    Google Scholar 

  25. Kim, T., Eltoft, T., Lee, T.-W.: Independent vector analysis: an extension of ICA to multivariate components. In: Rosca, J., Erdogmus, D., Príncipe, J.C., Haykin, S. (eds.) ICA 2006. LNCS, vol. 3889, pp. 165–172. Springer, Heidelberg (2006). https://doi.org/10.1007/11679363_21

    Chapter  MATH  Google Scholar 

  26. Linardatos, P., Papastefanopoulos, V., Kotsiantis, S.: Explainable AI: a review of machine learning interpretability methods. Entropy 23(1), 18 (2020)

    Article  Google Scholar 

  27. Mikolov, T., Chen, K., Corrado, G.S., Dean, J.: Efficient estimation of word representations in vector space abs/1301.3781

    Google Scholar 

  28. Moroney, C., et al.: The case for latent variable vs deep learning methods in misinformation detection: an application to covid-19. In: Soares, C., Torgo, L. (eds.) DS 2021. LNCS (LNAI), vol. 12986, pp. 422–432. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-88942-5_33

    Chapter  Google Scholar 

  29. Niederreiter, H.: Random Number Generation and Quasi-Monte Carlo Methods. Society for Industrial and Applied Mathematics, USA (1992)

    Google Scholar 

  30. Ramachandram, D., Taylor, G.W.: Deep multimodal learning: a survey on recent advances and trends. IEEE Sig. Process. Mag. 34(6), 96–108 (2017)

    Article  Google Scholar 

  31. Ribeiro, M.T., Singh, S., Guestrin, C.: “Why should i trust you?”: Explaining the predictions of any classifier arxiv.org/abs/1602.04938

  32. Sharma, K., Qian, F., Jiang, H., Ruchansky, N., Zhang, M., Liu, Y.: Combating fake news: a survey on identification and mitigation techniques. ACM Trans. Intell. Syst. Technol. (TIST) 10(3), 1–42 (2019)

    Article  Google Scholar 

  33. Suciu, P.: Covid-19 conspiracy theories continue to spread and thrive on social media. www.forbes.com/sites/petersuciu/2020/04/24/covid-19-conspiracy-theories-continue-to-spread-and-thrive-on-social-media/#e1a9e8b10076, April 2020

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lucas P. Damasceno .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Damasceno, L.P., Shafer, A., Japkowicz, N., Cavalcante, C.C., Boukouvalas, Z. (2022). Efficient Multivariate Data Fusion for Misinformation Detection During High Impact Events. In: Pascal, P., Ienco, D. (eds) Discovery Science. DS 2022. Lecture Notes in Computer Science(), vol 13601. Springer, Cham. https://doi.org/10.1007/978-3-031-18840-4_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-18840-4_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-18839-8

  • Online ISBN: 978-3-031-18840-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics