Skip to main content

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1272))

  • 719 Accesses

Abstract

This article reviews different methodologies used to conduct political analysis using various sources of information available on the Internet. In some societies, the use of social networks has a significant impact on the political field with society, and various methodologies have been used to analyze various political aspects and the strategies to be followed. The purpose of this paper is to understand these methodologies in order to provide potential voters with information to make informed decisions. First, the necessary terminology on web scraping is reviewed, and then, some examples of projects for political analysis that have used web scraping are presented. Finally, the conclusions are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ulbricht, L.: Scraping the demos. Digitalization, web scraping and the democratic project. Democratization 27(3), 426–442 (2020)

    Google Scholar 

  2. Yu, M., Krehbiel, M., Thompson, S., Miljkovic, T.: An exploration of gender gap using advanced data science tools: actuarial research community. Scientometrics, 1–23 (2020)

    Google Scholar 

  3. Anglin, K.L.: Gather-narrow-extract: a framework for studying local policy variation using web-scraping and natural language processing. J. Res. Edu. Effectiveness 12(4), 685–706 (2019)

    Article  Google Scholar 

  4. Mahdavi, P.: Scraping public co-occurrences for statistical network analysis of political elites. Polit. Sci. Res. Methods 7(2), 385–392 (2019)

    Article  Google Scholar 

  5. Schrenk, M.: Webbots, spiders, and screen scrapers, a guide to developing internet agent with PHP/CUR, 2nd edn (2012)

    Google Scholar 

  6. Mustafaraj, E., Lurie, E., Devine, C.: The case for voter-centered audits of search engines during political elections, January. In: Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, pp. 559–569 (2020)

    Google Scholar 

  7. Rahman, R.U., Wadhwa, D., Bali, A., Tomar, D.S.: The emerging threats of web scrapping to web applications security and their defense mechanism. In: Encyclopedia of Criminal Activities and the Deep Web, pp. 788–809. IGI Global (2020)

    Google Scholar 

  8. Jiao, J., Bai, S.: An empirical analysis of Airbnb listings in forty American cities. Cities 99, 102618 (2020)

    Article  Google Scholar 

  9. Aizenberg, E., Hanegraaff, M.: Is politics under increasing corporate sway? A longitudinal study on the drivers of corporate access. West Eur. Polit. 43(1), 181–202 (2020)

    Article  Google Scholar 

  10. De Stefano, D., Fuccella, V., Vitale, M.P., Zaccarin, S.: Using web scraping techniques to derive co-authorship data: insights from a case study. In SIS May 2018. 49th Scientific Meeting of the Italian Statistical Society, pp. 1–6. Pearson (2018)

    Google Scholar 

  11. Hopkins, D.J., King, G.: A method of automated nonparametric content analysis for social science. Am. J. Polit. Sci. 54(1), 229–247 (2010)

    Article  Google Scholar 

  12. Maerz, S.F., Schneider, C.Q.: Comparing public communication in democracies and autocracies: automated text analyses of speeches by heads of government. Qual. Quan. 1–29 (2019)

    Google Scholar 

  13. Joby, P.P.: Expedient information retrieval system for web pages using the natural language modeling. J. Artif. Intell. 2(02), 100–110 (2020)

    Google Scholar 

  14. Dorle, S., Pise, N.: Political sentiment analysis through social media. In: February 2018 Second International Conference on Computing Methodologies and Communication (ICCMC), pp. 869–873. IEEE (2018)

    Google Scholar 

  15. Mitchell, R.: Web scraping with Python: Collecting more data from the modern web. O’Reilly Media, Inc. (2018)

    Google Scholar 

  16. Matt, T., Pang, B., Lillian, L.: Get out the vote: determining support or opposition from congressional floor-debate transcripts proceedings of EMNLP, pp 327–335 (2006)

    Google Scholar 

  17. Wilkerson, J., Casas, A.: Large-scale computerized text analysis in political science: opportunities and challenges. Annu. Rev. Polit. Sci. 20, 529–544 (2017)

    Article  Google Scholar 

  18. Viloria, A., Varela, N., Lezama, O.B.P., Llinás, N.O., Flores, Y., Palma, H.H., … Marín-González, F.: Classification of digitized documents applying neural networks. In: Lecture Notes in Electrical Engineering, Vol. 637, pp. 213–220. Springer. https://doi.org/10.1007/978-981-15-2612-1_20 (2020)

  19. Kamatkar, S.J., Kamble, A., Viloria, A., Hernández-Fernandez, L., García Cali, E.: Database performance tuning and query optimization. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol. 10943 LNCS, pp. 3–11. Springer. https://doi.org/10.1007/978-3-319-93803-5_1 (2018)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Noel Varela .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Varela, N., Lezama, O.B.P., Charris, M. (2021). Web Scraping and Naïve Bayes Classification for Political Analysis. In: Pandian, A.P., Palanisamy, R., Ntalianis, K. (eds) Proceedings of International Conference on Intelligent Computing, Information and Control Systems. Advances in Intelligent Systems and Computing, vol 1272. Springer, Singapore. https://doi.org/10.1007/978-981-15-8443-5_1

Download citation

Publish with us

Policies and ethics