Extraction, Sentiment Analysis and Visualization of Massive Public Messages

Farina, Jacopo; Mazuran, Mirjana; Quintarelli, Elisa

doi:10.1007/978-3-319-01863-8_18

Jacopo Farina¹²,
Mirjana Mazuran¹² &
Elisa Quintarelli¹²

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 241))

1420 Accesses
1 Citations

Abstract

This paper describes the design and implementation of tools to extract, analyze and explore an arbitrarily great amount of public messages from diverse sources. The aim of our work is to flexibly support sentiment analysis by quickly adapting to different use cases, languages, and message sources. First, a highly parallel scraper has been implemented, allowing the user to customize the behavior with scripting technologies and thus being able to manage dynamically loaded content. Then, a novel framework is developed to support agile programming, building and validating a classifier for sentiment analysis. Finally, a web application allows the real-time selection and projection of the analysis results in different dimensions in an OLAP fashion.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Pike, R., Dorward, S., Griesemer, R., Quinlan, S.: Interpreting the Data: Parallel Analysis with Sawzall. Special Issue on Grids and Worldwide Computing Programming Models and Infrastructure 13(4), 227–298
Google Scholar
Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Communications of the ACM (1970)
Google Scholar
Yang, C., Yen, C., Tan, C., Madden, S.R.: Osprey: Implementing MapReduce-style fault tolerance in a shared-nothing distributed database. In: ICDE, pp. 657–668 (2010)
Google Scholar
Bautin, M., Vijayarenu, L., Skiena, S.: International sentiment analysis for news and blogs. In: ICWSM (2008)
Google Scholar
Pang, B., Lee, L.: Opinion mining and sentiment analysis. Foundations and trends in information retrieval (2008)
Google Scholar
Clark, E., Araki, K.: Text normalization in social media: progress, problems and applications for a pre-processing system of casual English. Procedia-Social and Behavioral Sciences 27, 2–11 (2011)
Article Google Scholar
Pang, B., Lee, L., Vaithyanathan, S.: Thumbs Up? Sentiment Classification Using Machine Learning Techniques. In: Proceedings of the ACL 2002 Conference on Empirical Methods in Natural Language Processing, vol. 10 (2002)
Google Scholar
Snyder, B., Barzilay, R.: Multiple Aspect Ranking using the Good Grief Algorithm. In: Proceedings of the Joint Human Language Technology/North American Chapter of the ACL Conference (2007)
Google Scholar
Pang, B., Lee, L.: Seeing Stars: Exploiting Class Relationships For Sentiment Categorization With Respect To Rating Scales. In: Proceedings of ACL, pp. 115–124 (2005)
Google Scholar
Esuli, A., Sebastiani, F.: Sentiwordnet: A publicly available lexical resource for opinion mining. In: Proceedings of LREC, pp. 417–422 (2006)
Google Scholar
Meena, A., Prabhakar, T.V.: Sentence level sentiment analysis in the presence of conjuncts using linguistic analysis. In: Amati, G., Carpineto, C., Romano, G. (eds.) ECiR 2007. LNCS, vol. 4425, pp. 573–580. Springer, Heidelberg (2007)
Chapter Google Scholar
Nasukawa, T., Yi, J.: Sentiment analysis: Capturing favorability using natural language processing. In: Proceedings of the 2nd Int. Conference on Knowledge Capture, pp. 70–77. ACM (2003)
Chapter Google Scholar
Thelwall, M., Buckley, K., Paltoglou, G., Cai, D.: Sentiment strength detection in short informal text. Journal of the American Society for Information Science and Technology 61(12), 2544–2558 (2010)
Article Google Scholar
Ahmad, K., Cheng, D., Almas, Y.: Multi-lingual sentiment analysis of financial news streams. In: Proc. of the 1st Intl. Conf. on Grid in Finance (2006)
Google Scholar
Gill, A.J., Gergle, D., French, R.M., Oberlander, J.: Emotion Rating from Short Blog Texts. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (2008)
Google Scholar
Wilson, T., Wiebe, J., Hoffmann, P.: Recognizing contextual polarity in phrase-level sentiment analysis. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, pp. 347–354. Association for Computational Linguistics (2005)
Google Scholar
Chang, P.-C., Galley, M., Manning, C.D.: Optimizing Chinese word segmentation for machine translation performance. In: Proceedings of the Third Workshop on Statistical Machine Translation, pp. 224–232. Association for Computational Linguistics (2008)
Google Scholar
Porter, M.F.: Snowball: A language for stemming algorithms (2001)
Google Scholar
Zadrozny, B., Elkan, C.: Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers. In: Proceedings of the 18th International Conference on Machine Learning, pp. 609–616 (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Politecnico di Milano, Piazza Leonardo da Vinci 32, 20133, Milano, Italy
Jacopo Farina, Mirjana Mazuran & Elisa Quintarelli

Authors

Jacopo Farina
View author publications
You can also search for this author in PubMed Google Scholar
Mirjana Mazuran
View author publications
You can also search for this author in PubMed Google Scholar
Elisa Quintarelli
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jacopo Farina .

Editor information

Editors and Affiliations

Dipartimento di Informatica Bioingegneria, Robotica e, Università di Genova, Genova, Italy
Barbara Catania
Dipartimento di Automatica e Informatica, Politecnico di Torino, Torino, Italy
Tania Cerquitelli
Dipartimento di Automatica e Informatica, Politecnico di Torino, Torino, Italy
Silvia Chiusano
Dipartimento di Informatica, Bioingegneria, Robotica e, Università di Genova, Genova, Italy
Giovanna Guerrini
Cloudera, Inc., California,, California, USA
Mirko Kämpf
Faculty of Informatics, Technische Universität München, Garching, Germany
Alfons Kemper
Dept. of Analytical Information Systems, Saint Petersburg University, Saint Petersburg, Russia
Boris Novikov
Dipartimento di Ingegneria e Scienza, dell’Informazione, ItalyUniversità di Trento, Povo, TN,, Italy
Themis Palpanas
Department of Software Engineering, Faculty of Mathematics and Physics, Charles University, Praha, Praha, Czech Republic
Jaroslav Pokorný
Department of Informatics, Aristotle University of Thessaloniki, Thessaloniki, Greece
Athena Vakali

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Farina, J., Mazuran, M., Quintarelli, E. (2014). Extraction, Sentiment Analysis and Visualization of Massive Public Messages. In: Catania, B., et al. New Trends in Databases and Information Systems. Advances in Intelligent Systems and Computing, vol 241. Springer, Cham. https://doi.org/10.1007/978-3-319-01863-8_18

Download citation

DOI: https://doi.org/10.1007/978-3-319-01863-8_18
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-01862-1
Online ISBN: 978-3-319-01863-8
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics