Skip to main content

Extraction, Sentiment Analysis and Visualization of Massive Public Messages

  • Conference paper
New Trends in Databases and Information Systems

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 241))

Abstract

This paper describes the design and implementation of tools to extract, analyze and explore an arbitrarily great amount of public messages from diverse sources. The aim of our work is to flexibly support sentiment analysis by quickly adapting to different use cases, languages, and message sources. First, a highly parallel scraper has been implemented, allowing the user to customize the behavior with scripting technologies and thus being able to manage dynamically loaded content. Then, a novel framework is developed to support agile programming, building and validating a classifier for sentiment analysis. Finally, a web application allows the real-time selection and projection of the analysis results in different dimensions in an OLAP fashion.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Pike, R., Dorward, S., Griesemer, R., Quinlan, S.: Interpreting the Data: Parallel Analysis with Sawzall. Special Issue on Grids and Worldwide Computing Programming Models and Infrastructure 13(4), 227–298

    Google Scholar 

  2. Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Communications of the ACM (1970)

    Google Scholar 

  3. Yang, C., Yen, C., Tan, C., Madden, S.R.: Osprey: Implementing MapReduce-style fault tolerance in a shared-nothing distributed database. In: ICDE, pp. 657–668 (2010)

    Google Scholar 

  4. Bautin, M., Vijayarenu, L., Skiena, S.: International sentiment analysis for news and blogs. In: ICWSM (2008)

    Google Scholar 

  5. Pang, B., Lee, L.: Opinion mining and sentiment analysis. Foundations and trends in information retrieval (2008)

    Google Scholar 

  6. Clark, E., Araki, K.: Text normalization in social media: progress, problems and applications for a pre-processing system of casual English. Procedia-Social and Behavioral Sciences 27, 2–11 (2011)

    Article  Google Scholar 

  7. Pang, B., Lee, L., Vaithyanathan, S.: Thumbs Up? Sentiment Classification Using Machine Learning Techniques. In: Proceedings of the ACL 2002 Conference on Empirical Methods in Natural Language Processing, vol. 10 (2002)

    Google Scholar 

  8. Snyder, B., Barzilay, R.: Multiple Aspect Ranking using the Good Grief Algorithm. In: Proceedings of the Joint Human Language Technology/North American Chapter of the ACL Conference (2007)

    Google Scholar 

  9. Pang, B., Lee, L.: Seeing Stars: Exploiting Class Relationships For Sentiment Categorization With Respect To Rating Scales. In: Proceedings of ACL, pp. 115–124 (2005)

    Google Scholar 

  10. Esuli, A., Sebastiani, F.: Sentiwordnet: A publicly available lexical resource for opinion mining. In: Proceedings of LREC, pp. 417–422 (2006)

    Google Scholar 

  11. Meena, A., Prabhakar, T.V.: Sentence level sentiment analysis in the presence of conjuncts using linguistic analysis. In: Amati, G., Carpineto, C., Romano, G. (eds.) ECiR 2007. LNCS, vol. 4425, pp. 573–580. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  12. Nasukawa, T., Yi, J.: Sentiment analysis: Capturing favorability using natural language processing. In: Proceedings of the 2nd Int. Conference on Knowledge Capture, pp. 70–77. ACM (2003)

    Chapter  Google Scholar 

  13. Thelwall, M., Buckley, K., Paltoglou, G., Cai, D.: Sentiment strength detection in short informal text. Journal of the American Society for Information Science and Technology 61(12), 2544–2558 (2010)

    Article  Google Scholar 

  14. Ahmad, K., Cheng, D., Almas, Y.: Multi-lingual sentiment analysis of financial news streams. In: Proc. of the 1st Intl. Conf. on Grid in Finance (2006)

    Google Scholar 

  15. Gill, A.J., Gergle, D., French, R.M., Oberlander, J.: Emotion Rating from Short Blog Texts. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (2008)

    Google Scholar 

  16. Wilson, T., Wiebe, J., Hoffmann, P.: Recognizing contextual polarity in phrase-level sentiment analysis. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, pp. 347–354. Association for Computational Linguistics (2005)

    Google Scholar 

  17. Chang, P.-C., Galley, M., Manning, C.D.: Optimizing Chinese word segmentation for machine translation performance. In: Proceedings of the Third Workshop on Statistical Machine Translation, pp. 224–232. Association for Computational Linguistics (2008)

    Google Scholar 

  18. Porter, M.F.: Snowball: A language for stemming algorithms (2001)

    Google Scholar 

  19. Zadrozny, B., Elkan, C.: Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers. In: Proceedings of the 18th International Conference on Machine Learning, pp. 609–616 (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jacopo Farina .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Farina, J., Mazuran, M., Quintarelli, E. (2014). Extraction, Sentiment Analysis and Visualization of Massive Public Messages. In: Catania, B., et al. New Trends in Databases and Information Systems. Advances in Intelligent Systems and Computing, vol 241. Springer, Cham. https://doi.org/10.1007/978-3-319-01863-8_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-01863-8_18

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-01862-1

  • Online ISBN: 978-3-319-01863-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics