Skip to main content

Text Analytics

  • Chapter
  • First Online:
Essentials of Business Analytics

Part of the book series: International Series in Operations Research & Management Science ((ISOR,volume 264))

  • 194k Accesses

Abstract

The main focus of this textbook thus far has been the analysis of numerical data. Text analytics, introduced in this chapter, concerns itself with understanding and examining data in word formats, which tend to be more unstructured and therefore more complex. Text analytics uses tools such as those embedded in R in order to extract meaning from large amounts of word-based data. Two methods are described in this chapter: bag-of-words and natural language processing (NLP). This chapter is focused on the bag-of-words approach. The bag-of-words approach does not attribute meaning to the sequence of words. Its applications include clustering or segmentation of documents and sentiment analysis. Natural language processing uses the order and “type” of words to infer the meaning. Hence, NLP deals more with issues such as parts of speech.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 129.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Tutorial link—http://www.mjdenny.com/Text_Processing_In_R.html (accessed on Dec 27, 2017).

  2. 2.

    The dataset “icecream.csv” can be downloaded from the book’s website.

  3. 3.

    https://raw.githubusercontent.com/sudhir-voleti/profile-script/master/sudhir%20shiny%20app%20run%20lists.txt (accessed on Dec 27, 2017).

  4. 4.

    https://www.youtube.com/watch?v=tN6FYIOe0bs (accessed on Dec 27, 2017) Sudhir Voleti is the creator of video.

  5. 5.

    http://wordnet.princeton.edu/ (accessed on Feb 7, 2018).

  6. 6.

    NLTK package and documentation are available on http://www.nltk.org/ (accessed on Feb 10, 2018).

  7. 7.

    Apache OpenNLP package and documentation are available on https://opennlp.apache.org/ (accessed on Feb 10, 2018).

References

  • Bird, S., Klein, E., & Loper, E. (2009). Natural language processing with Python. Sebastopol, CA: O’Reilly Media.

    Google Scholar 

  • Robinson, D., & Silge, J. (2017). Text mining with R: A tidy approach. Sebastopol, CA: O’Reilly Media.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sudhir Voleti .

Editor information

Editors and Affiliations

1 Electronic Supplementary Material

Supplementary Data 1

Generate_Document_Word_Matrix (CPP 1 kb)

Supplementary Data 2

Github shiny code (R 3 kb)

Supplementary Data 3

Icecream (R 4 kb)

Supplementary Data 4

Ice-cream (TXT 129 kb)

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Voleti, S. (2019). Text Analytics. In: Pochiraju, B., Seshadri, S. (eds) Essentials of Business Analytics. International Series in Operations Research & Management Science, vol 264. Springer, Cham. https://doi.org/10.1007/978-3-319-68837-4_9

Download citation

Publish with us

Policies and ethics