Skip to main content
Log in

Sentiment analysis of financial news articles using performance indicators

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Mining financial text documents and understanding the sentiments of individual investors, institutions and markets is an important and challenging problem in the literature. Current approaches to mine sentiments from financial texts largely rely on domain-specific dictionaries. However, dictionary-based methods often fail to accurately predict the polarity of financial texts. This paper aims to improve the state-of-the-art and introduces a novel sentiment analysis approach that employs the concept of financial and non-financial performance indicators. It presents an association rule mining-based hierarchical sentiment classifier model to predict the polarity of financial texts as positive, neutral or negative. The performance of the proposed model is evaluated on a benchmark financial dataset. The model is also compared against other state-of-the-art dictionary and machine learning-based approaches and the results are found to be quite promising. The novel use of performance indicators for financial sentiment analysis offers interesting and useful insights.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Agrawal R, Imieliński T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD International conference on management of data, pp 207–216, Washington, DC, May 26–28

  2. Antweiler W, Frank MZ (2004) Is all that talk just noise? The information content of internet stock message boards. J Finance 59(3):1259–1294

    Article  Google Scholar 

  3. Berzal F, Cubero J-C, Sánchez D, Serrano JM (2004) Art: a hybrid classification model. Mach Learn 54(1):67–92

    Article  MATH  Google Scholar 

  4. Bird S (2006) Nltk: the natural language toolkit. In: Proceedings of the COLING/ACL on interactive presentation sessions. Association for Computational Linguistics, pp 69–72

  5. Blitzer J, Blitzer J, Dredze M, Dredze M, Pereira F, Pereira F (2007) Biographies, bollywood, boom-boxes and blenders: domain adaptation for sentiment classification. Annu Meet Assoc Comput Linguist 45(1):440

    Google Scholar 

  6. Cambria E, Olsher D, Rajagopal D ( 2014). Senticnet 3: a common and common-sense knowledge base for cognition-driven sentiment analysis. In: Proceedings of the twenty-eighth AAAI conference on artificial intelligence, pp 1515–1521

  7. Dang Y, Zhang Y, Chen H (2010) A lexicon-enhanced method for sentiment classification: an experiment on online product reviews. IEEE Intell Syst 25(4):46–53

    Article  Google Scholar 

  8. Esuli A, Sebastiani F (2006) Sentiwordnet: a publicly available lexical resource for opinion mining. In: Proceedings of LREC, vol 6, pp 417–422

  9. Fawcett T (2006) An introduction to ROC analysis. Pattern Recogn Lett 27(8):861–874

    Article  MathSciNet  Google Scholar 

  10. Ferguson N, Philip D, Lam H, Guo J (2014) Media content and stock returns: the predictive power of press. Multinatl Finance J 19(1):1–31

    Google Scholar 

  11. Huang AH, Zang AY, Zheng R (2014) Evidence on the information content of text in analyst reports. Account Rev 89(6):2151–2180

    Article  Google Scholar 

  12. Ittner CD, Larcker DF (1998) Are nonfinancial measures leading indicators of financial performance? An analysis of customer satisfaction. J Account Res 36:1–35

    Article  Google Scholar 

  13. Kaplan RS, Norton DP (1996) Linking the balanced scorecard to strategy. Calif Manag Rev 39(1):53–79

    Article  Google Scholar 

  14. Kearney C, Liu S (2014) Textual sentiment in finance: a survey of methods and models. Int Rev Financ Anal 33:171–185

    Article  Google Scholar 

  15. Li F (2010) The information content of forward- looking statements in corporate filings—a naive bayesian machine learning approach. J Account Res 48(5):1049–1102

    Article  MathSciNet  Google Scholar 

  16. Li W, Han J, Pei J (2001) CMAR: accurate and efficient classification based on multiple class-association rules. In: Proceedings 2001 IEEE international conference on data mining, pp 369–376

  17. Li Q, Wang T, Li P, Liu L, Gong Q, Chen Y (2014a) The effect of news and public mood on stock movements. Inf Sci 278:826–840

    Article  Google Scholar 

  18. Li X, Xie H, Chen L, Wang J, Deng X (2014b) News impact on stock price return via sentiment analysis. Knowl Based Syst 69:14–23

    Article  Google Scholar 

  19. Li Q, Chen Y, Jiang LL, Li P, Chen H (2016) A tensor-based information framework for predicting the stock market. ACM Trans Inf Syst (TOIS) 34(2):11

    Article  Google Scholar 

  20. Liu M, Qu J (2012) Mining high utility itemsets without candidate generation. In: Proceedings of the 21st ACM international conference on information and knowledge management. ACM, pp 55–64

  21. Liu B, Hsu W, Ma Y, Ma B (1998) Integrating classification and association rule mining. Knowl Discov Data Min 80–86

  22. Loughran T, Mcdonald B (2011) When is a liability not a liability? Textual analysis, dictionaries, and 10-Ks. J Finance 66(1):35–65

    Article  Google Scholar 

  23. Loughran T, McDonald B (2015) The use of word lists in textual analysis. J Behav Finance 16(1):1–11

    Article  Google Scholar 

  24. Loughran T, McDonald B (2016) Textual analysis in accounting and finance: a survey. J Account Res 54:1187–1230

    Article  Google Scholar 

  25. Malo P, Sinha A, Korhonen P, Wallenius J, Takala P (2014) Good debt or bad debt: detecting semantic orientations in economic texts. J Assoc Inf Sci Technol 65(4):782–796

    Article  Google Scholar 

  26. Man Y, Yuanxin O, Hao S (2014) Investigating association rules for sentiment classification of web reviews. J Intell Fuzzy Syst 27(4):2055–2065

    Google Scholar 

  27. Meretakis D, Wüthrich B (1999) Extending naive bayes classifiers using long itemsets. In: Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 165–174

  28. Moilanen K, Pulman S, Zhang Y (2010) Packed feelings and ordered sentiments: sentiment parsing with quasi-compositional polarity sequencing and compression. In: Proceedings of the 1st workshop on computational approaches to subjectivity and sentiment analysis (WASSA 2010) at the 19th European conference on artificial intelligence (ECAI 2010), pp. 36–43

  29. Mo S, Y K, Liu A, Yang SY (2016) News sentiment to market impact and its feedback effect. Environ Syst Decisi 1–9

  30. O’Hare N, Davy M, Bermingham A, Ferguson P, Sheridan PP, Gurrin C, Smeaton AF, OHare N. (2009) Topic-dependent sentiment analysis of financial blogs. In: International CIKM workshop on topic-sentiment analysis for mass opinion measurement, pp 9–16

  31. Pang B, Lee L, Vaithyanathan S (2002) Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of the conference on empirical methods in natural language processing, pp 79–86

  32. Schumaker RP, Chen H (2009) Textual analysis of stock market prediction using breaking financial news: the AZFin text system. ACM Trans Inf Syst (TOIS) 27(2):12

    Article  Google Scholar 

  33. Stone PJ, Bales RF, Namenwirth JZ, Ogilvie DM (1962) The general inquirer: a computer system for content analysis and retrieval based on the sentence as unit of information. Comput Behav Sci 7(4):484–498

    Article  Google Scholar 

  34. Tetlock PC (2016) Giving content to investor sentiment: the role of media in the stock market. J Finance 62(3):1139–1168

    Article  Google Scholar 

  35. Tetlock PC, Saar-Tsechansky M, MacSkassy S (2008) More than words: quantifying language to measure firms’ fundamentals. J Finance 63(3):1437–1467

    Article  Google Scholar 

  36. Thelwall M, Buckley K, Paltoglou G (2012) Sentiment strength detection for the social web. J Am Soc Inf Sci Technol 63(1):163–173

    Article  Google Scholar 

  37. Turney PD (2002) Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th annual meeting of the association for computational linguistics, pp 417–424

  38. Van De Kauter M, Breesch D, Hoste V (2015) Fine-grained analysis of explicit and implicit sentiment in financial news articles. Expert Syst Appl 42(11):4999–5010

    Article  Google Scholar 

  39. Wiebe J, Wilson T, Cardie C (2005) Annotating expressions of opinions and emotions in language. Lang Resour Eval 39(2–3):165–210

    Article  Google Scholar 

  40. Yang CC, Tang X, Wong Y, Wei C-P (2010) Understanding online consumer review opinions with sentiment analysis using machine learning. Pac Asia J Assoc Inf Syst 2(3):73–89

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Srikumar Krishnamoorthy.

Appendix

Appendix

1.1 A. Parsing sentences using the NLTK toolkit

figure c

Steps:

  1. 1.

    Parse the sentence with a regular expression parser using the grammar specified above

  2. 2.

    If a match for ’NPJJ’ tree pattern is found:

    1. (a)

      Traverse the subtree to get the combination of NP (potential performance indicator) and JJ/RB/VB (potential directionality word). Look up the dictionary for the matching indicator and directionality word.

    2. (b)

      If the combination of NP and JJ/RB/VB are not found, check for presence of individual words (either performance indicator or directionality word) in the dictionary

    3. (c)

      Tag the sentence based on the identified matches.

1.2 B. Parsing numeric values to determine directionality

Preconditions:

  1. 1.

    If a sentence has not been tagged with combination of performance indicators and directionality using the parse rules specified in Section A above.

  2. 2.

    The sentence contains terms like compared to, versus, down from, up from #common sentences where one is likely to observe multiple numeric values without the use of directional words.

Example sentence: Operating profit margin was 8.3%, compared to 11.8%

                              a year earlier.

Expected tag output: LagInd::DOWN

figure d

Steps:

  1. 1.

    Parse the sentence with a regular expression parser using the grammar specified above

  2. 2.

    If a match for ’NPJJ’ tree pattern is found:

    1. (a)

      Traverse the subtree to get the combination of NP, CD, CD to extract the performance indicator and numeric values. The numeric values are analyzed to determine the directionality (UP/DOWN)

    2. (b)

      Tag the sentence based on the identified matches.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Krishnamoorthy, S. Sentiment analysis of financial news articles using performance indicators. Knowl Inf Syst 56, 373–394 (2018). https://doi.org/10.1007/s10115-017-1134-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-017-1134-1

Keywords

Navigation