Skip to main content

\(DMDP^2\): A Dynamic Multi-source Based Default Probability Prediction Framework

  • Conference paper
  • First Online:
Web and Big Data (APWeb-WAIM 2018)

Abstract

In this paper, we propose a dynamic forecasting framework, named \(DMDP^2\) (Dynamic Multi-source based Default Probability Prediction), to predict the default probability of a company. The default probability is a very important factor to assess the credit risk of listed companies on a stock market. Aiming at aiding financial institutions in decision making, our \(DMDP^2\) framework not only analyses financial data to well capture the historical performance of a company, but also utilizes Long Short-Term Memory model (LSTM) to dynamically incorporate daily news from social media to take the perceptions of market participants and public opinions into consideration. The study of this paper makes two key contributions. First, we make use of unstructured news crawled from social media to alleviate the impact of financial fraud issue made on default probability prediction. Second, we propose a neural network method to integrate both structured financial factors and unstructured social media data with appropriate time alignment for default probability prediction. Extensive experimental results demonstrate the effectiveness of \(DMDP^2\) in predicting default probability for the listed companies in mainland China, compared with various baselines.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://tushare.org.

  2. 2.

    http://finance.sina.com.cn.

  3. 3.

    https://github.com/fxsjy/jieba.

References

  1. Baesens, B., Van Gestel, T., Viaene, S., Stepanova, M., Suykens, J., Vanthienen, J.: Benchmarking state-of-the-art classification algorithms for credit scoring. J. Oper. Res. Soc. 54(6), 627–635 (2003)

    Article  Google Scholar 

  2. Desai, V.S., Crook, J.N., Overstreet, G.A.: A comparison of neural networks and linear scoring models in the credit union environment. Eur. J. Oper. Res. 95(1), 24–37 (1996)

    Article  Google Scholar 

  3. Hand, D.J., Henley, W.E.: Statistical classification methods in consumer credit scoring: a review. J. Roy. Stat. Soc. Ser. A (Stat. Soc.) 160(3), 523–541 (1997)

    Article  Google Scholar 

  4. Lee, T.S., Chiu, C.C., Chou, Y.C., Lu, C.J.: Mining the customer credit using classification and regression tree and multivariate adaptive regression splines. Comput. Stat. Data Anal. 50(4), 1113–1130 (2006)

    Article  MathSciNet  Google Scholar 

  5. Lee, T.S., Chen, I.F.: A two-stage hybrid credit scoring model using artificial neural networks and multivariate adaptive regression splines. Expert Syst. Appl. 28(4), 743–752 (2005)

    Article  Google Scholar 

  6. Altman, E.I.: Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. J Finance 23(4), 589–609 (1968)

    Article  Google Scholar 

  7. Altman, E.I., Saunders, A.: Credit risk measurement: developments over the last 20 years. J. Bank. Finance 21(11–12), 1721–1742 (1997)

    Article  Google Scholar 

  8. Huang, Z., Chen, H., Hsu, C.J., Chen, W.H., Wu, S.: Credit rating analysis with support vector machines and neural networks: a market comparative study. Decis. Support Syst. 37(4), 543–558 (2004)

    Article  Google Scholar 

  9. West, D.: Neural network credit scoring models. Comput. Oper. Res. 27(11), 1131–1152 (2000)

    Article  Google Scholar 

  10. Zhao, Z., Xu, S., Kang, B.H., Kabir, M.M.J., Liu, Y., Wasinger, R.: Investigation and improvement of multi-layer perceptron neural networks for credit scoring. Expert Syst. Appl. 42(7), 3508–3516 (2015)

    Article  Google Scholar 

  11. Chen, W., Ma, C., Ma, L.: Mining the customer credit using hybrid support vector machine technique. Expert Syst. Appl. 36(4), 7611–7616 (2009)

    Article  Google Scholar 

  12. Harris, T.: Credit scoring using the clustered support vector machine. Expert Syst. Appl. 42(2), 741–750 (2015)

    Article  Google Scholar 

  13. Hens, A.B., Tiwari, M.K.: Computational time reduction for credit scoring: an integrated approach based on support vector machine and stratified sampling method. Expert Syst. Appl. 39(8), 6774–6781 (2012)

    Article  Google Scholar 

  14. Huang, C.L., Chen, M.C., Wang, C.J.: Credit scoring with a data mining approach based on support vector machines. Expert Syst. Appl. 33(4), 847–856 (2007)

    Article  Google Scholar 

  15. Schebesch, K.B., Stecking, R.: Support vector machines for classifying and describing credit applicants: detecting typical and critical regions. J. Oper. Res. Soc. 56(9), 1082–1088 (2005)

    Article  Google Scholar 

  16. Bijak, K., Thomas, L.C.: Does segmentation always improve model performance in credit scoring? Expert Syst. Appl. 39(3), 2433–2442 (2012)

    Article  Google Scholar 

  17. Yap, B.W., Ong, S.H., Husain, N.H.M.: Using data mining to improve assessment of credit worthiness via credit scoring models. Expert Syst. Appl. 38(10), 13274–13283 (2011)

    Article  Google Scholar 

  18. Verikas, A., Gelzinis, A., Bacauskiene, M.: Mining data with random forests: a survey and results of new tests. Pattern Recogn. 44(2), 330–349 (2011)

    Article  Google Scholar 

  19. Brown, I., Mues, C.: An experimental comparison of classification algorithms for imbalanced credit scoring data sets. Expert Syst. Appl. 39(3), 3446–3453 (2012)

    Article  Google Scholar 

  20. Tsai, C.F., Chen, M.L.: Credit rating by hybrid machine learning techniques. Appl. Soft Comput. 10(2), 374–380 (2010)

    Article  Google Scholar 

  21. Yeh, I.C., Lien, C.: The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients. Expert Syst. Appl. 36(2), 2473–2480 (2009)

    Article  Google Scholar 

  22. Doumpos, M., Zopounidis, C.: Model combination for credit risk assessment: a stacked generalization approach. Ann. Oper. Res. 151(1), 289–306 (2007)

    Article  Google Scholar 

  23. Tsai, C.F., Wu, J.W.: Using neural network ensembles for bankruptcy prediction and credit scoring. Expert Syst. Appl. 34(4), 2639–2649 (2008)

    Article  Google Scholar 

  24. Twala, B.: Multiple classifier application to credit risk assessment. Expert Syst. Appl. 37(4), 3326–3336 (2010)

    Article  Google Scholar 

  25. Wang, G., Hao, J., Ma, J., Jiang, H.: A comparative assessment of ensemble learning for credit scoring. Expert Syst. Appl. 38(1), 223–230 (2011)

    Article  Google Scholar 

  26. Yu, L., Wang, S., Lai, K.K.: Credit risk assessment with a multistage neural network ensemble learning approach. Expert Syst. Appl. 34(2), 1434–1444 (2008)

    Article  Google Scholar 

  27. Sousa, M.R., Gama, J., Brandão, E.: A new dynamic modeling framework for credit risk assessment. Expert Syst. Appl. 45, 341–351 (2016)

    Article  Google Scholar 

  28. Wang, W.Y., Hua, Z.: A semiparametric gaussian copula regression model for predicting financial risks from earnings calls. In: ACL, vol. 1, pp. 1155–1165 (2014)

    Google Scholar 

  29. Klinkenberg, R.: Learning drifting concepts: example selection vs. example weighting. Intell. Data Anal, 8(3), 281–300 (2004)

    Google Scholar 

  30. Sousa, M.R., Gama, J., Gonçalves, M.J.S.: A two-stage model for dealing with temporal degradation of credit scoring. arXiv preprint arXiv:1406.7775 (2014)

  31. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  32. Zaremba, W., Sutskever, I., Vinyals, O.: Recurrent neural network regularization. arXiv preprint arXiv:1409.2329 (2014)

  33. Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  34. Joulin, A., Grave, E., Bojanowski, P., Douze, M., Jégou, H., Mikolov, T.: Fasttext.zip: compressing text classification models. arXiv preprint arXiv:1612.03651 (2016)

  35. Hastie, T., Tibshirani, R.: Generalized Additive Models. Wiley, Hoboken (1990)

    MATH  Google Scholar 

  36. Gao, Q.: Stock market forecasting using recurrent neural network. Ph.D thesis. University of Missouri-Columbia (2016)

    Google Scholar 

  37. Zhang, L., Aggarwal, C., Qi, G.J.: Stock price prediction via discovering multi-frequency trading patterns. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 2141–2149. ACM (2017)

    Google Scholar 

  38. Lin, T., Guo, T., Aberer, K.: Hybrid neural networks for learning the trend in time series. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI-17, pp. 2273–2279 (2017)

    Google Scholar 

Download references

Acknowledgments

The work is partially supported by the Hong Kong RGC GRF Project 16214716, NSFC with grant No. 61602297 and 61729201.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yanyan Shen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhao, Y., Huang, Y., Shen, Y. (2018). \(DMDP^2\): A Dynamic Multi-source Based Default Probability Prediction Framework. In: Cai, Y., Ishikawa, Y., Xu, J. (eds) Web and Big Data. APWeb-WAIM 2018. Lecture Notes in Computer Science(), vol 10987. Springer, Cham. https://doi.org/10.1007/978-3-319-96890-2_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-96890-2_26

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-96889-6

  • Online ISBN: 978-3-319-96890-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics