Abstract
In this paper, we propose a dynamic forecasting framework, named \(DMDP^2\) (Dynamic Multi-source based Default Probability Prediction), to predict the default probability of a company. The default probability is a very important factor to assess the credit risk of listed companies on a stock market. Aiming at aiding financial institutions in decision making, our \(DMDP^2\) framework not only analyses financial data to well capture the historical performance of a company, but also utilizes Long Short-Term Memory model (LSTM) to dynamically incorporate daily news from social media to take the perceptions of market participants and public opinions into consideration. The study of this paper makes two key contributions. First, we make use of unstructured news crawled from social media to alleviate the impact of financial fraud issue made on default probability prediction. Second, we propose a neural network method to integrate both structured financial factors and unstructured social media data with appropriate time alignment for default probability prediction. Extensive experimental results demonstrate the effectiveness of \(DMDP^2\) in predicting default probability for the listed companies in mainland China, compared with various baselines.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Baesens, B., Van Gestel, T., Viaene, S., Stepanova, M., Suykens, J., Vanthienen, J.: Benchmarking state-of-the-art classification algorithms for credit scoring. J. Oper. Res. Soc. 54(6), 627–635 (2003)
Desai, V.S., Crook, J.N., Overstreet, G.A.: A comparison of neural networks and linear scoring models in the credit union environment. Eur. J. Oper. Res. 95(1), 24–37 (1996)
Hand, D.J., Henley, W.E.: Statistical classification methods in consumer credit scoring: a review. J. Roy. Stat. Soc. Ser. A (Stat. Soc.) 160(3), 523–541 (1997)
Lee, T.S., Chiu, C.C., Chou, Y.C., Lu, C.J.: Mining the customer credit using classification and regression tree and multivariate adaptive regression splines. Comput. Stat. Data Anal. 50(4), 1113–1130 (2006)
Lee, T.S., Chen, I.F.: A two-stage hybrid credit scoring model using artificial neural networks and multivariate adaptive regression splines. Expert Syst. Appl. 28(4), 743–752 (2005)
Altman, E.I.: Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. J Finance 23(4), 589–609 (1968)
Altman, E.I., Saunders, A.: Credit risk measurement: developments over the last 20 years. J. Bank. Finance 21(11–12), 1721–1742 (1997)
Huang, Z., Chen, H., Hsu, C.J., Chen, W.H., Wu, S.: Credit rating analysis with support vector machines and neural networks: a market comparative study. Decis. Support Syst. 37(4), 543–558 (2004)
West, D.: Neural network credit scoring models. Comput. Oper. Res. 27(11), 1131–1152 (2000)
Zhao, Z., Xu, S., Kang, B.H., Kabir, M.M.J., Liu, Y., Wasinger, R.: Investigation and improvement of multi-layer perceptron neural networks for credit scoring. Expert Syst. Appl. 42(7), 3508–3516 (2015)
Chen, W., Ma, C., Ma, L.: Mining the customer credit using hybrid support vector machine technique. Expert Syst. Appl. 36(4), 7611–7616 (2009)
Harris, T.: Credit scoring using the clustered support vector machine. Expert Syst. Appl. 42(2), 741–750 (2015)
Hens, A.B., Tiwari, M.K.: Computational time reduction for credit scoring: an integrated approach based on support vector machine and stratified sampling method. Expert Syst. Appl. 39(8), 6774–6781 (2012)
Huang, C.L., Chen, M.C., Wang, C.J.: Credit scoring with a data mining approach based on support vector machines. Expert Syst. Appl. 33(4), 847–856 (2007)
Schebesch, K.B., Stecking, R.: Support vector machines for classifying and describing credit applicants: detecting typical and critical regions. J. Oper. Res. Soc. 56(9), 1082–1088 (2005)
Bijak, K., Thomas, L.C.: Does segmentation always improve model performance in credit scoring? Expert Syst. Appl. 39(3), 2433–2442 (2012)
Yap, B.W., Ong, S.H., Husain, N.H.M.: Using data mining to improve assessment of credit worthiness via credit scoring models. Expert Syst. Appl. 38(10), 13274–13283 (2011)
Verikas, A., Gelzinis, A., Bacauskiene, M.: Mining data with random forests: a survey and results of new tests. Pattern Recogn. 44(2), 330–349 (2011)
Brown, I., Mues, C.: An experimental comparison of classification algorithms for imbalanced credit scoring data sets. Expert Syst. Appl. 39(3), 3446–3453 (2012)
Tsai, C.F., Chen, M.L.: Credit rating by hybrid machine learning techniques. Appl. Soft Comput. 10(2), 374–380 (2010)
Yeh, I.C., Lien, C.: The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients. Expert Syst. Appl. 36(2), 2473–2480 (2009)
Doumpos, M., Zopounidis, C.: Model combination for credit risk assessment: a stacked generalization approach. Ann. Oper. Res. 151(1), 289–306 (2007)
Tsai, C.F., Wu, J.W.: Using neural network ensembles for bankruptcy prediction and credit scoring. Expert Syst. Appl. 34(4), 2639–2649 (2008)
Twala, B.: Multiple classifier application to credit risk assessment. Expert Syst. Appl. 37(4), 3326–3336 (2010)
Wang, G., Hao, J., Ma, J., Jiang, H.: A comparative assessment of ensemble learning for credit scoring. Expert Syst. Appl. 38(1), 223–230 (2011)
Yu, L., Wang, S., Lai, K.K.: Credit risk assessment with a multistage neural network ensemble learning approach. Expert Syst. Appl. 34(2), 1434–1444 (2008)
Sousa, M.R., Gama, J., Brandão, E.: A new dynamic modeling framework for credit risk assessment. Expert Syst. Appl. 45, 341–351 (2016)
Wang, W.Y., Hua, Z.: A semiparametric gaussian copula regression model for predicting financial risks from earnings calls. In: ACL, vol. 1, pp. 1155–1165 (2014)
Klinkenberg, R.: Learning drifting concepts: example selection vs. example weighting. Intell. Data Anal, 8(3), 281–300 (2004)
Sousa, M.R., Gama, J., Gonçalves, M.J.S.: A two-stage model for dealing with temporal degradation of credit scoring. arXiv preprint arXiv:1406.7775 (2014)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Zaremba, W., Sutskever, I., Vinyals, O.: Recurrent neural network regularization. arXiv preprint arXiv:1409.2329 (2014)
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Joulin, A., Grave, E., Bojanowski, P., Douze, M., Jégou, H., Mikolov, T.: Fasttext.zip: compressing text classification models. arXiv preprint arXiv:1612.03651 (2016)
Hastie, T., Tibshirani, R.: Generalized Additive Models. Wiley, Hoboken (1990)
Gao, Q.: Stock market forecasting using recurrent neural network. Ph.D thesis. University of Missouri-Columbia (2016)
Zhang, L., Aggarwal, C., Qi, G.J.: Stock price prediction via discovering multi-frequency trading patterns. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 2141–2149. ACM (2017)
Lin, T., Guo, T., Aberer, K.: Hybrid neural networks for learning the trend in time series. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI-17, pp. 2273–2279 (2017)
Acknowledgments
The work is partially supported by the Hong Kong RGC GRF Project 16214716, NSFC with grant No. 61602297 and 61729201.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Zhao, Y., Huang, Y., Shen, Y. (2018). \(DMDP^2\): A Dynamic Multi-source Based Default Probability Prediction Framework. In: Cai, Y., Ishikawa, Y., Xu, J. (eds) Web and Big Data. APWeb-WAIM 2018. Lecture Notes in Computer Science(), vol 10987. Springer, Cham. https://doi.org/10.1007/978-3-319-96890-2_26
Download citation
DOI: https://doi.org/10.1007/978-3-319-96890-2_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-96889-6
Online ISBN: 978-3-319-96890-2
eBook Packages: Computer ScienceComputer Science (R0)