Abstract
We design and implement a classifier hub that can explore the detailed information on the imbalanced dataset and classify the dataset into two classes. Against the data imbalance, through setting imbalance ratio, it can adjust the proportion of majority and minority class. In this hub, we also implement Decision Tree, KNN and Random Forrest machine learning classifiers based on Python and Java. In the experiments, we use 30,000 loan records from an online P2P system as the dataset to demonstrate the functions of the classifier hub. The influences of different imbalanced ratio on classification performance have been compared through Decision Tree, KNN and Random Forrest algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
He, J., Zhang, Y., Shi, Y., Huang, G.: Domain-driven classification based on multiple criteria and multiple constraint-level programming for intelligent credit scoring. IEEE Trans. Knowl. Data Eng. 22(6), 826–838 (2010)
Fawcett, T.: An introduction to ROC analysis. Pattern Recogn. Lett. 27(8), 861–874 (2006)
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Garcìa, V., Sànchez, J.S., Mollineda, R.A.: On the effectiveness of preprocessing methods when dealing with different levels of class imbalance. Knowl.-Based Syst. 25(1), 13–21 (2012)
Acknowledgment
This research has been funded by the Guangzhou Science and Technology Plan Project “Collaborative Innovation Project Oriented Big Data Security Industry Chain” (No. 201508010067).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Abeysinghe, C., Li, J., He, J. (2016). A Classifier Hub for Imbalanced Financial Data. In: Cheema, M., Zhang, W., Chang, L. (eds) Databases Theory and Applications. ADC 2016. Lecture Notes in Computer Science(), vol 9877. Springer, Cham. https://doi.org/10.1007/978-3-319-46922-5_43
Download citation
DOI: https://doi.org/10.1007/978-3-319-46922-5_43
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46921-8
Online ISBN: 978-3-319-46922-5
eBook Packages: Computer ScienceComputer Science (R0)