Skip to main content

A Classifier Hub for Imbalanced Financial Data

  • Conference paper
  • First Online:
Databases Theory and Applications (ADC 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9877))

Included in the following conference series:

Abstract

We design and implement a classifier hub that can explore the detailed information on the imbalanced dataset and classify the dataset into two classes. Against the data imbalance, through setting imbalance ratio, it can adjust the proportion of majority and minority class. In this hub, we also implement Decision Tree, KNN and Random Forrest machine learning classifiers based on Python and Java. In the experiments, we use 30,000 loan records from an online P2P system as the dataset to demonstrate the functions of the classifier hub. The influences of different imbalanced ratio on classification performance have been compared through Decision Tree, KNN and Random Forrest algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. He, J., Zhang, Y., Shi, Y., Huang, G.: Domain-driven classification based on multiple criteria and multiple constraint-level programming for intelligent credit scoring. IEEE Trans. Knowl. Data Eng. 22(6), 826–838 (2010)

    Article  Google Scholar 

  2. Fawcett, T.: An introduction to ROC analysis. Pattern Recogn. Lett. 27(8), 861–874 (2006)

    Article  MathSciNet  Google Scholar 

  3. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)

    MATH  Google Scholar 

  4. Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  5. Garcìa, V., Sànchez, J.S., Mollineda, R.A.: On the effectiveness of preprocessing methods when dealing with different levels of class imbalance. Knowl.-Based Syst. 25(1), 13–21 (2012)

    Article  Google Scholar 

Download references

Acknowledgment

This research has been funded by the Guangzhou Science and Technology Plan Project “Collaborative Innovation Project Oriented Big Data Security Industry Chain” (No. 201508010067).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jianguo Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Abeysinghe, C., Li, J., He, J. (2016). A Classifier Hub for Imbalanced Financial Data. In: Cheema, M., Zhang, W., Chang, L. (eds) Databases Theory and Applications. ADC 2016. Lecture Notes in Computer Science(), vol 9877. Springer, Cham. https://doi.org/10.1007/978-3-319-46922-5_43

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-46922-5_43

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-46921-8

  • Online ISBN: 978-3-319-46922-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics