Skip to main content

A Novel Approach to Solve Class Imbalance Problem Using Noise Filter Method

  • Conference paper
  • First Online:
Intelligent Systems Design and Applications (ISDA 2018 2018)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 940))

Abstract

Today’s one of the popular pre-processing technique in handling class imbalance problems is over-sampling. It balances the datasets to achieve a high classification rate and also avoids the bias towards majority class samples. Over-sampling technique takes full minority samples in the training data into consideration while performing classification. But, the presence of some noise (in the minority samples and majority samples) may degrade the classification performance. Hence, this work introduces a noise filter over-sampling approach with Adaptive Boosting Algorithm (AdaBoost) for effective classification. This work evaluates the performance with the state of-the-art methods based on ensemble learning like AdaBoost, RUSBoost, SMOTEBoost on 14 imbalance binary class datasets with various Imbalance Ratios (IR). The experimental results show that our approach works as promising and effective for dealing with imbalanced datasets using metrics like F-Measure and AUC.

Supported by KL University.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bunkhumpornpat, C., Sinapiromsaran, K., Lursinsap, C.: Safe-level-smote: safe-level-synthetic minority over-sampling technique for handling the class imbalanced problem. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 475–482. Springer (2009)

    Google Scholar 

  2. Cano, A., Zafra, A., Ventura, S.: Weighted data gravitation classification for standard and imbalanced data. IEEE Trans. Cybern. 43(6), 1672–1687 (2013)

    Article  Google Scholar 

  3. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)

    Article  Google Scholar 

  4. Codetta-Raiteri, D., Portinale, L.: Dynamic bayesian networks for fault detection, identification, and recovery in autonomous spacecraft. IEEE Trans. Syst. Man Cybern. Syst. 45(1), 13–24 (2015)

    Article  Google Scholar 

  5. Elkan, C.: The foundations of cost-sensitive learning. In: International Joint Conference on Artificial Intelligence, vol. 17, pp. 973–978. Lawrence Erlbaum Associates Ltd. (2001)

    Google Scholar 

  6. Galar, M., Fernandez, A., Barrenechea, E., Bustince, H., Herrera, F.: A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 42(4), 463–484 (2012)

    Article  Google Scholar 

  7. Han, H., Wang, W.Y., Mao, B.H.: Borderline-smote: a new over-sampling method in imbalanced data sets learning. In: International Conference on Intelligent Computing, pp. 878–887. Springer (2005)

    Google Scholar 

  8. He, H., Bai, Y., Garcia, E.A., Li, S.: ADASYN: adaptive synthetic sampling approach for imbalanced learning. In: IEEE International Joint Conference on Neural Networks, IJCNN 2008, IEEE World Congress on Computational Intelligence, pp. 1322–1328. IEEE (2008)

    Google Scholar 

  9. He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 9, 1263–1284 (2008)

    Google Scholar 

  10. Kang, Q., Huang, B., Zhou, M.: Dynamic behavior of artificial Hodgkin-Huxley neuron model subject to additive noise. IEEE Trans. Cybern. 46(9), 2083–2093 (2016)

    Article  Google Scholar 

  11. Kang, Q., Zhou, M., An, J., Wu, Q.: Swarm intelligence approaches to optimal power flow problem with distributed generator failures in power networks. IEEE Trans. Autom. Sci. Eng. 10(2), 343–353 (2013)

    Article  Google Scholar 

  12. Liu, X.Y., Wu, J., Zhou, Z.H.: Exploratory undersampling for class-imbalance learning. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 39(2), 539–550 (2009)

    Article  Google Scholar 

  13. Liu, X.Y., Zhou, Z.H.: The influence of class imbalance on cost-sensitive learning: an empirical study. In: Proceedings of the Sixth International Conference on Data Mining, pp. 970–974. IEEE (2006)

    Google Scholar 

  14. Maciejewski, T., Stefanowski, J.: Local neighbourhood extension of smote for mining imbalanced data. In: 2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM), pp. 104–111. IEEE (2011)

    Google Scholar 

  15. Oliker, N., Ostfeld, A.: A coupled classification-evolutionary optimization model for contamination event detection in water distribution systems. Water Res. 51, 234–245 (2014)

    Article  Google Scholar 

  16. Sáez, J.A., Galar, M., Luengo, J., Herrera, F.: INFFC: an iterative class noise filter based on the fusion of classifiers with noise sensitivity control. Inf. Fusion 27, 19–32 (2016)

    Article  Google Scholar 

  17. Sáez, J.A., Luengo, J., Stefanowski, J., Herrera, F.: SMOTE-IPF: addressing the noisy and borderline examples problem in imbalanced classification by a re-sampling method with filtering. Inf. Sci. 291, 184–203 (2015)

    Article  Google Scholar 

  18. Somasundaram, A., Reddy, U.S.: Modelling a stable classifier for handling large scale data with noise and imbalance. In: 2017 International Conference on Computational Intelligence in Data Science (ICCIDS), pp. 1–6. IEEE (2017)

    Google Scholar 

  19. Sun, Y., Kamel, M.S., Wong, A.K., Wang, Y.: Cost-sensitive boosting for classification of imbalanced data. Pattern Recogn. 40(12), 3358–3378 (2007)

    Article  Google Scholar 

  20. Tang, Y., Zhang, Y.Q., Chawla, N.V., Krasser, S.: SVMs modeling for highly imbalanced classification. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 39(1), 281–288 (2009)

    Article  Google Scholar 

  21. Tao, D., Tang, X., Li, X., Wu, X.: Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 28(7), 1088–1099 (2006)

    Article  Google Scholar 

  22. Van Hulse, J., Khoshgoftaar, T.M., Napolitano, A.: A novel noise filtering algorithm for imbalanced data. In: 2010 Ninth International Conference on Machine Learning and Applications (ICMLA), pp. 9–14. IEEE (2010)

    Google Scholar 

  23. Yin, H.L., Leong, T.Y.: A model driven approach to imbalanced data sampling in medical decision making. In: MedInfo, pp. 856–860 (2010)

    Google Scholar 

  24. Zhang, Y., Zhou, Z.H.: Cost-sensitive face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 32(10), 1758–1769 (2010)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gillala Rekha .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Rekha, G., Tyagi, A.K., Krishna Reddy, V. (2020). A Novel Approach to Solve Class Imbalance Problem Using Noise Filter Method. In: Abraham, A., Cherukuri, A.K., Melin, P., Gandhi, N. (eds) Intelligent Systems Design and Applications. ISDA 2018 2018. Advances in Intelligent Systems and Computing, vol 940. Springer, Cham. https://doi.org/10.1007/978-3-030-16657-1_45

Download citation

Publish with us

Policies and ethics