Skip to main content

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 188))

Abstract

A well-known domain in that it is highly likely for each exemplary dataset to be imbalanced is patient detection. In such systems there are many clients while a few of them are patient and the all others are healthy. So it is very common and likely to face an imbalanced dataset in such a system that is to detect a patient from various clients. In a breast cancer detection that is a special case of the mentioned systems, it is tried to discriminate the patient clients from healthy clients. It should be noted that the imbalanced shape of a dataset can be either relative or non-relative. The imbalanced shape of a dataset is relative where the mean number of samples is high in the minority class, but it is very less rather than the number of samples in the majority class. The imbalanced shape of a dataset is non-relative where the mean number of samples is low in the minority class. This paper presents an algorithm which is well-suited for and applicable to the field of non-relative imbalanced datasets. It is efficient in terms of both of the speed and the efficacy of learning. The experimental results show that the performance of the proposed algorithm outperforms some of the best methods in the literature.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowledge And Data Engineering 21(9), 1263–1284 (2009)

    Article  Google Scholar 

  2. Liu, X.Y., Wu, J., Zhou, Z.H.: Exploratory Under Sampling for Class Imbalance Learning. In: Proc. Int’l Conf. Data Mining, pp. 965–969 (2006)

    Google Scholar 

  3. Liu, X.Y., Wu, J., Zhou, Z.H.: Exploratory Under sampling for Class-Imbalance Learning. IEEE Transactions on Systems, Man, and Cybernetics-part B: Cybernetics (2009)

    Google Scholar 

  4. Zhang, J., Mani, I.: KNN Approach to Imbalanced Data Distributions: A Case Study Involving Information Extraction. In: Int’l Conf. Machine Learning (2003)

    Google Scholar 

  5. Hamzei, M., Kangavari, M.R.: Learning from imbalanced data. Technical Report, Iran University of Sci. & Tech., Iran (2010)

    Google Scholar 

  6. Minaei, F., Soleimanian, M., Kheirkhah, D.: Investigation the relationship between risk factors of occurrence of breast tumor in women, Aranobidgol, Iran (2009)

    Google Scholar 

  7. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: Synthetic Minority Over-Sampling Technique. J. Artificial Intelligence Research 16, 321–357 (2002)

    MATH  Google Scholar 

  8. He, H., Bai, Y., Garcia, E.A., Li, S.: ADASYN: Adaptive Synthetic Sampling Approach for Imbalanced Learning. In: Proc. Int’l J. Conf. Neural Networks, pp. 1322–1328 (2008)

    Google Scholar 

  9. Batista, G.E.A.P.A., Prati, R.C., Monard, M.C.: A Study of the Behavior of Several Methods for Balancing Machine Learning Training Data. ACM SIGKDD Explorations Newsletter 6(1), 20–29 (2004)

    Article  Google Scholar 

  10. Jo, T., Japkowicz, N.: Class Imbalances versus Small Disjuncts. ACM SIGKDD Explorations Newsletter 6(1), 40–49 (2004)

    Article  MathSciNet  Google Scholar 

  11. Chawla, N.V., Lazarevic, A., Hall, L.O., Bowyer, K.W.: SMOTEBoost: Improving Prediction of the Minority Class in Boosting. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) PKDD 2003. LNCS (LNAI), vol. 2838, pp. 107–119. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  12. Schapire, R.E.: The strength of weak learn ability. Machine Learning 5(2), 1971–1227 (1990)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hamid Parvin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Parvin, H., Ansari, S., Parvin, S. (2013). Proposing a New Method for Non-relative Imbalanced Dataset. In: Snášel, V., Abraham, A., Corchado, E. (eds) Soft Computing Models in Industrial and Environmental Applications. Advances in Intelligent Systems and Computing, vol 188. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32922-7_31

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-32922-7_31

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-32921-0

  • Online ISBN: 978-3-642-32922-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics