Skip to main content

A Data-Mining Model for Predicting Low Birth Weight with a High AUC

  • Chapter
  • First Online:
Computer and Information Science (ICIS 2017)

Part of the book series: Studies in Computational Intelligence ((SCI,volume 719))

Included in the following conference series:

Abstract

Birth weight is a significant determinant of a newborn’s probability of survival. Data-mining models are receiving considerable attention for identifying low birth weight risk factors. However, prediction of actual birth weight values based on the identified risk factors, which can play a significant role in the identification of mothers at the risk of delivering low birth weight infants, remains unsolved. This paper presents a study of data-mining models that predict the actual birth weight, with particular emphasis on achieving a higher area under the receiver operating characteristic (AUC). The prediction is based on birth data from the North Carolina State Center for Health Statistics of 2006. The steps followed to extract meaningful patterns from the data were data selection, handling missing values, handling imbalanced data, model building, feature selection, and model evaluation. Decision trees were used for classifying birth weight and tested on the actual imbalanced dataset and the balanced dataset using synthetic minority oversampling technique (SMOTE). The results highlighted that models built with balanced datasets using the SMOTE algorithm produce a relatively higher AUC compared to models built with imbalanced datasets. The J48 model built with balanced data outperformed REPTree and Random tree with an AUC of 90.3%, and thus it was selected as the best model. In conclusion, the feasibility of using J48 in birth weight prediction would offer the possibility to reduce obstetric-related complications and thus improving the overall obstetric health care.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  1. Reichman, N.E.: Low birth weight and school readiness. Future Child. 15(1), 91–116 (2005)

    Article  Google Scholar 

  2. United Nations Children’s Fund and World Health Organization: Low birth weight, country regional and global estimates (2004)

    Google Scholar 

  3. Yadav, H., Lee, N.: Maternal factors in predicting low birth weight babies. Med. J. Malays. 68(1), 44–47 (2013)

    Google Scholar 

  4. Senthilkumar, D., Paulraj, S.: Prediction of low birth weight infants and its risk factors using data mining techniques. In: Proceedings of the 2015 International Conference on Industrial Engineering and Operations Management, pp. 186–194 (2015)

    Google Scholar 

  5. Shittu, A.S., Kuti, O., Orji, E.O., Makinde, N.O., Ogunniyi, S.O., Ayoola, O.O., Sule, S.S.: Clinical versus sonographic estimation of foetal weight in Southwest Nigeria. J Heal. Popul. Nutr. 25(1), 14–23 (2007)

    Google Scholar 

  6. Desalegn, B.: Predicting Low Birth Weight Using Data Mining Techniques on Ethiopia Demographic and Health Survey Data Sets. Addis Ababa University (2011)

    Google Scholar 

  7. Salomon, L.J., Bernard, J.P., Ville, Y.: Estimation of fetal weight: reference range at 20–36 weeks’ gestation and comparison with actual birth-weight reference range. Ultrasound Obs. Gynecol. 29, 550–555 (2007)

    Article  Google Scholar 

  8. Torloni, M.R., Sass, N., Sato, J.L., Renzi, A.C.P., Fukuyama, M., de Lucca, P.R.: Clinical formulas, mother’ s opinion and ultrasound in predicting birth weight. Sao Paulo Med. J. 126(3), 145–149 (2008)

    Article  Google Scholar 

  9. Soni, J., Ansari, U., Sharma, D., Soni, S.: Predictive data mining for medical diagnosis: an overview of heart disease prediction. Int. J. Comput. Appl. 17(8), 43–48 (2011)

    Google Scholar 

  10. Catley, C., Frize, M., Walker, C.R., Petriu, D.C.: Predicting high-risk preterm birth using artificial neural networks. IEEE Trans. Inf Technol. Biomed. 10(3), 540–549 (2006)

    Article  Google Scholar 

  11. Tefera, M.: Application of Data Mining to Predict Urinary Fistula Surgical Repair Outcome. Addis Ababa University (2012)

    Google Scholar 

  12. Kaur, H., Wasan, S.K.: Empirical study on applications of data mining techniques in healthcare. J. Comput. Sci. 2(2), 194–200 (2006)

    Article  Google Scholar 

  13. Jeyarani, D.S., Anushya, G., Rajeswari, R.R., Pethalakshmi, A.: A comparative study of decision tree and Naive Bayesian classifiers on medical datasets. Int. J. Comput. Appl. 5–7 (2013)

    Google Scholar 

  14. Gupta, S., Kumar, D., Sharma, A.: Data mining classification techniques applied for breast cancer diagnosis and prognosis. Indian J. Comput. Sci. Eng. 2(2), 188–195 (2011)

    Google Scholar 

  15. Yahia, M.E., El-taher, M.E.: A new approach for evaluation of data mining techniques. Int. J. Comput. Sci. Inf. Issues 7(5), 181–186 (2010)

    Google Scholar 

  16. Marshall, G., Tapia, J.L., Ivonne, D., Grandi, C., Barros, C., Alegria, A., Standen, J., Panizza, R., Bancalari, A., Lacarruba, J., Fabres, J.: A new score for predicting neonatal very low birth weight mortality risk in the NEOCOSUR south American network. J. Perinatol. 25, 577–582 (2005)

    Article  Google Scholar 

  17. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)

    Article  Google Scholar 

  18. Khalilia, M., Chakraborty, S., Popescu, M.: Predicting disease risks from highly imbalanced data using random forest. BMC Med. Inform. Decis. Mak. 11(51), 1–13 (2011)

    Google Scholar 

  19. Taft, L.M., Evans, R.S., Shyu, C.R., Egger, M.J., Chawla, N., Mitchell, J.A., Thornton, S.N., Bray, B., Varner, M.: Countering imbalanced datasets to improve adverse drug event predictive models in labor and delivery. J. Biomed. Inform. 42, 356–364 (2009)

    Article  Google Scholar 

  20. Kumar, V., Minz, S.: Feature selection: a literature review. Smart Comput. Rev. 4(3), 211–229 (2014)

    Article  Google Scholar 

  21. Setiono, R.: Feature selection : an ever evolving frontier in data mining. In: JMLR: Workshop and Conference Proceedings, pp. 4–13 (2010)

    Google Scholar 

  22. Lakshmi, K.R., Kumar, S.P.: Utilization of data mining techniques for prediction of diabetes disease survivability. Int. J. Sci. Eng. Res. 4(6), 933–942 (2013)

    Google Scholar 

  23. Mazid, M.M., Ali, A.B.M.S., Tickle, K.S.: Improved C4.5 Algorithm for Rule Based Classification

    Google Scholar 

  24. Ravichandran, S., Srinivasan, V.B., Ramasamy, C.: Comparative study on decision tree techniques for mobile call detail record. J. Commun. Comput. 9, 1331–1335 (2012)

    Google Scholar 

  25. Cios, K.J., Moore, G.W.: Uniqueness of medical data mining. Artif. Intell. Med. 26, 1–24 (2002)

    Article  Google Scholar 

  26. Huang, J., Ling, C.X.: Using AUC and accuracy in evaluating learning algorithms. IEEE Trans. Knowl. Data Eng. 17(3), 299–310 (2005)

    Article  Google Scholar 

  27. Tanner, L., Schreiber, M., Low, J.G.H., Ong, A., Tolfvenstam, T., Lai, Y.L., Ng, L.C., Leo, Y.S., Puong, L.T., Vasudevan, S.G., Simmons, C.P., Martin, L., Ooi, E.E.: Decision tree algorithms predict the diagnosis and outcome of dengue fever in the early phase of illness. PLoS Negl. Trop. Dis. 2(3) (2008)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Uzapi Hange .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this chapter

Cite this chapter

Hange, U., Selvaraj, R., Galani, M., Letsholo, K. (2018). A Data-Mining Model for Predicting Low Birth Weight with a High AUC. In: Lee, R. (eds) Computer and Information Science. ICIS 2017. Studies in Computational Intelligence, vol 719. Springer, Cham. https://doi.org/10.1007/978-3-319-60170-0_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-60170-0_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-60169-4

  • Online ISBN: 978-3-319-60170-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics