Abstract
Organizations aim at harnessing predictive insights, using the vast real-time data stores that they have accumulated through the years, using data mining techniques. Health sector, has an extremely large source of digital data - patient-health related data-store, which can be effectively used for predictive analytics. This data, may consists of missing, incorrect and sometimes incomplete values sets that can have a detrimental effect on the decisions that are outcomes of data analytics. Using the PIMA Indians Diabetes dataset, we have proposed an efficient imputation method using a hybrid combination of CART and Genetic Algorithm, as a preprocessing step. The classical neural network model is used for prediction, on the preprocessed dataset. The accuracy achieved by the proposed model far exceeds the existing models, mainly because of the soft computing preprocessing adopted. This approach is simple, easy to understand and implement and practical in its approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Mitra, S., Acharya, T.: Data Mining, Multimedia, Soft-computing and Bioinformatics. Wiley Interscience, Hoboken (2004)
Little, R.J.A., Rubin, D.B.: Statistical Analysis with Missing Data. John Wiley, New York (1987)
Zhang, S., Qin, Z., Ling, C.X., Sheng, S.: Missing is Useful: Missing Values in Cost Sensitive Decision Trees. IEEE Transactions on Knowledge and Data Engineering 17(12) (2005)
Rady, E.A., Abd, El-Monsef, M.M.E., Abd, El-Latif, W.A.: A Modified Rough Set Approach to Incomplete Information Systems. Journal of Applied Mathematics and Decision Sciences 2007, article ID 58248Â (2007)
Satya Kumar, D.V.R., Sriram, K., Rao, K.M., Murty, U.S.: Management of Filariasis Using Prediction Rules Derived from Data Mining. In: Bioinformation by Biomedical Informatics Publishing Group (2005)
Palaniappan, S., Awang, R.: Intelligent Heart Disease Prediction System using Data Mining Techniques. International Journal of Computer Science and Network Security 8(8) (2008)
Liu, P., Lei, L.: A Review of Missing Data Treatment Methods. Intelligent Information Management Systems and Technologies 1(3), 412–419 (2005)
Mehala, B., Ranjit Jeba Thangaiah, P., Vivekanandan, K.: Selecting Scalable Algorithms to Deal with Missing Values. International Journal of Recent Trends in Engineering 1(2) (2009)
Adbdella, M., Marwala, T.: Treatment of Missing Data Using Neural Networks and Genetic Algorithms. In: International Joint Conference on Neural Networks, Canada (2005)
Magnani, M.: Techniques for Dealing with Missing Data in Knowledge Discovery Tasks, Department of Computer Science, University of Bologna (2004)
Acuna, E., Rodriguez, C.: The Treatment of Missing Values and its Effect in the Classifier Accuracy. In: Multiscale Methods in Science and Engineering, pp. 639–647. Springer, Heidelberg (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bhat, V.H., Rao, P.G., Shenoy, P.D., Venugopal, K.R., Patnaik, L.M. (2009). An Efficient Prediction Model for Diabetic Database Using Soft Computing Techniques. In: Sakai, H., Chakraborty, M.K., Hassanien, A.E., Ślęzak, D., Zhu, W. (eds) Rough Sets, Fuzzy Sets, Data Mining and Granular Computing. RSFDGrC 2009. Lecture Notes in Computer Science(), vol 5908. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10646-0_40
Download citation
DOI: https://doi.org/10.1007/978-3-642-10646-0_40
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10645-3
Online ISBN: 978-3-642-10646-0
eBook Packages: Computer ScienceComputer Science (R0)