Abstract
Case Based Reasoning (CBR) is the first choice in experience-based problems as diagnosis. However, building a case base for CBR is a challenging. Electronic Health Record (EHR) data can provide a starting point for building case base, but it needs a set of preprocessing steps. In this paper, we propose a case-base preparation framework for CBR systems. This framework consists of three main phases including data preparation, fuzzification, and coding. This paper will focus only on the data-preprocessing phase to prepare the EHR database as a knowledge source for CBR cases. It will use many machine-learning algorithms for feature selection and weighing, normalization, and others. As a case study, we will apply these algorithms on diabetes diagnosis data set. To check the effect of data preparation steps, a CBR prototype will being designed for diabetes diagnosis and prediction of its complications as kidney failure. The results show an enhancement to the case retrieval process of the implemented CBR system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Richter, M., Weber, R.: Case-Based Reasoning. Springer, Heidelberg (2013)
Blanco, X., RodrÃguez, S., Corchado, J.M., Zato, C.: Case-Based Reasoning Applied to Medical Diagnosis and Treatment. In: Omatu, S., Neves, J., Rodriguez, J.M.C., Paz Santana, J.F., Gonzalez, S.R. (eds.) Distrib. Computing & Artificial Intelligence. AISC, vol. 217, pp. 137–146. Springer, Heidelberg (2013)
Andritsos, P., Jurisica, I., Glasgow, J.: Case-Based Reasoning for Biomedical Informatics and Medicine. In: Springer Handbook of Bio-/Neuroinformatics, Part C, pp. 207–221. Springer, Heidelberg (2014)
Abidi, S., Manickam, S.: Leveraging XML-based electronic medical records to extract experiential clinical knowledge an automated approach to generate cases for medical case-based reasoning systems. International Journal of Medical Informatics 68, 187–203 (2002)
Lee, D., Cornet, R., Lau, F., Keizer, N.: A survey of SNOMED CT implementations. Journal of Biomedical Informatics 46, 87–96 (2013)
Burnum, J.: The misinformation era: the fall of the medical record. Ann. Intern. Med. 110, 482–484 (1989)
Weiner, M., Embi, P.: Toward reuse of clinical data for research and quality improvement: the end of the beginning? Ann. Intern. Med. 151, 359–360 (2009)
Lei, J.: Use and abuse of computer-stored medical records. Methods Inf. Med. 30, 79–80 (1991)
Borges, K., Aquino, R., Barcelos, T., Simoes, J.: A methodology for preprocessing data for application of case based reasoning. In: 2012 XXXVIII Conferencia Latinoamericana En IEEE Informatica (CLEI), pp. 1–8 (2012)
Jayalskshmi, T., Santhakumaran, A.: Impact of Preprocessing for Diagnosis of Diabetes Mellitus Using Artificial Neural Networks. In: IEEE Second International Conference on Machine Learning and Computing, pp. 109–112 (2010)
Begum, S., Ahmed, M., Funk, P., Xiong, N., Folke, M.: Case-Based Reasoning Systems in the Health Sciences: A Survey of Recent Trends and Developments. IEEE Transactions on Systems, Man, and Cybernetics, Part C 7(1), 39–59 (2010)
Wu, D., Weber, R., Abramson, D.: A case-based framework for leveraging nutrigenomics knowledge and personalized nutrition counseling. In: Proceeding of Workshop CBR Health Sci.s, pp. 71–80 (2004)
Jagannathan, R., Petrovic, S.: Dealing with Missing Values in a Clinical Case-Based Reasoning System. In: Second IEEE International Conference on Computer Science and Information Technology (ICCSIT), pp. 120–124 (2009)
Floyd, M.W., Davoust, A., Esfandiari, B.: Considerations for Real-Time Spatially-Aware Case-Based Reasoning: A Case Study in Robotic Soccer Imitation. In: Althoff, K.-D., Bergmann, R., Minor, M., Hanft, A. (eds.) ECCBR 2008. LNCS (LNAI), vol. 5239, pp. 195–209. Springer, Heidelberg (2008)
Weiskopf, N., Weng, C.: Methods and dimensions of electronic health record data quality assessment: enabling reuse for clinical research. Journal American Medical Informatics Association 20(1), 144–151 (2013)
Klompas, M., Eggleston, E., McVetta, J., Lazarus, R., Li, L., Platt, R.: Automated detection and classification of type 1 versus type 2 diabetes using electronic health record data. Diabetes Care 36(4), 914–921 (2013)
Esfandiari, N., Babavalian, M., Moghadam, A., Tabar, V.: Knowledge discovery in medicine: Current issue and future trend. Expert Systems with Applications 41, 4434–4463 (2014)
Kuhn, M., Johnson, K.: Data Pre-processing. Applied Predictive Modeling, 27–59 (2013)
Xie, X., Lin, L., Zhong, S.: Handling missing values and unmatched features in a CBR system for hydro-generator design. Computer-Aided Design 45, 963–976 (2013)
Guessouma, S., Laskrib, M., Lieberc, J.: RespiDiag: A Case-Based Reasoning System for the Diagnosis of Chronic Obstructive Pulmonary Disease. Expert Systems with Applications 41(2), 267–273 (2014)
Piramuthu, S.: Evaluating feature selection methods for learning in data mining applications. European Journal of Operational Research 156(2), 483–494 (2004)
Baig, M.: Case-based reasoning – an effective paradigm for providing diagnostic support for stroke patients. Master Thesis, Queen’s University, Kingston, Ontario, Canada (2008)
Bottrighi, A., Leonardi, G., Montani, S., Portinale, L., Terenziani, P.: Intelligent Data Interpretation and Case Base Exploration through Temporal Abstractions. In: Bichindaritz, I., Montani, S. (eds.) ICCBR 2010. LNCS (LNAI), vol. 6176, pp. 36–50. Springer, Heidelberg (2010)
Gopal, K.: Efficient case-based reasoning through feature weighting, and its application in protein crystallography. Ph.D. Thesis,Texas A & M University (2007)
Xiong, N., Funk, P.: Combined feature selection and similarity modelling in case-based reasoning using hierarchical memetic algorithm. In: IEEE Congress on Evolutionary Computation (CEC), pp. 1–6 (2010)
Shanga, C., Min, M., Fenga, S., Jianga, Q., Fana, J.: Feature selection via maximizing global in-formation gain for text classification. Knowledge-Based Systems 54, 298–309 (2013)
Huang, Y., McCullagh, P., Black, N., Harper, R.: Feature selection and classification model construction on type 2 diabetic patients’ data. Artificial Intelligence in Medicine 41, 251–262 (2007)
Kwiatkowska, M., Atkins, S.: Case Representation and Retrieval in the Diagnosis and Treatment of Obstructive Sleep Apnea: A Semio-fuzzy Approach. In: Proceedings of 7th European Case Based Reasoning Conference (ECCBR), pp. 25–35 (2004)
Balakrishnan, V., Shakouri, M., Hoodeh, H.: Integrating association rules and case-based reason-ing to predict retinopathy. Maejo International Journal of Science and Technology 6(03), 334–343 (2012)
Kotsiantis, S., Kanellopoulos, D., Pintelas, P.: Data Preprocessing for Supervised Leaning. International Journal of Computer Science 1(2), 111–117 (2006)
Han, J., Rodriguze, J., Beheshti, M.: Diabetes Data Analysis and Prediction Model Discovery Using RapidMiner. In: 2008 Second International Conference on Future Generation Communication and Networking, pp. 96–99 (2008)
Pla, A., López, B., Gay, P., Carles, C.: eXiT*CBR.v2: Distributed case-based reasoning tool for medical prognosis. Decision Support Systems 54(3), 1499–1510 (2013)
Rea, S., Pathak, J., et al.: Building a robust, scalable and standards-driven infrastructure for secondary use of EHR data: The SHARPn project. Journal of Biomedical Informatics 45, 763–771 (2012)
McSherry, D.: Precision and Recall in Interactive Case-Based Reasoning. In: Aha, D.W., Watson, I. (eds.) ICCBR 2001. LNCS (LNAI), vol. 2080, pp. 392–406. Springer, Heidelberg (2001)
Wang, X., Dong, J.: Fuzzy Based Similarity Adjustment of Case Retrieval Process in CBR System for BOF Oxygen Volume Control. In: Sixth International Conference on Advanced Computational Intelligence, Hangzhou, China, pp. 19–21 (2013)
The myCBR3 Project, http://www.mycbr-project.net/download.html (last accessed on May 11, 2014)
RapidMiner, http://rapidminer.com/ (last accessed on May 20, 2014)
HL7 Version 3: Reference Information Model (RIM), http://www.hl7.org (last accessed on May 11, 2014)
Molina, L.C., Belanche, L., Nebot, A.: Feature selection algorithms: a survey and experimental evaluation. In: IEEE International Conference on Data Mining ICDM, pp. 306–313 (2002)
Kar, D., Chakraborti, S., Ravindran, B.: Feature Weighting and Confidence Based Prediction for Case Based Reasoning Systems. In: Agudo, B.D., Watson, I. (eds.) ICCBR 2012. LNCS, vol. 7466, pp. 211–225. Springer, Heidelberg (2012)
Jagannathan, R., Petrovic, S.: Dealing with missing values in a clinical case-based reasoning system. In: Second IEEE International Conference on Computer Science and Information Technology (ICCSIT), pp. 120–124 (2009)
Michael, F.: Microvascular and Macrovascular Complications of Diabetes. American Diabetes Association, Clinical Diabetes 26(2), 77–82 (2008)
Hassanien, A.E., Abdelhafez, M.E., Own, H.S.: Rough sets data analysis in knowledge discovery: a case of kuwaiti diabetic children patients. Advances in Fuzzy Systemsol. 8, 2 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
El-Sappagh, S., Elmogy, M., Riad, A.M., Zaghlol, H., Badria, F.A. (2014). EHR Data Preparation for Case Based Reasoning Construction. In: Hassanien, A.E., Tolba, M.F., Taher Azar, A. (eds) Advanced Machine Learning Technologies and Applications. AMLTA 2014. Communications in Computer and Information Science, vol 488. Springer, Cham. https://doi.org/10.1007/978-3-319-13461-1_45
Download citation
DOI: https://doi.org/10.1007/978-3-319-13461-1_45
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13460-4
Online ISBN: 978-3-319-13461-1
eBook Packages: Computer ScienceComputer Science (R0)