Abstract
Analyzing large scale diagnosis histories of patients could help to discover comorbidity or disease progression patterns. Recently, open data initiatives make it possible to access statewide patient data at individual level, such as New York State SPARCS data. The goal of this study is to explore frequent disease co-occurrence and sequence patterns of cancer patients in New York State using SPARCS data. Our collection includes 18,208,830 discharge records from 1,565,237 patients with cancer-related diagnoses during 2011–2015. We use Apriori algorithm to discover top disease co-occurrences for common cancer categories based on support. We generate top frequent sequences of diagnoses with at least one cancer related diagnosis from patients’ diagnosis histories using the cSPADE algorithm. Our data driven approach provides essential knowledge to support the investigation of disease co-occurrence and progression patterns for improving the management of multiple diseases.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Stiglic, G., Brzan, P.P., Fijacko, N., Wang, F., Delibasic, B., Kalousis, A., Obradovic, Z.: Comprehensible predictive modeling using regularized logistic regression and comorbidity based features. PLoS ONE 10(12), e0144439 (2015). doi:10.1371/journal.pone.0144439
Lappenschaar, M., Hommersom, A., Lagro, J., Lucas, P.J.: Understanding the co-occurrence of diseases using structure learning. In: Conference on Artificial Intelligence in Medicine in Europe, pp. 135–144 (2013). doi:10.1007/978-3-642-38326-7_21
Munson, M.E., Wrobel, J.S., Holmes, C.M., Hanauer, D.A.: Data mining for identifying novel associations and temporal relationships with Charcot foot. J. Diabetes Res. (2014). doi:10.1155/2014/214353
Kost, R., Littenberg, B., Chen, E.S.: Exploring generalized association rule mining for disease co-occurrences. In: AMIA Annual Symposium Proceedings 2012, p. 1284 (2012)
Jensen, P.B., Jensen, L.J., Brunak, S.: Mining electronic health records: towards better research applications and clinical care. Nat. Rev. Genet. 13(6), 395–405 (2012). doi:10.1038/nrg3208
Kléma, J., Nováková, L., Karel, F., Stepankova, O., Zelezny, F.: Sequential data mining: a comparative case study in development of atherosclerosis risk factors. IEEE Trans. Syst. Man Cybern. Part C (Applications and Reviews) 38(1), 3–15 (2008). doi:10.1109/tsmcc.2007.906055
Baxter, R.A., Williams, G.J., He, H.: Feature selection for temporal health records. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 198–209 (2001). doi:10.1007/3-540-45357-1_24
Lin, W., Orgun, M.A., Williams, G.J.: Mining temporal patterns from health care data. In: International Conference on Data Warehousing and Knowledge Discovery, pp. 222–231 (2002). doi:10.1007/3-540-46145-0_22
Ferver, K., Burton, B., Jesilow, P.: The use of claims data in healthcare research. Open Public Health J. 2, 11–24 (2009). doi:10.2174/1874944500902010011
Tyree, P.T., Lind, B.K., Lafferty, W.E.: Challenges of using medical insurance claims data for utilization analysis. Am. J. Med. Qual. 21(4), 269–275 (2006). doi:10.1177/1062860606288774
Ram, S., Zhang, W., Williams, M., Pengetnze, Y.: Predicting asthma-related emergency department visits using big data. IEEE J. Biomed. Health Inform. 19(4), 1216–1223 (2015). doi:10.1109/jbhi.2015.2404829
LĂłpez-Soto, P.J., Smolensky, M.H., Sackett-Lundeen, L.L., De Giorgi, A., RodrĂguez-Borrego, M.A., Manfredini, R., Pelati, C., Fabbian, F.: Temporal patterns of in-hospital falls of elderly patients. Nurs. Res. 65(6), pp. 435–445 (2016). doi:10.1097/nnr.0000000000000184
Statewide Planning and Research Cooperative System (SPARCS). https://www.health.ny.gov/statistics/sparcs/
Chen, X., Wang, F.: Integrative spatial data analytics for public health studies of new york state. In: AMIA Annual Symposium Proceedings, vol. 2016, p. 391 (2016)
Chen, X., Wang, Y., Schoenfeld, E., Saltz, M., Saltz, J., Wang, F.: Spatio-temporal analysis for New York State SPARCS data. In: Proceedings of 2017 AMIA Joint Summits on Translational Science (2017)
Bekelis, K., Missios, S., Coy, S., Rahmani, R., Singer, R.J., MacKenzie, T.A.: Surgical clipping versus endovascular intervention for the treatment of subarachnoid hemorrhage patients in New York State. PLoS ONE 10(9), e0137946 (2015). doi:10.1371/journal.pone.0137946
Missios, S., Bekelis, K.: Regional disparities in hospitalization charges for patients undergoing craniotomy for tumor resection in New York State: correlation with outcomes. J. Neurooncol. 128(2), 365–371 (2016). doi:10.1007/s11060-016-2122-0
Bekelis, K., Missios, S., Coy, S., MacKenzie, T.A.: Scope of practice and outcomes of cerebrovascular procedures in children. Child’s Nerv. Syst. 32(11), 2159–2164 (2016). doi:10.1007/s00381-016-3114-2
Bekelis, K., Missios, S., Coy, S., MacKenzie, T.A.: Comparison of outcomes of patients with inpatient or outpatient onset ischemic stroke. J. Neurointerventional Surg., pp. neurintsurg-2015 (2016). doi:10.1136/neurintsurg-2015-012145
Dy, C.J., Lane, J.M., Pan, T.J., Parks, M.L., Lyman, S.: Racial and socioeconomic disparities in hip fracture care. J. Bone Joint Surg. Am. 98(10), 858–865 (2016)
Kim, H., Schwartz, R.M., Hirsch, J., Silverman, R., Liu, B., Taioli, E.: Effect of Hurricane Sandy on Long Island emergency departments visits. Disaster Med. Public Health Preparedness 10(03), 344–350 (2016). doi:10.1017/dmp.2015.189
He, F.T., De La Cruz, N.L., Olson, D., Lim, S., Seligson, A.L., Hall, G., Jessup, J., Gwynn, C.: Temporal and spatial patterns in utilization of mental health services during and after hurricane sandy: emergency department and inpatient hospitalizations in New York City. Disaster Med. Public Health Preparedness 10(03), 512–517 (2016). doi:10.1017/dmp.2016.89
Hodgins, J.L., Vitale, M., Arons, R.R., Ahmad, C.S.: Epidemiology of medial ulnar collateral ligament reconstruction: a 10-year study in New York State. Am. J. Sports Med. 44(3), 729–734 (2016). doi:10.1177/0363546515622407
Arakaki, L., Ngai, S., Weiss, D.: Completeness of Neisseria meningitidis reporting in New York City, 19892010. Epidemiol. Infect. 144(11), 2374–2381 (2016). doi:10.1017/s0950268816000406
Cancer facts & figures 2017. American Cancer Society (2017)
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proceedings of the 20th International Conference on Very Large Data Bases, VLDB, vol. 1215, pp. 487–499 (1994)
Zaki, M.J.: Sequence mining in categorical domains: incorporating constraints. In: Proceedings of the Ninth International Conference on Information and Knowledge Management, pp. 422–429 (2000). doi:10.1145/354756.354849
Mayo Clinic. http://www.mayoclinic.org
Acknowledgments
This work is supported in part by NSF ACI 1443054, by NSF IIS 1350885 and by NSF IIP1069147.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Wang, Y., Wang, F. (2017). Association Rule Learning and Frequent Sequence Mining of Cancer Diagnoses in New York State. In: Begoli, E., Wang, F., Luo, G. (eds) Data Management and Analytics for Medicine and Healthcare. DMAH 2017. Lecture Notes in Computer Science(), vol 10494. Springer, Cham. https://doi.org/10.1007/978-3-319-67186-4_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-67186-4_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67185-7
Online ISBN: 978-3-319-67186-4
eBook Packages: Computer ScienceComputer Science (R0)