Skip to main content

MCA-Based Rule Mining Enables Interpretable Inference in Clinical Psychiatry

  • Chapter
  • First Online:
Precision Health and Medicine (W3PHAI 2019)

Abstract

Development of interpretable machine learning models for clinical healthcare applications has the potential of changing the way we understand, treat, and ultimately cure, diseases and disorders in many areas of medicine. These models can serve not only as sources of predictions and estimates, but also as discovery tools for clinicians and researchers to reveal new knowledge from the data. High dimensionality of patient information (e.g., phenotype, genotype, and medical history), lack of objective measurements, and the heterogeneity in patient populations often create significant challenges in developing interpretable machine learning models for clinical psychiatry in practice. In this paper we take a step towards the development of such interpretable models. First, by developing a novel categorical rule mining method based on Multivariate Correspondence Analysis (MCA) capable of handling datasets with large numbers of features, and second, by applying this method to build transdiagnostic Bayesian Rule List models to screen for psychiatric disorders using the Consortium for Neuropsychiatric Phenomics dataset. We show that our method is not only at least 100 times faster than state-of-the-art rule mining techniques for datasets with 50 features, but also provides interpretability and comparable prediction accuracy across several benchmark datasets.

Qingzhu Gao, Humberto Gonzalez contributed equally to this work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th International Conference on Very Large Data Bases, pp. 487–499 (1994)

    Google Scholar 

  2. Anthimopoulos, M., Christodoulidis, S., Ebner, L., Christe, A., Mougiakakou, S.: Lung pattern classification for interstitial lung diseases using a deep convolutional neural network. IEEE Trans. Med. Imaging 35(5), 1207–1216 (2016)

    Article  Google Scholar 

  3. Beam, A.L., Kohane, I.S.: Big data and machine learning in health care. JAMA 319(13), 1317–1318 (2018)

    Article  Google Scholar 

  4. Borgelt, C.: Frequent item set mining. Wiley Interdiscip. Rev.: Data Min. Knowl. Discov. 2(6), 437–456 (2012)

    Google Scholar 

  5. Brooks, S.P., Gelman, A.: General methods for monitoring convergence of iterative simulations. J. Comput. Graph. Stat. 7(4), 434–455 (1998)

    MathSciNet  Google Scholar 

  6. Campolo, A., Sanfilippo, M., Whittaker, M., Crawford, K.: AI Now 2017 report. AI Now Institute at New York University (2017)

    Google Scholar 

  7. Cohen, J.: A coefficient of agreement for nominal scales. Educ. Psychol. Meas. 20(1), 37–46 (1960)

    Article  Google Scholar 

  8. Dheeru, D., Karra Taniskidou, E.: UCI machine learning repository (2017). http://archive.ics.uci.edu/ml

  9. Dietterich, T.G.: An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization. Mach. Learn. 40(2), 139–157 (2000)

    Article  Google Scholar 

  10. Gelman, A., Rubin, D.B.: Inference from iterative simulation using multiple sequences. Stat. Sci. 7(4), 457–472 (1992)

    Article  Google Scholar 

  11. Gilpin, L.H., Bau, D., Yuan, B.Z., Bajwa, A., Specter, M., Kagal, L.: Explaining explanations: an approach to evaluating interpretability of machine learning (2018)

    Google Scholar 

  12. Greenacre, M.J., Blasius, J.: Multiple Correspondence Analysis and Related Methods. Chapman & Hall/CRC, Boca Raton (2006)

    Google Scholar 

  13. Gunning, D.: DARPA explainable artificial intelligence (XAI) (2017). https://www.darpa.mil/program/explainable-artificial-intelligence

  14. Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. ACM SIGMOD Rec. 29(2), 1–12 (2000)

    Article  Google Scholar 

  15. Hendricks, P.: Titanic: titanic passenger survival data set (2015). https://github.com/paulhendricks/titanic (R package version 0.1.0)

  16. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 436–444 (2015)

    Article  Google Scholar 

  17. Letham, B., Rudin, C., McCormick, T.H., Madigan, D.: Interpretable classifiers using rules and Bayesian analysis: building a better stroke prediction model. Ann. Appl. Stat. 9(3), 1350–1371 (2015)

    Article  MathSciNet  Google Scholar 

  18. Li, W., Han, J., Pei, J.: CMAR: accurate and efficient classification based on multiple class-association rules. In: Proceedings of the 2001 IEEE International Conference on Data Mining, pp. 369–376 (2001)

    Google Scholar 

  19. Lipton, Z.C.: The mythos of model interpretability. ACM Queue 16(3) (2018)

    Google Scholar 

  20. Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining, pp. 80–86 (1998)

    Google Scholar 

  21. Loève, M.: Probability Theory I. Springer, Berlin (1977)

    MATH  Google Scholar 

  22. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  23. Poldrack, R.A., Congdon, E., Triplett, W., Gorgolewski, K.J., Karlsgodt, K.H., Mumford, J.A., Sabb, F.W., Freimer, N.B., London, E.D., Cannon, T.D., Bilder, R.M.: A phenome-wide examination of neural and cognitive function. Sci. Data 3, 160110 (2016)

    Article  Google Scholar 

  24. Rudin, C., Letham, B., Madigan, D.: Learning theory analysis for association rules and sequential event prediction. J. Mach. Learn. Res. 14, 3441–3492 (2013)

    MathSciNet  MATH  Google Scholar 

  25. Valdes, G., Luna, J.M., Eaton, E., II, C., Ungar, L.H., Solberg, T.D.: MediBoost: a patient stratification tool for interpretable decision making in the era of precision medicine. Sci. Rep. 6, 37854 (2016)

    Google Scholar 

  26. Wyatt, J., Spiegelhalter, D.: Field trials of medical decision-aids: potential problems and solutions. In: Proceedings of the Annual Symposium on Computer Application in Medical Care, pp. 3–7 (1991)

    Google Scholar 

  27. Yin, X., Han, J.: CPAR: classification based on predictive association rules. In: Proceedings of the 2003 SIAM International Conference on Data Mining, pp. 331–335 (2003)

    Google Scholar 

  28. Zhu, Q., Lin, L., Shyu, M.L., Chen, S.C.: Feature selection using correlation and reliability based scoring metric for video semantic detection. In: Proceedings of the IEEE 4th International Conference on Semantic Computing, pp. 462–469 (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Humberto Gonzalez .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Gao, Q., Gonzalez, H., Ahammad, P. (2020). MCA-Based Rule Mining Enables Interpretable Inference in Clinical Psychiatry. In: Shaban-Nejad, A., Michalowski, M. (eds) Precision Health and Medicine. W3PHAI 2019. Studies in Computational Intelligence, vol 843. Springer, Cham. https://doi.org/10.1007/978-3-030-24409-5_3

Download citation

Publish with us

Policies and ethics