Abstract
Advanced AI systems are rapidly making their way into medical research and practice, and, arguably, it is only a matter of time before they will surpass human practitioners in terms of accuracy, reliability, and knowledge. If this is true, practitioners will have a prima facie epistemic and professional obligation to align their medical verdicts with those of advanced AI systems. However, in light of their complexity, these AI systems will often function as black boxes: the details of their contents, calculations, and procedures cannot be meaningfully understood by human practitioners. When AI systems reach this level of complexity, we can also speak of black-box medicine. In this paper, we want to argue that black-box medicine conflicts with core ideals of patient-centered medicine. In particular, we claim, black-box medicine is not conducive for supporting informed decision-making based on shared information, shared deliberation, and shared mind between practitioner and patient.
Similar content being viewed by others
Notes
Later, in section 2, we will be more precise about the types of AI systems that we have in mind, but for now we can keep the term “AI systems” vague and apply it at a level of generality that mirrors the use of “autonomous/intelligent systems” employed by IEEE.
What may such overriding reasons look like? To give an example, suppose you—along the lines of Montgomery (2006), for instance—believe that there is a practical and irreducible know-how component to medical judgment and decision-making, which, importantly, cannot meaningfully be encoded in AI systems. If so, we can imagine how an appeal to such medical know-how may help us explain how a practitioner can be excused from acting in accordance with the recommendations of an epistemically superior AI system. Alternatively, as recently argued by Ploug and Holm (2019), it might be that patients have a medical right to withdraw from AI-assisted diagnostics and treatment. In order to respect such rights, it may be argued, practitioners can be excused from acting in accordance with the recommendations of AI systems. While there are lots to say about these issues, it is well beyond the scope of a single paper. For present purposes, we shall proceed on the assumption that practitioners are under an obligation to follow (future) black-box AI systems in decision-making.
In fact, as we shall point out in section 6, black-box medicine also presents a challenge to central elements of evidence-based medicine.
We are grateful to an anonymous referee for stressing this worry.
A central reason for why automated image recognition has seen such great progress in recent years is in large part due to the high quality of imaging data. For recent information about the progress being made to improve the quality of imaging data sets even further, see Harvey and Glocker (2019) and van Ooijen (2019).
Note that the reported result does not show that deep learning systems generally outperform clinicians and experts. As reported by De Fauw et al. (2018), differences in performance between deep learning systems and clinicians were considerably reduced when clinicians had access to all the information—such as patient history and clinical notes—that they ordinarily make use of in addition to an OCT scan; for conclusions that point in a similar direction, see Faes et al. (2019). As a reviewer for this journal interestingly noted, this observation might point toward a correlation between the amount of information that a clinician has and the degree to which he or she is epistemically obliged to rely on AI systems in decision-making. In particular, since junior clinicians might have access to less information than an experienced clinician, the former might be under a greater epistemic obligation to take the output of an AI system at face value than the latter. While these discussions are obviously interesting for current practices in AI-assisted medical decision-making, we assume, as mentioned, that AI systems (will) have access to more information than even expert human practitioners. As such, we need not worry too much about results showing that there is not a great difference in performance between humans and AI systems when they have access to roughly the same amount of medical information.
In Holzinger et al. (2019), for instance, it is stressed how AI performance can be optimized by integrating multiple independent data sets (e.g., imaging, omics profiles, and clinical data).
We will offer a somewhat detailed explanation of the sense in which deep learning networks count as black-box AI systems, in part because it is useful to have a clear understanding of what makes an AI system a black-box system, and in part because we will appeal to deep neural networks in our discussions in sections 4 and 5.
Schubback references Smolensky (1988) for this observation.
Note, though, that London (2019) is critical of this view of experts in medicine. For further discussion of these issues, see section 5.
So, to be clear, we are not thinking of black-box medicine as a practice in which practitioners are removed from medical decision-making nor as one in which practitioners are bound to follow the recommendations of black-box systems. Rather, we think of black-box AI systems as decision aids that can serve practitioners in both diagnostic and treatment contexts.
Note: the description of patient-centered medicine below captures the ideals of patient-centered medicine and not necessarily the way it is practiced in real healthcare.
Throughout, for the sake of argument, we will assume that patients place central value on the ideal of shared decision-making. But we should acknowledge that real healthcare situations are rarely so straightforward. As pointed out to us by an anonymous referee, it has been documented by Vogel et al. (2008) that patients vary in the extent to which they genuinely want to be involved in the decision-making processes surrounding cancer.
For an overview over these findings, see Delaney (2018).
Note: even if we assume that a post hoc interpretation of such a network was possible—for instance, through clustering “significant” units in the network to yield information that those specific 1237 patient variables with those specific interconnected range values result in that specific risk estimate—it is far from clear how such an interpretation could be of practical use to the scientific community. For similar critical pointers, see London (2019).
To be sure, the practitioner may be able to give an abstract explanation of how the deep learning network operates. Plausibly, the sort of explanation that the practitioner can offer Mary will be of the following form: “Somehow, based on a complex analysis of vast amount of data about people who share a high number of characteristics with you, the system determines with high reliability and accuracy that you are in significant danger of developing breast cancer; yet, I do not understand how and why it does it.” But, as indicated above, it is doubtful whether such an explanation is even minimally informative for normal people.
Thanks to an anonymous reviewer for raising this issue. For sentiments similar to those aired by London (2019) in the quotes, see also Schönberger (2019) and Zerilli et al. (2018). For arguments that indicate that typical medical explanations might not be as opaque as the quotes above suggest, see Lipton (2017).
As reported by Walker et al. (2019), such basic information is present even in medical studies that rely purely on correlational evidence.
The lack of transparency in black-box systems will not only present a challenge for eliciting trust in patients but also for avoiding automation biases in practitioners. Roughly put, the less practitioners have access to the internal operations of black-box systems, the less they are in a position to determine whether their trust in a given AI output is medically justified or whether it stems from an automation bias: a propensity to over-rely on outputs from automated decision-making systems. For more on the issues surrounding AI and automation biases, see Challen et al. (2019) and Goddard et al. (2011).
In a loose sense, it might be instructive to think of black-box medicine as reintroducing a kind of epistemic paternalism into medical practice. Of course, the epistemic paternalism in question is special. It applies both to patients and practitioners and involves no deliberate withholding of information on the part of the practitioner. As argued, since black-box medicine does not suggest an approach to medicine where patient values and autonomy are ignored, we should not understand black-box medicine as promoting a return to an all-out paternalism in medicine. Yet, insofar as epistemic paternalism can be characterized, paraphrasing Goldman (1991), in terms of the withholding of information that it is in the subject’s best interest to have—for instance, to enable informed decision-making—then black-box medicine does count as a sort of interesting, new type of epistemic paternalism.
Clearly, a lot hinges on how we spell out what it means to trust someone or something. If the relevant criteria for trust are criteria such as reliability and accuracy, then it may well turn out that people ought to put as much trust in black-box systems as in human experts. But if the relevant trust criteria include more fuzzy ones such as benevolence and honesty, then it may well turn out that people ought to trust human experts more than black-box systems; for instance, as pointed out by Ploug and Holm (2019), it may be that patients simply fear AI technology and, as a result, are disposed to distrust black-box systems more than human experts.
Thanks to an anonymous referee for encouraging us to acknowledge these complicating factors.
References
Bernat, J. L., & Peterson, L. M. (2006). Patient-centered informed consent in surgical practice. Archives of Surgery, 141(1), 86–92.
Binns, R., Van Kleek, M., Veale, M., Lyngs, U., Zhao, J., Shadbolt, N. (2018). ‘It’s reducing a human being to a percentage’: perceptions of justice in algorithmic decisions. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (p. 377). ACM.
Burrell, J. (2016). How the machine “thinks”: understanding opacity in machine learning algorithms. Big Data and Society, 3(1), 1–12.
Captain, S. (2017). Can IBM’s Watson do it all. Fast Company. Retrieved from https://www.fastcompany.com/3065339/can-ibms-watson-do-it-all (accessed online 29/10/2019).
Challen, R., Denny, J., Pitt, M., Gompels, L., Edwards, T., & Tsaneva-Atanasova, K. (2019). Artificial intelligence, bias and clinical safety. BMJ Qual Saf, 28(3), 231–237.
Calo, R. (2015). Robotics and the lessons of cyberlaw. California Law Review, 103(3), 513–563.
Danaher, J. (2016). Robots, law and the retribution-gap. Ethics and Information Technology, 18(4), 299–309.
De Fauw, J., Ledsam, J. R., Romera-Paredes, B., Nikolov, S., Tomasev, N., Blackwell, S., et al. (2018). Clinically applicable deep learning for diagnosis and referral in retinal disease. Nature medicine, 24(9), 1342–1350.
Delaney, L. J. (2018). Patient-centred care as an approach to improving health care in Australia. Collegian, 25(1), 119–123.
De Maeseneer, J., van Weel, C., Daeren, L., Leyns, C., Decat, P., Boeckxstaens, P., Avonts, D., & Willems, S. (2012). From “patient” to “person” to “people”: the need for integrated, people-centered healthcare. The International Journal of Person Centered Medicine, 2(3), 601–614.
Di Nucci, N. (2019). Should we be afraid of medical AI? Journal of Medical Ethics, 45(8), 556–558.
Doran, D., Schulz, S., & Besold, T. R. (2017). What does explainable AI really mean? A new conceptualization of perspectives. arXiv preprint arXiv:1710.00794.
Epstein, R. M., Fiscella, K., Lesser, C. S., & Stange, K. C. (2010). Why the nation needs a policy push on patient-centered health care. Health affairs, 29(8), 1489–1495.
Esteva, A., Robicquet, A., Ramsundar, B., Kuleshov, V., DePristo, M., Chou, K., Cui, C., Corrado, G., Thrun, S., & Dean, J. (2019). A guide to deep learning in healthcare. Nature medicine, 25(1), 24–29.
Esteva, A., Kuprel, B., Novoa, R. A., Ko, J., Swetter, S. M., Blau, H. M., & Thrun, S. (2017). Dermatologist-level classification of skin cancer with deep neural networks. Nature, 542(7639), 115–118.
Faes, L., Liu, X., Kale, A., Bruynseels, A., Shamdas, M., Moraes, G., Fu, D.J., Wagner, S.K., Kern, C., Ledsam, J.R. and Schmid, M.K. (2019). Deep learning under scrutiny: performance against health care professionals in detecting diseases from medical imaging-systematic review and meta-Analysis (preprint).
Ferroni, P., Zanzotto, F., Riondino, S., Scarpato, N., Guadagni, F., & Roselli, M. (2019). Breast cancer prognosis using a machine learning approach. Cancers, 11(3), 328.
Floridi, L., Cowls, J., Beltrametti, M., Chatila, R., Chazerand, P., Dignum, V., et al. (2018). AI4People—An ethical framework for a good AI society: opportunities, risks, principles, and recommendations. Minds and Machines, 28(4), 689–707.
Floridi, L. (2011). The informational nature of personal identity. Minds & Machines, 21, 549–566.
Forssbæck, J., & Oxelheim, L. (2014). The multifaceted concept of transparency. In J. Forssbæck & L. Oxelheim (Eds.), The Oxford handbook of economic and institutional transparency (pp. 3–31). New York: Oxford University Press.
Goddard, K., Roudsari, A., & Wyatt, J. C. (2011). Automation bias: a systematic review of frequency, effect mediators, and mitigators. Journal of the American Medical Informatics Association, 19(1), 121–127.
Goldman, A. (1991). Epistemic paternalism: communication control in law and society. Journal of Philosophy, 88(3), 113–131.
Hall, D. E., Prochazka, A. V., & Fink, A. S. (2012). Informed consent for clinical treatment. CMAJ, 184(5), 533–540.
Harvey, H., & Glocker, B. (2019). A standardized approach for preparing imaging data for machine learning tasks in radiology. Artificial Intelligence in Medical Imaging (pp. 61–72). Springer, Cham.
He, J., Baxter, S. L., Xu, J., Xu, J., Zhou, X., & Zhang, K. (2019). The practical implementation of artificial intelligence technologies in medicine. Nature Medicine, 25(1), 30–36.
Heald, D. (2006). Transparency as an instrumental value. In C. Hood & D. Heald (Eds.), Transparency: the key to better governance? (pp. 59–73). Oxford: Oxford University Press.
Holzinger, A., Biemann, C., Pattichis, C. S., & Kell, D. B. (2017). What do we need to build explainable AI systems for the medical domain?. arXiv preprint arXiv:1712.09923.
Holzinger, A., Haibe-Kains, B., & Jurisica, I. (2019). Why imaging data alone is not enough: AI-based integration of imaging, omics, and clinical data. European Journal of Nuclear Medicine and Molecular Imaging. https://doi.org/10.1007/s00259-019-04382-9.
Japkowicz, N., & Shah, M. (2011). Evaluating learning algorithms: a classification perspective. Cambridge: Cambridge University Press.
Jiang, F., Jiang, Y., Zhi, H., Dong, Y., Li, H., Ma, S., Wang, Y., Dong, Q., Shen, H., & Wang, Y. (2017). Artificial intelligence in healthcare: past, present and future. Stroke Vasc Neurol, 2(4), 230–243.
Jordan, M. I., & Mitchell, T. M. (2015). Machine learning: trends, perspectives, and prospects. Science, 349(6245), 255–260.
Kallis B., Collier M., Fu R. (2018). 10 promising AI applications in health care. Harvard Business Review, https://hbr.org/2018/05/10-promising-ai-applications-in-health-care (accessed online 11/12/2018).
Lee, J. G., Jun, S., Cho, Y. W., Lee, H., Kim, G. B., Seo, J. B., & Kim, N. (2017). Deep learning in medical imaging: general overview. Korean Journal of Radiology, 18(4), 570–584.
Lipton, P. (2003). Inference to the best explanation. Abingdon: Routledge.
Lipton, Z. C. (2017). The doctor just won’t accept that!. arXiv preprint arXiv:1711.08037.
Liu, X., Faes, L., Kale, A. U., Wagner, S. K., Fu, D. J., Bruynseels, A., et al. (2019). A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis. The Lancet Digital Health, 1(6), e271–e297.
Loh, E. (2018). Medicine and the rise of the robots: a qualitative review of recent advances of artificial intelligence in health. BMJ Leader, 2, 59–63.
London, A. J. (2019). Artificial intelligence and black-box medical decisions: accuracy versus explainability. Hastings Center Report, 49(1), 15–21.
Marcum, J. A. (2008). An introductory philosophy of medicine: Humanizing modern medicine (Vol. 99). Springer Science & Business Media.
McDougall, R. J. (2019). Computer knows best? The need for value-flexibility in medical AI. Journal of Medical Ethics, 45(8), 156–160.
McGinnis, J. M., & Foege, W. H. (1993). Actual causes of death in the United States. JAMA, 270(18), 2207–2212.
Miller, T. (2018). Explanation in artificial intelligence: insights from the social sciences. Artificial Intelligence, https://arxiv.org/pdf/1706.07269.pdf (accessed online 11/12/2018).
Mittelstadt B. D., Allo P., Taddeo M., Wachter S., Floridi L. (2016). The ethics of algorithms: mapping the debate. Big Data & Society, pp. 1–21.
Montgomery, K. (2006). How doctors think: Clinical judgment and the practice of medicine. Oxford: Oxford University Press.
Nguyen, A., Yosinski, J., & Clune, J. (2015). Deep neural networks are easily fooled: high confidence predictions for unrecognizable images. Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 427–436).
Nyholm, S. (2018). Attributing agency to automated systems: reflections on human–robot collaborations and responsibility-loci. Science and engineering ethics, 24(4), 1201–1219.
Obermeyer, Z., & Emanuel, E. J. (2016). Predicting the future—big data, machine learning, and clinical medicine. The New England journal of medicine, 375(13), 1216–1219.
Olorisade, B. K., Brereton, P., & Andras, P. (2017). Reproducibility in machine learning-based studies: an example of text mining.
Ploug, T., & Holm, S. (2019). The right to refuse diagnostics and treatment planning by artificial intelligence. Medicine, Health Care, and Philosophy. https://doi.org/10.1007/s11019-019-09912-8.
Prat, A. (2006). The more closely we are watched, the better we behave? In C. Hood & D. Heald (Eds.), Transparency: the key to better governance? (pp. 91–103). Oxford: Oxford University Press.
Price II, W. N. (2017). Artificial intelligence in healthcare: applications and legal implications. The SciTech Lawyer, 14(1), 10–13.
Price II, W. N. (2018). Medical malpractice and black-box medicine. In I. Cohen, H. Lynch, E. Vayena, & U. Gasser (Eds.), Big Data, Health Law, and Bioethics (pp. 295–306). Cambridge: Cambridge University Press.
Purdy, M., & Daugherty, P. (2016). Why artificial intelligence is the future of growth. Remarks at AI Now: The Social and Economic Implications of Artificial Intelligence Technologies in the Near Term, 1–72.
Ribeiro, M. T., Singh, S., & Guestrin, C. (2016, August). Why should i trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1135–1144). ACM.
Schönberger, D. (2019). Artificial intelligence in healthcare: a critical analysis of the legal and ethical implications. International Journal of Law and Information Technology, 27(2), 171–203.
Schubbach, A. (2019). Judging machines: philosophical aspects of deep learning. Synthese, pp. 1–21.
Seshia, S. S., & Young, G. B. (2013). The evidence-based medicine paradigm: where are we 20 years later? Part 1. Canadian Journal of Neurological Sciences, 40(4), 465–474.
Smolensky, P. (1988). On the proper treatment of connectionism. Behavioral and Brain Sciences, 11, 1–74.
Straus, S., Glasziou, P., Richardson, W. S., & Haynes, R. B. (2019). Evidence-based medicine: How to practice and teach EBM (5rd ed.). Edinburgh; New York: Elsevier.
Su, J., Vargas, D. V., & Sakurai, K. (2019). One pixel attack for fooling deep neural networks. IEEE Transactions on Evolutionary Computation.
Tiwari, P., Prasanna, P., Wolansky, L., Pinho, M., Cohen, M., Nayate, A. P., et al. (2016). Computer-extracted texture features to distinguish cerebral radionecrosis from recurrent brain tumors on multiparametric MRI: a feasibility study. American Journal of Neuroradiology, 37(12), 2231–2236.
Topol, E. J. (2019). High-performance medicine: the convergence of human and artificial intelligence. Nature medicine, 25(1), 44.
US Food and Drug Administration. (2018). FDA permits marketing of artificial intelligence-based device to detect certain diabetes-related eye problems. News Release, April (retrieved online Accessed August 7, 2018).
van Ooijen, P. M. (2019). Quality and curation of medical images and data. In Artificial Intelligence in Medical Imaging (pp. 247–255). Cham: Springer.
Vogel, B. A., Helmes, A. W., & Hasenburg, A. (2008). Concordance between patients’ desired and actual decision-making roles in breast cancer care. Psycho-Oncology, 17(2), 182–189.
Wachter, S., Mittelstadt, B., & Floridi, L. (2017). Transparent, explainable, and accountable AI for robotics. Science Robotics, 2(6).
Walker, M.J., Bourke, J. and Hutchison, K. (2019). Evidence for personalised medicine: mechanisms, correlation, and new kinds of black box. Theoretical medicine and bioethics, 40(2), pp. 103–121.
Watson, D. S., Krutzinna, J., Bruce, I. N., Griffiths, C. E., McInnes, I. B., Barnes, M. R., & Floridi, L. (2019). Clinical applications of machine learning algorithms: beyond the black box. BMJ, 364, l886.
Xiao, Y., Wu, J., Lin, Z., & Zhao, X. (2018). A deep learning-based multi-model ensemble method for cancer prediction. Computer Methods and Programs in Biomedicine, 153, 1–9.
Zerilli, J., Knott, A., Maclaurin, J., & Gavaghan, C. (2018). Transparency in algorithmic and human decision-making: is there a double standard? Philosophy & Technology, pp., 1–23.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Bjerring, J.C., Busch, J. Artificial Intelligence and Patient-Centered Decision-Making. Philos. Technol. 34, 349–371 (2021). https://doi.org/10.1007/s13347-019-00391-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13347-019-00391-6