Explainable artificial intelligence: easier said than done


TAlong with the growing use of artificial intelligence in medicine, comes a growing concern among many policy makers, patients and physicians about the use of black box algorithms. In a nutshell, this is it: we don’t know what these algorithms do or how they do it, and since we are not able to understand them, we cannot trust them and we should not trust it. .

A new area of ​​research, dubbed Explainable Artificial Intelligence (XAI), aims to address these concerns. As we discuss in the scientific journal, along with our colleagues I. Glenn Cohen and Theodoros Evgeniou, this approach may not help and in some cases may be harmful.

Artificial intelligence (AI) systems, especially machine learning (ML) algorithms, are increasingly ubiquitous in healthcare. They are used for things like assess cardiovascular images, identify eye disease, and detect bone fractures. Many of these systems, and most of those licensed or approved for use by the Food and Drug Administration, rely on what is called black box algorithms. While the notion of what constitutes a black box algorithm is somewhat fluid, we consider it an extremely difficult, if not impossible, algorithm for ordinary humans to understand.


Examples of black box AI models would be any class of algorithms commonly referred to as “deep learning”, such as neural networks with many layers, convolutions, back propagation, etc.

There are two main ways to understand how an AI system works. The first is simple and intuitive: the creator of the system can stop using the black box to make predictions and use a transparent system – a white box template – rather. While white box models are also a fluid concept, examples include simple decision trees or ordinary regression with a few variables, where it is easy to tell how the variables combine to form the system predictions. For example, many doctors use a pointing system to calculate patients’ risk of heart disease or stroke based on their blood pressure, cholesterol, age, and other characteristics. Let’s call these Interpretable AI White Box (IAI) systems.


Interpretable AI is ideal for increasing transparency and helping to understand how a model works. It’s simple, intuitive and easy to learn. And to the extent that such a simple white box can substitute for a complex black box, we are all in favor. But that’s where the problem lies: For many medical applications, developers have to use a more complicated model.

An example is an application that relies on image recognition, in which the number of predictor variables is extremely large and the functionality is often very sophisticated. Another example is an application that relies on genetic data. In such cases, developers generally will not want to substitute an advanced deep learning system with, for example, a simple decision tree. IAI is therefore not a suitable alternative, as it may not achieve the necessary levels of precision that more complex black box models can achieve.

To appease those worried about trust and transparency, developers who insist on using black box systems are turning to the second alternative, namely XAI. Here’s how it works: Given a black box model used to make predictions or diagnoses, a second explanatory algorithm is developed that approximates the black box outputs. This second algorithm (itself a white box model) is trained by fitting the black box predictions and not the original data. It is generally used to develop post-hoc explanations for black box exits and not to make actual predictions.

In other words, the approach is based on this dual process: a black box for predictions and a white box for ex post explanations. Using stroke risk as an example, the explanatory white box algorithm can tell a patient that their elevated stroke risk, as predicted by the black box model, is consistent with a linear model based on age, blood pressure, and smoking behavior.

But note that the post-hoc explanation is not the actual mechanism by which the black box prediction was generated. Indeed, it is easy to imagine many other explanations that can be generated that are also consistent with the black box prediction. For example, the patient’s risk of stroke might also be consistent with a decision tree that relies on their gender and diabetic status rather than blood pressure and smoking status. Similar patients can get very different post-hoc explanations. Due to the inconstant and afterthought nature of these types of explanations, we call understanding that XAI generates an ersatz understanding.

When a user receives such an explanation, they are no closer to understanding what is going on inside the black box; on the contrary, they have the false impression that they understand it better. This type of XAI is “fool’s gold” in this regard. The understanding it provides is akin to being told that the reason streetlights come on at night might be because the sun is setting, after observing these two events happening together a number of times. Such explanations can lead to other epistemic risks, such as narrative fallacy – believing in a story that is just plain wrong – or potentially overconfidence if, for example, the (bad) explanation provided reinforces beliefs. previous users.

Because this form of XAI is madman’s gold, it is unlikely to provide the benefits that are often touted. For example, since it does not add to the understanding of a black box system, it is unlikely to increase confidence in it. Likewise, since it doesn’t allow others to open the black box, so to speak, it is unlikely to help make AI / ML systems more accountable.

Requiring explainability for artificial intelligence and machine learning in healthcare can also limit innovation – limiting developers to algorithms that can be sufficiently well explained can hamper accuracy.

Instead of focusing on explainability, the FDA and other regulators should take a close look at aspects of AI / ML that affect patients, such as safety and efficacy, and consider submitting more products. health-related based on artificial intelligence and machine learning to clinical trials. Human factors play an important role in the safe use of technology and regulators, as well as product developers and researchers, must carefully consider them when designing reliable AI / ML systems.

Boris Babic is Assistant Professor of Philosophy and Statistics at the University of Toronto. Sara Gerke is Assistant Professor of Law at Penn State Dickinson Law. This essay has been adapted from a longer article in Science magazine by Boris Babic, Sara Gerke, Theodoros Evgeniou and I. Glenn Cohen.


Leave A Reply