What is Explainable AI?

Explainable AI and interpretable AI have emerged as new ways of building trust in AI-enabled decisions. Model interpretability is the degree to which a human can understand the cause of a decision made by the model [1].

When a model makes a decision, there must be a reason, but is the reason “reasonable” or explainable to humans? For simple models it is an easy task to parse how it reached a decision, but for complex models, it’s difficult.

What You Need to Know about Explainable AI

Building a complex model can be challenging. Without high prediction accuracies, we would not say that a model is good, but even if we have a well-trained model with 100% prediction accuracy, would we trust this model?

Imagine that we have a model that predicts whether a patient is at risk of getting cancer, based on symptoms and lab results.

Early signs of cancer are difficult to detect, and a doctor may want to use a model to supplement his or her own experience to identify patients at-risk of getting cancer.

If a patient is told that he/she will get cancer but eventually does not (a false positive prediction), this would harm both the patient’s mental wellness and the doctor’s reputation. A doctor would likely want to understand how the model makes its predictions, to use his /her own expertise to determine whether the predictions are reasonable.

Let’s examine the inner workings of an explainable AI model, LIME [2], that is used at Modzy. Fig. 1. (a) shows an input graph that is fed into a model that identifies whether there is a car in the picture. The explainable model then separates the features of the original picture, as shown in Fig. 2, and assigns a score to each feature. The highest score represents the most important feature that led the model to determine that there is a car in the picture.

To explain how this is done, we need to understand how an interpretable AI model works. An interpretable model is a model in which one could easily distinguish feature importance, such as a linear regression model or a decision tree model.

However, there are situations where we might need a more complex model, such as a deep neural network. In many instances, deep neural networks are considered to be too complex and opaque to be able to parse the inner workings.

In the earlier example of the car classification model, we used a Resnet model, which is considered to be a black box model. To overcome this, we implemented LIME in our model so that the user has an explanation for each decision the model makes. The “L” in LIME stands for local, which means that we assume that the complex model is linear at local scale. Under this assumption, one examines samples that are similar to the target sample (those that lie in the same local area) and uses a linear model for differentiation, Fig. 2.

To explain this approach, we build an interpretable model on top of a complex model. In other words, we use a simple linear model that focuses on a local area to learn how a complex model makes decisions in its prediction or other task.

From a given input (a picture of a car stopping before a stop sign, Fig. 1(a)), LIME generates variants of the original input, Fig. 1(b). For information on how these pictures are generated, refer to the segmentation module in the Scikit-learn library [3].

Next, LIME feeds each of the variants of the original sample into the black box model, Resnet in our case, and fetches the output of the model. Finally, it measures the distance between the original image and the variated images. The distances are then used as weights, and the outputs fetched from the black box model are used as the labels to train a linear model.

As we mentioned earlier, linear models are interpretable, so this approach allows us to explain our black box model. Another important point is that this approach works to generate explanations for models using other types of input data, such as text. For example, the language identification model, where we can provide explanations on how model identifies the language of a given string of text by highlighting words or N-grams that are important in identifying the language.

Modzy Differentiation

At Modzy, our data science team integrates state-of-the-art explainable AI techniques such as LIME and others, so that our users can easily understand how models make decisions.