What is explainable AI?

Explainable AI, or XAI, is a set of methods and techniques that allow us to understand how a machine learning model works and why it makes the decisions it does. Without XAI, a machine learning model might be a “black box”, where even the developers cannot understand it they arrived at a certain decision.

Examples of how explainable AI can work

Explainable AI techniques can vary. In the case of simple machine learning models like linear regression (formula y = mx + c), it’s easy to understand why a model has made a certain decision because there are only two parameters, the gradient m and the intercept c.

However, for more complex machine learning models, such as deep learning models, convolutional neural networks, and so on, we could have many millions of parameters inside the model and it becomes increasingly harder to understand the decisions made.

Explainable AI for very complex models

Explainable AI techniques in the case of extremely complex models normally consist of introducing small variations, or perturbations, into the input to the model, and observing the changes in the model’s output. For example, if a computer vision model is 87% confident that an image is a cat, and changing one pixel reduces the confidence to 85%, we can conclude that the pixel contained an element of ‘cattiness’ from the point of view of the model. By doing this across the image, we can get a very accurate map of which parts of the image are most cat-like to the model.

The beauty of XAI is that we don’t need to have any understanding of the model architecture to perform this analysis.

There are several well-known frameworks for XAI, the most widely used in Python currently being LIME.

Leave a Reply

en_GBEnglish (UK)