

Many call artificial intelligence (AI) a โblack boxโ, and it kinda is. One of the biggest problems of AI is that itโs incredibly difficult to understand how the data is being interpreted.
Before we get our hands dirty and dive deeper, letโs play a little game.
Iโm going to show you a series of abstract images that are either in category A or B.
Hint: Thereโs no C.
Weโll get back to this later.
If you choose B, donโt be embarassed, youโre not alone. When asked to a room full of engineers and developers, the split is always 50/50. Soโฆ why is the answer Aโฆ?
Because I said so.
The answer is A, thereโs no debating it, but if you donโt agree with me, then it was my fault as the trainer.
As the trainer, I know that A is a red circle. So anything with a red circle in it is A. I also know that B is an orange circle. The rest of the image is irrelevant. Itโs all about trying to find a pattern between the set of images.
But itโs hard.
In an AI system, I canโt explain with words what makes the image A. All I can do is show you more pictures and hope it starts to click.
And you, the AI, canโt tell me why you think itโs B. Itโs up to me to blindly feed you data until you get the answer right.
Hereโs the same set of images, but less abstract. If I were to ask you the same question, everyone would know right away that A is an apple and B is an orange. This is almost so easy that many people think itโs a trick question. We all know that the hand and background are all irrelevant information, because weโre humans and grew up learning these things, but for AI itโs not a given. It sees images as more abstract and doesnโt know what you want it to focus on.
Letโs take a look at another toy scenario that shows how we might accidentally communicate the wrong signals to the AI system.
We have a few samples of oak trees. (Itโs a bit cloudy where I live)
Here are some palm trees. (It was really sunny on the beach)
This next example is a palm tree, but the lighting is much closer to the oak trees. Which pattern should we focus on? The lighting? Or the shape of the tree? It might be difficult for the model to tell.
Confidence:
- Palm 0.75
- Oak 0.60
With this example, it might be pretty obvious that we left behind an unintended pattern for the AI to pick up. However, in reality, itโs normally something much more inconspicuous.
So how can we get more insight into what the AI is focussing on?
What if we passed a rectangle over the image and recorded the changes in confidence? If the confidence drops, then thatโs probably an important part of the image.
Which picture makes it easier to tell that this cable is a USB?
The first image completely obscures the connector, making it nearly impossible to guess, so we can denote the region the rectangle covers as important. However, the rectangle in the second image doesnโt hinder our ability to determine the cable type. We can safely mark the location as insignificant.
We can continue to pass the rectangle over image to establish a heat map of importance.
We can see that the modelโs focus is on the tip of the connector, which is great. Itโs looking where we want it to.
Letโs look at a model that wasnโt trained well.
Confidence:
- USB 0.76
The model correctly predicted that the cable was a USB with a confidence of 0.76
. We might say thatโs acceptable, especially since the photo is far away and isnโt great quality.
However, upon closer inspection, the model seems to be focusing on the wrong area, not the ends of the cable like we would expect.
What does this tell us?
The model appears to rely too heavily on the wire and fingers. To improve accuracy and clear up the confusion, we can include more examples of wire and hands in a negative training set.
We donโt need to train on piles and piles of generic data until our model starts performing better. We can tactfully use this information as an aid in retraining the model, saving us time and money.
Wow! This is great, but I donโt want to put in the effort to actually implement this
Good news! You can find the fully functional iOS app on my GitHub ๐
Creating your own model is easy, but that doesnโt mean the work stops there. The hardest part of machine learning is always producing good data.
We can use the basic guidelines of having similar pose, lighting and a consistent mix of stock and natural photos across our training images to gain a foothold in our quest toward a good model. After that, we are left using tools and our intuition to try and gain insight into the thought process of AI.
Thanks for reading! If you have any questions, feel free to reach out at bourdakos1@gmail.com, connect with me on LinkedIn, or follow me on Medium and Twitter.
If you found this article helpful, it would mean a lot if you gave it some applause๐ and shared to help others find it! And feel free to leave a comment below.
Create your free account to unlock your custom reading experience.