Cassie Kozyrkov


Explainable AI won’t deliver. Here’s why.

Let’s talk about interpretability, transparency, explainability, and the trust headache

Explainable AI (XAI) is getting a lot of attention these days and if you’re like most people, you’re drawn to it because of the conversation around AI and trust. If so, bad news: it can’t deliver the protection you’re hoping for. Instead, it provides a good source of incomplete inspiration.

Before we get caught up in the trust hype, let’s examine the wisdom of a sentence I’ve been hearing a lot lately: “in order to trust AI, we need it to be able to explain how it made its decisions.”

Image: source.

Complexity is the reason for all of it

Some tasks are so complicated you cannot automate them by giving explicit instructions.

AI is all about automating the ineffable, but don’t expect the ineffable to be easy to wrap your head around.

The point of AI is that by explaining with examples instead, you can dodge the headache of figuring out the instructions yourself. That becomes the AI algorithm’s job.

How could we possibly trust what we don’t understand?

Now that we’re automating things where we couldn’t have handcrafted that model (recipe/instructions) — because it’s way too complicated — are we seriously expecting to read it and fully grasp how it works? A recipe with a million boring items is something that a computer can remember easily, but it will overwhelm your limited human memory capacity.

So if we can’t just read that complicated tangle and figure out how the recipe is doing its decision-making, why should we trust it?

Imagine choosing between two spaceships. Spaceship 1 comes with exact equations explaining how it works, but has never been flown. How Spaceship 2 flies is a mystery, but it has undergone extensive testing, with years of successful flights like the one you’re going on.

Which spaceship would you choose?

This is a philosophical question, so I can’t answer it for you. I know I have a personal preference — maybe that’s the statistician in me — but I would choose careful testing as a better basis for trust.

Carefully testing your system, making sure that it works as it is supposed to — that’s what keeps you safe.

Testing as a better basis for trust

If we want a student to learn calculus, we want them to generalize beyond the textbook examples rather than overfit to them (“overfitting” translates approximately to “memorizing” in plain English). How do we check they’re competent?

Please don’t take a scalpel and go poking around in your student’s cranial wet stuff to figure out how they’re doing the calculus. That’s the equivalent of interpreting the model and I’m glad you don’t do that. You have no idea how the human brain implements calculus (since neither you nor neuroscience can describe the electrochemical signaling in there), but that’s okay — it’s not the best basis for trust anyway.

Craft exams to catch memorization and make sure they reflect the conditions where your student must perform.

What you should do instead is design an exam for your student carefully and if the student — human or machine — passes your exam, then you know they’re qualified. That’s pretty much what testing means in AI too.

The exam should be crafted to catch overfitting (using brand new data is the best way to foil those pesky memorizers) and relevant to the environment where the student must perform. Experts in applied AI take rigorous statistical testing seriously and you should too.

Let’s talk about explainable AI

Am I saying that interpretability, transparency, and explainability aren’t important? That’s not it at all. They have their place… in analytics.

In many interpretability debates, you’ll notice the participants are talking past one another and misunderstanding the different areas of applied data science. They’re fundamentally interested in different classes of application.

If the application involves generating inspiration — in other words, it’s advanced analytics with AI — then of course you need interpretability. How are you going to use a black box for inspiration? With great difficulty, that’s how.

If you’re looking for inspiration with advanced analytics, that’s a different goal from building a safe and reliable automated system for decisions at scale, where performance matters most. If your project truly demands both, you can blend two objectives for a price: your result will likely be worse at each of the objectives than if you’d stuck with only one. Don’t pay for things you don’t need.

It all boils down to how you apply the algorithms. There’s nothing to argue about if you frame the discussion in terms of the project goals.

Often, the people arguing are researchers. Their job is building general-purpose tools which don’t have a business project (or project goals) yet, so like salespeople peddling potato peelers, they are motivated to extoll the virtues of their wares to anyone who will listen, whether or not their captive audience is in danger of ever preparing food. “You need this potato peeler” simply isn’t true for everyone — it really depends on your project. Same goes for interpretability and XAI.

Fascination with mechanism

If you’re fascinated with how something works (mechanism) for its own sake, that’s the research instinct that got trained into you by your STEM classes. It helps you build a new student, a new brain, a new spaceship, a new microwave.

Much of the confusion comes from not knowing which AI business you’re in. Arguments that are appropriate for researchers (build better spaceships) make little sense for those who apply AI (solve problems using existing spaceships).

In applied data science, a love of mechanism is a great instinct for analysts; seeing how something works can reveal potential threats and opportunities. If you’re mining data for inspiration, chuck all black boxes out with yesterday’s trash!

Unfortunately, that same instinct will lead you astray if your goal is performance.

The mechanism inside the Canard Digérateur, or Digesting Duck, of 1739.

Popular nonsense concerning humans

Many people demand mechanism as a prerequisite for trust. They have a knee-jerk reaction to AI: “If I don’t know how it’s doing it, I can’t trust it to make decisions.”

If you refuse to trust decision-making to something whose process you don’t understand, then you should fire all your human workers, because no one knows how the brain (with its hundred billion neurons!) makes decisions.

Holding AI to superhuman standards

If you require an interpretation of how a person came to a decision at the model level, you should only be satisfied with an answer in terms of electrical signals and neurotransmitters moving from brain cell to brain cell. Do any of your friends describe the reason they ordered coffee instead of tea in terms of chemicals and synapses? Of course not.

Instead people do something else: examine the information and their choices, and then sing you a pretty song that tries to make sense of everything in hindsight. That’s XAI essentially, though human explanations aren’t always right. Behavioral economists get a kick out of planting decisions in their unsuspecting victims — er, experimental participants — and then enjoy listening to incorrect stories all about “why” the people made the decisions (which the experimenter actually made for them).

Whenever humans make up a convenient, oversimplified story that fits the inputs and outputs in hindsight, good news! You always have access to the same level of explainability for any model; it can even be model-agnostic. Just look at your input and output data, then tell a fun story. That’s what analytics is all about.

When it is simpler than the truth, your explanation is technically a lie. It might inspire you, but it’s closer to a safety blanket than a safety net.

Adding a layer of analytics is a good idea wherever you can afford it (remember, don’t take what you see too seriously). The wiser efforts in XAI focus on analytics with inputs and outputs. Sure, they make it sound like something brand new is being invented when it’s just good old fashioned “look at your data and check it makes sense” but I can’t fault the sensibility. The only issue I take with it is that it’s sold to you as a basis for trust. XAI is many good things, but the way it’s invoked in trust discussions is an audacious con. The explanation will always be horribly oversimplified, so it won’t be true. Oh dear.

Explainability provides a cartoon sketch of a why, but it doesn’t provide the how of decision-making. It’s not safe to take a cartoon sketch as more than inspiration, and you’d do well to remember that trust based on XAI is like trust based on a few pieces out of a giant puzzle.

Analytics without testing is a one-way ticket to a false sense of security.

But let’s get back to the kind of AI where the model is interpretable, where we’re not focused on what’s happening in the data. In other words, we want to understand the equivalent of the functioning of ~100,000,000,000 little cells in your skull to explain how you really chose that coffee.

Why can’t we have both?

In a perfect world, you’d like perfect performance and perfect interpretability, but usually real life forces you to choose. You know what we call a task where you can have both? Simple, easy, and probably already solved without AI.

So let’s talk about those hairy, complicated, and incomprehensible tasks. The ones your brain has evolved to do without telling you how it does them. Or the ones where the signal is a faint needle spread across a huge haystack of feature combinations. The ones that force you to resort to AI in the first place. For those, you have to choose between:

  • Interpretability: you do understand it but it doesn’t work well.
  • Performance: you don’t understand it but it does work well.

Remember, explainability can be model-agnostic, so it’s almost a separate analytics project you can tack on if you have the time and energy. True model interpretability, however, will cost you performance if the task requires a complex recipe.

When performance matters most, don’t limit your solutions to what the simple human mind can wrap itself around. You want to automate the ineffable, remember? The whole point is that some tasks really are so complicated that a good solution will be a rat’s nest of complexity, so you won’t understand the model. At best you’ll be able to oversimplify with analytics.

Don’t limit your solutions to what the simple human mind can wrap itself around.

To really succeed at these tasks with high performance, you have to let go. If you wanted your senses to be controlled by a simple number of brain cells working understandably, they wouldn’t be very effective senses. A fruitfly’s simple brain is easier to understand than yours, but I bet you wouldn’t willingly trade places. You like your performance better.

Requiring all models to be interpretable is like demanding that no helper’s brain is more complicated than a fruitfly’s. There’s only so much a fly can help you with.

Instead, opt for the trust that comes from making sure that you’re able to verify that your system does, in fact, work.

It’s nice to have both, of course, and some tasks are simple enough that they can be solved in a way that delivers all your wishes, but if you can’t have both, isn’t it’s best to go straight for what’s most relevant? That’s performance on well-crafted tests.

Hubris and the danger of preferring mechanism

Some people still prefer mechanism as their ideal basis for trust. They’d choose the untested spaceship whose physics they’ve read.

Lest hubris get the better of us, it’s worth appreciating that mechanism is a step away from performance. People who prefer information about how something works may be trusting too much in their own human ability to leap from complex mechanism to expected performance. Let’s hope they’re clever enough to avoid a nasty splat.

Not everything in life is simple

In a nutshell, simple solutions didn’t work for tasks that need complicated solutions, so AI comes to the rescue with… complicated solutions. Wishing complicated things were simple does not make them so. If a law says your inherently complicated thing has to be simple, it’s a polite way of saying you can’t have the thing at all. (Which is sometimes for the best.)

Why can AI algorithms make solutions that are more complicated than the code you handcraft? A computer can remember (save to disk) a billion examples perfectly in a way that you can’t, so it doesn’t fuzz out the nuances in the same way you do. It doesn’t get bored by writing out a million lines of instructions. Computer memory is nothing new, but with modern computing power we can exploit it at scale. You may have had a few millennia of simplistic recipes that fit in human memory, but now it’s time to turn a new leaf.

Today, humanity’s playbook has expanded to automating tasks with complicated recipes. Some of them are so complicated you can’t explain them in three sentences over a beer. We’d best get used to it.

More by Cassie Kozyrkov

Topics of interest

More Related Stories