Source: pixabay
“You ought to put on a coat!”
“Why?”
“Because it is snowing outside”
“Why does the fact it is snowing mean that I should put on a coat?
“Well, the fact that it is snowing means that it is cold”
“And, why does the fact it is cold mean that I should put on a coat?”
“If it is cold, and you go outside without a coat, you will be cold”
“Should I not be cold?”
“If you get too cold, you will freeze to death”
“So you’re saying I should not freeze to death?” [1]
To prove a point, you need to agree over some common truths. To convince someone to “put on a coat”, an elementary premise would be that “freezing to death by not putting on a coat” is rather silly.
An Artificial Intelligence does not necessarily believe in immovable truths.
However, I believe that an Artificial Intelligence, designed by humans, must believe in some kind of statements like “You ought to put on a coat”.
First of all, let’s begin by defining some useful vocabulary, and in particular define two kinds of statements:
“Is-statements: facts, how the world is, will be, was in the past or how it would be in hypothetical situations
Ought-statements: normative statements about how the world should be, goals, moral and values” [1]
In what follows, an agent means anything that can be viewed as perceiving its environment through sensors and acting upon that environment through effectors.[2]
A utility function is a mathematical function ranking the agent’s preferences by value.
An agent is said to be rational if it possesses a utility function, and behave accordingly, maximizing the expected value of its actions.
In the next paragraphs, I will consider that all agents are rational, and act accordingly to their utility function.
By definition, a rational agent ought to maximize its utility function.
Hence, an AI, supposed rational, cannot only reason with is-statements to behave in the world. Ought-statements are necessary to know if an action must be taken or not.
An AI, supposed rational, cannot only reason with is-statements to behave in the world
In the silly introductory dialogue, an agent tries to convince another agent to put on a coat by only stating facts, or is-statements.
Without knowing someone’s fundamental beliefs (ought-statements_),_ it is impossible to convince him to do anything.
Essentially, one cannot derive an ought-statement from an is-statement.
They are separated by what is called Hume’s Guillotine.
In other words, they are orthogonal.
Here is how Nick Bostrom [3] defines the orthogonality thesis:
“Intelligence and final goals are orthogonal axes along which possible agents can freely vary. In other words, more or less any level of intelligence could in principle be combined with more or less any final goal.”
To better understand this thesis, let’s precisely define what we mean by intelligence and final goals:
Final goal: the fundamental truth encoded in an agent. For instance: maximize the utility function U.
Intelligence: the agent’s ability to achieve any goal.
In other words, the orthogonality thesis claims that the complexity of an agent’s final goal is not correlated with the agent’s ability to achieve any goal.
On Monday, I organized the first AI Safety Meetup in Paris.
Basically, it consisted of seven rational agents in their twenties debating about the Orthogonality Thesis.
Left to Right: me, girl from the Future Society, random guy doing a PhD on consciousness, stranger studying technical AI Safety
“But can a mouse really have an infinitely complex goal?” objected someone.
“Well, no,” said the guy from Amsterdam studying AI Safety “but so do any mammals”.
“But you could encode in a computer any goal, without giving it any power to achieve it”, he continued.
Any mammal has a “fixed hardware”: we are programmed, in our DNA, to have a will to live.
However, let’s say you have an English-formulated goal, for instance the United States Constitution.
This text can be stored in any hardware with enough data storage, even in a pocket calculator.
But the pocket calculator does not possess any intelligence, in the sense we defined before!
“Well, you could, in principle, encode any goal into your pocket calculator. But would it understand it?”, I argued. “In order to inculcate a complex goal into an Artificial Agent, you need to give it the ability to comprehend it”.
“Fine,” the stranger answered, “it needs to have the basic ability to understand what the goal means, but it can have a very limited representation of the world, and no means to achieve it.”
In The Hitchhiker’s Guide to the Galaxy (HG2G), :
“[Douglas Adams tells the story of] a race of hyper-intelligent pan-dimensional beings who built a computer named Deep Thought to calculate the Answer to the Ultimate Question of Life, the Universe, and Everything. When the answer was revealed to be 42, Deep Thought explained that the answer was incomprehensible because the beings didn’t know what they were asking. It went on to predict that another computer, more powerful than itself would be made and designed by it to calculate the question for the answer.” [4]
Here, the answer to the ultimate question of life, the universe and everything, is the ought-statement.
The ought-statement, for a rational Artificial Agent, is the meaning of its life. It defines what utility function it ought to maximize.
But this answer, like the answer “42” in HG2G, is meaningless without an understanding of the question.
To understand the ought-statement, the Artificial Agent must first understand every word of the statement and how it relates to the world.
A thorough understanding of the world, and a complex representation of reality, is therefore necessary to comprehend complex ought-statements.
There is no way of encoding complex goals into an Artificial Agent without giving it an even more complex ability to reason about the world.
A thorough understanding of the world, and a complex representation of reality, is therefore necessary to comprehend complex ought-statements.
“But isn’t it utterly silly to even be speaking about ought-statements when everything in the world is is-statement?” disagreed the stranger.
“Any ought-statement for an agent must exist somewhere in its hardware, which is part of the physical world. So everything is is-statement and ought-statements are just a convenient way to simplify the situation”.
I don’t think so.
Here is why ought-statements are relevant.
What constrains us, humans, is our inability to represent the world correctly, but also our inability to express clearly this representation.
Thus, even if we had a very complex and well-defined ought-statement, we could not encode it inside an Artificial Agent, because we could not inculcate it a good enough ability to reason about the world to understand it.
Furthermore, our standard way of representing the behavior of agents is through utility functions, and when we do that, we implicitly declare the ought-statement:
“You ought to maximize this particular utility function.”
We need to be able to inculcate human values in clear and simple utility functions.
That’s why ought-statements are necessary for us, humans, to build Artificial Agents.
This ought-statement might just be “you ought to put on a coat”.
References
[1] (The Orthogonality Thesis, Intelligence, and Stupidity , Robert Miles, 2018)
[2](Artificial Intelligence: A Modern Approach, Stuart Russell and Peter Norvig, 1994)
[3] (The Superintelligent Will: Motivation and Instrumental Rationality in Advanced Rational Agents, Nick Bostrom, 2012)
[4] (The Hitchhiker’s Guide to the Galaxy, Wikipedia)
If this article was helpful to you, hold the 👏 button (up to 50 times) and become part of the 👏 gang.
You can follow me on Medium and Twitter, or you could even subscribe to my personal newsletter if you’re crazy enough.
If you enjoyed the discussion, join us at the Paris AI Safety Meetup.