Artificial Intelligence is all over the news. There are almost daily announcements of AI-powered products while mainstream media is full of stories of robots taking over factory floors, replacing truck drivers, and even office workers and stock traders. AI is going through a golden period and the result will be a change of how we do things across the board.
The question we are going to try and answer here is how can we reason about AI as we are thinking of building our own AI-powered applications. What are the conceptual tools we need to be effective? We are going to provide a simple framework, light on buzzwords, to achieve this. Then we are going use that framework to determine where some specific AI techniques or technologies, such as natural language processing (NLP) fit in the picture.
First, though, allow me a small history detour. It is generally accepted that the term Artificial Intelligence came into its own around 1956, although it was used in various forms before that. This was the year a group of scientists decided to get together to specifically study AI.
The actual proposal is fascinating, especially when viewed with the benefit of hindsight. McCarthy, Minsky, Rochester and Shannon wrote:
“We propose that a 2 month, 10 man study of artificial intelligence be carried out during the summer of 1956 at Dartmouth College in Hanover, New Hampshire. The study is to proceed on the basis of the conjecture that every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it. (..). We think that a significant advance can be made in one or more of these problems if a carefully selected group of scientists work on it together for a summer.” — A Proposal for the Dartmouth Summer Research Project on Artificial Intelligence
Back in 1956, 10 very smart people planned to focus quite hard and make a “significant advance” over a summer. Undoubtedly optimistic and it is easy to smile at it now. However, they showed incredible foresight in realising what the potential was. Over that summer a whole new field of study was created. One with deep roots in computer science that spans to cover a range of fields from philosophy to genetics and psychology.
As that initial proposal suggests the conjecture is that every aspect of learning or any other feature of intelligence can be described in a way that a machine can simulate it. Which leads on to the next question. What is a “feature of intelligence”? Let’s quickly start dispelling some myths by saying that it is not about human intelligence. We are also not going to complicate it by using the vague phrase “sufficiently hard problems” or “complex behaviour”. As evolutionary AI teaches us very simple behaviour can lead to some very extraordinary results. Also, AI does not just mean machine learning (ML), although ML is certainly the current driver of renewed enthusiasm in AI as a whole. Nevertheless, while entertaining, it is not particularly useful to agonize over the intricacies of neural networks and deep learning architectures for practical applications the same way it is not useful to worry about the details on the low-level networking communication protocols when building websites.
In order to understand AI in a practical way that can help us better understand the properties of the systems we are using, designing and building we will use a different paradigm. We will draw from a field of computer science called Agent-Based Software Engineering or Agent-Based Computing.
A simple way to describe agent-based computing is that it is a combination of software engineering and artificial intelligence. It concerns itself with how intelligent programs (agents) can be structured and provides ways to model and reason about your system and the interactions between agents much in the same way object-oriented modelling does.
We will understand AI by understanding what an agent is.
One of the classic textbooks for AI (Artificial Intelligence: A Modern Approach by Russell & Norvig) describes agents as “anything than can be viewed as perceiving its environment through sensors and acting on the environment through effectors”. While this is not necessarily a complete understanding it is good enough for our purposes.
An agent is something that perceives its environment (for example a chatbot waiting for a message from a user) and based on the input it receives (“Hello bot, how are you”) it will respond (“I am fine and how are you?”). An agent can also be a thermostat that perceives the temperature of a room and switches off the heating when the desired temperature is reached.
“Hold on there”, I hear you say. “A thermostat is an AI — I thought AI is about hard problems!?”. That is a fair statement but we are trying to build a framework that will help us deal with all sorts of situations without any vague assumptions of “complexity”. Do not get hung up on “hard problems” or “complicated situations”. Everything lies on a continuum and we will get there in time.
The examples we gave here are of what would be called reactive agents. Input A comes in and Output B is offered. Every time the same thing will happen. If no inputs are received our agent will not do anything. Not very exciting but it is a start.
Now, imagine your bot is a restless one. It is not happy to just sit there and wait for someone to speak to it. Instead it has its own agenda, its own goals. This restless bot wants you to interact with it. So it is going to act proactively to achieve this. Say it is a news bot — its objective is to get you to read news. Rather than waiting for you to input something it is going to write to you with “You won’t believe what happened today! Click here for the breaking news”.
Proactive agents can be described as having desirable environmental goals. They perceive the world (Ron hasn’t read the news yet), can act to change it (“Hey Ron! Read some news!”) and there is some state they want the world to be in (Success! Ron has read the news.). Proactive agents are definitely a bit more fun.
Now, let’s assume our agent is really very very keen about getting us to go read the news. In order to achieve its goal it knows it has to communicate with us and convince us to perform an action. A simple proactive agent may have a set list of actions it can perform to achieve its goals. A learning proactive agent will also attempt to determine the efficacy of those actions (perhaps by applying some utility function on the results) and it will adapt its actions.
For example, if “You won’t believe what happened today!” is not an effective hook then it might try “Click here for the latest news” or “The world is changing! Find out how!”.
There can be any number of layers of complexity hidden behind this learning proactive agent. The ways it assigns a score to different reactions can get extremely sophisticated and it may be able to draw not just from the reaction of a single person but thousands or millions of users with which it interacts. If the data grows and the hooks to test are multiple it will need special tools to make sense of them. Big Data analysts will be hired to identify the patterns. Hundreds of machines will be crunching numbers.
Nevertheless, the goal “Get user to read news” remains a simple one — irrespective of the amount of work that goes on in the background to figure out how to achieve it. It is also a single goal. It may perform different actions (i.e. utter different phrases) but they all want to lead to the same world state. One where we have read the news. Thinking of our learning proactive agent in this way allows us to box of the underlying complexity and understand the immediate issue at hand. Namely, we want to proactively engage with the user and get them to achieve a specific goal.
Now, let’s make it a bit more interesting. We will assume our agent has a rather more sophisticated objective. Instead of being driven by a goal — i.e. by a simple need to change the world from State A to State B it has motivations. These higher-level objectives such as “Get user to engage with politics more” could be satisfied by a number of goals. One might be “get user to visit site and read policies” while others are: “get user to like Facebook page”, “get user to reply to 5 survey questions”, etc. Each goal will have a number of plans or actions associated with it and our agent will have to decide which to pick based on some mechanisms. At this point we have an autonomous agent. We are not necessarily completely sure what it is going to do next and how the world is going to change because of that action (within a limited sense of course). It can decide how to change the world in order to satisfy its motivations.
You can probably imagine how complex things can become if an agent has multiple motivations, each leading to multiple goals with the motivations having different weightings. In short, things can get really interesting.
For the sake of completeness let us also briefly consider multiple interacting agents. Assume that an agent community has a common motivation of “get user to engage with political action”. Each individual agent, however, has differing capabilities and motivations. One agent is the Facebook Messenger agent with a particular focus on the Facebook environment and social media engagement, while a website-based agent is more interested in getting a user to offer money to the campaign. We might decide that they need to negotiate amongst themselves about who gets to influence the user in what way. A lot of work in agent-based computing is actually around negotiation mechanisms with agents even holding auctions as they bids for a chance to get to influence the world. The value an agent is willing to assign to an action is a useful mechanism to maximize utility across the system.
Once more, you can quickly see how the problem grows as we are essentially attempting to model a whole community of interacting agents with different motivations, different ways of perceiving the world, etc.
But you didn’t mention machine learning! What about Natural Language Processing? Vision? Autonomous driving? Isn’t that the “real” AI?
Yes, we haven’t mentioned machine learning in quite the way it is usually talked about. Our objective is to create a simple framework that we can use to understand some of the AI that we hear about in the news and, more importantly, how to think of our own applications.
Cloud-based APIs powered by machine learning can make our agents more intelligent than ever. That is why AI is suddenly relevant once more. While we could reason in terms of goals, motivations and actions that was not necessarily too exciting if our agent could not understand what a user is telling them, or cannot accurately predict what a user will like. This means that our agents can behave in ways that are more sophisticated. We, however, need to be able to reason efficiently about the behaviour of the agents we are building. That starts with basic characteristics such as reactivity, proactivity and autonomy. Everything else can build on top of that.
Now, let us see look at a couple of the typical technologies or application domain areas that are mentioned as prime examples of AI.
If our chatbot is plugged into an NLP API it could be modelled as just another sensor our agent uses or it can be modelled as simple reactive agent that our agent is communicating with. In either case we would expect that given the same phrase we would always get the same result. If, however, the NLP agent has a goal of always giving a better response (based on whatever learning mechanisms are in the background) we are better off modelling it as a proactive learning agent and the rest of our application needs to ensure it is happy with the response potentially changing over time.
The agent (or agents) that actually learn how to do NLP before we get to use it through an API are not our concern. This is precisely the key. It is easy to get caught up in all the layers of complexity — ultimately however we need to divide that in neat little boxes and focus on what is important for our AI-powered application.
Autonomous driving is a great example of how we can use agents to reason about a problem. Agent-based computing would suggest we consider a car not as a single agent but as a community of communicating agents. Each one is making thousands of decisions about the placement of other cars, the position of pedestrians, the management of the car. Each one is plugging into complex algorithms to figure this out. Decisions are shared and ultimately competing goals or motivations are satisfied such as “Take me back home”, “Don’t kill anyone while trying to get me there”, “Save fuel”, etc.
When building applications that need to interact with people or other applications in interesting ways agent-based concepts can greatly help to categorise and reason over what we are trying to do and what AI is. It allows us to have a single vocabulary through which to describe what our system is doing.
In this post we merely scratched the surface in order to set the scene. Agent-based computing has a lot further to offer as it describes the internal architecture of individual agents as well as architectures of communities of agents. As machine learning solutions become more widely available and sophisticated programs that proactively and autonomously attempt to solve problems become a necessity agent-based computing offers the framework of both how to think about them and actually build them. In follow-up posts we are going to dive into the specifics of modelling a chatbot using agent concepts and how it interacts with the world.
Part 2 of the guide is now published: “Understanding what the user says”.
Interested in why chatbots are actually here to stay? Read our post on the drivers behind conversational interface technology or find out more about our chatbot build services.
This post was originally published on deeson.co.uk