“The main lesson of thirty-five years of AI research is that the hard problems are easy and the easy problems are hard.” Pinker (1994), The Language Instinct I thought programming Software agents to collect Treasures on a Graph would be a piece of cake. I was utterly wrong. Coding agents so they do not act foolishly turned out to be intrinsically difficult. A Well-Defined Multi-Agent Setting “An is anything that can be viewed as perceiving its environment through and acting upon that environment through .” , Stuart Russell and Peter Norvig agent sensors effectors Artificial Intelligence: A Modern Approach Simulation of a multi-agent system collecting treasures using GraphStream Library Let agents move in a fully-connected graph in order to collect treasures. The agents are limited in their movements, perception, and communication. They can only observe and move to nodes directly connected to them and communicate with close-enough agents. Here is a simple multi-agent problem. n : , and . Explorers are doomed to explore the map because they are not allowed to pick treasures. The ones who are the Collectors, but they cannot carry too many and must hand out their collected treasures to Infinite-backpack agents. There are three kinds of agents Explorers Collectors, Infinite-backpack agents can Every agent has its own representation of the world, its own graph, which is a sub-graph of the real graph. Their sub-graph is the memory of all the nodes they have visited, and the edges they have ever seen or taken. They must communicate this graph to the others, so they can all share a reconstruction from all the subgraphs. The agents have limited perception but can remember past observations. (tutorials can be found , or ). In this Framework, a is a set of instructions an agent will execute. At each turn, every agent executes each of its behaviors sequentially. JADE (Java Agent DEvelopement Framework) will be used to implement what is called “behaviors” here here Multi-Agent System behavior : implement the agents’ behaviors to let them collect as many treasures as they can in a certain amount of time. Your goal Seems easy, right? (Note: this project is part of an (a course from ANDROIDE, a Master’s Degree in AI I am currently doing at UPMC). It was inspired by the survival horror game , and in the full version of the project, agents need to deal with a terrifying Wumpus wandering around). introduction to Multi-Agent Systems Hunt The Wumpus Non-trivial Behaviors There can be only one agent per node of the Graph, so they must coordinate their actions in order to not obstruct the way. A specific protocol must be implemented to take this scenario into account. Imagine two agents moving in opposite directions in a long hallway. Conflict of agents in a simulation: MyExplorerAgent2 is blocking the two others Coordination Thus, cooperation is . When a conflict happens, a protocol to the situation must happen. They must share their subgraphs, see who is closer to a highly-connected node and agree on who will move. Agents have limited perception and different abilities. sine qua non unblock , to optimize their movements and prevent conflicts. Explorer agents must also agree on who will explore which part of the unknown graph Information Exchanges The process of exchanging information in a Multi-Agent setting to give each agent access to the global knowledge is called the gossip problem . Source For instance, let’s suppose every agent from the set {1,2, … ,n} knows exactly one piece of information, called . Then, one very simple protocol is to have agent 1 call 2,3, … , n and get to know their secrets. Then, when 1 knows all the secrets, he calls 2, … , n to tell them those secrets, and then everyone knows all the secrets. A total of n-1+n-1 = 2n-2 calls are made. Actually, needs 2n-4 calls, which is close to our simple algorithm. a secret the optimal solution However, in our problem, , which makes the algorithm slightly more complicated, because the total knowledge is dynamic (the more the agents explore the graph, the greater their total knowledge). the total information is not fully known until all the nodes have been explored What is the optimal number of messages that must be exchanged between agents so they all know all secrets? More messages imply a better global knowledge and better coordination. Yet, with thousands of agents and millions of nodes, That’s where optimization compromises appear. n the cost of sending thousands of messages every millisecond is far from negligible and becomes a computational burden. Asynchronous Communication Communication between agents is . Since the execution of the agents is , then there is to synchronize the agents’ actions. Moreover, when exchanging information, every agent has a mailbox with messages from other agents, so the During this delay, one agent might move far away and . asynchronous distributed no global clock communication may be delayed. never answer the original message Coalition Formation Example of coalition formation : source Certain goals are not achievable alone (i.e. lifting a heavy object). Therefore, agents might agree to form group of agents, called , to achieve common goals. coalitions With three agents with three different skills (exploring, collecting and accumulating), . Thus, a protocol for creating and updating coalitions must be implemented. A possibility is to use the (surplus created by a coalition of agents) to determine which coalitions are the most valuable. necessary coalitions of three agents must be formed at least Shapley value Even with a simple problem setting, . This is a recurrent phenomenon when trying to build AI algorithms which would behave in a human-like manner. several obstacles arise very quickly and the algorithmic complexity seems insurmountable Building AI exhibiting simple behaviors is Hard “It is comparatively easy to make computers exhibit adult level performance on intelligence tests or playing checkers, and difficult or impossible to give them the skills of a one-year-old when it comes to perception and mobility.” Moravec (1988) Mind Children If we replaced the agents with humans, I am confident that they would very quickly understand how to win at this game, communicating what they saw in the graph and forming coalitions to collect the most treasures. However, implementing rigid behavior rules for Intelligent agents proves to be surprisingly difficult. This is : Moravec’s paradox What is easy for humans is incredibly difficult for machines When it comes to playing chess, AI achieves super-human performance. But for basic human behaviors, such as walking or coordinating actions to explore a map, Artificial Intelligence algorithms are surprisingly harder. Garry Kasparov, the Chess Grandmaster, makes the following statement in : it is not difficult for any sufficiently advanced algorithm to beat 20 top chess players in a simultaneous game. . Deep Thinking But no AI (in a robot) could walk around in a crowded bar and move the chess pieces by itself Source: DARPA robots falling down Machine Learning works under very specific circumstances you say. Well, . But why don’t we use the latest Machine Learning (ML) algorithm to solve our problem?… ML-only algorithms can only be used for certain tasks Yes, Reinforcement Learning (RL) algorithms are all the rage and can solve astoundingly hard problems such as achieving or at . But , which is not the case for our treasure-hunting problem, where the map is not fully visible at the beginning. superhuman level at Atari Games the game of Go those games are full-visibility games with small data inputs Source: (Feb. 2018) Deep Reinforcement Learning Doesn’t Work Yet But isn’t OpenAI working on a multi-agent system to beat humans at Dota 2 in a 5 vs 5 setting using Machine Learning algorithms?... you say. Yes, OpenAI has shown impressive results when the world champions at Dota 2 in a 1 vs 1 setting. But it was and . beating mostly due to their substantial computational power was not a breakthrough in Artificial Intelligence Their goal is to win in a 5 vs 5 setting, using . So, they seem to be working on a Multi-Agent problem with a full Machine Learning approach (learning from human games), and appear to be . a dataset of 5.8 Million of games missing a top-down approach to Multi-Agent Systems . Pure ML can work for Individual agents or fully observable systems, but for Multi-Agent Systems in a not-fully-known world, a more generalizable approach must be taken. The agents do not infer, do not generalize We have no clue how to implement scalable behaviors With only two agents walking in opposite directions in a hallway, we encountered an issue. Implementing a protocol to deal with this specific problem was possible. But what about 100 agents on a 400 nodes map? There is a gap between features for a small number of agents and having a and implementation of a Multi-Agent system. hardcoding scalable generalizable What needs to be done , through research, to solve those kinds of problems. won’t teach the agents how to communicate because the search space is too big. A pure data-driven approach won’t lead anywhere. Specific Multi-Agent Protocols must be developed Learning without prior knowledge Conclusion Implementing an algorithm capable of solving the treasure-hunting problem proved to be much more difficult than it seemed. Conceiving Multi-Agent Systems capable of solving simple problems is far from trivial. Machine Learning algorithms achieved great results in the last decade, but alone they cannot solve all Artificial Intelligence problems. 👏 👏 If this article was helpful to you, hold the button (up to 50 times) and become part of the gang. You can follow me on Medium and Twitter , or you could even subscribe to my personal newsletter if you’re crazy enough. EDIT: To get your hands dirty and start coding, here is to quickly try out this project. the library developed by Cédric Herpson