A machine learning algorithm analyzes its weights and biases in the film Ex Machina
Sometimes, when I jump in a self-driving Uber or notice that Facebook somehow tagged my profile picture with an ID of “stereotypical Colorado hiker,” I can’t help but wonder how close we really are to the dystopic robot uprising of Ex Machina or Westworld. Computers can play chess, but could they scheme, lie, or cheat? Can they be creative? Will they ever be able to critically analyze their place in the world and decide they want a better one?
Recent machine learning breakthroughs rely on a strategy called supervised learning to train networks of artificial “neurons” to perform specific tasks. These incredible algorithms have allowed for advances in every flashy tech pursuit from in-browser automatic translation to control systems for self-driving cars, but they all rely on huge amounts of labeled data to evaluate and adjust their performance in the task at hand (classification, for instance).
Using supervised training to optimize a neural network for a specific task is great — if you only have one type of task to solve. And, even though some algorithms have been sufficiently generalized to apply to a few overlapping tasks, you can’t really count on a network designed for language processing to learn how to recognize images particularly well. This is a huge obstacle for the nightmare robot uprising: something about organic intelligence is just more fluid, more resilient to change, and more capable of forming subtle or unexpected associations.
If we want to encourage the success of the nightmare robot uprising — which, for some reason, is the position of this article — we need to re-evaluate how we’re trying to build our artificial brain. We typically move from the top, which would be the specific result of a particular task, down through our network in order to teach it how to perform that task as well as it can. But a real brain forms in the opposite direction: it sends connections up through networks and, based on a few simple rules of attraction, encourages an ideal distribution of neuron clusters and pathways between them.
This sort of emergent complexity is called swarm intelligence, based on how huge colonies of insects (bees, for example) can work together to find optimal paths to food. The process relies on huge numbers of individual actors, each programmed with a few basic rules. A bee dances to communicate its findings with the hive, and the duration of a dance corresponds with the quality of the food source (in terms of quantity and proximity). Then follows a second generation of pilgrim bees, who are programmed to follow the path described by the very first dancing bee they encounter, and then to return home and dance about it themselves. Then a third generation does the same, and the process continues.
Eventually, the entire hive will find its way to the best food source, since the odds of encountering a bee dancing about that particular place will be slightly higher than the others initially (based on the length of the dance) and will continue to improve as more bees start twerking in that direction themselves. The entire hive, by virtue not of intelligence but of sheer numbers, can solve a problem together that no individual part of the group would be capable of solving on its own.
Several bees teeming up to solve a difficult problem
This is a lot like how a developing brain forms connections between new neurons: each neural “bee” (a radial glial cell) has just a few rules it follows as it moves through clusters of neurons. Once enough of them move through the network, the basic rules governing their development allow for an optimal pattern of connections between the neurons to emerge.
This approach to building a brain could make a big difference. In fact, when comparing the human brain with that of a chimpanzee, it appears that possibly the single most important distinction is simply the quantity of neurons and their connections — I mean, the number of bees in the hive. The same emergent patterns that allow a chimp to fish for termites with a stick, in greater numbers, make us capable of understanding language. If only their brains could continue to develop as many new connections as ours can, they could presumably learn to do any of the amazing things that make us human (e.g. operate a Twitter account, online banking, etc.).
All this brings us to a critical tool in developing a robust and flexible network for our terrifying army of artificially intelligent super-robots: unsupervised learning. Unsupervised models aren’t as well developed yet as supervised ones, but they simply use unlabeled data for training. They could conceivably allow for elegant patterns to emerge organically — if we can determine the best rules to instruct our dancing bees follow in the first place.
We could also start to wildly speculate about building a hybrid solution by wrapping a series of neural networks, themselves pre-trained in a supervised setting, in an unsupervised model. The unsupervised container could send input through each pre-trained network and discover which network should solve what problems, which networks should be multiplied or mitigated, and where neurons might form long-range connections with other networks to provide more humanesque capabilities for creative association. This could allow for an artificially intelligent program that intuitively knows to analyze an image when it sees it, and may, by doing so, be reminded of a relevant song, and could, after thinking of that song, tell you about its whole train of thought in conversational English.
Making these systems efficient enough for a pseudo-conscious intelligence to emerge is still an incredible task, even if the entirety of the network doesn’t have to be explicitly programmed by hand. But it’s important to indulge the _Westworld-_ian fantasies of feeling-bots in real terms every once in a while, if only so the study of machine learning and artificial intelligence doesn’t get swallowed up by the ever-present buzzkill of commercial viability.