In brief, our company’s AI bested top US lawyers for the first time in accurately spotting risks in everyday business contracts.
The results went about as viral as a legal services solution gets. The story was covered by hundreds of news sites including Mashable, Inc, Futurism, The Daily Mail, The Daily Dot, and Popular Mechanics and made the front page of Reddit. It showed that no matter how many times this showdown takes place, there remains something dramatic about AI taking on humans. I have since been asked by others in our narrow field (there are more than 67 AI players in legal) and from other sectors to provide some thoughts on the “whys” and “how’s” of doing something similar.
The AI vs Human test has become something of a 101 in the lifecycle of an AI company. The “why” of doing such a test, fundamentally relates to the simple definition of Artificial Intelligence. AI is a “system that learns to perform intelligent tasks we usually think only humans can do.”
Almost every industry over the past decade has seen a point where we have seen technology overtake (skilled) humans. But demonstrating this crossing of the robotic Rubicon remains a crucial step.
Creating a dynamic competition is something for your team of researchers and engineers to rally around. It also provides a shorthand way to categorically and independently affirm a shift in your field. The more it is repeated by as many players as possible, the better for the adoption of AI and technology.
Helping demystify “magic” from your profession
The second “why” for doing an AI vs Robot matchup is that in many professions, there is an inherent human “magic” associated with tasks. Historic showdowns necessarily challenge this assertion. In our case, it opened up a centuries old profession’s eyes to the inefficient manual processes. Powerfully, it also helped to unhype AI for a skeptical audience. Shena Shenoi, a Harvard-educated business lawyer, and Non-Disclosure Agreement expert, was one of 20 lawyers who took on our machine in the showdown for the most lawyerly work of contract review. She said: “The test gave me a practical glimpse into how technology can automate a staple of the legal profession — reviewing Non-Disclosure Agreements (NDAs). The type of issue-spotting carried out is credible and quite similar to how we (manually)do this type of work and have for decades.”
Bringing back humanity (through robots)
Being beaten by an algorithm does not actually hurt humans. The wonderful AlphaGo documentary, shows how DeepMind’s AlphaGo defeated Lee Sedol, the international Go champion. The overriding message was not the rise of scary machines, but the beauty of the pursuit itself and the nobility of the champions for humanity, both Go players and the creators of the technology.
Demis Hassabis CEO and Co-Founder, DeepMind makes this exact point recalling lessons from his historic achievement. He wrote on his blog: “We’ve learned two important things from this experience. First, this test bodes well for AI’s potential in solving other problems. AlphaGo has the ability to look “globally” across a board — and find solutions that humans either have been trained not to play or would not consider.
“This has huge potential for using AlphaGo-like technology to find solutions that humans don’t necessarily see in other areas. Second, while the match has been widely billed as “man vs. machine,” AlphaGo is really a human achievement. Lee Sedol and the AlphaGo team both pushed each other toward new ideas, opportunities and solutions — and in the long run that’s something we all stand to benefit from.”
The How’s: Good Humans +a clear structure
But how should you start to think about creating such a challenge? Firstly, get the best participants. Noam Brown who created Libratus, which decisively defeated four of the world’s best human poker professionals says his number one piece of advice is: “Challenge the very best humans, and structure the rules of the competition so that a victory by the AI would leave no doubt that the AI is truly superior.”
In our lawyer contract-off, the hardest challenge was getting the best participants. Without an undisputed best lawyer in the world (probably the most famous lawyers would not actually be so familiar with the drudgework of the mundane contracts we were testing) we ensured every lawyer hired had specific skills and decades of experience in the exact area of experience we were testing. This was crucial. This left no doubt as to the superiority of AI in ploughing through legal issues-spotting in 26 seconds (it took each lawyer an average of 92 minutes). Our particular test sourced 20 top corporate lawyers whose decades of contract experience spanned companies including Goldman Sachs and Cisco, and global law firms including Alston & Bird and K&L Gates.
We further sourced the best human “referees”. We enlisted leading legal academics at top US universities and veteran US corporate lawyers to check our methodology and a further lawyer to oversee the competition. If you can, as a further reinforcement, ask the academics involved (if you are not the academic involved) to publish the findings in a top publication (such as Libratus outlining its findings in Science).
The How’s 2: Build drama
Law is rarely like in the movies. To watch a live match up of reviewing Non-Disclosure Agreements we realized might not have the ready-made TV viewing power of watching Lee Sedol’s sweat angst and tears against AlphaGo.
However, in the place of TV cameras, we used every aspect of data we had gathered. This helped show in our microscope competition the daily toil of lawyers even for such a real everyday challenge of reviewing 5 NDAs (each lawyer was required to review 30 legal issues, 11 A4 pages, 153 paragraphs, and 3213 clauses). The entire methodology was used to create a 40-page study which we published. To maintain human drama, alongside a lengthy science explanation provided by our CTO and scientific advisors, we used the data to create attractive simple graphics. This set out the basic match rules and final results.
In this way even “boring” industries may apply to match their tech against humans. For instance, insurance AI player Lemonade re-told the story of their man vs machine breakthrough, which set the world record for the fastest ever insurance claim. They did it by showcasing the experience of a man named Brandon whose coat (a Canada Goose Langford Parka) was stolen on a cold night December 23rd at 5:43 pm and that his AI-enabled Lemonade payment was processed in 3 seconds. This was faster and more accurate than any human claim handler.
Gimmick or end goal?
The success of a narrow test helps companies move to other areas and further grow and highlight the power of their AI technology. In the mid-2000s, IBM unleashed IBM Watson on contestants on Jeopardy. Its victory was a precursor for IBM to train Watson in areas such as medicine. On the initial Jeopardy test for their technology, there were challenges. Paul Horn, then director of IBM Research told Tech Republic: “They initially said no, it’s a silly project to work on, it’s too gimmicky, it’s not a real computer science test, and we probably can’t do it anyway”. But ultimately the entire IBM team got behind it to showcase their technology. Horn says: “The questions are complicated and nuanced, and it takes a unique type of computer to have a chance of beating a human by answering those type of questions”.
The AI-Human challenge still captures our imagination month as it continues to evolve. In May Google showed off a jaw-dropping new capability of Google Assistant as it placed a call and made an appointment with a human worker at a hair salon. It is the nuance of the computer that can interact in everyday human tasks that still astonishes us. Just like every lawyer has done a contract review, all of us have booked a hair appointment, putting this sophisticated tech into a relatable setting.
If people look under the hood they will see that behind headline-grabbing results are a team of people. It takes years of hard work and setbacks along the way to do a simple showcase such as it (it took about 20 researchers three years to reach a level where IBM could win, and four years to get our AI to match wits on even the most common legal contracts). However rather than focus on the graft of the technology, these match ups provide a clear use case for the technology in a simple, but powerful way.
It also helps to reignite interest and think anew about an old field whether it is the ancient game of Go or centuries-old practice of law. Following the AlphaGo publicity, searches for Go rules and Go boards spiked in the U.S. In China, tens of millions watched live streams of the matches, and the “Man vs. Machine Go Showdown” hashtag saw 200 million pageviews on Sina Weibo. There were 15 million viewers tuned in to the rerun of Jeopardy. Our report on how our AI defeated 20 lawyers gets downloaded hundreds of times each day.
The challenge successfully bridges a gap between technology and humans in each field. If we had one takeaway from our challenge it was that this powerful technology is not meant to be (nor indeed is it currently capable of being) used as a standalone tool. But professions need to know about the rapid advances. Tech is changing how lawyers, medics, dermatologists and others do their daily work, allowing humans to carry out the more strategic work.
It was put nicely by another of our 20 lawyer participants. Justin Brown, Partner at Brown Brothers Law, says: “As a chess player and attorney I will take from Grandmaster Vishy Anand and say the future of law is human and computer versus (another) human and computer. Either working alone is inferior to the combination of both.”
Jonathan Marciano is Director of Communications at LawGeex.