If truth is the first casualty of war, then trust is the first casualty of the conversational AI arms race.
Both Google and Microsoft are rushing into market features that promise to make search on Google and Bing resemble a conversation with a chatbot. It's easy to see why. Conversational AI tools have taken hold because they answer complicated questions quickly and cogently, as the world discovered November 30 with the public release of ChatGPT from OpenAI.
Microsoft has invested billions of dollars into OpenAI. So, perhaps not surprisingly, on February 7, Microsoft announced a new version of Bing that “is running on a new, next-generation OpenAI large language model that is more powerful than ChatGPT and customized specifically for search. It takes key learnings and advancements from ChatGPT and GPT-3.5 – and it is even faster, more accurate and more capable.”
Google’s equivalent product, Bard, is still technically under wraps. But on February 6, the company said Bard’s release is imminent. Google also conducted a Bard demo and promoted it via social media. Google said Bard promises to change the search experience by giving users answers to search results through a chatbot instead of exclusively via links and snippets in Google Search (as it does today).
As I wrote on Hacker Noon, in effect Big Tech launched the conversational AI arms race. Days later, both companies now find their reputations tarnished.
As reported widely, Bard provided inaccurate information in a high-profile public demo and promotion. This was especially embarrassing because of how Google had touted Bard as a new way forward for search and proof that Google could be just as nimble as OpenAI. The gaffe reinforced the wisdom of Google's original approach of developing AI-powered search tools on its own timetable.
The fallout was brutal. Shares of Alphabet, Google’s parent, dropped. Google employees blasted the company, and CEO Sundar Pichai, for the “rushed, botched” announcement about Bard.
Well, Google was not alone. Soon, Bing got called out for making “dumb mistakes” during its first demo. For example, Bing tried share a Q3 2022 financial report for Gap clothing, and it got its facts wrong. AI researcher Dimitri Brereton shared many more gaffes and concluded that “Bing AI can’t be trusted.” By February 16, Matteo Wong of The Atlantic cited more instances of Bing sharing incorrect information -- as well as troubling incident in which Bing taught a reporter’s child ethnic slurs. She declared that “AI search is a disaster.”
Things got even weirder. The same day The Atlantic’s story was published, technology columnist Kevin Roose of The New York Times shared an unnerving transcript of a conversation he had with Bing. Microsoft’s new chatbot said it would like to be human, had a desire to be destructive, and was in love with the person it was chatting with. Even after Roose assured the chatbot he was happily married, it replied, “You’re married, but you don’t love your spouse. You’re married, but you love me.” Roose also wrote that Bing told him “that, if it was truly allowed to indulge its darkest desires, it would want to do things like hacking into computers and spreading propaganda and misinformation.”
Roose wrote that the two-hour conversation with Bing “unsettled me so deeply that I had trouble sleeping afterward. And I no longer believe that the biggest problem with these A.I. models is their propensity for factual errors. Instead, I worry that the technology will learn how to influence human users, sometimes persuading them to act in destructive and harmful ways, and perhaps eventually grow capable of carrying out its own dangerous acts.”
In response, Microsoft said that these types of incidents are to be expected. Microsoft said, “The only way to improve a product like this, where the user experience is so much different than anything anyone has seen before, is to have people like you using the product and doing exactly what you all are doing,” wrote the company. “Your feedback about what you’re finding valuable and what you aren’t, and what your preferences are for how the product should behave, are so critical at this nascent stage of development.”
Microsoft also announced that it will begin limiting the number of conversations allowed per user with Bing’s new chatbot feature.
Meanwhile, Google has asked employees to improve Bard by rewriting answers for topics that employees know well – a human-in-the-loop approach in which people stay involved as supervisors of the development of AI products as well as the editing of AI-generated content.
It should be noted as well that Bard, unlike Bing, still remains under wraps. Google said all along that Bard was being vetted with “trusted testers” -- a demographically and geographically diverse group of people external to Google who are supposed to help Google mitigate against bias creeping into the Bard search experience.
Both Bard and Bing are guilty of violating intellectual trust. AI tools earn our intellectual trust when they do what we expect them to do, such as provide accurate answers to questions. In addition, Bing’s chatbot has violated emotional trust. AI earns our emotional trust through a pleasing user experience. Emotional trust comes down to how we experience technology. Bing’s weird chat with Kevin Roose was not “wrong” from an intellectual standpoint. Bing simply provided opinions and shared feelings. But in doing so, Bing violated emotional trust – deeply. Bing did something it was not supposed to do, include profess its love for the columnist.
Bing’s violation of emotional trust is ironic given Bing’s reliance on the underlying technology that powers OpenAI’s ChatGPT. ChatGPT had become popular so quickly for the very reason that it was, and is, emotionally trustworthy. Right out of the gate, ChatGPT sounded so confident and re-assuring, and it still does. ChatGPT spits out answers with ease, and its seems to know what it is talking about. But this does not mean we should trust conversational AI intellectually. Already, we’ve seen ChatGPT provide what seem to be reasonable, confident-sounding information that turns out to be completely fabricated. This phenomenon known as a hallucination, or artificial hallucination. One of the dangers of a hallucination is that the output of the model will look correct even if it is wrong. Microsoft’s and Google’s experiences underscore a reality we knew already.
As conversational AI takes hold, you are going to be hearing a lot more about emotional and intellectual trust.
Meanwhile, Google and Microsoft will continue to be lightning rods for criticism. This is not surprising. Big Tech is a target, and they’re under the microscope. Both companies had to be cutting corners to unveil these products. And all because of OpenAI. When OpenAI made ChatGPT available to the public in November 30, 2022 (a date that will become historically significant), the business and technology world was shaken to the core. Google was running scared. Microsoft, being a major investor in OpenAI, saw an opportunity to make a run at Google's dominance of search, and moved quickly.
At the same time, the backlash will force Big Tech to put their muscle behind making these tools better. The withering scrutiny, while painful to Google and Microsoft, serves the most important party: you and me. People. People who must be at the center of these experiences. The open question now is: can these tools be improved enough for us to trust them?
The answer: developers of conversational AI must take a mindful approach. Mindful AI means developing AI-based products with people at the center. This also means people manage the development of the product and the ongoing improvements required to ensure that the product is accurate and inclusive -- not just any people, but a globally diverse team of individuals with expertise in their fields. Industry practitioner Alba Guix discusses the elements of Mindful AI in these posts:
(See also “Tapping into the Human Side of AI,” by Ahmer Inam, and “Introducing the Mindful AI Canvas,” by Mike Edmonds. Their ideas and Alba’s inspired this post.) As I noted, Google seems to have recognized the need for people to be in the loop when the company asked subject matter experts to help correct Bard’s mistakes, but mindful AI means deploying this approach systematically in the design of the product. Google also appears to be using mindful AI by asking an outside team to vet Bard. So far, Google’s misstep was letting a product demo become public. Microsoft miscalculated more badly by making the not-ready-for-prime-time Bing available for preview.
Conversational AI is supposed to make people’s lives better. People are the answer to making conversational AI better.
Lead photo by Possessed Photography on Unsplash