Voice-activated services reduce friction and are faster — you know that. What’s not mentioned is how much of that friction is self-inflicted. Or the clarity which comes from voice user interfaces, and their potential to save us from distraction.
To demonstrate this, let’s consider the as-is user experience for a few common use cases and compare them with the voice equivalents.
OK, I admit it’s unfair to start with this one — like pitting Russell Crowe’s Legionnaires against the Barbarian Horde, you know who’s going to get crushed. Then again, that was a great opening sequence, so let’s do it.
As-is experience installing a mobile app to get daily business news updates:
Notice how often a third of those steps involve ignoring items vying for our attentin — be they other apps, alert notifications, or promoted/featured content, they’re all distractions from the immediate task.
Compare that 3–4 minute process with this:
Message in, action taken, confirmation out. Nothing else.
While we don’t install new apps every day, this is a proxy use case for most workflow processes performed with a computer, from scheduling an appointment with somebody to setting an alarm. It’s why you find half-drafted emails and half-scheduled appointments from the morning as you prepare to leave the office at night.
(Oh, I realise that Siri can fast-track some of this on a phone too – “Siri, install CNBC from the App Store” allows you to skip 5 of the above steps – but it still requires physical touch. Besides, far-field voice activation is just more dramatic.)
OK, enough of that one. Let’s consider the next use case….
Now you want to get the latest news. Here’s the current as-is experience:
45% of US adults get their news from Facebook, 11% from Twitter, 7% from Instagram, 5% from LinkedIn
Wait, what’s that crap? Yep, F.I.L.T. (Facebook, Instagram, LinkedIn, Twitter)
What happened to the app we just installed? You never use it. Why? because it’s on the fifth panel of your phone’s home screen and even if you do remember to check it, you get distracted by something else before you even see the CNBC icon.
And so we come to one of the most telling graphics of 2017:
The result …. is sitting in the White House.
By contrast, Flash news are sheer quality and one of the best use cases for the Amazon Echo.
Moving swiftly on, what about something useful like ordering something.
Voice is not well-suited for comparing many items, or purchasing items where appearance is a differentiating factor. But for some transactions, it’s perfect:
You already get the importance of this. So should any company in e-commerce or which offers a product or service which can be easily repeat-ordered.
So let’s close with that grand-daddy of web use-cases … search.
Consider this scenario.
Son: “Hey dad, what’s the capital of Canada?”
Dad: “Montreal”
Son: “Thanks dad”
Dad (thinks to self): “Wait, maybe it’s Toronto. Where’s my phone?”
Now, how long should the resulting Google search take? 10 seconds? 30 max.
And how often does some permutation of the below occur instead?
How 30-seconds becomes 30-minutes on a smartphone.
Or perhaps better illustrated in a trendy scenario map:
Alternatively, voice save us from even opening the Pandora’s box we call a smartphone:
And that there’s the great potential of Alexa and voice-first interfaces: to return us to a world of clarity, and escape the clutter of promoted ads & irrelevance all vying for our attention.
Let that be your purpose in voice UX. You can change the world for the better.
Putting aside the idealism, let’s face it — even I, voice proponent, know we’re going to screw it up. It won’t be long before people abuse the coming alert/notifications API in the name of retention and ‘listens’… until we get to the point where we’ll see the yellow ring on the Echo and won’t even want to speak to Alexa.
So in the meantime, enjoy voice interfaces for all that they could be.