Artificial intelligence (AI) is all the rage right now. Chances are your news feeds and social media timelines are filled with articles predicting how AI will change the way we will interact with the world around us. Everything from the way we consume content, conduct business, interact with our peers, transport ourselves, and earn a living is going to be affected by AI-related innovations. The revolution has already begun.
While the technology is still imperfect, significant milestones have been reached in the past 18 months. Milestones that show without the shadow of a doubt that AI can improve the lives of people with disabilities. This article will give you a prospective sense of where we’re headed with the technology. And what this means for accessibility and inclusion of people with disabilities in the digital space.
Artificial intelligence often seems like it’s happening in a black box. But its foundations can be explained with relative ease. Exposure to massive amounts of data is at the core of all the magic. In a nutshell, AI cannot happen without lots and lots of data. And then a lot of computational power to process the wealth of information is exposed to. This is how artificial intelligence develops new understandings. How the magic (let’s call it machine learning) happens.
Machine Learning can be summarized as the practice of using algorithms to parse data, learn from this data, and then make determinations or predictions through complex neural networks. The connections AI systems make as they are exposed to data results in patterns the technology can recognize. These patterns lead to new possibilities, such as accomplishing tasks that were impossible for the machine until then: recognizing a familiar face in a crowd, identify objects around us, interpreting information in real time, etc.
Neural networks are at the core of a machine’s ability to learn. Think of it as the human brain: information comes in through our senses, and it gets processed. Associations are made, based on preexisting knowledge. New knowledge emerges as a result. A similar process leads to new understandings for machines. The associations computers can make through AI are the key to developing the future of digital inclusion.
As neural networks build themselves, and as machines learn from the resulting assembled data points, it becomes possible to build blocks of AI that serve very specific, and somewhat “simple” purposes or tasks. Fueled by users needs, and with a little bit of creativity, these building blocks can be assembled to create more complex services that can then improve our lives, do tasks on our behalf. Generally speaking, simplifying some of the things humans need to do on a daily basis.
Let’s focus on five such building blocks, and see how they already contribute to making the experience of people online more accessible. Some of these blocks relate to overcoming a disability, while others address broader human challenges.
And to think we’ve only scratched the surface at this point. Damn.
Every day, people are uploading over 2 billion pictures on Facebook, Instagram, Messenger and WhatsApp. Imagine how going through your own timeline without any images would feel like. That was the reality for millions of people with visual disabilities until Facebook decided to do something about it.
In early 2016, the social media giant released its groundbreaking automatic alternative text feature. It dynamically describes images to blind and visually impaired people. The feature makes it possible for Facebook’s platform to recognize the various components making up an image. Powered by machine learning and neural networks, it can describe each one with jaw-dropping accuracy.
Before, alt text for images posted on your timeline only mentioned the name of whoever posted the picture. Today, images posted on your timeline are described based on each element that can be recognized in them through AI. A picture of three friends enjoying a canoe ride on a sunny day might be described as 3 people, smiling, a body of water, blue sky, outdoors. Granted, this is not as rich and compelling as human written alt text could be. But it’s already an amazing improvement for anyone who can’t see the images. And to think Facebook has only been doing this for about 18 months!
Give it another 5 to 7 years, and image recognition AI will become so accurate, that the mere thought of writing up alt text for images will seem pointless. As pointless as using layout tables instead of CSS feels to some of us today.
As Apple implemented facial recognition as the new way to unlock the next generation of iPhones, Microsoft has been hard at work implementing Windows Hello.
Both technologies allow you to log in to your computer using only face recognition. The end goal? Eradicating the need for passwords, which we know most humans are pretty terrible at managing. And data from Apple shows that it works pretty well so far. While the error ratio for Touch ID on iOS was about 1 in 50,000, Apple claims that with facial recognition, they are already bringing that ratio down to 1 in a million. Talk about an improvement!
Yes, facial recognition raises significant security and privacy concerns. But it also addresses many of the challenges related to authenticating online. Through exposure to data — in this case, multiple photos of one’s face, from multiple angles — building blocks of AI learn to make assumptions about who’s in front of the camera. As a result, they end up being able to recognize and authenticate a person in various contexts.
The replacement of CAPTCHA images is one area in which people with disabilities might benefit the most from facial recognition. Once the system recognizes a person interacting with it as a human through the camera lens, the need to weed out bots should be a thing of the past. AI-powered facial recognition might be the CAPTCHA killer we’ve all been waiting for.
Did you know that AI is already beating the world’s top lip-reading experts by a ratio of 4 to 1? Again, through massive exposure to data, building blocks of AI have learned to recognize patterns and mouth shapes over time. These systems can now interpret what people are saying.
The Google DeepMind project ran research on over 100 000 natural sentences, taken from videos from the BBC. These videos had a wide range of languages, speech rates, accents, variations in lighting and head positions. Researchers had some of the world’s top experts try to interpret what people on screen were saying. They then, ran the same collection of videos against the neural networks of Google DeepMind. The results were astonishing. While the best experts interpreted about 12.4% of the content, AI successfully interpreted 46.8%. Enough to put any expert to shame!
Automated lip-reading also raises significant privacy concerns. What if any camera can pick up close to 50% of what someone is saying in a public space? Still, the technology yields amazing potential to help people with hearing disabilities as they’re trying to consume online video content. Give Google DeepMind and other similar building blocks of AI a few years to get better at lip-reading. As the quality and relevancy of automated captions improve, we’ll start seeing dramatic improvements in the accuracy of these online services.
AI is useful to help bring down barriers for people who have visual or auditory disabilities. But people with cognitive impairments can benefit, too! Salesforce, among others, has been working on an abstractive summarization algorithm. The algorithm uses machine learning to produce shorter text abstracts. While still in its infancy, it is both coherent and accurate. The human language is one of the most complex aspects of human intelligence to break down for machines. This building block holds great promises for people who have learning disabilities such as dyslexia, and people with attention deficit disorders, memory issues, or low literacy skill levels.
In a few years, Salesforce made impressive progress with automated summarization. They are now leveraging AI to move from an extractive model to an abstractive one. Extractive models draw from pre-existing words in the text to create a summary. This makes the model quite rigid. With an abstractive model, computers have more options. They can introduce new related words and synonyms, as long as the system understands the context enough to introduce the right words to summarize the text. This is another area where massive exposure to data allows AI to make better educated guesses. These guesses then lead to success, relevancy and accuracy.
In today’s world, with so much exposure to information, keeping up with the data is a huge challenge. Processing relevant information while weeding out the rest is has become one of the biggest challenges of the 21st century. We all have to read more and more to keep up-to-date with our jobs, the news, and social media. This is an even bigger challenge for people with cognitive disabilities, people who have low literacy skills, or people coming from a different culture. Let’s not hold our breaths on abstractive summarization yet, but this may be our best hopes of finding a way out of the cognitive overload mess we’re in.
Diversity of languages and cultures might be one of mankind’s richest aspects. It is also one that causes insurmountable problems when it comes to communicating with people from all over the world. For as long as humans can remember, people have dreamed of building machines that would allow people to communicate without language barriers. Until now.
We’ve all grown familiar with services such as Google Translate. Most of us often made fun of how inaccurate the resulting translations often were. Especially in various languages that are less common and thus, not represented as well. In November of 2016, Google launched its Neural Machine Translation (GNMT) system, which lowered error rates by up to 85%. Gone are the days where the service would translate on a word-by-word basis. Now, thanks to GNMT, translations are globally operated. Sentences per sentences, ideas per ideas. The more AI is exposed to a language, the more it learns about it, and the more accurate translations become.
Earlier this year, Google released PixelBuds. These ear buds work with the latest release of their phone. These can now translate what you hear, in real time, in up to 40 different languages. This is just the beginning. From the perspective of accessibility and bringing down barriers, this is incredible. We’re so close to the Babelfish (that small, yellow, leech-like alien from The Hitchhiker’s Guide to the Galaxy), we can almost touch it.
These building blocks are only a few of the innovations that have emerged, thanks to artificial intelligence. It’s the tip of the AI iceberg. The next few years will guarantee a lot more to come. Such innovations are already finding their way into the assistive technologies. They already contribute to bridging the gaps experienced by people with disabilities. As creative people connect those building blocks, we see products, applications, and services that are changing people’s lives for the better. These are exciting times.
Self-driving cars, environment recognition applications, brain-implanted computer interfaces, etc. Such ideas were dismissed as science fiction a few years ago. There’s a perfect AI storm coming. It will better the lives of everyone, but especially the lives of people with disabilities.
As someone who lives and breathes digital inclusion, I can’t wait to see what the future holds. I plan on keeping track of these things. If you are too, we should follow one another on Twitter, so we can watch it unfold together. Give me a shout at @dboudreau.
This post was originally published on 24a11y.com and the Deque blog.