paint-brush
AI for Noobs: How Amazon Alexa Worksby@edemgold
1,544 reads
1,544 reads

AI for Noobs: How Amazon Alexa Works

by Edem GoldJanuary 20th, 2022
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

Alexa is a voice-based, AI-powered digital assistant, Alexa can reply to simple questions, perform various tasks and commands. In this article, we are going to look at how Alexa works. It works by processing requests or commands like -*Alexa Put on the lights*- through a machine learning technique called Natural Language Processing, NLP. NLP is simply the manipulation of forms of natural language like text, audio, speech using algorithms. Alexa performs the same process explained above to return a response back to you but in reverse, Alexa just goes through her database, finds the best words to describe the answer to your questions, and then outputs that response in speech form.

Companies Mentioned

Mention Thumbnail
Mention Thumbnail
featured image - AI for Noobs: How Amazon Alexa Works
Edem Gold HackerNoon profile picture

The key to artificial intelligence has always been representation.—Jeff Hawkins


Alexa is a voice-based, AI-powered digital assistant, Alexa can reply to simple questions, perform various tasks and commands. In this article, we are going to look at how Alexa works.

How Does Alexa Work?

Alexa works by processing requests or commands like -Alexa Put on the lightsAlexa play me a song- through a machine learning technique called Natural Language Processing, NLP for short.


In simple words, Natural Language Processing (NLP) is simply the manipulation of forms of natural language like text, audio, speech using algorithms.

How Does Alexa use Natural Language Processing(Overview)

I am going to break down Alexa's Natural Language Processing into 4 steps:


  1. Alexa first records your speech because the interpreting of sounds takes a lot of computation power, your sound input is sent to Amazon's servers to be interpreted.


  2. Next, Alexa parses/breaks down the interpreted audio input into individual sounds. It then consults an audio database containing various word pronunciations to find the closest corresponding matches to your broken-down audio input


  3. After this, Alexa identifies keywords contained in the now recognized audio input and then carries out the corresponding functions required of it to satisfy the command/request contained in the keyword. For example, you might say to Alexa "Alexa what is the weather like today", after recording, interpreting, breaking down, and then matching your audio input it knows you are asking it for the weather, it will then send this request to Amazon's servers where an API request will be made to a weather API service which will then satisfy this request.


  4. After the request is made and information is received, Amazon's servers then return this information back to the device so Alexa can then return the information back to you in speech form. Alexa performs the same process explained above to return a response back to you but in reverse,- Alexa just goes through her database, finds the best words to describe the answer to your questions, and then outputs that response in speech form.

Deeper Look Into Alexa's Natural Language Processing

It all begins with Signal Processing, put simply, Signal Processing is simply the technique that aids speech-based systems to get rid of noise which prevents accurate processing of data, things like background noise, etc. The aim of signal processing is to enable Alexa to identify and mute Ambient noise like TV, radios, human chat, etc so it can then focus on the target signal i.e the voice command/request.


After successfully processing incoming signals, Alexa performs wake word detection, during this process Alexa searches the target signal for an activation word, usually Alexa, Hey Alexa.

Once a wake word is detected, Alexa then sends the original order to the cloud-based speech recognition software, the software then takes the audio signal and converts it into text using a decoder.


The decoder breaks down the order into tone pronunciation and then matches those pronunciations to similar pronunciations in a sound database, those pronunciations are parsed and then implemented. The above process is performed by the Alexa Voice Service which runs on Amazon Web Service(AWS).


Alexa Voice Service is simply the brain behind all Alexa-powered devices, it performs all the complex operations like such as Automatic SpeechRecognition (ASR), making API requests, etc. In other words, it is Alexa's processor.

How Does Amazon's Alexa AI Improve?

AI improves its performance by learning from data gotten from its past experiences. Amazon uses requests/commands made to Alexa to train its neural networks.


Alexa is trained with request/command data from different users from different backgrounds in order to get variation in speech patterns as this improves accuracy in predicting speech from users from different backgrounds and accents, dialects, and vocabulary.


Alexa's Neural Networks are trained using supervised machine learning algorithms. For this to work, humans review a sample of requests, in order to help Alexa understand the correct interpretation of requests and therefore perform more efficiently on similar tasks in the future.

EndNote

The holy grail of the AI community is Artificial General Intelligence(AGI), AI that is able to completely replicate and supersede human intelligence but unfortunately, AI just isn't there yet.

Alexa and systems like her are the closest we have come to achieving this singularity.