Introductory Guide to Voice Technology Implementation

If you’ve been considering whether voice technology is worth paying attention to or if it’s just a trend, you’ll find this article useful. Even the most ardent doubters must admit that voice-based solutions are gaining popularity faster than many other advancements, which is highly questionable.

To dispel any doubts, let’s go over the statistics. According to projections, the number of voice assistants could hit 8.4 billion by 2024, surpassing the global population. Besides, Statista estimates that the worldwide voice recognition industry is expected to expand from 10.7 US$ in 2020 to 27.16 billion US$ in 2026. That being said, we can notice the huge expansion of the market, and that could trigger the majority of apps to correspond to the level and undertake an upgrade.

Surely, the cold figures are not enough to make a decision, so we are going to highlight a few valuable reasons to consider Voice technology, which are going to influence both users and your profit. So, let’s get to the point.

Major reasons to consider Voice Technology

Competitive advantage

Certainly, we mentioned that the number of voice assistants will increase to impossible heights, but it doesn’t mean that making your app a part of this number will be a bad idea. If you narrow down the industry you are going to fill in with the new app and conduct research on the competitors – you’ll be very pleased to see just a few of them or even none with voice assistants’ integration. Implementing new technologies, especially those that require the application of AI and Machine Learning, will definitely create a competitive advantage over the other apps. And even if you think that there is no way how you can integrate voice recognition – think harder.

Voice ordering along with voice technology could facilitate any task and any feature for the user, whether it is a Transportation Management System (here you can add the possibility to administrate parcel delivery with the voice), or an Online Tutoring Platform (here users could schedule classes, and manage audio material with their voices).

24/7 Availability for Users

Users’ orders and queries can be responded to at any moment by sophisticated voice assistants. Furthermore, because AI can so closely mimic an intellect, many repetitious jobs may be successfully mechanized. As a result, voice technology may be the most cost-effective approach for a company to increase customer happiness while also growing its customer base.

As for another benefit, if you’ll extend the time spent in your app to 24/7, you’ll get the possibility to interact with them longer, which is equal to higher user engagement and leads to more profit.

Production efficiency

For the users, the efficiency starts right after they turn on the voice ordering/voice recognition. With that function, they can do several tasks at a time. This affects their satisfaction with the app and attracts more customers.

The other issue is that businesses are constantly seeking methods to improve their efficiency. As a result, numerous executives have begun to use voice technology for company management as well, by implementing it to the internal operations. With AI-trained voice assistants, the workflow happens faster, because multiple tasks could be conducted by the machine, with a simple order from an employee.

Customization

Voice technology offers data that can help you comprehend your target user. For example, you can look at how many times a specific type of voice search query led a person to your webpage. This provides you with information on the user’s browsing and purchasing habits. With such information, you can create better marketing strategies, and upgrade an app to fulfill the users’ demands. Which is obviously important for further app’s presence on the market range.

Now, when you ascertained what voice technology could mean to your app, it’s time to acknowledge how to apply this innovation into your idea, and how to deploy it. For your convenience, we gathered below a simple guide for you. Just keep reading!

Part #1: Installation

First of all, you need to create an Amazon account on Alexa. After that, you can go to Developer Console and start creating your first Alexa skill. To do so, you need to decide on the skill name. The skill name could be everything, however, the creation implies to make it customized and unique, so there is no plagiarism on the market. For the example, we chose ‘Incora assistant’.

The next step is to choose a model between pre-built ones and custom made. Surely, it’s your choice, but we recommend picking up a Custom option and building it through your efforts. With that, you need to decide on a method to host your skills. You can use Alexa-hosted variants for personal training. Although for production, we recommend hosting the Lambda function on AWS, so in that case, you should select the block ‘Provision on your own’.

To get started, Alexa’s interface offers to choose a template for establishing backend code and interaction model. And once again, if you agreed upon creating a unique solution, click on the ‘Start from Scratch’ option.

Finally, when the initial preparation is over, you can continue building your Voice Technology. Now, you need to figure out the Skill Invocation name. Make sure to find a phrase or a word, that will be easy to remember and spell for the user. Apparently, it could be similar to the Skill name.

When you are done, don’t forget to save and build the model after each change you made.

Hence, now you can create your own intents. An intent is an action that takes place in response to a user’s voiced request. In the sidebar go to ‘Interaction Model’ and then click on ‘Intents’. There you can create custom intents. Afterward, you should generate sample utterances. The sample utterances are a collection of plausible spoken sentences that have been aligned to the intents.

Then create a Lambda function. To do this, we use Node.js, serverless, and Alexa ask-sdk.

Let’s start writing some code. We need to create handlers for standard Alexa intents, including several paths to the file with different requests. You can find them below.

src/handlers/LaunchRequestHandler.js

const LaunchRequestHandler = {
 canHandle (handlerInput) {
   return handlerInput.requestEnvelope.request.type === 'LaunchRequest'
 },
 handle (handlerInput) {
   return handlerInput.responseBuilder.
     speak('Welcome to Incora assistant. You can ask about technology stack, projects, and a lot more').
     reprompt('What\'s your request? ').
     getResponse()
 }
}

module.exports = LaunchRequestHandler

src/handlers/HelpIntentHandler.js

const HelpIntentHandler = {
 canHandle (handlerInput) {
   return handlerInput.requestEnvelope.request.type === 'IntentRequest'
     && handlerInput.requestEnvelope.request.intent.name === 'AMAZON.HelpIntent'
 },
 handle (handlerInput) {
   return handlerInput.responseBuilder.
     speak('You can say: \'alexa, hello\'').
     reprompt('What\'s your request? ').
     getResponse()
 }
}

module.exports = HelpIntentHandler

src/handlers/FallbackHandler.js

const FallbackHandler = {
 canHandle (handlerInput) {
   return handlerInput.requestEnvelope.request.type === 'IntentRequest'
 },
 handle (handlerInput) {
   return handlerInput.responseBuilder.
     speak('Can you repeat it, please? ').
     reprompt('What\'s your request? ').
     getResponse()
 }
}

module.exports = FallbackHandler

src/handlers/ErrorHandler.js

const ErrorHandler = {
 canHandle () {
   return true
 },
 handle (handlerInput, error) {
   console.log('ERROR HANDLED', error)
  
   return handlerInput.responseBuilder.
     speak('Sorry, I had trouble doing what you asked. Please try again. ').
     reprompt( 'What\'s your request? ').
     getResponse()
 }
}

module.exports = ErrorHandler

src/handlers/CancelAndStopIntentHandler.js

const CancelAndStopIntentHandler = {
 canHandle (handlerInput) {
   return handlerInput.requestEnvelope.request.type === 'IntentRequest'
     && (handlerInput.requestEnvelope.request.intent.name === 'AMAZON.CancelIntent'
       || handlerInput.requestEnvelope.request.intent.name === 'AMAZON.StopIntent')
 },
 handle (handlerInput) {
   return handlerInput.responseBuilder.
     speak('Goodbye ').
     getResponse()
 }
}

module.exports = CancelAndStopIntentHandler

src/handlers/SessionEndedRequestHandler.js

const SessionEndedRequestHandler = {
 canHandle (handlerInput) {
   return handlerInput.requestEnvelope.request.type === 'SessionEndedRequest'
 },
 handle (handlerInput) {
   // Any cleanup logic goes here.
   return handlerInput.responseBuilder.getResponse()
 }
}

module.exports = SessionEndedRequestHandler

Here you should add the custom handler for this intent such as ‘TechnologyStackIntentHandler’.

src/handlers/TechnologyStackIntentHandler.js

const TechnologyStackIntentHandler = {
 canHandle (handlerInput) {
   return handlerInput.requestEnvelope.request.type === 'IntentRequest'
     && handlerInput.requestEnvelope.request.intent.name === 'HelloWorldIntent'
 },
 handle (handlerInput) {
   // add your own logic
   // get data from some API, database, etc.
   return handlerInput.responseBuilder.
     speak('Our technology stack comprises JavaScript (Node, Angular, React, Ember, Vue), Python (Django), Mobile apps (React Native, Ionic).').
     reprompt( 'What\'s your request? ').
     getResponse()
 }
}

module.exports = TechnologyStackIntentHandler

For the next step, you need to have or create an AWS account. Set AWS credentials on your local machine. Configure serverless.yml file and run yarn deploy or npm run deploy.

serverless.yml

service: alexa-lambda-example

plugins:
 - serverless-pseudo-parameters
 - serverless-iam-roles-per-function

provider:
 name: aws
 runtime: nodejs14.x
 region: us-east-1
 stage: prod

functions:
 info:
   handler: src/index.handler
   events:
     - alexaSkill: amzn1.ask.skill.XXXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXX

After that, you can go to the AWS lambda function and copy ARN.

Go to the Alexa developer console and set your AWS lambda ARN as an endpoint.

Finally, your skill is built, and you can try to test it using the Alexa developer console or any Alexa device. Don’t forget to switch to development mode in ‘Test’ inside the navigation bar.

To test and provide assurance of the code, we deployed it to the developer console.

After the testing stage, at long last, you get yourself a voice assistant. The next step would be just to integrate it into the hardware system.

Part #2: AWS Cognito and Alexa Skill Account Linking

Since in the first part we’ve highlighted the process of the initial installation of Alexa skill and the development of the custom Voice technology, here we are going to focus on the next stage needed – Account linking. These operations are required for user recognition, so your voice assistant would identify the person who is entering the system. So, account linking should help in building a user’s background for more personalized interactions and offer customized solutions. Simply said, by means of account linking, your voice technology collects all the data about users, and applies those information to reduce bothering questions for profile identification. This will be done automatically. However, let’s get down to this concept in depth.

What is Account Linking regarding Alexa Skill?

Certain personalized skills necessitate the option of adding the user’s identity to the other’s system user. The purpose is to establish a relationship between the Alexa user and your system’s user account. Account linking allows you to securely authenticate users with services using your skill. Once the skill and external user account have been connected, the skill can perform tasks from the user with that account.

Account linking is conducted with the integration of OAuth 2.0. OAuth 2.0 is an open protocol that enables online, mobile, and desktop programs to request user authorization from remote services in a standard-compliant way. You may create your own OAuth server and identity management solution from scratch. But, to achieve the same result, you can utilize AWS Cognito that will assist you in developing a custom identity recognition. AWS Cognito uses User Pools, which are scalable user registry that can handle millions of members. User Pools is a fully managed service that is simple to set up without the need to worry about setting up server infrastructure, and it uses technical standards such as OAuth 2.0 to interrelate with your backend. Thus, below we will describe the process of Account Linking setup with the help of AWS Cognito since it will result in a custom-made service. But first, let’s discover the advantages.

Advantages of Account Linking

Account linking is a preferable method for those who want to develop a user-friendly solution, convenient for the target. But in detail, what are those benefits, that could convince you to establish Account linking for your Alexa skill?

Users can omit the creation of new profiles for each platform separately, which decreases configuration time and increase users engagement.
Account linking allows registered users to keep employing their old account while using a new social or passwordless access.
It is convenient for users who enrolled without a password to join their profile to one with more details.
Apparently, your application could integrate the user recognition, which will gather data from the other sources and fill in the missed gaps on your system’s profile.
It allows your software applications to access profile page details kept across several sessions.

Considering the advantages listed above, now you might question how to generate Account linking for your voice technology application, and adjust it to your needs. To learn more about that practical part of the process, follow further instructions.

Getting started

First of all, you need to go into Alexa Developer Console select your skill, click Tools and then Account Linking in your left sidebar**.**

The first phase is to configure the users’ access and action within account settings. We recommend, to allow users to create an account or link to an existing account with you and enabling skill without account linking. Under the block Securite Provider Information you should select Auth Code Grant as an authorization grant type. Exactly this type is used so that the user could obtain access tokens from the authorization code.

User Pool’s Setup

For the next step, you need to have an existing AWS Cognito User Pool or create a new one and set it up. To do so, move onto the AWS Cognito page. There we will set up User Pools in order to get the client id and secret id, which will be needed for the further configuration on the Alexa Developer Console.

Step 1: Sign in options

At the first step of User Pool’s Configuration, you should choose the way how the users will be able to sign in. Between the options Username, Email, and Phone Number, we choose Email and go on to Step 2.

Step 2: Security

Here you need to decide on the ways how to make the user experience with account linking more securable. For that purpose, you need to choose the password requirements and authentification method. For illustrative purposes, there is no need to write a custom policy, so we select the defaulted parameters. The same is with multi-factor authentication, there is no need for us to choose MFA, so we preferred the No MFA option. However, when developing your own solution, you can decide which measurements to set, and define customized requirements.

At this stage, you also need to configure the possible solutions for the users, when they forget their passwords. We definitely recommend enabling self-service account recovery, since it would be a common issue for each platform. And as a delivery method for the account recovery, we choose Email.

Step 3-4: Sign-up options and Message delivery

These steps are completely optional, and won’t affect the Account linking. Hence, we omit the instructions on them. You can also skip them as we did, or set up complying with your needs.

Step 5: App integration

Let’s create the User Pool Name. As this is the example, we selected ‘Test User Pool’. Then we enable to use the AWS Cognito Hosted UI, since here is the part important for the following process we highlight in this article, namely AWS Cognito and Alexa Skill Account linking.

Then we configure a domain for the endpoints. Below we provided the screenshots to demonstrate this phase of the setup.

Now you need to copy callback URLs from Alexa Developer Console and insert them into the provided fields.

After all of the passed steps, you should set the sign-out URL. Since on the screenshot below, there is not the full version, here is the template we outlined for you:

https://{YOUR_DOMAIN}.auth.us-east-1.amazoncognito.com/logout?response_type=code

*Step 6: Configurations approvement

Now you just need to submit all the changes you made and move on to the next stage of the Account linking process.

Auth Code Grant Layout

All the steps above lead to the formation of the client id and secret id. So, you copy that data from the tab ‘Test User Pool → App client: Alexa client’.

Then, you should return to Alexa Developer Console and finish the initial configurations.

Here are the templates of the full URIs:

Web Authorization URI: https://{YOUR_DOMAIN}.auth.us-east-2.amazoncognito.com/oauth2/authorize?response_type=code&redirect_uri=https://pitangui.amazon.com/api/skill/link/{CODE}
Access Token URI: https://{YOUR_DOMAIN}.auth.us-east-2.amazoncognito.com/oauth2/token

For the last step, you need to create and connect AccountLinkingHandler in your Lambda function.

src/handlers/AccountLinkingHandler.js

const AccountLinkingHandler = {
 canHandle(handlerInput) {
   return !(
     handlerInput.requestEnvelope.session.user &&
     handlerInput.requestEnvelope.session.user.accessToken
   );
 },
 handle(handlerInput) {
   return handlerInput.responseBuilder
     .speak('Please link your account to the Incora assitance skill using the card that I have sent to the Alexa app. ')
     .withLinkAccountCard()
     .getResponse();
 },
};

module.exports = AccountLinkingHandler;

Finally, you can use userAccessToken for getting any information from your system for personalization. An example of how to obtain accessToken can be found below.

JavaScript

const TechnologyStackIntentHandler = {
 canHandle (handlerInput) {
   return handlerInput.requestEnvelope.request.type === 'IntentRequest'
     && handlerInput.requestEnvelope.request.intent.name === 'HelloWorldIntent'
 },
 handle (handlerInput) {
   // add your own logic
   // const userAccessToken = handlerInput.requestEnvelope.session.user.accessToken
   // now you can use userAccessToken for getting any information from your system for personalization
   // get data from some API, database, etc.
   return handlerInput.responseBuilder.
     speak('Our technology stack comprises JavaScript (Node, Angular, React, Ember, Vue), Python (Django), Mobile apps (React Native, Ionic).').
     reprompt( 'What\'s your request? ').
     getResponse()
 }
}

module.exports = TechnologyStackIntentHandler

That’s it, the account linking is already successfully adjusted and ready to be used! Now, your voice assistant is much closer to user satisfaction.

Concluding

Voice technology is the future of each industry. The potential of voice recognition will only grow as AI and Machine Learning advance, bringing usefulness to the market. Voice technology introduces a completely new manner of communicating with clients and enhances their engagement.

Github code sample of the Installation Process.

Github code sample of Account Linking.

First published here: Part 1 / Part 2