In the world full of Siri, Cortana & Alexa, have you ever wondered you can create a new friend of yours. Well it might not be that intelligent but it not worthless to try creating something new. With the current state of web apps, we can rely on various UI elements to interact with users. With the Web Speech API, we can develop rich web applications with natural user interactions and minimal visual interface, using voice commands. This enables countless use cases for richer web applications. Moreover, the API can make web apps accessible,helping people with physical or cognitive disabilities or injuries. The future web will be more conversational and accessible! Here, we will use the API to create an artificial intelligence (AI) voice chat interface in the browser. The app will listen to the user’s voice and reply with a synthetic voice. Because the Web Speech API is still experimental, the app works only in . The features used for this article, both speech recognition and speech synthesis, are currently only in the Chromium-based browsers, including Chrome 25+ and Opera 27+, while Firefox, Edge and Safari support only speech synthesis at the moment. supported browsers Let's get started To build the web app, we’re going to take three major steps: Use the Web Speech API’s interface to listen to the user’s voice. Speech Recognition Send the user’s message to a commercial natural-language-processing API as a text string. Once API.AI returns the response text back, use the SpeechSynthesis interface to give it a synthetic voice. Requirements and Installations Supported browser Installation of Node.js Installation of node modules like: : APIAI npm i apiai : Socket.io npm install socket.io : dotenv npm i dotenv-extended : Express npm install express --save Setting Up Your Application Set up a web app framework with Node.js. Create your app directory, and set up your app’s structure like this: . ├── index.js ├── files │ ├── css │ │ └── style.css │ └── js │ └── script.js └── views └── index.html Then, run this command to initialize your Node.js app: npm init This will generate a package.json file that contains the basic info for your app. Now, install all of the dependencies needed to build this app: $ npm i apiai $ npm install socket.io $ npm i dotenv-extended $ npm install express --save We are going to use , a Node.js web application server framework, to run the server locally. To enable real-time bidirectional communication between the server and the browser, we’ll use . Also, we’ll install the natural language processing service tool, in order to build an AI chatbot that can have an artificial conversation. Express Socket.IO APIAI Socket.IO is a library that enables us to use WebSocket easily with Node.js. By establishing a socket connection between the client and server, our chat messages will be passed back and forth between the browser and our server, as soon as text data is returned by the Web Speech API (the voice message) or by API.AI API (the “AI” message). Now, let’s create an file and instantiate Express and listen to the server: index.js ; apiai = ( ); APIAI_TOKEN =apiai( ); APIAI_SESSION_ID = ; express = ( ); app = express(); app.use(express.static(__dirname + )); app.use(express.static(__dirname + )); server = app.listen(process.env.PORT || , () => { .log( , server.address().port); }); io = ( )(server); io.on( , { .log( ); }); app.get( , (req, res) => { res.sendFile( ); }); io.on( , { socket.on( , (text) => { .log( + text); apiaiReq = APIAI_TOKEN.textRequest(text, { : APIAI_SESSION_ID }); apiaiReq.on( , (response) => { aiText = response.result.fulfillment.speech; .log( + aiText); socket.emit( , aiText); }); apiaiReq.on( , (error) => { .log(error); }); apiaiReq.end(); }); }); 'use strict' var require 'apiai' var " " //use a api token from the official site const " " //use a session id const require 'express' const '/views' '/files' const 3000 console 'Server listening on port %d ' const require 'socket.io' 'connection' ( ) function socket console 'a user connected' // Web UI '/' 'index.html' 'connection' ( ) function socket 'chat message' console 'Message: ' // Get a reply from API.ai let sessionId 'response' let console 'Bot reply: ' 'bot reply' 'error' console Now,we will integrate the front-end code with the Web Speech API. Creating The User Interface The UI of this app is simple: just a button to trigger voice recognition. Let’s set up our index.html file and include our front-end JavaScript file (script.js) and Socket.IO, which we will use later to enable the real-time communication: Recognito Recognito You said: ... Recognito replied: ... <!DOCTYPE html> < = > html lang "en" < > head < = > meta charset "utf-8" < = = > meta http-equiv "X-UA-Compatible" content "IE=edge,chrome=1" < = = > meta name "viewport" content "width=device-width" < > title </ > title < = = > link rel "stylesheet" href "https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.7.0/css/font-awesome.min.css" < = = = > link rel "stylesheet" type "text/css" href "css/style.css" </ > head < > body < > section < > h1 </ > h1 < = > button id "btn" < = > i class "fa fa-microphone" </ > i </ > button < > div < > p < = > em class "output-you" </ > em </ > p < > p < = > em class "output-bot" </ > em </ > p </ > div </ > section < = > script src "socket.io/socket.io.js" </ > script < = > script src "js/script.js" </ > script </ > body </ > html To style the button , refer to the style.css file in the . source code Capturing Voice With JavaScript In , invoke an instance of , the controller interface of the Web Speech API for voice recognition: script.js SpeechRecognition We’re including both prefixed and non-prefixed objects, because Chrome currently supports the API with prefixed properties. Also, we are using some of ECMAScript 6 syntax in this tutorial, because the syntax, including the const and arrow functions, are available in browsers that support both Speech API interfaces, and Speech Recognition SpeechSynthesis. Optionally, you can set varieties of to customize speech recognition: properties recognition.lang = ; recognition.interimResults = ; 'en-US' false Then, capture the DOM reference for the button UI, and listen for the click event to initiate speech recognition. .querySelector( ).addEventListener( , () => { recognition.start(); }); document 'button' 'click' Once speech recognition has started, use the result event to retrieve what was said as text. This will return a SpeechRecognitionResultList object containing the result, and you can retrieve the text in the array. Also, as you can see in the code sample, this will return confidence for the transcription, too. recognition.addEventListener( , (e) => { last = e.results.length - ; text = e.results[last][ ].transcript; .log( + e.results[ ][ ].confidence); }); 'result' let 1 let 0 console 'Confidence: ' 0 0 // We will use the Socket.IO here later… Real-Time Communication With Socket.IO is a library for real-time web applications. It enables real-time bidirectional communication between web clients and servers. We are going to use it to pass the result from the browser to the Node.js code, and then pass the response back to the browser. Socket.IO You may be wondering why are we not using simple HTTP or AJAX instead. You could send data to the server via POST. However, we are using WebSocket via Socket.IO because sockets are the best solution for bidirectional communication, especially when pushing an event from the server to the browser. With a continuous socket connection, we won’t need to reload the browser or keep sending an AJAX request at a frequent interval. Instantiate Socket.IO in script.js somewhere: socket = io(); const Then, insert this code where you are listening to the result event from SpeechRecognition: socket.emit( , text); 'chat message' Now, let’s go back to the Node.js code to receive this text and use AI to reply to the user. To build a quick conversational interface, we will use because it provides a free developer account and allows us to set up a small-talk system quickly using its web interface and Node.js library. API.AI Setting Up APIAI Use this for reference: APIAI_TOKEN =apiai( ); APIAI_SESSION_ID = ; var "5afc4bdf601046b39972ff3866cca392" const "chatbot-clvxfh" or get your own by visiting the official site( )and signing up. Getting Started Now we will use the server-side Socket.IO to receive the result from the browser. io.on( , { socket.on( , (text) => { apiaiReq = apiai.textRequest(text, { : APIAI_SESSION_ID }); apiaiReq.on( , (response) => { aiText = response.result.fulfillment.speech; socket.emit( , aiText); }); apiaiReq.on( , (error) => { .log(error); }); apiaiReq.end(); }); }); 'connection' ( ) function socket 'chat message' // Get a reply from API.AI let sessionId 'response' let 'bot reply' // Send the result back to the browser! 'error' console Once the connection is established and the message is received, use the API.AI APIs to retrieve a reply to the user’s message.When API.AI returns the result, use Socket.IO’s to send it back to the browser. socket.emit() Giving Voice to the bot With The SpeechSynthesis Interface Create a function to generate a synthetic voice. This time, we are using the SpeechSynthesis controller interface of the Web Speech API. The function takes a string as an argument and enables the browser to speak the text: { synth = .speechSynthesis; utterance = SpeechSynthesisUtterance(); utterance.text = text; synth.speak(utterance); } ( ) function synthVoice text const window const new In the function, first, create a reference to the API entry point, . You might notice that there is no prefixed property this time: This API is more widely supported than SpeechRecognition, and all browsers that support it have already dropped the prefix for SpeechSysthesis. window.speechSynthesis Then, create a new instance using its constructor, and set the text that will be synthesised when the utterance is spoken. You can set other , such as voice to choose the type of the voices that the browser and operating system should support. SpeechSynthesisUtterance() properties Finally, use the to let it speak! SpeechSynthesis.speak() Now, get the response from the server using Socket.IO again. Once the message is received, call the function. ; socket = io(); outputYou = .querySelector( ); outputBot = .querySelector( ); SpeechRecognition = .SpeechRecognition || .webkitSpeechRecognition; recognition = SpeechRecognition(); recognition.lang = ; recognition.interimResults = ; recognition.maxAlternatives = ; .querySelector( ).addEventListener( , () => { recognition.start(); }); recognition.addEventListener( , () => { .log( ); }); recognition.addEventListener( , (e) => { .log( ); last = e.results.length - ; text = e.results[last][ ].transcript; outputYou.textContent = text; .log( + e.results[ ][ ].confidence); socket.emit( , text); }); recognition.addEventListener( , () => { recognition.stop(); }); recognition.addEventListener( , (e) => { outputBot.textContent = + e.error; }); { synth = .speechSynthesis; utterance = SpeechSynthesisUtterance(); utterance.text = text; synth.speak(utterance); } socket.on( , { synthVoice(replyText); (replyText == ) replyText = ; outputBot.textContent = replyText; }); 'use strict' const const document '.output-you' const document '.output-bot' const window window const new 'en-US' false 1 document 'button' 'click' 'speechstart' console 'Speech has been detected.' 'result' console 'Result has been detected.' let 1 let 0 console 'Confidence: ' 0 0 'chat message' 'speechend' 'error' 'Error: ' ( ) function synthVoice text const window const new 'bot reply' ( ) function replyText if '' '(No answer...)' It's done.Run the following command in your terminal. $ node index.js And search in any supported browser. localhost:3000 You can refer to my for further help. repository References https://caniuse.com/#search=speech https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognition https://www.npmjs.com/package/apiai https://socket.io/ https://www.npmjs.com/package/dotenv-extended https://expressjs.com/