Do you know what generative voices are capable of? What vast possibilities does they open up, and how many areas can benefit from them? Maybe you need an app for voicing a video blog or book. Perhaps you want to bring a game character to life with a professional voiceover. How about creating an application for learning foreign languages? Today, has prepared a tutorial for you that will help you get to know AI voice technologies better! lablab.ai Let’s dive in! Introduction It  is one of the most exciting times for software development, what with the emergence of various " " tools in the market. Just name it, cover letter generation? Check! E-mail generation? Check! Automatic code comment generation? Check! Even outside coding and software development, the use case possibilities are enormous. generative AI Now, we can generate images with text prompts with various image generation models. Thus, it makes it possible for us to incorporate generated assets in our various products. The next question is: how about voices? The trend of user experiences in the past few years mentioned "voice command" as one of the emerging trend. It is only natural that the software we build will incorporate voices as one of the features. Which is why, in this tutorial, we will showcase the " " feature offered by in a simple app, which generates random words and have it spell it. To build the UI for this , we will use , a new UI library to share data science projects. Speech Synthesis ElevenLabs Python-based app Streamlit Introduction to ElevenLabs is a voice technology research company which offers solution. With easy to use , it allows developers to generate high-quality speeches using AI. It is made possible by the AI model which has been trained on a vast amount of audiobooks and also podcasts. The training allows the AI to deliver predictable and high-quality results in speech generation. ElevenLabs speech synthesis API There are two main features that has to offer, the first one is VoiceLab, where users can clone voices from recorded audio and/or existing pre-made voices, and also "design" voices based on gender, ages, ethnicities and races. Once users have the voices to work with, they can move on to the next feature, Speech Synthesis, where they can generate speeches using their designed voices or just using the pre-made ones. ElevenLabs Introduction to Model Anthropic's Claude is the latest AI model developed by , an AI research organization which focuses on improving the interoperability, robustness and safety of artificial intelligence systems. Claude Anthropic The model is designed to generate human-like responses, making it a powerful tool for a wide range of applications, from content creation, legal, to customer service. Just like any other in the market, is also trained on a diverse range of internet text. However, unlike most AI models, it has focus on "safety", which makes it possible to refuse outputs that it considers "harmful" or "untruthful" for the users. Claude AI models Claude Introduction to Streamlit is an open-source library that makes it easy and possible for developers and data scientists to create and share visually appealing and customized web apps. Developers can use to build and deploy fully featured data science apps in minutes. It is made possible by the simple and intuitive API that can be used to turn data scripts into UI components. Streamlit Python Streamlit Prerequisites Basic knowledge of and using Python UI development Streamlit Access to Anthropic API Access to ElevenLabs API Outline Initializing our Streamlit Project Adding Word Generation Feature using Model Claude Adding Speech Generation Feature using ElevenLabs API Testing the Word Generator App Discussion There are at least four steps that we will get through in this tutorial. First we need to initialize the Streamlit-based project, to get a general feel of developing user interfaces using Streamlit. Next, we start adding more features, beginning with engineering prompt to get model to give us a randomized word that is commonly misspelled. After that, we'll add text-to-voice generation provided by to demonstrate how the multilingual model spell the words. Finally, we're going to test the simple app. Claude ElevenLabs Initializing our Streamlit Project Let's get into the coding action! First, let's make a directory for our project and enter it! mkdir randomwords
cd randomwords Next, we're going to use this directory as the basis of our project. Because a project is essentially a project, we need to do some steps to initialize our project, such as defining and activating our virtual environment. Streamlit Streamlit Python Python # Creating the virtual environment
python -m venv env

# Activate the virtual environment
# On Linux/Mac
source env/bin/activate

# On Windows:
.\env\Scripts\activate Once activated, the output of our terminal should show the name of the virtual environment (env), like so: Next, it's time to install the libraries we need for this project! let's use the command to install the , , and library. Note that we also install a version-locked library to prevent a Pydantic-related error in one of the function. pip streamlit anthropic elevenlabs pydantic elevenlabs pip install streamlit anthropic elevenlabs "pydantic==1.*" With all the project's requirements out of the way, now let's dive into the coding part! Create a new file inside our project directory, let's call it . randomwords_app.py touch randomwords_app.py After the file is created, let's open the file in our favorite code editor or integrated development environment (IDE). For the starter, let's build our simple app from the simplest parts possible, a title and a text for the caption! import streamlit as st

st.title("Random Words Generator")

st.text("Hello, this is a random words generator app") To wrap up our project initialization part, let's try test running the app. Make sure that our current working directory is still inside our project and our virtual environment is already activated. When everything is ready, use the to run the app. streamlit run <app-name> streamlit run randomwords_app.py The app should open automatically in our default browsers! it should show the title and text for now. Next, we're going to add random word generation feature using model. Anthropic's Claude One last thing though, we'll have to provide our app with the API keys for the services that we're going to use, namely model and feature. As these keys are considered sensitive, we should keep them in a safe and isolated place. Anthropic's Claude ElevenLabs' Speech Synthesis However, this time we don't store them in a file. This is because Streamlit deal with environment variables differently. According to the , we need to create a secret configuration file inside a directory. We can create the directory inside our project and then create the file. .env documentation .streamlit mkdir .streamlit
touch .streamlit/secrets.toml Let's edit the TOML file we created, note that TOML file uses different formatting from the usual file. .env xi_api_key = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
claude_key = "sk-ant-xxxxxxxxxxxxxxxxxxxxxxxxxxxxx" Adding Word Generation Feature using Claude Model In this step, we will add a button that will generate the random word, the heading element to show the generated word and the subheading to show the meaning of the word. However, coming from a webdev background, I strongly believe that UI elements should be placed and arranged inside containers. So, we'll do exactly that. Import the necessary libraries First of all, let's add some import statements. We're going to import the library to generate our random words. anthropic import streamlit as st
import anthropic Then, before we get to the UI part, let's create our word generation function first. Defining the word generation function def generate_word():
    prompt = (f"{anthropic.HUMAN_PROMPT} Give me one non-English word that's commonly misspelled and the meaning. Please strictly follow the format! example: Word: Schadenfreude; Meaning: joy at other's expenses."
              f"{anthropic.AI_PROMPT} Word: Karaoke; Meaning: a form of entertainment where people sing popular songs over pre-recorded backing tracks."
              f"{anthropic.HUMAN_PROMPT} Great! just like that. Remember, only respond following the pattern.")

    c = anthropic.Anthropic(api_key=st.secrets["claude_key"])
    resp = c.completions.create(
        prompt=f"{prompt} {anthropic.AI_PROMPT}",
        stop_sequences=[anthropic.HUMAN_PROMPT],
        model="claude-v1.3-100k",
        max_tokens_to_sample=900,
    )

    print(resp.completion)
    return resp.completion In this function, the most heavy lifting is done by ** **l (Thanks, ! 😉). However, our part in this function is how to make Claude return the exact format consistently. So we need to both instruct to "strictly follow the format" and give it an example response by adding it after our initial prompt. Anthropic's Claude mode Claude Claude Finally, we make sure that Claude comply with our agreements by ask it to "Remember to only respond following the pattern". The function ends by returning the response from . Claude Next, let's get back to editing the UI! Updating the UI st.title("Random Words Generator")

with st.container():
    st.header("Random Word")
    random_word = st.subheader("-")
    word_meaning = st.text("Meaning: -")

    st.write("Click the `Generate` button to generate new word")
    if st.button("Generate"):
        result = generate_word()
        # Split the string on the semicolon
        split_string = result.split(";")

        # Split the first part on ": " to get the word
        word = split_string[0].split(": ")[1]

        # Split the second part on ": " to get the meaning
        meaning = split_string[1].split(": ")[1]

        print(f"word result: {word}")
        random_word.subheader(word)
        word_meaning.text(f"Meaning: {meaning}") This time, we added a container with some elements inside it. The header, subheader for displaying the random word, and the text element to show the meaning of the word. We also have a text to show the hint on how to use the app, as well as a button beneath it. In , we can declare click event handler by using a conditional statement, where it returns when the button is clicked. In this code, we invoke the function which returns the generated word and the meaning, and split the result into separate variables for the word and the meaning, respectively. Finally, we update the subheader and the text element to display the word and the meaning. Streamlit True generate_word() Final form Let's double check our code once again! It should contains the import statements, the function for generating the random word, and the updated UI which contains subheader, and text elements as well as button that generate the word by invoking the function. generate_word() import streamlit as st
import anthropic

def generate_word():
    prompt = (f"{anthropic.HUMAN_PROMPT} Give me one non-English word that's commonly misspelled and the meaning. Please strictly follow the format! example: Word: Schadenfreude; Meaning: joy at other's expenses."
              f"{anthropic.AI_PROMPT} Word: Karaoke; Meaning: a form of entertainment where people sing popular songs over pre-recorded backing tracks."
              f"{anthropic.HUMAN_PROMPT} Great! just like that. Remember, only respond following the pattern.")

    c = anthropic.Anthropic(api_key=st.secrets["claude_key"])
    resp = c.completions.create(
        prompt=f"{prompt} {anthropic.AI_PROMPT}",
        stop_sequences=[anthropic.HUMAN_PROMPT],
        model="claude-v1.3-100k",
        max_tokens_to_sample=900,
    )

    print(resp.completion)
    return resp.completion


st.title("Random Words Generator")

with st.container():
    st.header("Random Word")
    random_word = st.subheader("-")
    word_meaning = st.text("Meaning: -")

    st.write("Click the `Generate` button to generate new word")
    if st.button("Generate"):
        result = generate_word()
        # Split the string on the semicolon
        split_string = result.split(";")

        # Split the first part on ": " to get the word
        word = split_string[0].split(": ")[1]

        # Split the second part on ": " to get the meaning
        meaning = split_string[1].split(": ")[1]

        print(f"word result: {word}")
        random_word.subheader(word)
        word_meaning.text(f"Meaning: {meaning}") Testing the Word Generation Function Let's run the app once again with the same command. We can also just rerun the app by clicking the upper right menu and click " " if we've had the app running before. Rerun It should show this updated user interface. Now, let's try clicking the button! Generate One of the sweet things about Streamlit is that it handled loading and provided the loading indicator out of the box. We should see the indicator in the upper-right corner, as well as the option to " " the operation. Neat, huh? stop After a few seconds, the result should be showed in the UI. Perfect! notice that the app correctly split the generated text from the model into word and the meaning. However, if the result doesn't come out according to the expected format, we can always click the button again. Claude Generate The next step is the main feature of this app, to incorporate speech generation into our random word generator. Besides demonstrating how to generate the audio file using the API provided by ElevenLabs, this step also serve to demonstrate the capabilities of ElevenLabs' multilingual model. Adding Speech Generation Feature using API ElevenLabs The first step of this section is, as you've probably guessed, is to add more import statement! So, let's add some functions from that we'll use for the speech generation feature. elevenlabs import streamlit as st
import anthropic
++ from elevenlabs import generate, set_api_key Next, we're going to define the function to handle the speech generation. def generate_speech(word):
    set_api_key(st.secrets['xi_api_key'])
    audio = generate(
        text=word,
        voice="Bella",
        model='eleven_multilingual_v1'
    )

    return audio Thanks to the simplicity and readability of , and also easy-to-use API, we can generate the speech by using this code alone! The function accepts the random word which we use to generate the speech. We also specifically use "eleven_multilingual_v1" model which is a multilingual model, perfect for our use case to demonstrate the spelling and pronounciation of foreign and commonly misspelled words! Finally, we use the " " voice for this tutorial, which is one of the pre-made voice provided by . Python ElevenLabs Bella ElevenLabs Next, we'll add an audio player to play the generated speech. print(f"word result: {word}")
    random_word.subheader(word)
    word_meaning.text(f"Meaning: {meaning}")
++     speech = generate_speech(word)
++     st.audio(speech, format='audio/mpeg') Just below our latest code from earlier, we add the variable to store the generated speech, and run the speech using audio player provided by function from Streamlit. At this point, our app should work as expected, only showing the audio player when there is a random word available to "read". st.audio Curious how it works out? me too! let's test the app now! Testing the Word Spelling Feature Let's run the app again using or just rerun the app if we have it running already. It should look exactly the same as the last time we left it. However, let's try to click the "Generate" button this time! streamlit run Amazing! this time, the app also shows an audio player! Let's try playing it. Using the multilingual model, the speech generated should use the accent and intonation which is close to the origin language of the word. For example, "entrepreneur" should be pronounced in French accent. Conclusion In this short tutorial, hopefully we've explored the capabilities of speech generation technology offered by . With the multilingual model, it's easy to generate speeches that is intended for non-English listener. In our use case, we need multilingual model to aid us in finding the correct way to pronounce and spell non-English words that are commonly misspelled. ElevenLabs With so many ideas, we invite developers to join us in creating the future! , in where you can create your own voice AI app with models! (Additionally, you can leverage other AI models such as large language models, image and video generative models, etc., as long as they are not in direct competition with the hackathon technology). On July 28 lablab.ai launching a challenge ElevenLabs *Your final submission should consist of a ready-to-play working prototype of your idea, a video pitch, and a presentation showcasing your solution. You can find more tutorials and you can other hackathons to build with cutting-edge technologies! HERE JOIN And big thanks to the - the Author of this article. 💚 Septian Adi Nugraha

Walkthroughs, tutorials, guides, and tips. This story will teach you how to do something new or how to do something better.

The code in this story is for educational purposes. The readers are solely responsible for whatever they build with it.

How Cohere's Multilingual Model is Helping Businesses Connect and Succeed Worldwide

Building a Simple Word Spelling App with ElevenLabs, Streamlit, and Claude

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

AI-Driven Autonomous Agents - The Future of AI

The Noonification: How Often Do NFTs Pass The Howey Test? (1/13/2023)

Darwin's Hybrid Intelligence to Align AI & Human Goals for Startups & VCs

The Noonification: White Man (11/26/2022)

The Noonification: The Metaverse is a Sh*tshow (11/2/2022)

100 Days of AI Day 1: From Newsletter to Podcast, Leveraging AI for Audio Transformation

AI-Driven Autonomous Agents - The Future of AI

The Noonification: How Often Do NFTs Pass The Howey Test? (1/13/2023)

Darwin's Hybrid Intelligence to Align AI & Human Goals for Startups & VCs

The Noonification: White Man (11/26/2022)

The Noonification: The Metaverse is a Sh*tshow (11/2/2022)

100 Days of AI Day 1: From Newsletter to Podcast, Leveraging AI for Audio Transformation

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps