An Essential Python Text-to-Speech Tutorial Using the pyttsx3 Library

Let me first explain some abbreviations and some basic terms that we are going to use in this post.

For those who do not know what TTS means, It means Text To Speech.

What we want to do is to give some piece of text to our program and it will convert that text into speech and will read that to us.

It is, in other words, making the computer read to us.

Now there are various ways to do this TTS, but here we will discuss a library, which I have personally used and have got good results.

We are going to use pyttsx3.

What is `pyttsx3`

pyttsx3 is a Python library that allows us to convert text to speech. So we will be providing it our text and it will convert that text into audio.

It’s a wrapper around several text-to-speech engines, including Microsoft’s Text-to-Speech (TTS) engine.

The Fun Stuff

Now let us see how we use this library for TTS.

The first and foremost thing that we need to do is to install this library and we can do that using the pip, which comes bundled with the python installation.

The syntax for this installation is similar to all pip installs.

pip install pyttsx3

If in case this install gives you an error cmd not found, then try using pip3 instead of pip like:

pip3 install pyttsx3

After installing, let us check whether it was installed successfully or not by running this command:

pip3 freeze

It will return a list of all the packages installed in our env. If you find pyttsx3 in this list, then we successfully installed pyttsx3 and we are ready to use this in our project.

After the installation is complete, we need to import this library into our project, and then we have to initialize the text-to-speech engine. This engine is the most important part, and it is this engine that is going to perform the TTS for us.

Importing pyttsx3 and initializing the text-to-speech engine:

import pyttsx3

engine = pyttsx3.init()

.init() is the method that needs to be called in order to initialize the engine.

Now, as our engine is initialized, we can use it for our TTS by calling the say(text) method.

engine.say(text)

engine.runAndWait()

This speed and volume of the spoken text will be the default and we can change them in the following ways.

All we need is to set some values for our engine. It is like telling the engine what to use.

So, we have to do this in the following way:

First, we will get that property using getProperty(name)
Then we will set that property using setProperty(name, value)

We will set the rate and also the volume of the engine.

Setting the rate and volume of the speech:

rate = engine.getProperty('rate')

engine.setProperty('rate', rate-100)

By default, the rate is 200, so we will lower it to 100. The rate is the speaking rate, and 200 is high for us. So we lowered it.

The rate is simply, the pace at which the speaker will speak the text passed.

After setting the rate, we will change or set the volume by first getting the volume property and then setting it.

volume = engine.getProperty('volume')

engine.setProperty('volume', volume+0.50)

The default volume is 1 I.e 100%, we will increase it to 150% by adding the .50 to the received value from reading the volume property.

Now we are done setting these two properties, we will call say() and the speech will now be having our parameter I.e 1x speed and 1.50 volume.

engine.say("Hello, This is the test for the pyttsx3")

engine.runAndWait()

Now this runAndWait is important for us. We want to run this engine, and keep running until it has completed the TTS of the passed text.

Besides, changing the volume, we can also change the voice, that is being spoken. Now there are 2 voices here. One is for the female voice and one for the male voice.

We will use the same syntax for setting our property as we did earlier.

Keep in mind:

voices[0] is for the male voice.
voices[1] is for the female voice.

Let’s change the voice:

voices = engine.getProperty('voices')

engine.setProperty('voice', voices[1].id)

We will need to call the id property of the selected voice and then we are all set.

We can copy the upper code and this time the voice will be female instead of the default one that is male.

After we are done with the setting of the parameters and testing the TTS, we can now save the audio file of the generated TTS.

Instead of calling .say() we will call, this time, .save_to_file and pass the text and also the name of the out file, to which we want our TTS audio to be saved.

This time, it won’t read the text aloud, but rather save that to the file, whose name we passed.

engine.save_to_file("Hello, this is test for pyttsx3.", "test.mp3")

engine.runAndWait()

After running this, we will have a file named,test.mp3 with the generated TTS.

Also, that pyttsx3 has several other methods and properties that you can use to customize the text-to-speech output, such as setting the pitch, language, etc. You can find more information about these in the pyttsx3 documentation.

Final Words

See, how easy is it to generate TTS from a text or even from the file containing the text?

We can use these generated TTS for various purposes and it is unto you.

I know, now you will be able to perform this TTS with ease and You will make awesome projects using this.

Now, do one thing, follow me on Twitter, if you like the content and want to stay connected.

Thank you for reading, enjoy the content.

An Essential Python Text-to-Speech Tutorial Using the pyttsx3 Library

What is pyttsx3

The Fun Stuff

Final Words

What is `pyttsx3`