Let's take a look at how one may setup an ad-hoc, local (offline) text-to-speech synthesizer using , piper-whistle and named pipes. piper To setup piper on a GNU/Linux based system, I'll describe a general architecture using named pipes, which is straight forward enough to allow for system wide text-to-speech, with a little bit of manual setup, the help of piper-whistle and some minor trade-offs (it's simple, yet it won't support parallel speech processing). To start, let's fetch the latest piper stand-alone built from its ( at the time of writing this). After downloading the compressed archive, we'll create a directory structure for our piper setup. The root directory shall be at with the following sub-directories: repository hosted on github 2023.11.14-2 /opt/wind (will house the piper built) /opt/wind/piper (will contain the named pipes) /opt/wind/channels After decompressing, the piper executable should be available at , as well the accompanying libraries and . /opt/wind/piper/piper espeak-ng-data For managing voice models used by piper, I'd recommend using piper-whistle, a command-line utility written in python, which makes it more convenient to download and manage voices. You can get the latest wheel file from its or release page, or use the most recent release available through pip via . After installing piper-whistle, let's fetch a voice to generate some speech. First we’ll update the database by calling . For English speech, I quite like the female voice called alba. Using whistle, we can get a list of all available English (GB) voices using . The voice is at index 2. So to install it, simply call . gitlab github pip install -U piper-whistle piper_whistle -vR piper_whistle list -l en_GB piper_whistle install en_GB 2 Next, let's create the necessary named pipes. The resulting structure will look like this: (accepts JSON payload) /opt/wind/channels/speak (read by piper) /opt/wind/channels/input (written by piper) /opt/wind/channels/ouput To create a named pipe, you may use the following command: mkfifo -m 755 /opt/wind/channels/input Finally, we create three processes in separate shells: tty0: tail -F /opt/wind/channels/speak | tee /opt/wind/channels/input tty1: /opt/wind/piper/piper -m $(piper_whistle path alba@medium) --debug --json-input --output_raw < /opt/wind/channels/input > /opt/wind/channels/output tty2: aplay --buffer-size=777 -r 22050 -f S16_LE -t raw < /opt/wind/channels/output The process on tty0 makes sure, the pipe is kept open, even after the processing by piper or has been finished. This way, we can queue TTS requests and subsequently play or save them. aplay Since piper-whistle offers additional features if you use the structure above, we can now generate speech via . On systems with X11 you may generate a spoken version of the text in your clipboard via . piper_whistle speak "This is quite neat" piper_whistle speak "$(xsel --clipboard --output)"