Too Long; Didn't Read
Building an End-to-End Speech Recognition Model in PyTorch with AssemblyAI. We’ll be training on a subset of LibriSpeech, which is a corpus of read English speech data derived from audiobooks. The output of the model will be a probability matrix of characters, and we'll use that probability matrix to decode the most likely characters spoken from the audio. The model is inspired by Deep Speech 2 (Baidu’s second revision of their now-famous model)
Share Your Thoughts