Building an End-to-End Speech Recognition Model in PyTorch with AssemblyAI

TLDR

Building an End-to-End Speech Recognition Model in PyTorch with AssemblyAI. We’ll be training on a subset of LibriSpeech, which is a corpus of read English speech data derived from audiobooks. The output of the model will be a probability matrix of characters, and we'll use that probability matrix to decode the most likely characters spoken from the audio. The model is inspired by Deep Speech 2 (Baidu’s second revision of their now-famous model)via the TL;DR App

no story

Written by comet.ml | Allowing data scientists and teams the ability to track, compare, explain, reproduce ML experiments.

Published by HackerNoon on 2020/05/19