Too Long; Didn't Read
Unsupervised learning approach is demonstrated by state-of-the-art NLP model (e.g. BERT, GPT-2) to be a good way to learn feature for downstream task. Researchers demonstrated data-driven learned features provide a better audio feature than traditional acoustic feature such as Mel-frequency cepstrum (MFCC).