Too Long; Didn't Read
Transformer-based models are a game-changer when it comes to using unstructured text data. The top-performing models in the General Language Understanding Evaluation (GLUE) benchmark are all BERT models. Transformer models can learn long-range dependencies between text and can be trained in parallel (as opposed to sequence to sequence models), meaning they can be pre-trained on large amounts of data. We set out to explore how text and tabular data could be used together to provide stronger signals in our projects.
Share Your Thoughts