Too Long; Didn't Read
BERT — which stands for Bidirectional Encoder Representations from Transformers— leverages the transformer architecture in a novel way. BERT analyses both sides of the sentence with a randomly masked word to make a prediction. Fine-tuning transformers requires a powerful GPU with parallel processing. In this tutorial, we will use the newly released spaCy 3 library to fine tune our transformer. We will provide the data in IOB format contained in a TSV file then convert to spaCy.