paint-brush
Text Classification With Zero Shot Learningby@shyamganesh
3,505 reads
3,505 reads

Text Classification With Zero Shot Learning

by Shyam Ganesh SOctober 16th, 2023
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

Zero-shot learning is a technique which doesn’t require a huge amount of labeled data for training rather use the learned knowledge for different class. The training resources (data and time) for zero shot learning is very less when compared to traditional machine learning. In this article, we will try to implement few transformers and TARSclassifier.

People Mentioned

Mention Thumbnail
featured image - Text Classification With Zero Shot Learning
Shyam Ganesh S HackerNoon profile picture

In this piece, we will delve into the fascinating realm of zero-shot learning, and explore how it can effectively address the challenge of text classification.

Zero-Shot Learning

Let us begin by establishing a clear distinction between traditional machine learning and zero-shot learning. Traditional machine learning involves training models using extensive datasets and making predictions based on the acquired knowledge during the training phase.


On the contrary, zero-shot learning is an approach that minimizes the need for vast labeled data during training.


Instead, it leverages pre-learned knowledge from different classes and applies that knowledge during inference. Consequently, zero-shot learning significantly reduces the training resources, such as data and time, than conventional machine learning.


After getting acquainted with this introduction, I am eager to delve into the intriguing realm of transfer learning. Zero-shot learning, a fascinating approach, draws upon transfer learning in its inference.


Understanding transfer learning becomes seamless through the lens of generative models or large language models. These sophisticated models undergo extensive training on vast datasets to excel at various tasks.


The real magic happens when we decide to employ these models for our specific use cases — this is where transfer learning truly shines. The model effortlessly applies its learned knowledge from one task to excel at an entirely different task, even if it was not part of its initial training objectives.


A prime example of this is leveraging ChatGPT, initially designed for natural language processing, to tackle sentiment classification effectively.


Zero-shot learning does not involve the need for additional datasets or training time. It is a technique where we leverage pre-existing trained models for making inferences, saving valuable resources and time typically required for traditional training processes.


Certainly! After gaining a brief introduction to zero-shot classification, let us delve into the fascinating realm of text classification through the lens of zero-shot learning.

Zero-shot Text Classification

As we delve into the realm of text classification, a familiar concept emerges - predicting the class of a given text document during inference. In traditional machine learning, achieving accurate predictions in this task demands extensive training data encompassing various classes.


However, with the advent of zero-shot text classification, we aim to simplify the dataset intricacies during the training stage. An intriguing aspect of zero-shot text classification is that it operates without the necessity of a single labeled data point.


Is it not impressive? Our journey of exploration is far from over. Stay tuned for more exciting discoveries ahead...


Zero-shot learning finds application through transformers, showcasing their versatility and power.


If you are unfamiliar with transformers and their potential, you can explore them in detail by following this link.


Hugging Face, a renowned platform in the realm of transformers, offers a rich repository of transformative tools. Hosting an extensive collection of over 60 open-source transformers, the platform serves as a hub for developers and enthusiasts alike.


In addition to these transformers, one can also delve into zero-shot text classification using TARSclassifier from the Flair library. Throughout this article, we will explore the practical implementation of various transformers and demonstrate the application of TARSclassifier.


Selecting the appropriate transformer model or TARSclassifier is entirely at the discretion of the developer. The decision hinges on various factors, including the specific task at hand, the available resources, and the unique requirements of the business.

Implementation of Zero-shot Text Classification

As previously mentioned, within the Hugging Face platform, there are an extensive array of over 60 transformers. Regrettably, it is impractical to delve into every single model in this article. However, we'll focus on exploring the top two models and delve into the TARSclassifier for a comprehensive understanding.


Before getting into the implementation, we will first install the required packages.

!pip3 install transformers


Having installed the packages, we will now start with the transformer implementation of zero-shot text classification.

facebook/bart-large-mnli

The bart-large-mnli model, a creation of Facebook researchers, stands as a testament to advancements in natural language processing. Specifically, the bart-large base model underwent rigorous training using the MNLI dataset, leading to stellar performance in zero-shot text classification.


Hugging Face, a pioneering platform for natural language processing, offers a convenient solution for users—a customizable pipeline. This feature allows users to input their data and expected labels for their chosen pre-trained model. Let us take a closer look through an illustrative example:


import transformers  

classifier = transformers.pipeline("zero-shot-classification", model="facebook/bart-large-mnli")  

text = "I enjoy playing cricket, specializing as a left-arm leg spinner while showcasing my skills as a right-handed one-down batsman." 
labels = ['Politics', 'Automobile', 'Sports', 'Business', 'World']  

prediction = classifier(text, labels)  

print(prediction['sequence']) 
print(prediction['labels']) 
print(prediction['scores'])


The output of the classifier is a dictionary containing the input sequence with predicted labels and probabilities in the descending order of the score.

cross-encoder/nli-deberta-base

This transformer is also a part of huggingface which is trained on SNLI and MultiNLI datasets. This transformer can be used for both cross-encoding tasks and zero-shot classification.


First, we will create a pipeline, and then call the zero-shot text classifier to get a prediction.

classifier = transformers.pipeline("zero-shot-classification", model="cross-encoder/nli-deberta-base")

text = "I enjoy playing cricket, specializing as a left-arm leg spinner while showcasing my skills as a right-handed one-down batsman."
labels = ['Politics', 'Automobile', 'Sports', 'Business', 'World']

prediction = classifier(text, labels)

print(prediction['sequence'])
print(prediction['labels'])
print(prediction['scores'])


The output of the classifier is a dictionary containing the input sequence with predicted labels and probabilities in the descending order of the score.

TARSclassifier

In addition to transformers, zero-shot text classification can be effectively implemented using the Flair library, leveraging the power of the TARSclassifier model.


Before getting into the implementation, we will first install the flair library,

!pip install Flair


Having installed the flair library, we will import the required packages and proceed with the implementation.

from flair.models import TARSClassifier
from flair.data import Sentence

classifier2 = TARSClassifier.load("tars-base")

sentence = Sentence("I am so glad to use Flair")

classes = ["happy", "sad"]

classifier2.predict_zero_shot(sentence, classes)

print(sentence)


In this article, we delve into the fascinating realm of zero-shot learning, exploring its applications and demonstrating how to effectively implement this technique to tackle text classification challenges using Python and a variety of transformers along with the Flair library.


With a comprehensive understanding of zero-shot learning under our belt, we can now venture into related concepts such as one-shot learning and few-shot learning. These approaches involve training our models with just one or a few labeled examples per class, offering intriguing possibilities and solutions in the world of machine learning.


For further insights and a deeper dive into the realm of few-shot learning, feel free to check out my dedicated article on few-shot learning.


Happy learning and exploring!