My engineering friend, let’s address him as Mr. Wolf 🐺 (identify hidden), requested a 1:1 call to help him fix his classification model. This happened months ago, and I thought to share our conversation as a learning opportunity for others. To be honest he addressed his model as an AI-based classifier, earlier I just polished his reference. 😂 Real-life based story of #engineermeetsdatascientist It was his first experience building a DS model. Mr. Wolf scanned through many Youtube influencer videos and, from some random godly-looking intelligent data scientists, learned how to build a data science classification model-or, to be precise, a text classification model. He must have come across someone claiming like this Mr. Wolf, after a few videos, felt he knew enough and went ahead to build his quoted “AI-Based Classifier” which detects a topic with the highest probability and responds with a canned response corresponding to the topic. [Problem Statement] The classifier is trained on questions that are mostly related to two topics (shipping plans and order cancellation costs). Mr. Wolf with his partial data science and domain knowledge converted text to embeddings and then created a binary classification model. Ignoring the fact that there might be user questions that do not belong to the above two topics, he deployed this model to his production system. He became a rock star after model deployment, Woohooooo!!! earning his workplace the title of “AI-DRIVEN COMPANY.” Soon after a day or two during post-analysis, his happiness was lost. He now had to save his face. The model produces incorrect results for topics that are not listed in the first two categories. He called me to help fix the sham. 😂 First thing I said to my friend Mr. Wolf “Bhai Sun meri baat kisi bhi youtuber ka code copy karke hero mat bano pehle usska background and experience dekho, GURU banane se pehle” Translation: (My first piece of advice was to not follow any Influencer’s Random Code Snippets and to first learn about a person’s background before appointing an XYZ person as your GURU.) After understanding the details of the problem and his current deployed model, which classifies user queries into two categories: order cancellations and shipping plans, the following were the two solutions proposed: Short-term, inelegant solution (not recommended for long-term use) Implement an automatic response feature that only activates when the class confidence is above a 0.8 threshold. This would help reduce “false positives,” in which the model responds to queries that are not related to shipping plans or order cancellation. However, this solution may lead to an increase in false negatives and result in an influx of common queries being sent to human customer support agents until the model is improved. Note: The 0.8 threshold is just an assumption. The ideal threshold can be determined by identifying the point on a receiver operating characteristic (ROC) plot where the false positive rate is minimized and the true positive rate is maximized. Fundamentally, what was done wrong? Mr. Wolf, while building the classification model, only trained the model on 40% of the data, without considering the remaining 60% as the class. This resulted in the model treating the problem as a binary classification problem rather than a multi-class classification problem. DO_NOT_RESPOND Long-Term Solution Implement a multi-label classifier with three categories. Class1: shipping plans Class2: order cancellation cost Class3: negative samples — `DO_NOT_REPLY` As more queries are added, it is important to keep our model up to date to prevent model drift. To enhance the model’s understanding of the queries and their classes, we should consider for the three classes. retraining or refreshing the model Online learning with is also a possible solution to handle the drift issue. Vowpal Wabbit Share some NLP Classifier Methods, we can apply. Query text to embedding Build a multi-class classifier with Embedding as features and target as the topic. : TFIDFVectorizer with SpaCy Tokenizer and a machine learning classifier Method1 : Using Facebook’s FastText library for word embeddings Method2 : Converting text to vectors using Doc2Vec, a pretrained Gensim model, and a machine learning classifier Method3 : A combination of Word2Vec with an Average Pooling strategy or a TFIDF Weighted Pooling strategy Method4 : An advanced approach using Google’s BERT model to obtain document embeddings and classify them into three different topics/categories Method5 : If the training dataset is small, then a zero-shot classifier is the best approach. Method6 Try — Zero Shot Classifier only 4–5 lines of Code. Ktrain If you are looking for the exact text to embedding code, check out my newsletter Edition 6, “Building a Job Recommendation Strategy for LinkedIn and XING." If you have something interesting for me, let’s connect and discuss it in detail. MAIL ME Mr. Wolf carefully heard my list of ideas and approaches and said you are my GURU now. Originally published . here