Machine learning models are usually developed in a training environment (online or offline) and then can be deployed to be used with live data. If you're working in Data Science and Machine learning projects, knowing how to deploy a model is one of the most important skills you'll need to have. Who is this article for? This article is for those who have created a machine learning model in a local machine and want to deploy and test the model within a short time. It's also for those who are looking for an alternative platform to deploy their machine learning models. Let's get started! 🚀 What does it mean to deploy a Machine Learning model? Model deployment is the process of integrating your model into an existing production environment. The model will receive input and predict an output. Machine learning models can be deployed in different environments and can be integrated with different web or mobile applications through an API. “Only when a model is fully integrated with the business systems, we can extract real value from its predictions”. — Christopher Samiullah There are different platforms that can help you deploy your machine learning model. But for most of these platforms, it takes a lot of time and resources to configure the environment and deploy your model. For example, Sagemaker offers a popular library and ML frameworks, but you still have to depend on them for new releases. This might mean that you won't be able to deploy your model on time. Let's say the Sagemaker platform has scikit-learn v0.24 in their environment and you want to train and deploy your model with scikit-learn v1.0.1. You will not be able to do it until Sagemaker upgrades to the new version of scikit learn (1.0.1). In this article, you will learn how to use Aibro to deploy your model quickly and easily. What is Aibro? is a serverless MLOps tool that makes Machine Learning cloud computing cheap, easy, and fast. The tool can help data scientists or machine learning engineers train and deploy machine learning models on cloud platforms within a short period of time. Aibro It currently supports AWS cloud platform and they are planning to support more cloud platforms like Google Cloud, Microsoft Azure, Alibaba Cloud and IBM Cloud. It also supports most of the popular machine learning frameworks in the market like TensorFlow, Pytorch, Scikit-learn, and XGboost. Another advantage of using Aibro is the ability to reduce cloud costs by 85% using an exclusive cost-saving strategy built for machine learning. After understanding Aibro and its services, let's create a simple model and then deploy it. Create a Simple Model The first step is to build the model. We are going to use the to build a model that can classify if a movie review is positive or negative. Here are the steps you should follow to do that. IMDB Movie dataset Import the Important packages We need to import Python packages to load the data, clean the data, create a machine learning model, and save the model for deployment. accuracy_score, classification_report, plot_confusion_matrix, ) ): nltk.download(dependency) # import important modules import numpy as np import pandas as pd # sklearn modules from sklearn.model_selection import train_test_split from sklearn.pipeline import Pipeline from sklearn.naive_bayes import MultinomialNB # classifier from sklearn.metrics import ( from sklearn.feature_extraction.text import TfidfVectorizer, CountVectorizer # text preprocessing modules from string import punctuation # text preprocessing modules from nltk.tokenize import word_tokenize import nltk from nltk.corpus import stopwords from nltk.stem import WordNetLemmatizer import re #regular expression # Download dependency for dependency in ( "brown" , "names" , "wordnet" , "averaged_perceptron_tagger" , "universal_tagset" , import warnings warnings.filterwarnings( "ignore" ) # seeding np.random.seed( 123 ) Load the dataset from the data folder: # load data data = pd.read_csv( "../data/labeledTrainData.tsv" , sep= '\t' ) And then show a sample of the dataset: data.head() # show top five rows of data Our dataset has 3 columns: Id — This is the id of the review Sentiment — either positive (1) or negative (0) Review — comment about the movie Next, let's check the shape of the dataset: data.shape # check the shape of the data (25000, 3) The dataset has 25,000 reviews. Now we need to check if the dataset has any missing values: # check missing values in data data.isnull(). sum () id 0 sentiment 0 review 0 dtype: int64 The output shows that our dataset does not have any missing values. How to Evaluate Class Distribution We can use the value_counts() method from the Pandas package to evaluate the class distribution from our dataset. data.sentiment.value_counts() # evalute news sentiment distribution 1 12500 0 12500 Name: sentiment, dtype: int64 In this dataset, we have an equal number of positive and negative reviews. How to Process the Data After analyzing the dataset, the next step is to preprocess the dataset into the right format before creating our machine learning model. The reviews in this dataset contain a lot of unnecessary words and characters that we don’t need when creating a machine learning model. We will clean the messages by removing stopwords, numbers, and punctuation. Then we will convert each word into its base form by using the lemmatization process in the NLTK package. The text_cleaning() function will handle all necessary steps to clean our dataset. text = text.split() text = text.split() lemmatizer = WordNetLemmatizer() stop_words = stopwords.words( 'english' ) def text_cleaning ( text, remove_stop_words= True , lemmatize_words= True ): # Clean the text, with the option to remove stop_words and to lemmatize words # Clean the text text = re.sub( r"[^A-Za-z0-9]" , " " , text) text = re.sub( r"\'s" , " " , text) text = re.sub( r'http\S+' , ' link ' , text) text = re.sub( r'\b\d+(?:\.\d+)?\s+' , '' , text) # remove numbers # Remove punctuation from text text = '' .join([c for c in text if c not in punctuation]) # Optionally, remove stop words if remove_stop_words: text = [w for w in text if not w in stop_words] text = " " .join(text) # Optionally, shorten words to their stems if lemmatize_words: lemmatized_words = [lemmatizer.lemmatize(word) for word in text] text = " " .join(lemmatized_words) # Return a list of words return (text) Now we can clean our dataset by using the text_cleaning() function: #clean the review data[ "cleaned_review" ] = data[ "review" ].apply(text_cleaning) Then split the data into features and target variables like this: y = data.sentiment.values #split features and target from data X = data[ "cleaned_review" ] Our feature for training is the cleaned_review variable and the target is the sentiment variable. We then split our dataset into train and test data. The test size is 15% of the entire dataset. X_train, X_valid, y_train, y_valid = train_test_split( X, y, stratify=y, ) # split data into train and validate test_size= 0.15 , random_state= 42 , shuffle= True , How to Create a Model We will train the Multinomial algorithm to classify if a review is positive or negative. This is one of the most common algorithms used for text classification. Naive Bayes But before training the model, we need to transform our cleaned reviews into numerical values so that the model can understand the data. In this case, we will use the . TfidfVectorizer will help us to convert a collection of text documents to a matrix of TF-IDF features. TfidfVectorizer method from scikit-learn To apply this series of steps (pre-processing and training), we will use a from scikit-learn that sequentially applies a list of transforms and a final estimator. Pipeline class sentiment_classifier = Pipeline(steps=[ ]) # Create a classifier in pipeline ( 'pre_processing' ,TfidfVectorizer(lowercase= False )), ( 'naive_bayes' ,MultinomialNB()) Then we train our classifier like this: sentiment_classifier.fit(X_train,y_train) # train the sentiment classifier We then create a prediction from the validation set: y_preds = sentiment_classifier.predict(X_valid) # test model performance on valid data The model’s performance will be evaluated by using the accuracy_score evaluation metric. We use accuracy_score because we have an equal number of classes in the sentiment variable. accuracy_score(y_valid,y_preds) 0.8629 The accuracy of our model is around 86.29% which is good performance. How to Save the Model Pipeline We can save the model pipeline in the model’s directory by using the joblib Python package. #save model import joblib joblib.dump(sentiment_classifier, '../models/sentiment_model_pipeline.pkl' ) Now that we've built our model, let's learn how to deploy it with Aibro. Deployment WorkFlow – a Step by step Guide To deploy your model with Aibro, you need to prepare your model in the properly formatted machine learning model repository. You can quickly take a look at this repository , but we will build the same for the model we have created. https://github.com/AIpaca-Inc/Aibro-examples As you can see from the image of the deployment flow, Aibro will create an inference API from the formatted machine learning model repository, and you will receive a unique API URL and start to make a prediction from your model. All you need to do is to follow these steps. Step 1: Install the aibro Python library To install aibro, run the following command in your terminal: pip install aibro Step 2: Prepare the Model Repository The model repository will be formatted in the following structure. This folder will contain the model you have created. (a) model folder The data folder will have a JSON file that has an input value. For our case, the input will have a text value (review) as follows. (b) data folder { "data": "I loved it, the kids loved it. It shows them that anything is possible but more especially when you have that one person fighting for you. That one person who believes in you without fail. I appreciated the various life lessons included in the film about being humble and thankful but commanding respect at the same time despite where or what background you come from. Success doesn’t see age, race or gender but sadly opportunity often does. Will Smith doesn’t let the lack of opportunity beat them as a family and the family is a team. The bigger picture is always knowing that there is a team involved in most successful people." } Remember there is no restriction on how you want to format your input and output. Note: The python file should contain two python functions. (c) predict.py This function is responsible for loading the machine learning model from the model folder and returning it. In this tutorial, we will use the joblib package to load the model we have created. load_model(): def load_model (): #load model model = joblib.load( "model/sentiment_model_pipeline.pkl" ) return model This function will receive a model as the input and then load the data from the data folder. Finally, it will make predictions and return the result. run(): data = json.load(fp) def run ( model ): fp = open ( "data/data.json" , "r" ) review = text_cleaning(data[ "data" ]) result = { "data" : model.predict([review])} return result Therefore the predict.py will look as follows: data = json.load(fp) run(load_model()) # import important modules import json # load data import joblib # load model from clean import text_cleaning # function to clean the text def load_model (): #load model model = joblib.load( "model/sentiment_model_pipeline.pkl" ) return model def run ( model ): fp = open ( "data/data.json" , "r" ) review = text_cleaning(data[ "data" ]) result = { "data" : model.predict([review])} return result if __name__ == "__main__" : Aibro will first need to install the packages required to run your model before deploying the model itself. You can either manually write the packages and their version number in the requirements.txt or run the following command which will do the same: (d) requirements.txt pip list -- format =freeze > requirements.txt nltk==3.6.7 numpy==1.19.1 pandas==1.0.5 scikit_learn==0.23.1 joblib==1.0.0 It is also recommended to use the pipreqs Python package to generate requirements.txt. This is because it will include Python packages based on imports in your project instead of all packages in your environment. Note: $ pipreqs /home/aibro_project Successfully saved requirements file /home/aibro_project/requirements.txt You can also include other files or folders that will be used by the predict.py Python file. For example, in the model we have created, we will need to clean the input before making a prediction. (e) Other Artifacts The clean.py contains a Python function that will clean the text before making a prediction. nltk.download(dependency) nltk.download(dependency) nltk.download(dependency) text = text.split() text = text.split() lemmatizer = WordNetLemmatizer() # import packages import nltk # Download dependency corpora_list = [ "stopwords" , "names" , "brown" , "wordnet" ] for dependency in corpora_list: try : nltk.data.find( 'corpora/{}' . format (dependency)) except LookupError: taggers_list = [ "averaged_perceptron_tagger" , "universal_tagset" ] for dependency in taggers_list: try : nltk.data.find( 'taggers/{}' . format (dependency)) except LookupError: tokenizers_list = [ "punkt" ] for dependency in tokenizers_list: try : nltk.data.find( 'tokenizers/{}' . format (dependency)) except LookupError: from nltk.corpus import stopwords from nltk.stem import WordNetLemmatizer from nltk.tokenize import word_tokenize import re #regular expression from string import punctuation stop_words = stopwords.words( 'english' ) # function to clean the text def text_cleaning ( text, remove_stop_words= True , lemmatize_words= True ): # Clean the text, with the option to remove stop_words and to lemmatize word # Clean the text text = re.sub( r"[^A-Za-z0-9]" , " " , text) text = re.sub( r"\'s" , " " , text) text = re.sub( r'http\S+' , ' link ' , text) text = re.sub( r'\b\d+(?:\.\d+)?\s+' , '' , text) # remove numbers # Remove punctuation from text text = '' .join([c for c in text if c not in punctuation]) # Optionally, remove stop words if remove_stop_words: text = [w for w in text if not w in stop_words] text = " " .join(text) # Optionally, shorten words to their stems if lemmatize_words: lemmatized_words = [lemmatizer.lemmatize(word) for word in text] text = " " .join(lemmatized_words) # Return a list of words return (text) How to Test the Repo with Dryrun Before we deploy our model, we can test the repo using Dryrun. Dryrun will locally validate the repo structure and test if the inference result can be successfully returned. The following line of code will test the repo we have created: api_url = Inference.deploy( ) from aibro import Inference artifacts_path= "./sentiment_model_repo" , dryrun= True , The formatted model repository is saved at path “./sentiment_model_repo”. Note: The result shows that the prediction finished without errors. Now we can deploy the model. How to Create an Inference API with One Line of Code To deploy the model, you need to configure the following variables in the inference.deploy() method. The model name should be unique with respect to all current active inference jobs under . In this example, the model name will be "my_sentiment_classifier". (a) model_name your profile This is the id of the machine that will run our model. For this example we will use "c5.large.od".You can see the entire list in the . (b) machine_id_config marketplace This will be the path to your formatted machine learning model repository. For this example, the path is . (c) artifacts_path "./sentiment_model_repo" You can also add a description of your model deployment. (d) description Finally, the one-line code to create an inference API will look as follows. api_url = Inference.deploy( ) from aibro import Inference model_name = "my_movie_sentiment_classifier" , machine_id_config = "c5.large.od" , artifacts_path = "./sentiment_model_repo" , description= "my first inference job" , Once the deployment is finished, an API URL is returned with the syntax like this “ ” http://api.aipaca.ai/v1/{username}/{client_id}/{model_name}/predict if your inference job is public, {client_id} is filled out with "public". Otherwise, {client_id} should be filled out with one of your ' IDs. Note: clients In this tutorial, the API URL will be http://api.aipaca.ai/v1/DavisDavid/public/my_sentiment_classifier/predict How to Test an Aibro API We successfully deployed the model and got the API URL. Let’s test the model and see the result. We will use a Python package called requests to send a request to the API URL and get results. The posted data will replace everything in the data folder. Therefore, your posted data should have the same format as whatever you had in the data folder initially. Note: prediction = requests.post( data=review, ) result = prediction.text import requests import json review = { "data" : "A truly beautiful film that will having you crying with joy and pride. The (few) poor reviews cite a lack of authenticity regarding Richards character and a lack of screen time for the other major family members, including Serena. While I admittedly don’t know exactly the kind of person and father Richard was" } "http://api.aipaca.ai/v1/DavisDavid/public/my_movie_sentiment_classifier/predict" , print (result) The prediction result is {‘data’: array([1])}. As you can see we managed to predict by using the API, and the model predicts that the review is . positive (1) Complete the inference job If you're no longer going to use your inference job, you should shut down the API to avoid unnecessary costs. You can shut it down by passing the inference job id in the Inference.complete() method. rom aibro.inference import Inference id = "inf_cd712f4a-4b59-4e44-8787-9c5b5450ff6d" Inference.complete(job_id= id ) You will receive an output that the inference job successfully completed. Final Thoughts In this article, you have learned the fastest and simplest way to deploy a Machine Learning model to the cloud by using Aibro. You don't need to take a lot of your time and resources to configure the environment – just install aibro and you are good to go. There is a free community edition you can use for your small ML projects. But Alpaca is also giving free credits to new users so you don't need to worry about the cloud costs. There are a lot of features from Aibro that you can use while deploying your model. To learn more you can visit our beautiful designed documentation pages and you can also join our community to get more help. here here You can download the source code used in this article here: https://github.com/Davisy/Aibro-ML-Model-Deployment If you learned something new or enjoyed reading this article, please share it so that others can see it.