Serve Data Models with MLFlow in Production

481 reads

@ rick-bahague Rick Bahague Free & Open Source Advocate. Data Geek - Big or Small.

For organizations looking for a way to “democratize” data science, it is a must that data models are accessible to the enterprise in a very simple way. In our context, this is part of “model operationalization.” There are other solutions out there to serve data models which is a very common problem for data scientists.

We’ve thought of having a REST API to handle running data models on a HTTP post with new data.

Then, came MLFlow — which allows serving data models as REST API without the complicated setup.

To serve models using MLFlow, we did the following:

1. Save R models as RDS format using the saveRDS function.

2. Convert RDS format into MLFLOW flavor using the mlflow_save_model().

model = readRDS(“./LR_Model.rds”) predictor <- crate(~ stats::predict.lm(model, as.data.frame(.x)), model) mlflow_save_model(predictor, “< path_to_save >”)

3. Commit to Git.

4. Setup a server to host the model. Clone git repo.

5. Serve model using the mlflow_rfunc_serve().

mlflow_rfunc_serve( "< path_to_save >" , run_uuid = NULL , host = "127.0.0.1" ,port = 8090 )

6. On users’ jupyter notebook, we have the following R lines to query the rest api endpoint with the output variable storing the model response.

request_body_json <- toJSON(feature_dataframe[column_name_features],dataframe= 'rows' ) request_body_json result = POST(url = paste0( 'http://127.0.0.1:8090/' , 'predict/' ),body=request_body_json,add_headers(.headers = c( "Content-Type" = "application/json" , "Accept" = "application/json" ))) output <- content(result)

7. On python, a similar function is available

import requests import json def mlflow_predict(dd,model_name= 'LR12' ): url = "http://127.0.0.1:8090/predict/" headers = { "Accept" : "application/json" , "Content-Type" : "application/json" } resp = requests.post(url=url,json=dd,headers=headers) res=json.loads(resp.content.decode( "utf-8" )) return res[ 'predictions' ]

If you’re deploying on AWS, make sure to set host=”0.0.0.0" to make it listen to the public internet address.

For Python-based models, MLFLow supports deploying to SageMaker.

If you have multiple models to serve, MLFLOW assigns a port to each of them.

Make sure you have a working R and Python installations. We used anaconda3 to setup the environment. Also, at least 1GB of RAM is needed to get R running with MLFlow in AWS LightSail.

Share this story @ rick-bahague Rick Bahague Read my stories Free & Open Source Advocate. Data Geek - Big or Small.

Tags