paint-brush
Serve Data Models with MLFlow in Productionby@rick-bahague
673 reads
673 reads

Serve Data Models with MLFlow in Production

by Rick BahagueAugust 3rd, 2019
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

MLFlow allows serving data models as REST API without the complicated setup. For organizations looking for a way to ‘democratize’ data science, it is a must that data models are accessible to the enterprise. There are other solutions out there to serve data models which is a very common problem for data scientists. We used anaconda3 to setup the environment and at least 1GB of RAM is needed to get R running with MLFlow in AWS LightSail. For Python-based models, MLFLow supports deploying to SageMaker.
featured image - Serve Data Models with MLFlow in Production
Rick Bahague HackerNoon profile picture

For organizations looking for a way to “democratize” data science, it is a must that data models are accessible to the enterprise in a very simple way. In our context, this is part of “model operationalization.” There are other solutions out there to serve data models which is a very common problem for data scientists.

We’ve thought of having a REST API to handle running data models on a HTTP post with new data.

Then, came MLFlow — which allows serving data models as REST API without the complicated setup.

To serve models using MLFlow, we did the following:

1. Save R models as RDS format using the saveRDS function.

2. Convert RDS format into MLFLOW flavor using the mlflow_save_model().

model = readRDS(“./LR_Model.rds”)
predictor <- crate(~ stats::predict.lm(model, as.data.frame(.x)), model)
mlflow_save_model(predictor, “< path_to_save >”)

3. Commit to Git.

4. Setup a server to host the model. Clone git repo.

5. Serve model using the mlflow_rfunc_serve().

mlflow_rfunc_serve("< path_to_save >", run_uuid = NULL, host = "127.0.0.1",port = 8090)

6. On users’ jupyter notebook, we have the following R lines to query the rest api endpoint with the output variable storing the model response.

request_body_json <- toJSON(feature_dataframe[column_name_features],dataframe='rows')
request_body_json
result = POST(url = paste0('http://127.0.0.1:8090/','predict/'),body=request_body_json,add_headers(.headers = c("Content-Type"="application/json","Accept" = "application/json")))
output <- content(result)

7. On python, a similar function is available

import requests
import json
def mlflow_predict(dd,model_name='LR12'):
    url = "http://127.0.0.1:8090/predict/"
    headers = {"Accept": "application/json", "Content-Type":"application/json"}
    resp = requests.post(url=url,json=dd,headers=headers)
    res=json.loads(resp.content.decode("utf-8"))
    return res['predictions']

If you’re deploying on AWS, make sure to set host=”0.0.0.0" to make it listen to the public internet address.

For Python-based models, MLFLow supports deploying to SageMaker.

If you have multiple models to serve, MLFLOW assigns a port to each of them.

Make sure you have a working R and Python installations. We used anaconda3 to setup the environment. Also, at least 1GB of RAM is needed to get R running with MLFlow in AWS LightSail.