One of the known truths of the Machine Learning(ML) world is that it takes a lot longer to deploy ML models to production than to develop it.¹ The problem of deploying ML models to production is well known. Modern software requires a variety of crucial properties such as on-demand scaling and high availability. As a result, it might take a lot of effort and time to correctly deploy models into productions. Let’s discuss some different options you have when it comes to deploying ML models. Variants are provided in order from the most general to ML-specific. 1. Hardwave / VM The most way to deploy anything is to rent a VM, wrap a model into some kind of a server and leave it running. While being extremely straightforward and customizable this method has numerous drawbacks such as hard integration into CI/CD pipelines and isolation problems. direct 2. Containers It is possible to deploy ML models in Docker containers using Kubernetes or similar orchestration tools. This option provides way more quality of life improvements. Models can be easily wrapped into specially designed servers such as or (works for VM option as well). Now it is even easier to chain models together using highly sophisticated frameworks such as . NVIDIA Triton Tensorflow Runtime Kubeflow However, customizability comes at a cost of DevOps complexity and a requirement to maintain technologies that make your model run. 3. General purpose serverless platforms An easy way to just drop your model on the cloud would be using serverless PaaS platforms. Here you have to wrap your model into some preprocessing and postprocessing code. Platforms like or provide more flexibility since you can even wrap your code into a container while functions, or make it much easier to deploy, even providing great integrations into respective cloud services. Heroku Google App Engine AWS Lambda Google Cloud Functions Azure Function This approach is great for background task processing since inference time is relatively high because you are and models themself are commonly stored from processing nodes and may require time to load. limited to processing models on CPU far away 4. ML-focused serverless providers Now we are seeing a rise of ML-focused serverless providers that host your model providing an API or a set of frameworks. One set of providers would be or . These services still require you to rent underlying compute instances on which your models will be running. Amazon SageMaker Google Cloud AI Platform Another option would be to use , , . Here you generally get serverless experience since you pay only for time your models are running. DeepMux Algorithmia Dataiku true In general, using ML-focused serverless providers allows you to separate GPU-intensive computations from CPU-intensive while providing on-demand scalability for the former. However, you still have to perform pre and post-processing on the client application or using cloud functions. That's it We successfully reviewed common options to deploy ML models. Thank you for reading! Stay tuned for more articles and feel free to write in the comment section or ask questions on msitnikov@deepmux.com One of the biggest underrated challenges in machine learning development is the deployment of the trained models in production that too in a scalable way. One joke on it I have read is “Most common way, Machine Learning gets deployed today is powerpoint slides :)”. References Adarsh Shah. (June 21, 2020). Challenges Deploying Machine Learning Models to Production https://towardsdatascience.com/challenges-deploying-machine-learning-models-to-production-ded3f9009cb3 Sambit Mahapatra. (March 17, 2019). Machine Learning Models as Micro Services in Docker https://towardsdatascience.com/machine-learning-models-as-micro-services-in-docker-a798e1f068a5 James Le.(March 6, 2020). The 5 Components Towards Building Production-Ready Machine Learning Systems https://medium.com/cracking-the-data-science-interview/the-5-components-towards-building-production-ready-machine-learning-system-a4d5237ec04e Also published at https://medium.com/@deepmux/deploying-ml-models-into-production-with-ease-8fbe48bc10e8

Chain

Amazon

Google

Heroku

How to Easily Deploy ML Models to Production

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

Untitled Story

How Cloud Functions for Machine Learning Can Become True

The Noonification: Use This 7-Step McKinsey Framework to Solve Any Problem (1/10/2023)

The Noonification: A Taxonomy of Inclusiveness (1/11/2024)

The Noonification: What is the InfiniteNature-Zero AI Model? (11/19/2022)

10 Ways AI Has Changed Our Lives

How Cloud Functions for Machine Learning Can Become True

The Noonification: Use This 7-Step McKinsey Framework to Solve Any Problem (1/10/2023)

The Noonification: A Taxonomy of Inclusiveness (1/11/2024)

The Noonification: What is the InfiniteNature-Zero AI Model? (11/19/2022)

10 Ways AI Has Changed Our Lives

Light-Mode

Classic

Newspaper

Dark-Mode

Neon Noir

Minty

HN StartUps