Too Long; Didn't Read
Efficient Model Training in the Cloud with Kubernetes, TensorFlow, and Alluxio Open Source. Alibaba Cloud Container Service Team's case study (White Paper here) Our goal was to reduce the cost and complexity of data access for Deep Learning training in a hybrid environment, which resulted in over 40% reduction in training time and cost. The merger of these technologies as a combined solution is emerging as the industry trend for DL training. We designed and implemented a model training architecture based on container and data orchestration technologies as shown below: