Proactive container auto-scaling for cloud native machine learning services
10.1109/CLOUD49709.2020.00070
Inclou dades d'ús des de 2022
Cita com:
hdl:2117/340053
Tipus de documentComunicació de congrés
Data publicació2020
EditorInstitute of Electrical and Electronics Engineers (IEEE)
Condicions d'accésAccés obert
Tots els drets reservats. Aquesta obra està protegida pels drets de propietat intel·lectual i
industrial corresponents. Sense perjudici de les exempcions legals existents, queda prohibida la seva
reproducció, distribució, comunicació pública o transformació sense l'autorització del titular dels drets
Abstract
Understanding the resource usage behaviors of the ever-increasing machine learning workloads are critical to cloud providers offering Machine Learning (ML) services. Capable of auto-scaling resources for customer workloads can significantly improve resource utilization, thus greatly reducing the cost. Here we leverage the AI4DL framework [1] to characterize workload and discover resource consumption phases. We advance the existing technology to an incremental phase discovery method that applies to more general types of ML workload for both training and inference. We use a time-window MultiLayer Perceptron (MLP) to predict phases in containers with different types of workload. Then, we propose a predictive vertical auto-scaling policy to resize the container dynamically according to phase predictions. We evaluate our predictive auto-scaling policies on 561 long-running containers with multiple types of ML workloads. The predictive policy can reduce up to 38% of allocated CPU compared to the default resource provisioning policies by developers. By comparing our predictive policies with commonly used reactive auto-scaling policies, we find that they can accurately predict sudden phase transitions (with an F1-score of 0.92) and significantly reduce the number of out-of-memory errors (350 vs. 20). Besides, we show that the predictive auto-scaling policy maintains the number of resizing operations close to the best reactive policies.
CitacióBuchaca, D. [et al.]. Proactive container auto-scaling for cloud native machine learning services. A: IEEE International Conference on Cloud Computing. "2020 IEEE 13th International Conference on Cloud Computing: 18–24 October 2020, virtual event: proceedings". Institute of Electrical and Electronics Engineers (IEEE), 2020, p. 475-479. ISBN 978-1-7281-8780-8. DOI 10.1109/CLOUD49709.2020.00070.
ISBN978-1-7281-8780-8
Versió de l'editorhttps://ieeexplore.ieee.org/document/9284206
Fitxers | Descripció | Mida | Format | Visualitza |
---|---|---|---|---|
_IEEE_CLOUD_202 ... hine_Learning_Services.pdf | 362,5Kb | Visualitza/Obre |