Proactive container auto-scaling for cloud native machine learning services

Buchaca Prats, David; Berral García, Josep Lluís; Wang, Chen; Youssef, Alaa

doi:10.1109/CLOUD49709.2020.00070

Visualitza/Obre

_IEEE_CLOUD_2020_shortpaper_Proactive_Container_Auto_scaling_for_Cloud_Native_Machine_Learning_Services.pdf (362,5Kb)

Veure estadístiques d'ús d'UPCommons

Estadístiques de LA Referencia / Recolecta

Cita com:

Mostra el registre d'ítem complet

Buchaca Prats, David

Berral García, Josep Lluís

Wang, Chen

Youssef, Alaa

Tipus de documentComunicació de congrés

Data publicació2020

EditorInstitute of Electrical and Electronics Engineers (IEEE)

Condicions d'accésAccés obert

Tots els drets reservats. Aquesta obra està protegida pels drets de propietat intel·lectual i industrial corresponents. Sense perjudici de les exempcions legals existents, queda prohibida la seva reproducció, distribució, comunicació pública o transformació sense l'autorització del titular dels drets

Abstract

Understanding the resource usage behaviors of the ever-increasing machine learning workloads are critical to cloud providers offering Machine Learning (ML) services. Capable of auto-scaling resources for customer workloads can significantly improve resource utilization, thus greatly reducing the cost. Here we leverage the AI4DL framework [1] to characterize workload and discover resource consumption phases. We advance the existing technology to an incremental phase discovery method that applies to more general types of ML workload for both training and inference. We use a time-window MultiLayer Perceptron (MLP) to predict phases in containers with different types of workload. Then, we propose a predictive vertical auto-scaling policy to resize the container dynamically according to phase predictions. We evaluate our predictive auto-scaling policies on 561 long-running containers with multiple types of ML workloads. The predictive policy can reduce up to 38% of allocated CPU compared to the default resource provisioning policies by developers. By comparing our predictive policies with commonly used reactive auto-scaling policies, we find that they can accurately predict sudden phase transitions (with an F1-score of 0.92) and significantly reduce the number of out-of-memory errors (350 vs. 20). Besides, we show that the predictive auto-scaling policy maintains the number of resizing operations close to the best reactive policies.

CitacióBuchaca, D. [et al.]. Proactive container auto-scaling for cloud native machine learning services. A: IEEE International Conference on Cloud Computing. "2020 IEEE 13th International Conference on Cloud Computing: 18–24 October 2020, virtual event: proceedings". Institute of Electrical and Electronics Engineers (IEEE), 2020, p. 475-479. ISBN 978-1-7281-8780-8. DOI 10.1109/CLOUD49709.2020.00070.

URIhttp://hdl.handle.net/2117/340053

DOI10.1109/CLOUD49709.2020.00070

ISBN978-1-7281-8780-8

Versió de l'editorhttps://ieeexplore.ieee.org/document/9284206

Col·leccions

Veure estadístiques d'ús d'UPCommons

Mostra el registre d'ítem complet

Fitxers	Descripció	Mida	Format	Visualitza
_IEEE_CLOUD_202 ... hine_Learning_Services.pdf		362,5Kb	PDF	Visualitza/Obre

UPCommons. Portal del coneixement obert de la UPC

Proactive container auto-scaling for cloud native machine learning services

Visualitza/Obre

Explora