Data-flow driven optimal tasks distribution for global heterogeneous systems

García Almiñana, Jordi; Aguiló Gost, Francisco de Asis L.; Asensio Garcia, Adrian; Simó Mezquita, Ester; Zaragoza Monroig, M. Luisa; Masip Bruin, Xavier

doi:10.1016/j.future.2021.07.018

Visualitza/Obre

Data_Flow_Driven_Optimal_Tasks_Distribution_v3.pdf (1,073Mb)

Veure estadístiques d'ús d'UPCommons

Estadístiques de LA Referencia / Recolecta

Cita com:

Mostra el registre d'ítem complet

García Almiñana, Jordi

Aguiló Gost, Francisco de Asis L.

Asensio Garcia, Adrian

Simó Mezquita, Ester

Zaragoza Monroig, M. Luisa

Masip Bruin, Xavier

Tipus de documentArticle

Data publicació2021-07

EditorElsevier

Condicions d'accésAccés obert

Attribution-NonCommercial-NoDerivs 3.0 Spain

Llevat que s'hi indiqui el contrari, els continguts d'aquesta obra estan subjectes a la llicència de Creative Commons : Reconeixement-NoComercial-SenseObraDerivada 3.0 Espanya

Abstract

As a result of advances in technology and highly demanding users expectations, more and more applications require intensive computing resources and, most importantly, high consumption of data distributed throughout the environment. For this reason, there has been an increasing number of research efforts to cooperatively use geographically distributed resources, working in parallel and sharing resources and data. In fact, an application can be structured into a set of tasks organized through interdependent relationships, some of which can be effectively executed in parallel, notably speeding up the execution time. In this work a model is proposed aimed at offloading tasks execution in heterogeneous environments, considering different nodes computing capacity connected through distinct network bandwidths, and located at different distances. In the envisioned model, the focus is on the overhead produced when accessing remote data sources as well as the data transfer cost generated between tasks at run-time. The novelty of this approach is that the mechanism proposed for tasks allocation is data-flow aware, considering the geographical location of both, computing nodes and data sources, ending up in an optimal solution to a highly complex problem. Two optimization strategies are proposed, the Optimal Matching Model and the Staged Optimization Model, as two different approaches to obtain a solution to the task scheduling problem. In the optimal model approach a global solution for all application’s tasks is considered, finding an optimal solution. Differently, the staged model approach is designed to obtain a local optimal solution by stages. In both cases, a mixed integer linear programming model has been designed intended to minimizing the application execution time. In the studies carried out to evaluate this proposal, the staged model provides the optimal solution in 76% of the simulated scenarios, while it also dramatically reduces the solving time with respect to optimal. Both models have pros and cons and, in fact, can be used together to complement each other. The optimal model finds the global optimal solution at high running time cost, which makes this model unpractical on some scenarios. The staged model instead, is faster enough to be used on those scenarios; however, the given solution might not be optimal in some cases.

CitacióGarcia, J. [et al.]. Data-flow driven optimal tasks distribution for global heterogeneous systems. "Future generation computer systems", Juliol 2021, vol. 125, p. 792-805.

URIhttp://hdl.handle.net/2117/351355

DOI10.1016/j.future.2021.07.018

ISSN0167-739X

Col·leccions

Veure estadístiques d'ús d'UPCommons

Mostra el registre d'ítem complet

Fitxers	Descripció	Mida	Format	Visualitza
Data_Flow_Driven_Optimal_Tasks_Distribution_v3.pdf		1,073Mb	PDF	Visualitza/Obre

UPCommons. Portal del coneixement obert de la UPC

Data-flow driven optimal tasks distribution for global heterogeneous systems

Visualitza/Obre

Explora