TensorFlow on state-of-the-art HPC clusters: a machine learning use case

Ramirez-Gargallo, Guillem; Garcia-Gasulla, Marta; Mantovani, Filippo

doi:10.1109/CCGRID.2019.00067

Visualitza/Obre

TensorFlow on state-of-the-art HPC clusters.pdf (847,8Kb)

Veure estadístiques d'ús d'UPCommons

Estadístiques de LA Referencia / Recolecta

Cita com:

Mostra el registre d'ítem complet

Ramirez-Gargallo, Guillem

Garcia-Gasulla, Marta

Mantovani, Filippo

Tipus de documentComunicació de congrés

Data publicació2019

EditorIEEE

Condicions d'accésAccés obert

Tots els drets reservats. Aquesta obra està protegida pels drets de propietat intel·lectual i industrial corresponents. Sense perjudici de les exempcions legals existents, queda prohibida la seva reproducció, distribució, comunicació pública o transformació sense l'autorització del titular dels drets

Abstract

The recent rapid growth of the data-flow programming paradigm enabled the development of specific architectures, e.g., for machine learning. The most known example is the Tensor Processing Unit (TPU) by Google. Standard data-centers, however, still can not foresee large partitions dedicated to machine learning specific architectures. Within data-centers, the High-Performance Computing (HPC) clusters are highly parallel machines targeting a broad class of compute-intensive workflows, as such they can be used for tackling machine learning challenges. On top of this, HPC architectures are rapidly changing, including accelerators and instruction sets other than the classical x86 CPUs. In this blurry scenario, identifying which are the best hardware/software configurations to efficiently support machine learning workloads on HPC clusters is not trivial. In this paper, we considered the workflow of TensorFlow for image recognition. We highlight the strong dependency of the performance in the training phase on the availability of arithmetic libraries optimized for the underlying architecture. Following the example of Intel leveraging the MKL libraries for improving the TensorFlow performance, we plugged the Arm Performance Libraries into TensorFlow and tested on an HPC cluster based on Marvell ThunderX2 CPUs. Also, we performed a scalability study on three state-of-the-art HPC clusters based on different CPU architectures, x86 Intel Skylake, Arm-v8 Marvell ThunderX2, and PowerPC IBM Power9.

CitacióRamirez-Gargallo, G.; Garcia-Gasulla, M.; Mantovani, F. TensorFlow on state-of-the-art HPC clusters: a machine learning use case. A: "2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)". 2019, p. 1-8.

URIhttp://hdl.handle.net/2117/131762

DOI10.1109/CCGRID.2019.00067

Versió de l'editorhttps://ieeexplore.ieee.org/document/8752892

Col·leccions

Computer Sciences - Reports de recerca [15]

Veure estadístiques d'ús d'UPCommons

Mostra el registre d'ítem complet

Fitxers	Descripció	Mida	Format	Visualitza
TensorFlow on state-of-the-art HPC clusters.pdf		847,8Kb	PDF	Visualitza/Obre

UPCommons. Portal del coneixement obert de la UPC

TensorFlow on state-of-the-art HPC clusters: a machine learning use case

Visualitza/Obre

Explora