Spark deployment and performance evaluation on the MareNostrum supercomputer

Tous Liesa, Rubén; Gounaris, Anastasios; Tripiana, Carlos; Torres Viñals, Jordi; Girona Turell, Sergi; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Becerra Fontal, Yolanda; Carrera Pérez, David; Valero Cortés, Mateo

doi:10.1109/BigData.2015.7363768

dc.contributor.author	Tous Liesa, Rubén
dc.contributor.author	Gounaris, Anastasios
dc.contributor.author	Tripiana, Carlos
dc.contributor.author	Torres Viñals, Jordi
dc.contributor.author	Girona Turell, Sergi
dc.contributor.author	Ayguadé Parra, Eduard
dc.contributor.author	Labarta Mancho, Jesús José
dc.contributor.author	Becerra Fontal, Yolanda
dc.contributor.author	Carrera Pérez, David
dc.contributor.author	Valero Cortés, Mateo
dc.contributor.other	Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
dc.date.accessioned	2017-01-27T09:14:37Z
dc.date.available	2017-01-27T09:14:37Z
dc.date.issued	2015
dc.identifier.citation	Tous, R., Gounaris, A., Tripiana, C., Torres, J., Girona, S., Ayguadé, E., Labarta, J., Becerra, Y., Carrera, D., Valero, M. Spark deployment and performance evaluation on the MareNostrum supercomputer. A: IEEE International Conference on Big Data. "2015 IEEE International Conference on Big Data: Oct 29-Nov 01, 2015, Santa Clara, CA, USA: proceedings". Santa Clara, CA: Institute of Electrical and Electronics Engineers (IEEE), 2015, p. 299-306.
dc.identifier.isbn	978-1-4799-9925-5
dc.identifier.uri	http://hdl.handle.net/2117/100165
dc.description.abstract	In this paper we present a framework to enable data-intensive Spark workloads on MareNostrum, a petascale supercomputer designed mainly for compute-intensive applications. As far as we know, this is the first attempt to investigate optimized deployment configurations of Spark on a petascale HPC setup. We detail the design of the framework and present some benchmark data to provide insights into the scalability of the system. We examine the impact of different configurations including parallelism, storage and networking alternatives, and we discuss several aspects in executing Big Data workloads on a computing system that is based on the compute-centric paradigm. Further, we derive conclusions aiming to pave the way towards systematic and optimized methodologies for fine-tuning data-intensive application on large clusters emphasizing on parallelism configurations.
dc.format.extent	8 p.
dc.language.iso	eng
dc.publisher	Institute of Electrical and Electronics Engineers (IEEE)
dc.subject	Àrees temàtiques de la UPC::Informàtica::Arquitectura de computadors
dc.subject.lcsh	Big data
dc.subject.lcsh	Parallel processing (Electronic computers)
dc.subject.other	Sparks
dc.subject.other	Benchmark testing
dc.subject.other	Supercomputers
dc.subject.other	Scalability
dc.subject.other	Heart beat
dc.title	Spark deployment and performance evaluation on the MareNostrum supercomputer
dc.type	Conference report
dc.subject.lemac	Macrodades
dc.subject.lemac	Processament en paral·lel (Ordinadors)
dc.contributor.group	Universitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions
dc.identifier.doi	10.1109/BigData.2015.7363768
dc.description.peerreviewed	Peer Reviewed
dc.relation.publisherversion	http://ieeexplore.ieee.org/abstract/document/7363768/
dc.rights.access	Open Access
local.identifier.drac	19377684
dc.description.version	Postprint (author's final draft)
local.citation.author	Tous, R.; Gounaris, A.; Tripiana, C.; Torres, J.; Girona, S.; Ayguadé, E.; Labarta, J.; Becerra, Y.; Carrera, D.; Valero, M.
local.citation.contributor	IEEE International Conference on Big Data
local.citation.pubplace	Santa Clara, CA
local.citation.publicationName	2015 IEEE International Conference on Big Data: Oct 29-Nov 01, 2015, Santa Clara, CA, USA: proceedings
local.citation.startingPage	299
local.citation.endingPage	306

Fitxers d'aquest items

Nom:: sparkk4mn_bigdata15_withfonts.pdf
Mida:: 253,9Kb
Format:: PDF

Visualitza/Obre

Aquest ítem apareix a les col·leccions següents

Ponències/Comunicacions de congressos [784]
Ponències/Comunicacions de congressos [1.954]

Mostra el registre d'ítem simple

UPCommons. Portal del coneixement obert de la UPC

Spark deployment and performance evaluation on the MareNostrum supercomputer

Fitxers d'aquest items

Aquest ítem apareix a les col·leccions següents

Explora