Show simple item record

dc.contributor.authorBaig, Shuja-ur-Rehman
dc.contributor.authorAmaral, Marcelo
dc.contributor.authorPolo Cantero, José
dc.contributor.authorCarrera Pérez, David
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament d'Enginyeria Electrònica
dc.date.accessioned2018-10-29T19:15:26Z
dc.date.issued2018
dc.identifier.citationBaig, S., Amaral, M., Polo, J., Carrera, D. Performance characterization of spark workloads on shared NUMA Systems. A: International Conference on Big Data Computing Service and Applications. "2018 IEEE Fourth International Conference on Big Data Computing Service and Applications (BigDataService 2018): Bamberg, Germany: 26-29 March 2018". Institute of Electrical and Electronics Engineers (IEEE), 2018, p. 41-48.
dc.identifier.isbn9781538651209
dc.identifier.urihttp://hdl.handle.net/2117/123195
dc.description.abstractAs the adoption of Big Data technologies becomes the norm in an increasing number of scenarios, there is also a growing need to optimize them for modern processors. Spark has gained momentum over the last few years among companies looking for high performance solutions that can scale out across different cluster sizes. At the same time, modern processors can be connected to large amounts of physical memory, in the range of up to few terabytes. This opens an enormous range of opportunities for runtimes and applications that aim to improve their performance by leveraging low latencies and high bandwidth provided by RAM. The result is that there are several examples today of applications that have started pushing the in-memory computing paradigm to accelerate tasks. To deliver such a large physical memory capacity, hardware vendors have leveraged Non-Uniform Memory Architectures (NUMA). This paper explores how Spark-based workloads are impacted by the effects of NUMA-placement decisions, how different Spark configurations result in changes in delivered performance, how the characteristics of the applications can be used to predict workload collocation conflicts, and how to improve performance by collocating workloads in scale-up nodes. We explore several workloads run on top of the IBM Power8 processor, and provide manual strategies that can leverage performance improvements up to 40% on Spark workloads when using smart processor-pinning and workload collocation strategies.
dc.format.extent8 p.
dc.language.isoeng
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)
dc.subjectÀrees temàtiques de la UPC::Enginyeria de la telecomunicació::Telemàtica i xarxes d'ordinadors::Trànsit de dades
dc.subject.lcshBig data
dc.subject.otherBenchmark
dc.subject.otherCharacterization
dc.subject.otherMemory
dc.subject.otherModeling
dc.subject.otherNUMA
dc.subject.otherPerformance
dc.subject.otherSpark
dc.subject.otherBig data
dc.subject.otherCharacterization
dc.subject.otherData storage equipment
dc.subject.otherElectric sparks
dc.subject.otherMemory architecture
dc.subject.otherModels
dc.subject.otherRandom access storage
dc.subject.otherBig data technologies
dc.subject.otherComputing paradigm
dc.subject.otherImprove performance
dc.subject.otherNon-uniform memory architecture
dc.subject.otherNUMA
dc.subject.otherPerformance
dc.subject.otherPerformance characterization
dc.subject.otherPerformance improvements
dc.subject.otherBenchmarking
dc.titlePerformance characterization of spark workloads on shared NUMA Systems
dc.typeConference report
dc.subject.lemacDades massives
dc.contributor.groupUniversitat Politècnica de Catalunya. GRUP ISI - Grup d'Instrumentació, Sensors i Interfícies
dc.contributor.groupUniversitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions
dc.identifier.doi10.1109/BigDataService.2018.00015
dc.description.peerreviewedPeer Reviewed
dc.relation.publisherversionhttps://ieeexplore.ieee.org/document/8405690
dc.rights.accessOpen Access
local.identifier.drac23409239
dc.description.versionPostprint (author's final draft)
dc.relation.projectidinfo:eu-repo/grantAgreement/MINECO//TIN2015-65316-P/ES/COMPUTACION DE ALTAS PRESTACIONES VII/
dc.relation.projectidinfo:eu-repo/grantAgreement/AGAUR/PRI2010-2013/2014 SGR 1051
dc.relation.projectidinfo:eu-repo/grantAgreement/EC/H2020/639595/EU/Holistic Integration of Emerging Supercomputing Technologies/Hi-EST
local.citation.authorBaig, S.; Amaral, M.; Polo, J.; Carrera, D.
local.citation.contributorInternational Conference on Big Data Computing Service and Applications
local.citation.publicationName2018 IEEE Fourth International Conference on Big Data Computing Service and Applications (BigDataService 2018): Bamberg, Germany: 26-29 March 2018
local.citation.startingPage41
local.citation.endingPage48


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

All rights reserved. This work is protected by the corresponding intellectual and industrial property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public communication or transformation of this work are prohibited without permission of the copyright holder