Implementation of the K-Means Algorithm on Heterogeneous Devices: A Use Case Based on an Industrial Dataset

Cita com:
hdl:2117/114842
Document typeConference lecture
Defense date2018
PublisherIOS Press
Rights accessOpen Access
All rights reserved. This work is protected by the corresponding intellectual and industrial
property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public
communication or transformation of this work are prohibited without permission of the copyright holder
ProjectAXIOM - Agile, eXtensible, fast I%2FO Module for the cyber-physical era (EC-H2020-645496)
HiPEAC - High Performance and Embedded Architecture and Compilation (EC-H2020-687698)
MONT-BLANC - Mont-Blanc, European scalable and power efficient HPC platform based on low-power embedded technology (EC-FP7-288777)
MONT-BLANC 2 - Mont-Blanc 2, European scalable and power efficient HPC platform based on low-power embedded technology (EC-FP7-610402)
Mont-Blanc 3 - Mont-Blanc 3, European scalable and power efficient HPC platform based on low-power embedded technology (EC-H2020-671697)
COMPUTACION DE ALTAS PRESTACIONES VII (MINECO-TIN2015-65316-P)
HiPEAC - High Performance and Embedded Architecture and Compilation (EC-H2020-687698)
MONT-BLANC - Mont-Blanc, European scalable and power efficient HPC platform based on low-power embedded technology (EC-FP7-288777)
MONT-BLANC 2 - Mont-Blanc 2, European scalable and power efficient HPC platform based on low-power embedded technology (EC-FP7-610402)
Mont-Blanc 3 - Mont-Blanc 3, European scalable and power efficient HPC platform based on low-power embedded technology (EC-H2020-671697)
COMPUTACION DE ALTAS PRESTACIONES VII (MINECO-TIN2015-65316-P)
Abstract
This paper presents and analyzes a heterogeneous implementation of an industrial use case based on K-means that targets symmetric multiprocessing (SMP), GPUs and FPGAs. We present how the application can be optimized from an algorithmic point of view and how this optimization performs on two heterogeneous platforms. The presented implementation relies on the OmpSs programming model, which introduces a simplified pragma-based syntax for the communication between the main processor and the accelerators. Performance improvement can be achieved by the programmer explicitly specifying the data memory accesses or copies. As expected, the newer SMP+GPU system studied is more powerful than the older SMP+FPGA system. However the latter is enough to fulfill the requirements of our use case and we show that uses less energy when considering only the active power of the execution.
CitationXu, Y. H. [et al.]. Implementation of the K-Means Algorithm on Heterogeneous Devices: A Use Case Based on an Industrial Dataset. A: "Parallel Computing is Everywhere (serie: Advances in Parallel Computing)". IOS Press, 2018, p. 642-651.
ISBN0927-5452
Publisher versionhttp://ebooks.iospress.nl/volumearticle/48661
Collections
Files | Description | Size | Format | View |
---|---|---|---|---|
Implementation of the K-means.pdf | 463,1Kb | View/Open |