Show simple item record

dc.contributor.authorJaksic, Zoran
dc.contributor.authorCadenelli, Nicola
dc.contributor.authorBuchaca Prats, David
dc.contributor.authorPolo Bardés, Jordà
dc.contributor.authorBerral García, Josep Lluís
dc.contributor.authorCarrera Pérez, David
dc.contributor.otherUniversitat Politècnica de Catalunya. Doctorat en Arquitectura de Computadors
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
dc.contributor.otherBarcelona Supercomputing Center
dc.date.accessioned2020-05-06T08:29:50Z
dc.date.available2020-05-06T08:29:50Z
dc.date.issued2020-03-01
dc.identifier.citationJaksic, Z. [et al.]. A highly parameterizable framework for Conditional Restricted Boltzmann Machine based workloads accelerated with FPGAs and OpenCL. "Future generation computer systems", 1 Març 2020, vol. 104, núm. March 2020, p. 201-211.
dc.identifier.issn0167-739X
dc.identifier.urihttp://hdl.handle.net/2117/186484
dc.description© 2020 Elsevier. This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/
dc.description.abstractConditional Restricted Boltzmann Machine (CRBM) is a promising candidate for a multidimensional system modeling that can learn a probability distribution over a set of data. It is a specific type of an artificial neural network with one input (visible) and one output (hidden) layer. Recently published works demonstrate that CRBM is a suitable mechanism for modeling multidimensional time series such as human motion, workload characterization, city traffic analysis. The process of learning and inference of these systems relies on linear algebra functions like matrix–matrix multiplication, and for higher data sets, they are very compute-intensive. In this paper, we present a configurable framework for CRBM based workloads for arbitrary large models. We show how to accelerate the learning process of CRBM with FPGAs and OpenCL, and we conduct an extensive scalability study for different model sizes and system configurations. We show significant improvement in performance/Watt for large models and batch sizes (from 1.51x up to 5.71x depending on the host configuration) when we use FPGA and OpenCL for the acceleration, and limited benefits for small models comparing to the state-of-the-art CPU solution.
dc.description.sponsorshipThis work was supported by the European Research Council(ERC) under the European Union’s Horizon 2020 research andinnovation programme (grant agreements No 639595); the Min-istry of Economy of Spain under contract TIN2015-65316-P andGeneralitat de Catalunya, Spain under contract 2014SGR1051;the ICREA, Spain Academia program; the BSC-CNS Severo Ochoaprogram, Spain (SEV-2015-0493) and Intel Corporation, UnitedStates
dc.format.extent11 p.
dc.language.isoeng
dc.publisherElsevier
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 International
dc.rights.urihttps://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subjectÀrees temàtiques de la UPC::Informàtica::Intel·ligència artificial::Aprenentatge automàtic
dc.subject.lcshMachine learning
dc.subject.lcshComputer systems
dc.subject.otherCRBM
dc.subject.otherFPGA
dc.subject.otherOpenCL
dc.subject.otherTime-series
dc.subject.otherANN
dc.subject.otherGEMM
dc.titleA highly parameterizable framework for Conditional Restricted Boltzmann Machine based workloads accelerated with FPGAs and OpenCL
dc.typeArticle
dc.subject.lemacAprenentatge automàtic
dc.subject.lemacSistemes informàtics
dc.contributor.groupUniversitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions
dc.identifier.doi10.1016/j.future.2019.10.025
dc.description.peerreviewedPeer Reviewed
dc.relation.publisherversionhttps://www.sciencedirect.com/science/article/pii/S0167739X19313676
dc.rights.accessOpen Access
local.identifier.drac26906540
dc.description.versionPostprint (published version)
dc.relation.projectidinfo:eu-repo/grantAgreement/EC/H2020/639595/EU/Holistic Integration of Emerging Supercomputing Technologies/Hi-EST
dc.relation.projectidinfo:eu-repo/grantAgreement/MINECO//SEV-2015-0493/ES/BARCELONA SUPERCOMPUTING CENTER - CENTRO. NACIONAL DE SUPERCOMPUTACION/
local.citation.authorJaksic, Z.; Cadenelli, N.; Buchaca, D.; Polo, J.; Berral, J.; Carrera, D.
local.citation.publicationNameFuture generation computer systems
local.citation.volume104
local.citation.numberMarch 2020
local.citation.startingPage201
local.citation.endingPage211


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-NoDerivs 4.0 International
Except where otherwise noted, content on this work is licensed under a Creative Commons license : Attribution-NonCommercial-NoDerivs 4.0 International