Show simple item record

dc.contributor.authorSilfa Feliz, Franyell Antonio
dc.contributor.authorArnau Montañés, José María
dc.contributor.authorGonzález Colás, Antonio María
dc.contributor.otherUniversitat Politècnica de Catalunya. Doctorat en Arquitectura de Computadors
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
dc.date.accessioned2021-04-29T12:32:51Z
dc.date.available2021-04-29T12:32:51Z
dc.date.issued2020
dc.identifier.citationSilfa, F.A.; Arnau, J.; González, A. Boosting LSTM performance through dynamic precision selection. A: International Symposium on High Performance Computing. "2020 IEEE 27th International Conference on High Performance Computing, Data, and Analytics, HiPC 2020: 16-18 December 2020, Pune, India (virtual event): proceedings". Institute of Electrical and Electronics Engineers (IEEE), 2020, p. 323-333. ISBN 978-0-7381-1035-6. DOI 10.1109/HiPC50609.2020.00046.
dc.identifier.isbn978-0-7381-1035-6
dc.identifier.urihttp://hdl.handle.net/2117/344816
dc.description.abstractThe use of low numerical precision is a fundamental optimization included in modern accelerators for Deep Neural Networks (DNNs). The number of bits of the numerical representation is set to the minimum precision that is able to retain accuracy based on an offline profiling, and it is kept constant for DNN inference. In this work, we explore the use of dynamic precision selection during DNN inference. We focus on Long Short Term Memory (LSTM) networks, which represent the state-of-the-art networks for applications such as machine translation and speech recognition. Unlike conventional DNNs, LSTM networks remember information from previous evaluations by storing data in the LSTM cell state. Our key observation is that the cell state determines the amount of precision required: time-steps where the cell state changes significantly require higher precision, whereas time-steps where the cell state is stable can be computed with lower precision without any loss in accuracy. We propose a novel hardware scheme that tracks the evolution of the elements in the LSTM cell state and dynamically selects the appropriate precision on each time-step. For a set of popular LSTM networks, it chooses the lowest precision for 57% of the time, outperforming systems that fix the precision statically. We evaluate our proposal on top of a modern highly-optimized LSTM accelerator, and show that it provides 1.46x speedup and 19.2% energy savings on average without degrading the model accuracy. Our scheme has an overhead of less than 8%.
dc.description.sponsorshipThis work has been supported by the CoCoUnit ERC Advanced Grant of the ED's Horizon 2020 program (grant No 833057), the Spanish State Research Agency under grant TIN2016-75344-R (AEI/FEDER, EU), the ICREA Academia program, and the Fundación Carolina and PUCMM by a scholarship.
dc.format.extent11 p.
dc.language.isoeng
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)
dc.subjectÀrees temàtiques de la UPC::Informàtica::Arquitectura de computadors
dc.subject.lcshHigh performance computing
dc.subject.lcshNeural networks (Computer science)
dc.subject.otherRNNs
dc.subject.otherLong short term memory
dc.subject.otherAccelerators
dc.subject.otherQuantization
dc.titleBoosting LSTM performance through dynamic precision selection
dc.typeConference report
dc.subject.lemacCàlcul intensiu (Informàtica)
dc.subject.lemacXarxes neuronals (Informàtica)
dc.contributor.groupUniversitat Politècnica de Catalunya. ARCO - Microarquitectura i Compiladors
dc.identifier.doi10.1109/HiPC50609.2020.00046
dc.description.peerreviewedPeer Reviewed
dc.relation.publisherversionhttps://ieeexplore.ieee.org/document/9406683
dc.rights.accessOpen Access
local.identifier.drac30433147
dc.description.versionPostprint (author's final draft)
dc.relation.projectidinfo:eu-repo/grantAgreement/EC/H2020/833057/EU/CoCoUnit: An Energy-Efficient Processing Unit for Cognitive Computing/CoCoUnit
dc.relation.projectidinfo:eu-repo/grantAgreement/MINECO/1PN/TIN2016-75344-R
local.citation.authorSilfa, F.A.; Arnau, J.; González, A.
local.citation.contributorInternational Symposium on High Performance Computing
local.citation.publicationName2020 IEEE 27th International Conference on High Performance Computing, Data, and Analytics, HiPC 2020: 16-18 December 2020, Pune, India (virtual event): proceedings
local.citation.startingPage323
local.citation.endingPage333


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

All rights reserved. This work is protected by the corresponding intellectual and industrial property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public communication or transformation of this work are prohibited without permission of the copyright holder