Show simple item record

dc.contributor.authorSilfa Feliz, Franyell Antonio
dc.contributor.authorDot Artigas, Gem
dc.contributor.authorArnau Montañés, José María
dc.contributor.authorGonzález Colás, Antonio María
dc.contributor.otherUniversitat Politècnica de Catalunya. Doctorat en Arquitectura de Computadors
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
dc.date.accessioned2020-02-05T16:12:02Z
dc.date.available2020-02-05T16:12:02Z
dc.date.issued2019
dc.identifier.citationSilfa, F.A. [et al.]. Neuron-level fuzzy memoization in RNNs. A: Annual IEEE/ACM International Symposium on Microarchitecture. "MICRO-52: the 52nd Annual IEEE/ACM International Symposium on Microarchitecture: proceedings: October 12-16, 2019: Columbus, Ohio, USA". New York: Association for Computing Machinery (ACM), 2019, p. 782-793.
dc.identifier.isbn978-1-4503-6938-1
dc.identifier.urihttp://hdl.handle.net/2117/176878
dc.descriptionThe final publication is available at ACM via http://dx.doi.org/10.1145/3352460.3358309
dc.description.abstractRecurrent Neural Networks (RNNs) are a key technology for applications such as automatic speech recognition or machine translation. Unlike conventional feed-forward DNNs, RNNs remember past information to improve the accuracy of future predictions and, therefore, they are very effective for sequence processing problems. For each application run, each recurrent layer is executed many times for processing a potentially large sequence of inputs (words, images, audio frames, etc.). In this paper, we make the observation that the output of a neuron exhibits small changes in consecutive invocations. We exploit this property to build a neuron-level fuzzy memoization scheme, which dynamically caches the output of each neuron and reuses it whenever it is predicted that the current output will be similar to a previously computed result, avoiding in this way the output computations. The main challenge in this scheme is determining whether the new neuron's output for the current input in the sequence will be similar to a recently computed result. To this end, we extend the recurrent layer with a much simpler Bitwise Neural Network (BNN), and show that the BNN and RNN outputs are highly correlated: if two BNN outputs are very similar, the corresponding outputs in the original RNN layer are likely to exhibit negligible changes. The BNN provides a low-cost and effective mechanism for deciding when fuzzy memoization can be applied with a small impact on accuracy. We evaluate our memoization scheme on top of a state-of-the-art accelerator for RNNs, for a variety of different neural networks from multiple application domains. We show that our technique avoids more than 24.2% of computations, resulting in 18.5% energy savings and 1.35x speedup on average.
dc.format.extent12 p.
dc.language.isoeng
dc.publisherAssociation for Computing Machinery (ACM)
dc.subjectÀrees temàtiques de la UPC::Enginyeria de la telecomunicació::Processament del senyal::Processament de la parla i del senyal acústic
dc.subjectÀrees temàtiques de la UPC::Informàtica
dc.subject.lcshNeural networks (Computer science)
dc.subject.lcshAutomatic speech recognition
dc.subject.otherRecurrent neural networks
dc.subject.otherLong short term memory
dc.subject.otherBinary networks
dc.subject.otherMemoization
dc.subject.otherMachine learning
dc.titleNeuron-level fuzzy memoization in RNNs
dc.typeConference report
dc.subject.lemacXarxes neuronals (Informàtica)
dc.subject.lemacReconeixement automàtic de la parla
dc.contributor.groupUniversitat Politècnica de Catalunya. ARCO - Microarquitectura i Compiladors
dc.identifier.doi10.1145/3352460.3358309
dc.description.peerreviewedPeer Reviewed
dc.relation.publisherversionhttps://dl.acm.org/doi/abs/10.1145/3352460.3358309
dc.rights.accessOpen Access
local.identifier.drac26580762
dc.description.versionPostprint (author's final draft)
dc.relation.projectidinfo:eu-repo/grantAgreement/MINECO/1PE/TIN2016-75344-R
dc.relation.projectidinfo:eu-repo/grantAgreement/EC/H2020/833057/EU/CoCoUnit: An Energy-Efficient Processing Unit for Cognitive Computing/CoCoUnit
local.citation.authorSilfa, F.A.; Dot, G.; Arnau, J.; Gonzalez, A.
local.citation.contributorAnnual IEEE/ACM International Symposium on Microarchitecture
local.citation.pubplaceNew York
local.citation.publicationNameMICRO-52: the 52nd Annual IEEE/ACM International Symposium on Microarchitecture: proceedings: October 12-16, 2019: Columbus, Ohio, USA
local.citation.startingPage782
local.citation.endingPage793


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

All rights reserved. This work is protected by the corresponding intellectual and industrial property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public communication or transformation of this work are prohibited without permission of the copyright holder