Mostra el registre d'ítem simple

dc.contributor.authorRiera Villanueva, Marc
dc.contributor.authorArnau Montañés, José María
dc.contributor.authorGonzález Colás, Antonio María
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
dc.date.accessioned2018-11-28T18:24:59Z
dc.date.issued2018
dc.identifier.citationRiera, M., Arnau, J., Gonzalez, A. Computation reuse in DNNs by exploiting input similarity. A: International Symposium on Computer Architecture. "2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA 2018): Los Angeles, California, USA: 1-6 June 2018". Institute of Electrical and Electronics Engineers (IEEE), 2018, p. 57-68.
dc.identifier.isbn9781538659854
dc.identifier.urihttp://hdl.handle.net/2117/125204
dc.description.abstractIn recent years, Deep Neural Networks (DNNs) have achieved tremendous success for diverse problems such as classification and decision making. Efficient support for DNNs on CPUs, GPUs and accelerators has become a prolific area of research, resulting in a plethora of techniques for energy-efficient DNN inference. However, previous proposals focus on a single execution of a DNN. Popular applications, such as speech recognition or video classification, require multiple back-to-back executions of a DNN to process a sequence of inputs (e.g., audio frames, images). In this paper, we show that consecutive inputs exhibit a high degree of similarity, causing the inputs/outputs of the different layers to be extremely similar for successive frames of speech or images of a video. Based on this observation, we propose a technique to reuse some results of the previous execution, instead of computing the entire DNN. Computations related to inputs with negligible changes can be avoided with minor impact on accuracy, saving a large percentage of computations and memory accesses. We propose an implementation of our reuse-based inference scheme on top of a state-of-the-art DNN accelerator. Results show that, on average, more than 60% of the inputs of any neural network layer tested exhibit negligible changes with respect to the previous execution. Avoiding the memory accesses and computations for these inputs results in 63% energy savings on average.
dc.format.extent12 p.
dc.language.isoeng
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)
dc.subjectÀrees temàtiques de la UPC::Enginyeria de la telecomunicació::Processament del senyal::Processament de la parla i del senyal acústic
dc.subject.lcshAutomatic speech recognition
dc.subject.otherComputation reuse
dc.subject.otherDNN
dc.subject.otherHardware accelerator
dc.subject.otherInput similarity Decision making
dc.subject.otherEnergy conservation
dc.subject.otherEnergy efficiency
dc.subject.otherMemory architecture
dc.subject.otherNetwork architecture
dc.subject.otherNetwork layers
dc.subject.otherProgram processors
dc.subject.otherSpeech recognition
dc.subject.otherBack-to-back execution
dc.subject.otherComputation reuse
dc.subject.otherDegree of similarity
dc.subject.otherDifferent layers
dc.subject.otherEnergy efficient
dc.subject.otherHardware accelerators
dc.subject.otherInput similarity
dc.subject.otherVideo classification
dc.subject.otherDeep neural networks
dc.titleComputation reuse in DNNs by exploiting input similarity
dc.typeConference report
dc.subject.lemacReconeixement automàtic de la parla
dc.contributor.groupUniversitat Politècnica de Catalunya. ARCO - Microarquitectura i Compiladors
dc.identifier.doi10.1109/ISCA.2018.00016
dc.description.peerreviewedPeer Reviewed
dc.relation.publisherversionhttps://ieeexplore.ieee.org/document/8416818
dc.rights.accessRestricted access - publisher's policy
local.identifier.drac23527231
dc.description.versionPostprint (published version)
dc.relation.projectidinfo:eu-repo/grantAgreement/MINECO/1PE/TIN2016-75344-R
dc.date.lift10000-01-01
local.citation.authorRiera, M.; Arnau, J.; Gonzalez, A.
local.citation.contributorInternational Symposium on Computer Architecture
local.citation.publicationName2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA 2018): Los Angeles, California, USA: 1-6 June 2018
local.citation.startingPage57
local.citation.endingPage68


Fitxers d'aquest items

Imatge en miniatura

Aquest ítem apareix a les col·leccions següents

Mostra el registre d'ítem simple