Computation reuse in DNNs by exploiting input similarity

Riera Villanueva, Marc; Arnau Montañés, José María; González Colás, Antonio María

doi:10.1109/ISCA.2018.00016

dc.contributor.author	Riera Villanueva, Marc
dc.contributor.author	Arnau Montañés, José María
dc.contributor.author	González Colás, Antonio María
dc.contributor.other	Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
dc.date.accessioned	2018-11-28T18:24:59Z
dc.date.issued	2018
dc.identifier.citation	Riera, M., Arnau, J., Gonzalez, A. Computation reuse in DNNs by exploiting input similarity. A: International Symposium on Computer Architecture. "2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA 2018): Los Angeles, California, USA: 1-6 June 2018". Institute of Electrical and Electronics Engineers (IEEE), 2018, p. 57-68.
dc.identifier.isbn	9781538659854
dc.identifier.uri	http://hdl.handle.net/2117/125204
dc.description.abstract	In recent years, Deep Neural Networks (DNNs) have achieved tremendous success for diverse problems such as classification and decision making. Efficient support for DNNs on CPUs, GPUs and accelerators has become a prolific area of research, resulting in a plethora of techniques for energy-efficient DNN inference. However, previous proposals focus on a single execution of a DNN. Popular applications, such as speech recognition or video classification, require multiple back-to-back executions of a DNN to process a sequence of inputs (e.g., audio frames, images). In this paper, we show that consecutive inputs exhibit a high degree of similarity, causing the inputs/outputs of the different layers to be extremely similar for successive frames of speech or images of a video. Based on this observation, we propose a technique to reuse some results of the previous execution, instead of computing the entire DNN. Computations related to inputs with negligible changes can be avoided with minor impact on accuracy, saving a large percentage of computations and memory accesses. We propose an implementation of our reuse-based inference scheme on top of a state-of-the-art DNN accelerator. Results show that, on average, more than 60% of the inputs of any neural network layer tested exhibit negligible changes with respect to the previous execution. Avoiding the memory accesses and computations for these inputs results in 63% energy savings on average.
dc.format.extent	12 p.
dc.language.iso	eng
dc.publisher	Institute of Electrical and Electronics Engineers (IEEE)
dc.subject	Àrees temàtiques de la UPC::Enginyeria de la telecomunicació::Processament del senyal::Processament de la parla i del senyal acústic
dc.subject.lcsh	Automatic speech recognition
dc.subject.other	Computation reuse
dc.subject.other	DNN
dc.subject.other	Hardware accelerator
dc.subject.other	Input similarity Decision making
dc.subject.other	Energy conservation
dc.subject.other	Energy efficiency
dc.subject.other	Memory architecture
dc.subject.other	Network architecture
dc.subject.other	Network layers
dc.subject.other	Program processors
dc.subject.other	Speech recognition
dc.subject.other	Back-to-back execution
dc.subject.other	Computation reuse
dc.subject.other	Degree of similarity
dc.subject.other	Different layers
dc.subject.other	Energy efficient
dc.subject.other	Hardware accelerators
dc.subject.other	Input similarity
dc.subject.other	Video classification
dc.subject.other	Deep neural networks
dc.title	Computation reuse in DNNs by exploiting input similarity
dc.type	Conference report
dc.subject.lemac	Reconeixement automàtic de la parla
dc.contributor.group	Universitat Politècnica de Catalunya. ARCO - Microarquitectura i Compiladors
dc.identifier.doi	10.1109/ISCA.2018.00016
dc.description.peerreviewed	Peer Reviewed
dc.relation.publisherversion	https://ieeexplore.ieee.org/document/8416818
dc.rights.access	Restricted access - publisher's policy
local.identifier.drac	23527231
dc.description.version	Postprint (published version)
dc.relation.projectid	info:eu-repo/grantAgreement/MINECO/1PE/TIN2016-75344-R
dc.date.lift	10000-01-01
local.citation.author	Riera, M.; Arnau, J.; Gonzalez, A.
local.citation.contributor	International Symposium on Computer Architecture
local.citation.publicationName	2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA 2018): Los Angeles, California, USA: 1-6 June 2018
local.citation.startingPage	57
local.citation.endingPage	68

Fitxers d'aquest items

Nom:: 08416818.pdf
Mida:: 379,0Kb
Format:: PDF

Visualitza/Obre

Aquest ítem apareix a les col·leccions següents

Ponències/Comunicacions de congressos [187]
Ponències/Comunicacions de congressos [1.954]

Mostra el registre d'ítem simple

UPCommons. Portal del coneixement obert de la UPC

Computation reuse in DNNs by exploiting input similarity

Fitxers d'aquest items

Aquest ítem apareix a les col·leccions següents

Explora