The dark side of DNN pruning

Yazdani Aminabadi, Reza; Arnau Montañés, José María; González Colás, Antonio María; Riera Villanueva, Marc

doi:10.1109/ISCA.2018.00071

Visualitza/Obre

08416873(1).pdf (626,0Kb) (Accés restringit) Sol·licita una còpia a l'autor

Veure estadístiques d'ús d'UPCommons

Estadístiques de LA Referencia / Recolecta

Cita com:

Mostra el registre d'ítem complet

Yazdani Aminabadi, Reza

Arnau Montañés, José María

González Colás, Antonio María

Riera Villanueva, Marc

Tipus de documentText en actes de congrés

Data publicació2018

EditorInstitute of Electrical and Electronics Engineers (IEEE)

Condicions d'accésAccés restringit per política de l'editorial

Tots els drets reservats. Aquesta obra està protegida pels drets de propietat intel·lectual i industrial corresponents. Sense perjudici de les exempcions legals existents, queda prohibida la seva reproducció, distribució, comunicació pública o transformació sense l'autorització del titular dels drets

Abstract

DNN pruning has been recently proposed as an effective technique to improve the energy-efficiency of DNN-based solutions. It is claimed that by removing unimportant or redundant connections, the pruned DNN delivers higher performance and energy-efficiency with negligible impact on accuracy. However, DNN pruning has an important side effect: it May reduce the confidence of DNN predictions. We show that, although top-1 accuracy May be maintained with DNN pruning, the likelihood of the class in the top-1 is significantly reduced when using the pruned models. For applications such as Automatic Speech Recognition (ASR), where the DNN scores are consumed by a successive stage, the workload of this stage can be dramatically increased due to the loss of confidence in the DNN. An ASR system consists of a DNN for computing acoustic scores, followed by a Viterbi beam search to find the most likely sequence of words. We show that, when pruning the DNN model used for acoustic scoring, the Word Error Rate (WER) is maintained but the execution time of the ASR system is increased by 33%. Although pruning improves the efficiency of the DNN, it results in a huge increase of activity in the Viterbi search since the output scores of the pruned model are less reliable. Based on this observation, we propose a novel hardware-based ASR system that effectively integrates a DNN accelerator for pruned models with a Viterbi accelerator. In order to avoid the aforementioned increase in Viterbi search workload, our system loosely selects the N-best hypotheses at every time step, exploring only the N most likely paths. To avoid an expensive sort of the hypotheses based on their likelihoods, our accelerator employs a set-associative hash table to keep track of the best paths mapped to each set. In practice, this solution approaches the selection of N-best, but it requires much simpler hardware. Our approach manages to efficiently combine both DNN pruning and Viterbi search, and achieves 9x energy savings and 4.2x speedup with respect to the state-of-the-art ASR solutions.

CitacióYazdani, R., Riera, M., Arnau, J., Gonzalez Colas, A. The dark side of DNN pruning. A: International Symposium on Computer Architecture. "2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA 2018): Los Angeles, California, USA: 1-6 June 2018". Institute of Electrical and Electronics Engineers (IEEE), 2018, p. 790-801.

URIhttp://hdl.handle.net/2117/125141

DOI10.1109/ISCA.2018.00071

ISBN9781538659854

Versió de l'editorhttps://ieeexplore.ieee.org/document/8416873

Col·leccions

Veure estadístiques d'ús d'UPCommons

Mostra el registre d'ítem complet

Fitxers	Descripció	Mida	Format	Visualitza
08416873(1).pdf		626,0Kb	PDF	Accés restringit

UPCommons. Portal del coneixement obert de la UPC

The dark side of DNN pruning

Visualitza/Obre

Explora