Comparing fixed and adaptive computation time for recurrent neural networks

Fojo, Daniel; Campos Camunez, Victor; Giró Nieto, Xavier

Visualitza/Obre

PONÈNCIA (515,9Kb)

Veure estadístiques d'ús d'UPCommons

Estadístiques de LA Referencia / Recolecta

Cita com:

Mostra el registre d'ítem complet

Fojo, Daniel

Campos Camunez, Victor

Giró Nieto, Xavier

Tipus de documentText en actes de congrés

Data publicació2018

Condicions d'accésAccés obert

Tots els drets reservats. Aquesta obra està protegida pels drets de propietat intel·lectual i industrial corresponents. Sense perjudici de les exempcions legals existents, queda prohibida la seva reproducció, distribució, comunicació pública o transformació sense l'autorització del titular dels drets

ProjecteCOMPUTACION DE ALTAS PRESTACIONES VII (MINECO-TIN2015-65316-P)

Abstract

Deep networks commonly perform better than shallow ones, but allocating the proper amount of computation for each particular input sample remains an open problem. This issue is particularly challenging in sequential tasks, where the required complexity may vary for different tokens in the input sequence. Adaptive Computation Time (ACT) was proposed as a method for dynamically adapting the computation at each step for Recurrent Neural Networks (RNN). ACT introduces two main modifications to the regular RNN formulation: (1) more than one RNN steps may be executed between an input sample is fed to the layer and and this layer generates an output, and (2) this number of steps is dynamically predicted depending on the input token and the hidden state of the network. In our work, we aim at gaining intuition about the contribution of these two factors to the overall performance boost observed when augmenting RNNs with ACT. We design a new baseline, Repeat-RNN, which performs a constant number of RNN state updates larger than one before generating an output. Surprisingly, such uniform distribution of the computational resources matches the performance of ACT in the studied tasks. We hope that this finding motivates new research efforts towards designing RNN architectures that are able to dynamically allocate computational resources. TL;DR: Comparison of Adaptive Computation Time with Fixed computation time for RNN's gives surprising results

CitacióFojo, D., Campos, V., Giro, X. Comparing fixed and adaptive computation time for recurrent neural networks. A: International Conference on Learning Representations. "Sixth International Conference on Learning Representations: Monday April 30-Thursday May 03, 2018, Vancouver Convention Center, Vancouver: [proceedings]". 2018, p. 1-8.

URIhttp://hdl.handle.net/2117/118497

Versió de l'editorhttps://iclr.cc/Conferences/2018/Schedule?type=Workshop

Col·leccions

Veure estadístiques d'ús d'UPCommons

Mostra el registre d'ítem complet

Fitxers	Descripció	Mida	Format	Visualitza
giro2.pdf	PONÈNCIA	515,9Kb	PDF	Visualitza/Obre

UPCommons. Portal del coneixement obert de la UPC

Comparing fixed and adaptive computation time for recurrent neural networks

Visualitza/Obre

Explora