Comparing fixed and adaptive computation time for recurrent neural networks
Visualitza/Obre
Estadístiques de LA Referencia / Recolecta
Inclou dades d'ús des de 2022
Cita com:
hdl:2117/118497
Tipus de documentText en actes de congrés
Data publicació2018
Condicions d'accésAccés obert
Tots els drets reservats. Aquesta obra està protegida pels drets de propietat intel·lectual i
industrial corresponents. Sense perjudici de les exempcions legals existents, queda prohibida la seva
reproducció, distribució, comunicació pública o transformació sense l'autorització del titular dels drets
Abstract
Deep networks commonly perform better than shallow ones, but allocating the proper amount of computation for each particular input sample remains an open problem. This issue is particularly challenging in sequential tasks, where the required complexity may vary for different tokens in the input sequence. Adaptive Computation Time (ACT) was proposed as a method for dynamically adapting the computation at each step for Recurrent Neural Networks (RNN). ACT introduces two main modifications to the regular RNN formulation: (1) more than one RNN steps may be executed between an input sample is fed to the layer and and this layer generates an output, and (2) this number of steps is dynamically predicted depending on the input token and the hidden state of the network. In our work, we aim at gaining intuition about the contribution of these two factors to the overall performance boost observed when augmenting RNNs with ACT. We design a new baseline, Repeat-RNN, which performs a constant number of RNN state updates larger than one before generating an output. Surprisingly, such uniform distribution of the computational resources matches the performance of ACT in the studied tasks. We hope that this finding motivates new research efforts towards designing RNN architectures that are able to dynamically allocate computational resources. TL;DR: Comparison of Adaptive Computation Time with Fixed computation time for RNN's gives surprising results
CitacióFojo, D., Campos, V., Giro, X. Comparing fixed and adaptive computation time for recurrent neural networks. A: International Conference on Learning Representations. "Sixth International Conference on Learning Representations: Monday April 30-Thursday May 03, 2018, Vancouver Convention Center, Vancouver: [proceedings]". 2018, p. 1-8.
Versió de l'editorhttps://iclr.cc/Conferences/2018/Schedule?type=Workshop
Fitxers | Descripció | Mida | Format | Visualitza |
---|---|---|---|---|
giro2.pdf | PONÈNCIA | 515,9Kb | Visualitza/Obre |