Skip RNN: learning to skip state updates in recurrent neural networks

View/Open
Document typeConference lecture
Defense date2018
Rights accessOpen Access
All rights reserved. This work is protected by the corresponding intellectual and industrial
property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public
communication or transformation of this work are prohibited without permission of the copyright holder
Abstract
Recurrent Neural Networks (RNNs) continue to show outstanding performance in sequence modeling tasks. However, training RNNs on long sequences often face challenges like slow inference, vanishing gradients and difficulty in capturing long term dependencies. In backpropagation through time settings, these issues are tightly coupled with the large, sequential computational graph resulting from unfolding the RNN in time. We introduce the Skip RNN model which extends existing RNN models by learning to skip state updates and shortens the effective size of the computational graph. This model can also be encouraged to perform fewer state updates through a budget constraint. We evaluate the proposed model on various tasks and show how it can reduce the number of required RNN updates while preserving, and sometimes even improving, the performance of the baseline RNN models. Source code is publicly available at https://imatge-upc.github.io/skiprnn-2017-telecombcn/
CitationCampos, V., Jou, B., Giro, X., Torres, J., Chang, S. Skip RNN: learning to skip state updates in recurrent neural networks. A: International Conference on Learning Representations. "Sixth International Conference on Learning Representations: Monday April 30-Thursday May 03, 2018, Vancouver Convention Center, Vancouver: [proceedings]". 2018, p. 1-17.
Publisher versionhttps://iclr.cc/Conferences/2018/Schedule?type=Poster
Collections
- GPI - Grup de Processament d'Imatge i Vídeo - Ponències/Comunicacions de congressos [316]
- CAP - Grup de Computació d'Altes Prestacions - Ponències/Comunicacions de congressos [782]
- Departament d'Arquitectura de Computadors - Ponències/Comunicacions de congressos [1.841]
- Departament de Teoria del Senyal i Comunicacions - Ponències/Comunicacions de congressos [3.228]