Efficient keyword spotting by capturing long-range interactions with temporal lambda networks
Visualitza/Obre
10.1109/ASRU51503.2021.9687932
Inclou dades d'ús des de 2022
Cita com:
hdl:2117/366331
Tipus de documentComunicació de congrés
Data publicació2021
Condicions d'accésAccés obert
Tots els drets reservats. Aquesta obra està protegida pels drets de propietat intel·lectual i
industrial corresponents. Sense perjudici de les exempcions legals existents, queda prohibida la seva
reproducció, distribució, comunicació pública o transformació sense l'autorització del titular dels drets
Abstract
Models based on attention mechanisms have shown unprecedented speech recognition performance. However, they are computationally expensive and unnecessarily complex for keyword spotting, a task targeted to small-footprint devices. This work explores the application of Lambda networks, an alternative framework for capturing long-range interactions without attention, for the keyword spotting task. We propose a novel ResNet-based model by swapping the residual blocks by temporal Lambda layers. Furthermore, the proposed architecture is built upon uni-dimensional temporal convolutions that further reduce its complexity. The presented model does not only reach state-of-the-art accuracies on the Google Speech Commands dataset, but it is 85% and 65% lighter than its Transformer-based (KWT) and convolutional (ResNet15) counterparts while being up to 100× faster. To the best of our knowledge, this is the first attempt to explore the Lambda framework within the speech domain and therefore, we unravel further research of new interfaces based on this architecture.
CitacióTura, B. [et al.]. Efficient keyword spotting by capturing long-range interactions with temporal lambda networks. A: IEEE Automatic Speech Recognition and Understanding Workshop. "2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) Proceedings". 2021, p. 146-153. ISBN 978-1-6654-3739-4. DOI 10.1109/ASRU51503.2021.9687932.
ISBN978-1-6654-3739-4
Versió de l'editorhttps://ieeexplore.ieee.org/document/9687932
Altres identificadorshttps://arxiv.org/pdf/2104.08086.pdf
Col·leccions
Fitxers | Descripció | Mida | Format | Visualitza |
---|---|---|---|---|
LambdaNetworks___ASRU_2021.pdf | 985,5Kb | Visualitza/Obre |