Exploració per tema "Reconeixement automàtic de la parla"

Automatic speech recognition with deep neural networks for impaired speech

España-i-Bonet, Cristina; Rodríguez Fonollosa, José Adrián (Springer, 2016)
Text en actes de congrés
Accés obert

Automatic Speech Recognition has reached almost human performance in some controlled scenarios. However, recognition of impaired speech is a difficult task for two main reasons: data is (i) scarce and (ii) heterogeneous. ...

Automatic speech recognition with Kaldi toolkit

Rosillo Gil, Victor (Universitat Politècnica de Catalunya, 2016-02-08)
Treball Final de Grau
Accés obert
Realitzat a/amb: Akademia Górniczo-Hutnicza im. S. Staszica w Krakowie

The topic of this thesis is to built an accurate automatic speech recognition system to be able to recognize speech using Kaldi, an open-source toolkit for speech recognition written in C++ and with free data. First of ...

Awareness, mobilisation and dissemination actions

Trandaba¿, Diana; Cristea, Dan; Branco, Antonio; Mendes, Amalia; Pellegrini, Thomas; Ananiadou, Sophia; Thompson, Paul; Irimia, Elena; Tufis, Dan; Gilmenau, Georgiana; Rosner, Mike; Moreno Bilbao, M. Asunción; Bel, Nùria (2012-01-31)
Report de recerca
Accés obert

The central objective of the Metanet4u project is to contribute to the establishment of a pan-European digital platform that makes available language resources and services, encompassing both datasets and software tools, ...

BaNa: a noise resilient fundamental frequency detection algorithm for speech and music

Yang, Na; Ba, He; Cai, Weiyang; Demirkol, Ilker Seyfettin; Heinzelman, Wendi (2014-08-27)
Article
Accés obert

Fundamental frequency (F0) is one of the essential features in many acoustic related applications. Although numerous F0 detection algorithms have been developed, the detection accuracy in noisy environments still needs ...

Blind channel equalization using weighted subspace methods

Ruiz Feliu, Rafael; Cabrera-Bean, Margarita (Institute of Electrical and Electronics Engineers (IEEE), 1999)
Text en actes de congrés
Accés obert

This paper addresses the problems of blind channel estimation and symbol detection with second order statistics methods from the received data. It can be shown that this problem is similar to direction of arrival (DOA) ...

Block-based Speech-to-Speech Translation

Roca, Sandra (Universitat Politècnica de Catalunya, 2018-10)
Treball Final de Grau
Accés obert

Esta tesis explora diferentes maneras de implementar un sistema de bloques de Traducción de Voz con el propósito de generar grandes cantidades de datos para generar un gran corpus paralelo de voz. La primera tarea consiste ...

BUCEADOR hybrid TTS for blizzard challenge 2011

Sainz, Iñaki; Erro Eslava, Daniel; Navas, Eva; Adell Mercado, Jordi; Bonafonte Cávez, Antonio (2011)
Text en actes de congrés
Accés obert

This paper describes the Text-to-Speech (TTS) systems presented by the Buceador Consortium in the Blizzard Challenge 2011 evaluation campaign. The main system is a concatenative hybrid one that tries to combine the strong ...

BUCEADOR, a multi-language search engine for digital libraries

Adell Mercado, Jordi; Bonafonte Cávez, Antonio; Cardenal, Antonio; Ruiz Costa-Jussà, Marta; Rodríguez Fonollosa, José Adrián; Moreno Bilbao, M. Asunción; Navas, Eva; Rodríguez Banga, Eduardo (2012)
Comunicació de congrés
Accés obert

This paper presents a web-based multimedia search engine built within the Buceador (www.buceador.org) research project. A proof-of-concept tool has been implemented which is able to retrieve information from a digital ...

Building synthetic voices in the METANET framework

Garcia Casademont, Emília; Bonafonte Cávez, Antonio; Moreno Bilbao, M. Asunción (2012)
Comunicació de congrés
Accés obert

METANET4U is a European project aiming at supporting language technology for European languages and multilingualism. It is a project in the META-NET Network of Excellence, a cluster of projects aiming at fostering the ...

Casc per a personal de serveis d'emergència controlat per veu

Boada Torre-Marín, Jordi (Universitat Politècnica de Catalunya, 2017-07)
Treball Final de Grau
Accés obert

Aquest projecte ha pretès la creació d’un prototip d’equip de protecció individual que permeti la interacció amb diferents actuadors mitjançant un control per veu, facilitant i fent més segures les accions del portador en ...

Catalan Accent Classification by Voice using Deep Learning

Felip I Díaz, Bernat (Universitat Politècnica de Catalunya, 2023-05-25)
Projecte Final de Màster Oficial
Accés obert

Speech characterization is a vital field in artificial intelligence, yet accent classification is often overlooked, particularly for the Catalan language. This project is centered on the classification of Catalan accents ...

Channel selection measures for multi-microphone speech recognition

Nadeu Camprubí, Climent; Wolf, Martin (2014-02-01)
Article
Accés restringit per política de l'editorial

Automatic speech recognition in a room with distant microphones is strongly affected by noise and reverberation. In scenarios where the speech signal is captured by several arbitrarily located microphones the degree of ...

Channel selection using N-best hypothesis for multi-microphone ASR

Wolf, Martin; Nadeu Camprubí, Climent (2013)
Text en actes de congrés
Accés restringit per política de l'editorial

If speech is captured by several arbitrarily-located microphones in a room, the degree of distortion by noise and reverberation may vary strongly from one channel to another. Channel selection for automatic speech recognition ...

Characterization of Speech Recognition Systems on GPU Architectures

Segura Salvador, Albert (Universitat Politècnica de Catalunya, 2016-07-04)
Projecte Final de Màster Oficial
Accés obert

This master thesis characterizes the performance and energy bottlenecks of speech recognition systems when running on modern GPU, with the aim of providing useful information for designing future GPU architectures, as well ...

Collaborative voting of 3D features for robust gesture estimation

van Sabben Alsina, Daniel; Ruiz Hidalgo, Javier; Suau Cuadros, Xavier; Casas Pla, Josep Ramon (Institute of Electrical and Electronics Engineers (IEEE), 2017)
Comunicació de congrés
Accés obert

Human body analysis raises special interest because it enables a wide range of interactive applications. In this paper we present a gesture estimator that discriminates body poses in depth images. A novel collaborative ...

Combining phrase and neural-based machine translation: what worked and did not

Ruiz Costa-Jussà, Marta; Rodríguez Fonollosa, José Adrián (2017)
Article
Accés restringit per política de l'editorial

Phrase-based machine translation assumes that all words are at the same distance and translates them using feature functions that approximate the probability at different levels. On the other hand, neural machine translation ...

Computation reuse in DNNs by exploiting input similarity

Riera Villanueva, Marc; Arnau Montañés, José María; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2018)
Text en actes de congrés
Accés restringit per política de l'editorial

In recent years, Deep Neural Networks (DNNs) have achieved tremendous success for diverse problems such as classification and decision making. Efficient support for DNNs on CPUs, GPUs and accelerators has become a prolific ...

Control remot per veu d'un robot Open Source

Sbert Cañellas, Antoni (Universitat Politècnica de Catalunya, 2019-06)
Treball Final de Grau
Accés obert

En l’actualitat la interacció amb les màquines mitjançant el reconeixement de la parla està molt de moda. Aquesta manera d’interactuar amb les màquines es duu desenvolupant des de fa molts anys. Aquest treball es centrarà ...

Controlador de dispositivos por reconocimiento de voz (CDRV)

Roca Nonell, Aleix (Universitat Politècnica de Catalunya, 2014-12-11)
Projecte/Treball Final de Carrera
Accés obert

CDRV (Controlador de dipositivos por reconocimiento de voz) es un dispositiu capaç de controlar altres dispositius mitjançant la veu. Concretament, per aquest projecte, s'ha adaptat per controlar una butaca reclinable.

Controlling 3D holographic contents by personal devices

Barroso Laguna, Axel (Universitat Politècnica de Catalunya, 2014-12)
Projecte/Treball Final de Carrera
Accés obert
Realitzat a/amb: Politecnico di Torino

[ANGLÈS] Foremost, this project explains the different sensors of a personal device (e.g. smartphone). After that, this study shows how to interact with holographic scenes. These scenes have been created by Blender. Blender ...