Exploració per tema "Reconeixement automàtic de la parla"

Corpus for cyberbullying prevention

Moreno Bilbao, M. Asunción; Bonafonte Cávez, Antonio; Jauk, Igor; Tarrés, Laia; Pereira, Victor (International Speech Communication Association (ISCA), 2018)
Text en actes de congrés
Accés obert

Cyberbullying is the use of digital media to harass a person or group of people, through personal attacks, disclosure of confidential or false information, among other means. That is to say, it ...

Corpus selection

Adda, Gilles; Barras, Claude; Kernal Ekenel, Hazim; Morros Rubió, Josep Ramon; Hernando Pericás, Francisco Javier (2013-03-31)
Report de recerca
Accés obert

Entregable del proyecto Collaborative Annotation of multi-MOdal, MultI-Lingual and multi-mEdia documents. This document describes the different corpora that will be used during the Camomile project

Creating expressive synthetic voices by unsupervised clustering of audiobooks

Jauk, Igor; Bonafonte Cávez, Antonio; López Otero, Paula; Docio Fernández, Laura (International Speech Communication Association (ISCA), 2015)
Comunicació de congrés
Accés restringit per política de l'editorial

In this work we design an approach for automatic feature selection and voice creation for expressive synthesis. Our approach is guided by two main goals: (1) increasing the flexibility of expressive voice creation and (2) ...

Deep learning backend for single and multisession i-vector speaker recognition

Ghahabi Esfahani, Omid; Hernando Pericás, Francisco Javier (2017-04-01)
Article
Accés obert

The lack of labeled background data makes a big performance gap between cosine and Probabilistic Linear Discriminant Analysis (PLDA) scoring baseline techniques for i-vectors in speaker recognition. Although there are some ...

Deep Learning for Demographic Classification by Speech

Navarrete Jiménez, Daniel (Universitat Politècnica de Catalunya, 2022-10-26)
Projecte Final de Màster Oficial
Accés obert

Speech characterization is a challenging task and one of the most relevant challenges in AI. Moreover, it is a field of study with minimal scope in the Catalan language. In this work, we try to perform a demographic ...

Deep learning for i-vector speaker and language recognition

Ghahabi Esfahani, Omid (Universitat Politècnica de Catalunya, 2018-05-29)
Tesi
Accés obert

Over the last few years, i-vectors have been the state-of-the-art technique in speaker and language recognition. Recent advances in Deep Learning (DL) technology have improved the quality of i-vectors but the DL techniques ...

Deep Neural Networks for Channel Compensated i-Vectors in Speaker Recognition

Jiménez Sanfiz, Albert (Universitat Politècnica de Catalunya, 2014-06)
Treball Final de Grau
Accés obert

This thesis explores the application of channel-compensation techniques in speaker verification and the posterior combination with deep learning technologies. The idea is to reduce the degradation of the performance due ...

Deep neural networks for i-vector language identification of short utterances in cars

Ghahabi Esfahani, Omid; Bonafonte Cávez, Antonio; Hernando Pericás, Francisco Javier; Moreno Bilbao, M. Asunción (International Speech Communication Association (ISCA), 2016)
Text en actes de congrés
Accés restringit per política de l'editorial

This paper is focused on the application of the Language Identification (LID) technology for intelligent vehicles. We cope with short sentences or words spoken in moving cars in four languages: English, Spanish, German, ...

DeepVoice: tecnologías de aprendizaje profundo aplicadas al procesado de voz y audio

Ruiz Costa-Jussà, Marta; Rodríguez Fonollosa, José Adrián (2017-09-01)
Article
Accés obert

This project proposes the development of new deep learning methods for speech and audio processing, exploring new applications and continuing the initial work of the research team and the international community. Research ...

DeepVoice: tecnologías de aprendizaje profundo aplicadas al procesado de voz y audio

Ruiz Costa-Jussà, Marta; Rodríguez Fonollosa, José Adrián (2017-09-22)
Article
Accés obert

Este proyecto propone el desarrollo de nuevas arquitecturas para el procesado de la voz y el audio mediante métodos de aprendizaje profundo, explorando también nuevas aplicaciones y dando continuidad al trabajo inicial del ...

Demisyllable based Spanish Number Recognition Experiments

Mariño Acebal, José Bernardo; Nadeu Camprubí, Climent; Lleida Solano, Eduardo (1987)
Text en actes de congrés
Accés obert

The main features of our demisyllable based continuous speech recognition system (RAMSES) are showed. Special attention is paid to demisyllable definition and the syntactic constraints used with the dynamic programming ...

Design and implementation of SIP VoIP Adapter

Guixà Ibàñez, Adrià (Universitat Politècnica de Catalunya, 2009-12-15)
Projecte/Treball Final de Carrera
Accés obert

The SIP VoIP Adapter is a Java application that is able to establish a SIP communication acting as a User Agent, which uses an external device as a sound device, to play and acquire the audio from the call established ...

Design, development, and evaluation of a real-time facial expression and speech emotion recognition system

Borràs Duarte, Marta (Universitat Politècnica de Catalunya, 2023-10-20)
Treball Final de Grau
Accés obert

Aquesta tesi presenta el disseny, desenvolupament i avaluació d’un sistema de reconeixement d'emocions en temps real per a aplicacions mèdiques. El mateix permet la monitorització en remot de l'estat emocional de pacients ...

Despliegue y análisis de un escenario de telefonía IP

Sanz Pages, Francesc (Universitat Politècnica de Catalunya, 2011-09-29)
Treball Final de Grau
Accés obert

Detecció i classificació de sons superposats

León Gimeno, Marc (Universitat Politècnica de Catalunya, 2021-07-12)
Treball Final de Grau
Accés obert

DCASE Challenge es una competición internacional anual para realizar evaluaciones de sistemas de detección y clasificación de audio. Los desafíos suelen implicar diversas tareas, como por ejemplo la detección de sonidos ...

Detection and handling of overlapping speech for speaker diarization

Zelenak, Martin; Hernando Pericás, Francisco Javier (2012)
Text en actes de congrés
Accés obert

This thesis concerns the detection of overlapping speech segments and its further application for the improvement of speaker diarization performance. We propose the use of three spatial cross-correlation-based parameters ...

Digui: a flexibe dialogue system for guiding the user interaction to guiding the user interaction to acces web services

González Bermúdez, Meritxell (Universitat Politècnica de Catalunya, 2010-10-22)
Tesi
Accés obert

Current dialogue systems can handle friendly and collaborative communication that supports diverse types of interactions, such as menus in which the user is asked to choose an option, form filling in which the user is ...

Diseño e implementación de un control por voz para un robot de cocina inteligente

Barbero Carbonell, David (Universitat Politècnica de Catalunya, 2020-09-15)
Treball Final de Grau
Accés restringit per acord de confidencialitat
Realitzat a/amb: Ondho Enmul

Diseño e implementación de un sistema de control por voz

Montero Mata, Jordi (Universitat Politècnica de Catalunya, 2008-04-21)
Projecte/Treball Final de Carrera
Accés obert

El presente proyecto final de carrera consiste en el diseño e implementación de un sistema de reconocimiento de voz con una interfaz amigable que permita al usuario un manejo sencillo del sistema, Este sistema tiene como ...

Disseny i aplicació d’un sistema de reconeixement de veu

Tarradas i Juan, Josep (Universitat Politècnica de Catalunya, 2015-09)
Treball Final de Grau
Accés obert

Aquest projecte presenta el disseny de una aplicació de reconeixement de veu. Aquesta aplicació té dos utilitats una és la conversió de veu a text i l’altre de text a veu. En primer lloc s’han analitzat els diferents ...