Exploració per tema "Processament de la parla"

Aplicación Android de movilidad de invidentes

Lidó Monzón, Ingrid (Universitat Politècnica de Catalunya, 2011-05-09)
Treball Final de Grau
Accés obert

En este proyecto se ha desarrollado parte de una aplicación de movilidad de invidentes para Android. La introducción del destino se realiza por voz y a partir de ahí utilizando diversas herramientas se guía al usuario. A ...

APVQ encoder applied to wideband speech coding

Salavedra Molí, Josep; Masgrau Gómez, Enrique José (Institute of Electrical and Electronics Engineers (IEEE), 1996)
Text en actes de congrés
Accés obert

The paper describes a coding scheme for broadband speech (sampling frequency 16 KHz). The authors present a wideband speech encoder called APVQ (adaptive predictive vector quantization). It combines subband coding, vector ...

AR modeling of the speech autocorrelation to improve noisy speech recognition

Hernando Pericás, Francisco Javier; Nadeu Camprubí, Climent (1992)
Text en actes de congrés
Accés obert

Speech recognition in noisy environments remains an unsolved problem even in the case of isolated word recognition with small vocabularies. Recently, several techniques have been proposed to alleviate this problem. Concretely, ...

Asignación secuencial de canales para tráfico de voz y datos en entornos móviles celulares

Molina Castillo, Pilar (Universitat Politècnica de Catalunya, 2007-11-22)
Projecte/Treball Final de Carrera
Accés obert

En el siguiente documento se realizará un estudio sobre diferentes modelos de asignación de time slots (canales) en redes radio, con el fin de conseguir el mayor número consecutivo de time slots libres teniendo en cuenta ...

Audio classification experiments in a neonatal intensive care unit

Sólvez Pérez, Sergi (Universitat Politècnica de Catalunya, 2014-06-25)
Projecte/Treball Final de Carrera
Accés obert

[ANGLÈS] Newborns delivered at a gestational age of 24-32 weeks commonly have health problems. The use of a Neonatal Intensive Care Unit (NICU) is, in most of the cases, crucial for their survival. Nowadays, it is known ...

Augment de dades de veu per a sistemes de processament de la parla

Falceto Piñol, Anna (Universitat Politècnica de Catalunya, 2023-01-31)
Treball Final de Grau
Accés obert

We live in an era where intelligent systems are becoming more and more part of our lives. These systems require a large amount of data to learn different tasks and, in many cases, not enough content is available to train ...

Auto-encoding nearest neighbor i-vectors for speaker verification

Khan, Umair; India Massana, Miquel Àngel; Hernando Pericás, Francisco Javier (International Speech Communication Association (ISCA), 2019)
Comunicació de congrés
Accés obert

In the last years, i-vectors followed by cosine or PLDA scoringtechniques were the state-of-the-art approach in speaker veri-fication. PLDA requires labeled background data, and thereexists a significant performance gap ...

Bandwidth extension of narrowband speech

Expósito Pérez, Miquel; Salavedra Molí, Josep (Universidad Politécnica de Valencia, 2014)
Text en actes de congrés
Accés obert

Recently, 4G mobile phone systems have been designed to process wideband speech signals whose sampling frequency is 16 kHz. However, most part of mobile and classical phone network, and current 3G mobile phones, still ...

Bit-slice implementation of a linear predictive vocoder

Vázquez Grau, Gregorio; Gasull Llampallas, Antoni (1985)
Text en actes de congrés
Accés obert

A digital 16-bit high-speed general-purpose signal-processor is shown. The main objective has been the implementation of a linear predictive vocoder for obtaining real-time speech compression. For real-time digital speech ...

Building synthetic voices in the META-NET framework

Garcia Casademont, Emília; Bonafonte Cávez, Antonio; Moreno Bilbao, M. Asunción (2012)
Text en actes de congrés
Accés restringit per política de l'editorial

METANET 4 U is a European project aiming at supporting language technology for European languages and multilingualism. It is a project in the META-NET Network of Excellence, a cluster of projects aiming at fostering the ...

CDHMM speaker recognition by means of frequency filtering of filter-bank energies

Hernando Pericás, Francisco Javier; Nadeu Camprubí, Climent (1997)
Text en actes de congrés
Accés obert

Recently, the set of spectral parameters of every speech frame that result from filtering the frequency sequence of mel-scaled filter-bank energies with a simple first-order high-pass FIR filter have proved to be an efficient ...

Channel selection and reverberation-robust automatic speech recognition

Wolf, Martin (Universitat Politècnica de Catalunya, 2013-11-11)
Tesi
Accés obert

If speech is acquired by a close-talking microphone in a controlled and noise-free environment, current state-of-the-art recognition systems often show an acceptable error rate. The use of close-talking microphones, however, ...

Codificación APVQ de voz en banda ancha para velocidades entre 16 y 32 KBPS

Salavedra Molí, Josep; Masgrau Gómez, Enrique José (1996)
Text en actes de congrés
Accés obert

This paper describes a coding scheme for broadband speech (sampling frequency 16KHz). We present a wideband speech encoder called APVQ (Adaptive Predictive Vector Quantization). It combines Subband Coding, Vector Quantization ...

Codificación APVQ de voz en banda ancha usando asignación dinámica de bits

Salavedra Molí, Josep (Universidad de Valladolid, 1995)
Text en actes de congrés
Accés obert

This paper describes a coding scheme for broadband speech. It can be seen as a vectorial extension of a conventional ADPCM encoder. In this scheme, signal vector is formed with one sample of the normalized prediction error ...

Codificación APVQ-extendida de voz de banda ancha

Masgrau Gómez, Enrique José; Salavedra Molí, Josep (1994)
Text en actes de congrés
Accés obert

This paper describes a coding scheme for broadband speech. It can be seen as a vectorial extension of an conventional ADPCM encoder. In this scheme, the vector signal is formed with one sample of the normalizaed prediction ...

Comparative analysis of methods for the adaptation of Speech Emotion Recognition (SER) systems

Feijóo Rodríguez, David (Universitat Politècnica de Catalunya, 2023-07-06)
Treball Final de Grau
Accés obert
Realitzat a/amb: University of New South Wales

The aim of this work is to analyse how the adaptation to certain speakers of a Speech Emotion Recognition (SER) system improves its performance by contrasting several variations of the adaptation procedure. The initial ...

Comparison of time & frequency filtering and cepstral-time matrix approaches in ASR

Macho, D; Nadeu Camprubí, Climent; Jancovic, P; Rozinaj, G; Hernando Pericás, Francisco Javier (1999)
Text en actes de congrés
Accés obert

In current speech recognition systems, speech is represented by a 2-D sequence of parameters that model the temporal evolution of the spectral envelope of speech. Linear transformation or filtering along both time and ...

Comportamiento de la transformación bilineal de frecuencias en reconocimiento de habla ruidosa

Hernando Pericás, Francisco Javier; Nadeu Camprubí, Climent (1992)
Text en actes de congrés
Accés obert

Configuración e instalación de una PBX de VoIP basada en Asterisk

Castro Alonso, Sergio (Universitat Politècnica de Catalunya, 2013-05-06)
Treball Final de Grau
Accés obert

El proyecto trata de la configuración de una centralita Asterisk y de su integración con diferentes aplicaciones para dar servicios de valor añadido. No hay ninguna duda de que VoIP es la telefonía del futuro por las ...

Conversió de veu a text per a reunions virtuals: un estudi de transcripció automatitzada

Candela i Oliver, Elia (Universitat Politècnica de Catalunya, 2023-07-07)
Treball Final de Grau
Accés obert

In the last few years, the use of Deep Learning has increased in virtual assistance and speech recognition applications, improving its performance with supervised learning techniques. However, it is an area that continues ...