Exploració per tema "Àrees temàtiques de la UPC::Enginyeria de la telecomunicació::Processament del senyal::Processament de la parla i del senyal acústic"

A bilingual Spanish-Catalan database of units for concatenative synthesis

Esquerra Llucià, Ignasi; Bonafonte Cávez, Antonio; Vallverdú Bayés, Sisco; Febrer Godayol, Albert (1998)
Text en actes de congrés
Accés obert

Different databases of phonetic units are required in multilingual Text-to-Speech systems based on concatenative synthesis. We are currently developing a TTS system able to convert text either in Catalan and Spanish, with ...

A comparative study of techniques for HMM-based noisy speech recognition in noisy car environment

Hernando Pericás, Francisco Javier; Nadeu Camprubí, Climent; Mariño Acebal, José Bernardo (Springer, 1993)
Text en actes de congrés
Accés obert

The performance of existing speech recognition systems degrades rapidly in the presence of background noise when training and testing cannot be done under the same ambient conditions. The aim of this paper is to report the ...

A continuously adaptive vector predictive coder (AVPC) for speech encoding

Masgrau Gómez, Enrique José; Mariño Acebal, José Bernardo; Vallverdú Bayés, Sisco (Institute of Electrical and Electronics Engineers (IEEE), 1986)
Text en actes de congrés
Accés obert

In this work we present a waveform speech coding system including vector quantization. This system can be seen as a vector version of the scalar ADPCM speech coder. In such system the speech samples are grouped in vectors ...

A conversation analysis framework using speech recognition and naïve bayes classification for construction process monitoring

Zhang, T.; Lee, Y. C.; Zhu, Y.; Hernando Pericás, Francisco Javier (American Society of Civil Engineers (ASCE), 2018)
Text en actes de congrés
Accés restringit per política de l'editorial

At a dynamic construction site, conversations convey vital information including construction activities, operation status, and task performance. Even though because of information security, recording the entire conversations ...

A fast one-pass-training feature selection technique for GMM-based acoustic event detection with audio-visual data

Butko, Taras; Nadeu Camprubí, Climent (2010)
Text en actes de congrés
Accés obert

Acoustic event detection becomes a difficult task, even for a small number of events, in scenarios where events are produced rather spontaneously and often overlap in time. In this work, we aim to improve the detection ...

A graph partitioning approach to entity disambiguation using uncertain information

Sapena Masip, Emilio; Padró, Lluís; Turmo Borras, Jorge (Springer, 2008-08-31)
Text en actes de congrés
Accés restringit per política de l'editorial

This paper presents a method for Entity Disambiguation in Information Extraction from different sources in the web. Once entities and relations between them are extracted, it is needed to determine which ones are referring ...

A graph-based strategy to streamline translation quality assessments

Pighin, Daniele; Formiga Fanals, Lluís; Màrquez Villodre, Lluís (2012)
Text en actes de congrés
Accés obert

We present a detailed analysis of a graph- based annotation strategy that we employed to annotate a corpus of 11,292 real-world En- glish to Spanish automatic translations with relative (ranking) and absolute ...

A hierarchical architecture with feature selection for audio segmentation in a broadcast news domain

Butko, Taras; Nadeu Camprubí, Climent (2010)
Text en actes de congrés
Accés obert

This work presents a hierarchical HMM-based audio segmentation system with feature selection designed for the Albayzin 2010 Evaluations. We propose an architecture that combines the outputs of individual binary detectors ...

A law of word meaning in dolphin whistle types

Ferrer Cancho, Ramon; McCowan, Brenda (2009-10-30)
Article
Accés obert

We show that dolphin whistle types tend to be used in specific behavioral contexts, which is consistent with the hypothesis that dolphin whistle have some sort of “meaning”. Besides, in some cases, it can be shown that the ...

A low-power, high-performance speech recognition accelerator

Yazdani, Reza; Arnau Montañés, José María; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2019-12-01)
Article
Accés obert

Automatic Speech Recognition (ASR) is becoming increasingly ubiquitous, especially in the mobile segment. Fast and accurate ASR comes at high energy cost, not being affordable for the tiny power-budgeted mobile devices. ...

A multilingual corpus for rich audio-visual scene description in a meeting-room environment

Butko, Taras; Nadeu Camprubí, Climent; Moreno Bilbao, M. Asunción (ACM Press. Association for Computing Machinery, 2011)
Text en actes de congrés
Accés restringit per política de l'editorial

In this paper, we present a multilingual database specifically designed to develop technologies for rich audio-visual scene description in meeting-room environments. Part of that database includes the already existing ...

A neural network approach for automatic detection of acoustic alarms

Peiró Lilja, Alexandre; Raboshchuk, Ganna; Nadeu Camprubí, Climent (Scitepress, 2017)
Comunicació de congrés
Accés restringit per política de l'editorial

Acoustic alarms generated by biomedical equipment are relevant sounds in the noisy Neonatal Intensive Care Unit (NICU) environment both because of their high frequency of occurrence and their possible negative effects on ...

A new algorithm for adaptive IIR filtering based on the log-area-ratio parameters

Rodríguez Fonollosa, José Adrián; Masgrau Gómez, Enrique José (Elsevier, 1990)
Text en actes de congrés
Accés obert

A programmable accelerator for streaming automatic speech recognition on edge devices

Pinto Rivero, Dennis; Arnau Montañés, José María; González Colás, Antonio María (2022)
Text en actes de congrés
Accés obert

Automatic Speech Recognition (ASR) is quickly becoming a mainstream technology, mainly driven by the outstanding accuracy achieved by modern systems based on machine learning. However, these systems often require billions ...

A spectral estimator of vocal jitter

Mas Soro, Pol (Universitat Politècnica de Catalunya, 2011-09-09)
Projecte/Treball Final de Carrera
Accés obert
Realitzat a/amb: Université libre de Bruxelles

English: The purpose of this thesis is to study and implement a spectral method for short-time jitter estimation. Jitter consists in rapid perturbations of the vocal cycle lengths, which can be observed from one cycle to ...

A speech enhancement system using higher order ar estimation in real environments

Salavedra Molí, Josep; Masgrau Gómez, Enrique José; Moreno Bilbao, M. Asunción (1993)
Text en actes de congrés
Accés obert

We study some speech enhancement algorithms based on the iterative Wiener filtering method due to Lim-Oppenheim [2], where the AR spectral estimation of the speech is carried out using a second-order analysis. But in our ...

A stabilized finite element method for the mixed wave equation in an ALE framework with application to diphthong production

Guasch Fortuny, Oriol; Arnela, Marc; Codina, Ramon; Espinoza Román, Héctor Gabriel (2016-01)
Article
Accés obert

Working with the wave equation in mixed rather than irreducible form allows one to directly account for both, the acoustic pressure field and the acoustic particle velocity field. Indeed, this becomes the natural option ...

UPCommons. Portal del coneixement obert de la UPC