L'àmbit de recerca del grup 'VEU' és el tractament de la parla. Investiguem tecnologies que permeten l'extracció d'informació que la veu conté: reconeixement del que es diu, l'idioma o el dialecte, característiques del parlant -qui és, la seva edat, el sexe, l'estat emocional-, la direcció del so. També treballem en la caracterització general de l'àudio, per determinar quan hi ha veu i quan hi ha altres esdeveniments acústics com música o sorolls diversos. Les tecnologies de la parla possibiliten generar veu -síntesis de veu- o modificar les seves

http://futur.upc.edu/VEU

Enviaments recents

  • Language and noise transfer in speech enhancement generative adversarial network 

    Pascual de la Puente, Santiago; Park, Maruchan; Serra, Joan; Bonafonte Cávez, Antonio; Ahn, Kang-hun (Institute of Electrical and Electronics Engineers (IEEE), 2018)
    Text en actes de congrés
    Accés restringit per política de l'editorial
    Speech enhancement deep learning systems usually require large amounts of training data to operate in broad conditions or real applications. This makes the adaptability of those systems into new, low resource environments ...
  • Tracking economic growth by evolving expectations via genetic programming: a two-step approach 

    Claveria González, Oscar; Monte Moreno, Enrique; Torra Porras, Salvador (2018-10-09)
    Report de recerca
    Accés obert
    The main objective of this study is to present a two-step approach to generate estimates of economic growth based on agents’ expectations from tendency surveys. First, we design a genetic programming experiment to derive ...
  • A geometric proxy of economic uncertainty based on the disagreement in survey expectations 

    Claveria González, Oscar; Monte Moreno, Enrique; Torra Porras, Salvador (2018)
    Comunicació de congrés
    Accés obert
    In this study we present a geometric approach to proxy economic uncertainty. We design a positional indicator of disagreement among survey-based agents' expectations about the state of the economy. Previous dispersion-based ...
  • Economic uncertainty: a geometric indicator of discrepancy among experts’ expectations 

    Claveria González, Oscar; Monte Moreno, Enrique; Torra Porras, Salvador (2018-01-01)
    Article
    Accés restringit per acord de confidencialitat
    In this study we present a geometric approach to proxy economic uncertainty. We design a positional indicator of disagreement among survey-based agents’ expectations about the state of the economy. Previous dispersion-based ...
  • TEC forecasting based on manifold trajectories 

    Monte Moreno, Enrique; García Rigo, Alberto; Hernández Pajares, Manuel; Yang, Heng (Multidisciplinary Digital Publishing Institute (MDPI), 2018-06-22)
    Article
    Accés obert
    In this paper, we present a method for forecasting the ionospheric Total Electron Content (TEC) distribution from the International GNSS Service’s Global Ionospheric Maps. The forecasting system gives an estimation of the ...
  • Machine and deep learning approaches to localization and range estimation of underwater acoustic sources 

    Houégnigan, Ludwig; Safari, Pooyan; Nadeu Camprubí, Climent; André, Michel; Van der Schaar, Mike Connor Roger Malcolm (Institute of Electrical and Electronics Engineers (IEEE), 2017)
    Comunicació de congrés
    Accés restringit per política de l'editorial
    This paper introduces ongoing experiments and early results for the underwater localization and range estimation of acoustic sources. Beyond classical results obtained for direction of arrival estimation, results concerning ...
  • Search engine for multilingual audiovisual contents 

    Pérez, José David; Bonafonte Cávez, Antonio; Ruiz Costa-Jussà, Marta; Cardenal, Antonio; Rodríguez Fonollosa, José Adrián; Moreno Bilbao, M. Asunción; Navas, Eva; Rodríguez Banga, Eduardo (2012)
    Comunicació de congrés
    Accés obert
    This paper describes the BUCEADOR search engine, a web server that allows retrieving. multimedia documents (text, audio, video) in different languages. All the documents are translated into the user language and are ...
  • N-gram-based machine translation 

    Mariño Acebal, José Bernardo; Banchs, Rafael E.; Crego, Josep Maria; de Gispert Ramis, Adrià; Lambert, Patrik; Rodríguez Fonollosa, José Adrián; Ruiz Costa-Jussà, Marta (2006-12-01)
    Article
    Accés obert
    This article describes in detail an n-gram approach to statistical machine translation. This ap- proach consists of a log-linear combination of a translation model based on n-grams of bilingual units, which are referred ...
  • Multi-output RNN-LSTM for multiple speaker speech synthesis and adaptation 

    Pascual, Santiago; Bonafonte Cávez, Antonio (Institute of Electrical and Electronics Engineers (IEEE), 2016)
    Text en actes de congrés
    Accés restringit per política de l'editorial
    Deep Learning has been applied successfully to speech processing. In this paper we propose an architecture for speech synthesis using multiple speakers. Some hidden layers are shared by all the speakers, while there is a ...
  • A bilingual Spanish-Catalan database of units for concatenative synthesis 

    Esquerra Llucià, Ignasi; Bonafonte Cávez, Antonio; Vallverdú Bayés, Sisco; Febrer Godayol, Albert (1998)
    Text en actes de congrés
    Accés obert
    Different databases of phonetic units are required in multilingual Text-to-Speech systems based on concatenative synthesis. We are currently developing a TTS system able to convert text either in Catalan and Spanish, with ...
  • Phoneme recognition with statistical modeling of the prediction error of neural networks 

    Freitag, Fèlix; Monte Moreno, Enrique (International Speech Communication Association (ISCA), 1998)
    Text en actes de congrés
    Accés obert
    This paper presents a speech recognition system which incorporates predictive neural networks. The neural networks are used to predict observation vectors of speech. The prediction error vectors are modeled on the state ...
  • Feature decorrelation methods in speech recognition. A comparative study 

    Batlle Mont, Eloi; Nadeu Camprubí, Climent; Rodríguez Fonollosa, José Adrián (International Speech Communication Association (ISCA), 1998)
    Text en actes de congrés
    Accés obert
    In this paper we study various decorrelation methods for the features used in speech recognition and we compare the performance of each one by running several tests with a speech database. First of all we study the ...

Mostra'n més