Now showing items 1-20 of 189

    • A bilingual Spanish-Catalan database of units for concatenative synthesis 

      Esquerra Llucià, Ignasi; Bonafonte Cávez, Antonio; Vallverdú Bayés, Sisco; Febrer Godayol, Albert (1998)
      Conference report
      Open Access
      Different databases of phonetic units are required in multilingual Text-to-Speech systems based on concatenative synthesis. We are currently developing a TTS system able to convert text either in Catalan and Spanish, with ...
    • A conversation analysis framework using speech recognition and naïve bayes classification for construction process monitoring 

      Zhang, T.; Lee, Y. C.; Zhu, Y.; Hernando Pericás, Francisco Javier (American Society of Civil Engineers (ASCE), 2018)
      Conference report
      Restricted access - publisher's policy
      At a dynamic construction site, conversations convey vital information including construction activities, operation status, and task performance. Even though because of information security, recording the entire conversations ...
    • A low-power, high-performance speech recognition accelerator 

      Yazdani, Reza; Arnau Montañés, José María; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2019-12-01)
      Open Access
      Automatic Speech Recognition (ASR) is becoming increasingly ubiquitous, especially in the mobile segment. Fast and accurate ASR comes at high energy cost, not being affordable for the tiny power-budgeted mobile devices. ...
    • A programmable accelerator for streaming automatic speech recognition on edge devices 

      Pinto Rivero, Dennis; Arnau Montañés, José María; González Colás, Antonio María (2022)
      Conference report
      Open Access
      Automatic Speech Recognition (ASR) is quickly becoming a mainstream technology, mainly driven by the outstanding accuracy achieved by modern systems based on machine learning. However, these systems often require billions ...
    • A Speech-based Dialogue System for Household Robots 

      Pons Rueda, Susana (Universitat Politècnica de Catalunya / Technische Universiteit Delft, 2011)
      Master thesis (pre-Bologna period)
      Restricted access - author's decision
      This thesis studies mechanisms to improve human-robot-interaction through a spoken dialogue for household robots. Therefore, a full dialogue system, in which the semantics of the words play an important role, is ...
    • Action plan for dissemination 

      Cristea, Dan; Trandaba¿, Diana; Branco, Antonio; Mendes, Amalia; Pellegrini, Thomas; Thompson, Paul; Irimia, Elena; Tufis, Dan; Gilmenau, Georgiana; Rosner, Mike; Moreno Bilbao, M. Asunción; Bel, Nùria (2012-07-29)
      Research report
      Open Access
      The central objective of the Metanet4u project is to contribute to the establishment of a pan-European digital platform that makes available language resources and services, encompassing both datasets and software tools, ...
    • Action plan for dissemination updated 

      Cristea, Dan; Trandaba¿, Diana; Branco, Antonio; Mendes, Amalia; Pellegrini, Thomas; Thompson, Paul; Tufis, Dan; Gilmenau, Georgiana; Rosner, Mike; Moreno Bilbao, M. Asunción; Bel, Nùria (2012-07-04)
      Research report
      Open Access
      Deliverable D5.3 del projecte METANET4U (Project CIP #270893)
    • Adaptación del sistema texto a voz "Festival" al catalán 

      Jaén Gómez, Alejandro (Universitat Politècnica de Catalunya, 2007-01-08)
      Master thesis (pre-Bologna period)
      Open Access
    • Age prediction by voice using deep learning 

      Linde Martínez, David (Universitat Politècnica de Catalunya, 2023-01-30)
      Master thesis
      Open Access
      One of the main topics in artificial intelligence is the speech characterization. Moreover, it is a field of study with the minimal scope when the Catalan language is involved in. In this project, we try to perform an age ...
    • AI-Vocie: intel·ligència artificial aplicada al reconeixement de la veu 

      Gil Aguilar, Aleix (Universitat Politècnica de Catalunya, 2019-10)
      Bachelor thesis
      Open Access
      L’objectiu d’aquest treball de final de grau és dissenyar i implementar un Altaveu Intel·ligent, Lima, senzill, però pràctic, i que respongui a ordres, simulant la manera de pensar de l’ésser humà. La finalitat és que ...
    • Alzheimer disease diagnosis based on automatic spontaneous speech analysis 

      Lopez de Ipiña Peña, Karmele; Alonso Hernandez, Jesus Bernardino; Sole Casals, Jordi; Barroso Moreno, Nora; Faúndez Zanuy, Marcos; Ecay Torres, Mirian; Travieso Gonzalez, Carlos Manuel; Ezeiza Ramos, Aitzol; Estanga Alustiza, Ainara (SciTePress, 2012)
      Conference report
      Restricted access - publisher's policy
      Alzheimer's disease (AD) is the most prevalent form of progressive degenerative dementia and it has a high socio-economic impact in Western countries, therefore is one of the most active research areas today. Its diagnosis ...
    • An ASR prototype for Spanish dictation 

      Cosano Serra, Marta (Universitat Politècnica de Catalunya, 2020-01)
      Bachelor thesis
      Open Access
      Automatic Speech Recognition (ASR), or speech to text conversion, has been subject to many researchers for decades due to its various applications. In this project I propose to implement an ASR based on Hidden Markov Model ...
    • An information-theoretic string matching approach for spoken utterance verification and keyword spotting 

      Quer Romeo, Guillem (Universitat Politècnica de Catalunya, 2016)
      Master thesis (pre-Bologna period)
      Restricted access - author's decision
      The goal of this project is to develop an information-theoretic acoustic-phonetic approach to detect the presence of words or phrases in an utterance. Specifically, the project focuses on two types of detection tasks in ...
    • An ultra low-power hardware accelerator for acoustic scoring in speech recognition 

      Tabani, Hamid; Arnau Montañés, José María; Tubella Murgadas, Jordi; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2017)
      Conference report
      Restricted access - publisher's policy
      Accurate, real-time Automatic Speech Recognition (ASR) comes at a high energy cost, so accuracy has often to be sacrificed in order to fit the strict power constraints of mobile systems. However, accuracy is extremely ...
    • An ultra low-power hardware accelerator for automatic speech recognition 

      Yazdani Aminabadi, Reza; Segura Salvador, Albert; Arnau Montañés, José María; González Colás, Antonio María (IEEE Press, 2016)
      Conference report
      Open Access
      Automatic Speech Recognition (ASR) is becoming increasingly ubiquitous, especially in the mobile segment. Fast and accurate ASR comes at a high energy cost which is not affordable for the tiny power budget of mobile devices. ...
    • Analisis estadistico de orden superior de la voz 

      Rodríguez Fonollosa, José Adrián; Moreno Bilbao, M. Asunción (1991)
      Conference report
      Open Access
      Most of the speech analysis methods developed up to date have been based on the autocorrelation function or power spectrum, i. e., the second order statistics of the signa!. In this paper it is shown that higher order ...
    • Análisis de los servicios y sistemas de comunicaciones de voz y datos para la implantación de un sistema de telefonía IP, en un entorno editorial 

      Giné Figueras, Francesc (Universitat Politècnica de Catalunya, 2011-04-27)
      Master thesis (pre-Bologna period)
      Restricted access - author's decision
      Català: L'objectiu principal del present document és el d'analitzar el sistema de comunicacions de veu actual i, en especial, la qualitat del servei ofertat, estudiar els requeriments dels serveis de comunicacions de veu ...
    • Aplicació de la lectura de llavis automatitzada a l'accessibilitat: escriptura per imatge 

      Guevara Moran, Meritxell (Universitat Politècnica de Catalunya, 2023-06-28)
      Bachelor thesis
      Open Access
      En els darrers anys, els avenços significatius en la intel·ligència artificial han obert noves vies per a promoure la diversitat i la integració a la societat. Aquests progressos han proporcionat eines potents que es poden ...
    • Automatic Spanish translation of SQuAD dataset for multi-lingual question answering 

      Carrino, Casimiro Pio; Ruiz Costa-Jussà, Marta; Rodríguez Fonollosa, José Adrián (European Language Resources Association (ELRA), 2020)
      Conference lecture
      Open Access
      Recently, multilingual question answering became a crucial research topic, and it is receiving increased interest in the NLP community.However, the unavailability of large-scale datasets makes it challenging to train ...
    • Automatic speech recognition with deep neural networks for impaired speech 

      España-i-Bonet, Cristina; Rodríguez Fonollosa, José Adrián (Springer, 2016)
      Conference report
      Open Access
      Automatic Speech Recognition has reached almost human performance in some controlled scenarios. However, recognition of impaired speech is a difficult task for two main reasons: data is (i) scarce and (ii) heterogeneous. ...