Now showing items 1-9 of 9

    • Deep learning applied to speech synthesis 

      Pascual de la Puente, Santiago (Universitat Politècnica de Catalunya, 2016-06-30)
      Master thesis
      Open Access
      Deep Learning has been applied successfully to speech processing problems. In this work we explore its capabilities, focusing concretely in recurrent neural architectures to build a state of the art Text-To-Speech system ...
    • Editorial 

      Pascual de la Puente, Santiago (Rama de estudiantes del IEEE de Barcelona, 2013-05)
      Article
      Open Access
    • Efficient, end-to-end and self-supervised methods for speech processing and generation 

      Pascual de la Puente, Santiago (Universitat Politècnica de Catalunya, 2020-01-31)
      Doctoral thesis
      Open Access
      Deep learning has affected the speech processing and generation fields in many directions. First, end-to-end architectures allow the direct injection and synthesis of waveform samples. Secondly, the exploration of efficient ...
    • Exploring efficient neural architectures for linguistic-acoustic mapping in text-to-speech 

      Pascual de la Puente, Santiago; Serra, Joan; Bonafonte Cávez, Antonio (Multidisciplinary Digital Publishing Institute, 2019-08-17)
      Article
      Open Access
      Conversion from text to speech relies on the accurate mapping from linguistic to acoustic symbol sequences, for which current practice employs recurrent statistical models such as recurrent neural networks. Despite the ...
    • Language and noise transfer in speech enhancement generative adversarial network 

      Pascual de la Puente, Santiago; Park, Maruchan; Serra, Joan; Bonafonte Cávez, Antonio; Ahn, Kang-hun (Institute of Electrical and Electronics Engineers (IEEE), 2018)
      Conference report
      Restricted access - publisher's policy
      Speech enhancement deep learning systems usually require large amounts of training data to operate in broad conditions or real applications. This makes the adaptability of those systems into new, low resource environments ...
    • Prosodic break prediction with RNNs 

      Pascual de la Puente, Santiago; Bonafonte Cávez, Antonio (Springer, 2016)
      Conference report
      Restricted access - publisher's policy
      Prosodic breaks prediction from text is a fundamental task to obtain naturalness in text to speech applications. In this work we build a data-driven break predictor out of linguistic features like the Part of Speech (POS) ...
    • Spanish statistical parametric speech synthesis using a neural vocoder 

      Bonafonte Cávez, Antonio; Pascual de la Puente, Santiago; Dorca, G. (International Speech Communication Association (ISCA), 2018)
      Conference report
      Open Access
      During the 2000s decade, unit-selection based text-to-speech was the dominant commercial technology. Meanwhile, the TTS research community has made a big effort to push statistical-parametric speech synthesis to get similar ...
    • Time-domain speech enhancement using generative adversarial networks 

      Pascual de la Puente, Santiago; Serra, Joan; Bonafonte Cávez, Antonio (2019-11-01)
      Article
      Restricted access - publisher's policy
      Speech enhancement improves recorded voice utterances to eliminate noise that might be impeding their intelligibility or compromising their quality. Typical speech enhancement systems are based on regression approaches ...
    • Wav2Pix: speech-conditioned face generation using generative adversarial networks 

      Cardoso Duarte, Amanda; Roldan, Francisco; Tubau, Miquel; Escur, Janna; Pascual de la Puente, Santiago; Salvador Aguilera, Amaia; Mohedano, Eva; McGuinness, Kevin; Torres Viñals, Jordi; Giró Nieto, Xavier (Institute of Electrical and Electronics Engineers (IEEE), 2019)
      Conference lecture
      Restricted access - publisher's policy
      Speech is a rich biometric signal that contains information about the identity, gender and emotional state of the speaker. In this work, we explore its potential to generate face images of a speaker by conditioning a ...