Now showing items 1-20 of 149

    • A bit more on the ability of adaptation of speech signals 

      Ballesteros, Dora Maria; Moreno Aróstegui, Juan Manuel (2013-03)
      Article
      Open Access
      Some traditional digital signal processing techniques encompass enhancement, filtering, coding, compression, detection and recognition. Recently, it has been presented a new hypothesis of signal processing known as the ...
    • A comparative study of techniques for HMM-based noisy speech recognition in noisy car environment 

      Hernando Pericás, Francisco Javier; Nadeu Camprubí, Climent; Mariño Acebal, José Bernardo (Springer, 1993)
      Conference report
      Open Access
      The performance of existing speech recognition systems degrades rapidly in the presence of background noise when training and testing cannot be done under the same ambient conditions. The aim of this paper is to report the ...
    • A continuously adaptive vector predictive coder (AVPC) for speech encoding 

      Masgrau Gómez, Enrique José; Mariño Acebal, José Bernardo; Vallverdú Bayés, Sisco (Institute of Electrical and Electronics Engineers (IEEE), 1986)
      Conference report
      Open Access
      In this work we present a waveform speech coding system including vector quantization. This system can be seen as a vector version of the scalar ADPCM speech coder. In such system the speech samples are grouped in vectors ...
    • A spectral estimator of vocal jitter 

      Mas Soro, Pol (Universitat Politècnica de Catalunya, 2011-09-09)
      Master thesis (pre-Bologna period)
      Open Access
      Covenantee:   Université libre de Bruxelles
      English: The purpose of this thesis is to study and implement a spectral method for short-time jitter estimation. Jitter consists in rapid perturbations of the vocal cycle lengths, which can be observed from one cycle to ...
    • A speech enhancement system using higher order ar estimation in real environments 

      Salavedra Molí, Josep; Masgrau Gómez, Enrique José; Moreno Bilbao, M. Asunción (1993)
      Conference report
      Open Access
      We study some speech enhancement algorithms based on the iterative Wiener filtering method due to Lim-Oppenheim [2], where the AR spectral estimation of the speech is carried out using a second-order analysis. But in our ...
    • Adaptive prediction and bit-assignment in subband coding of speech 

      Mariño Acebal, José Bernardo; Martí Ros, Jaume (1985)
      Conference report
      Open Access
      The combination of time-domain harmonic scaling (TDHS) and sub-band coding (SBC) provides an encoding approach which allows 9.6 Kb/s speech encoding with good communication quality. Starting from this structure, this paper ...
    • Adaptive vector predictive speech coding with sample by sample update at 16 Kbps 

      Masgrau Gómez, Enrique José; Mariño Acebal, José Bernardo (1986)
      Conference report
      Open Access
      A vectorial generalization of the ADPCM system is introduced. Once the speech signal is grouped in vectors, they are coded using a vector predictor (VP) and a vector quantizer (VQ). Both subsystems are continously adaptive; ...
    • Almacenamiento en nodos de redes inalámbricas de sensores 

      Pérez Rodríguez, Iria (Universitat Politècnica de Catalunya, 2013-01-21)
      Master thesis (pre-Bologna period)
      Open Access
      [ANGLÈS] The aim of this project is to store data inside a Wireless Sensor Network node, in order to transmit it when the conditions are favourable, using a certain hardware and software stack. This necessity comes from ...
    • An HMM-Based Approach to the INTERSPEECH 2011 Speaker State Challenge 

      Nogueiras Rodríguez, Albino (2011)
      Conference lecture
      Restricted access - publisher's policy
      The current main trend in paralinguistic information recognition is the so-called static classification. In this kind of classification the low level descriptors are pooled togethr by means of statistical functionals ...
    • Anàlisi de sentiment per a textos curts en català i castellà aprofitant dades no supervisades 

      Navarrete Jimenez, Daniel (Universitat Politècnica de Catalunya, 2021-01-24)
      Bachelor thesis
      Open Access
      There may be a lot of abusive behaviour in conversations between teenagers, which take place through social media. In this project, we develop classifiers to find out which texts present abuse such as violence, sexual ...
    • APVQ encoder applied to wideband speech coding 

      Salavedra Molí, Josep; Masgrau Gómez, Enrique José (Institute of Electrical and Electronics Engineers (IEEE), 1996)
      Conference report
      Open Access
      The paper describes a coding scheme for broadband speech (sampling frequency 16 KHz). The authors present a wideband speech encoder called APVQ (adaptive predictive vector quantization). It combines subband coding, vector ...
    • AR modeling of the speech autocorrelation to improve noisy speech recognition 

      Hernando Pericás, Francisco Javier; Nadeu Camprubí, Climent (1992)
      Conference report
      Open Access
      Speech recognition in noisy environments remains an unsolved problem even in the case of isolated word recognition with small vocabularies. Recently, several techniques have been proposed to alleviate this problem. Concretely, ...
    • Audio classification experiments in a neonatal intensive care unit 

      Sólvez Pérez, Sergi (Universitat Politècnica de Catalunya, 2014-06-25)
      Master thesis (pre-Bologna period)
      Open Access
      [ANGLÈS] Newborns delivered at a gestational age of 24-32 weeks commonly have health problems. The use of a Neonatal Intensive Care Unit (NICU) is, in most of the cases, crucial for their survival. Nowadays, it is known ...
    • Audiovisual head orientation estimation with particle filtering in multisensor scenarios 

      Canton Ferrer, Cristian; Segura Perales, Carlos; Casas Pla, Josep Ramon; Pardàs Feliu, Montse; Hernando Pericás, Francisco Javier (2008-01)
      Article
      Open Access
      This article presents a multimodal approach to head pose estimation of individuals in environments equipped with multiple cameras and microphones, such as SmartRooms or automatic video conferencing. Determining the individuals ...
    • Auto-encoding nearest neighbor i-vectors for speaker verification 

      Khan, Umair; India Massana, Miquel Àngel; Hernando Pericás, Francisco Javier (International Speech Communication Association (ISCA), 2019)
      Conference lecture
      Open Access
      In the last years, i-vectors followed by cosine or PLDA scoringtechniques were the state-of-the-art approach in speaker veri-fication. PLDA requires labeled background data, and thereexists a significant performance gap ...
    • Bandwidth extension of narrowband speech 

      Expósito Pérez, Miquel; Salavedra Molí, Josep (Universidad Politécnica de Valencia, 2014)
      Conference report
      Open Access
      Recently, 4G mobile phone systems have been designed to process wideband speech signals whose sampling frequency is 16 kHz. However, most part of mobile and classical phone network, and current 3G mobile phones, still ...
    • Bit-slice implementation of a linear predictive vocoder 

      Vázquez Grau, Gregorio; Gasull Llampallas, Antoni (1985)
      Conference report
      Open Access
      A digital 16-bit high-speed general-purpose signal-processor is shown. The main objective has been the implementation of a linear predictive vocoder for obtaining real-time speech compression. For real-time digital speech ...
    • Building synthetic voices in the META-NET framework 

      Garcia Casademont, Emília; Bonafonte Cávez, Antonio; Moreno Bilbao, M. Asunción (2012)
      Conference report
      Restricted access - publisher's policy
      METANET 4 U is a European project aiming at supporting language technology for European languages and multilingualism. It is a project in the META-NET Network of Excellence, a cluster of projects aiming at fostering the ...
    • CDHMM speaker recognition by means of frequency filtering of filter-bank energies 

      Hernando Pericás, Francisco Javier; Nadeu Camprubí, Climent (1997)
      Conference report
      Open Access
      Recently, the set of spectral parameters of every speech frame that result from filtering the frequency sequence of mel-scaled filter-bank energies with a simple first-order high-pass FIR filter have proved to be an efficient ...
    • Codificación APVQ de voz en banda ancha para velocidades entre 16 y 32 KBPS 

      Salavedra Molí, Josep; Masgrau Gómez, Enrique José (1996)
      Conference report
      Open Access
      This paper describes a coding scheme for broadband speech (sampling frequency 16KHz). We present a wideband speech encoder called APVQ (Adaptive Predictive Vector Quantization). It combines Subband Coding, Vector Quantization ...