Now showing items 1-10 of 10

    • Auto-encoding nearest neighbor i-vectors for speaker verification 

      Khan, Umair; India Massana, Miquel Àngel; Hernando Pericás, Francisco Javier (International Speech Communication Association (ISCA), 2019)
      Conference lecture
      Open Access
      In the last years, i-vectors followed by cosine or PLDA scoringtechniques were the state-of-the-art approach in speaker veri-fication. PLDA requires labeled background data, and thereexists a significant performance gap ...
    • DNN speaker embeddings using autoencoder pre-training 

      Khan, Umair; Hernando Pericás, Francisco Javier (Institute of Electrical and Electronics Engineers (IEEE), 2019)
      Conference lecture
      Restricted access - publisher's policy
      Over the last years, i-vectors have been the state-of-the-art approach in speaker recognition. Recent improvements in deep learning have increased the discriminative quality of i-vectors. However, deep learning architectures ...
    • I-vector transformation using k-nearest neighbors for speaker verification 

      Khan, Umair; India Massana, Miquel Àngel; Hernando Pericás, Francisco Javier (Institute of Electrical and Electronics Engineers (IEEE), 2020)
      Conference report
      Restricted access - publisher's policy
      Probabilistic Linear Discriminant Analysis (PLDA) is the most efficient backend for i-vectors. However, it requires labeled background data which can be difficult to access in practice. Unlike PLDA, cosine scoring avoids ...
    • Restricted Boltzmann Machine vectors for speaker clustering 

      Khan, Umair; Safari, Pooyan; Hernando Pericás, Francisco Javier (International Speech Communication Association (ISCA), 2018)
      Conference lecture
      Open Access
      Restricted Boltzmann Machines (RBMs) have been used both in the front-end and backend of speaker verification systems. In this work, we apply RBMs as a front-end in the context of speaker clustering. Speakers' utterances ...
    • Restricted Boltzmann machine vectors for speaker clustering and tracking tasks in TV broadcast shows 

      Khan, Umair; Safari, Pooyan; Hernando Pericás, Francisco Javier (Multidisciplinary Digital Publishing Institute, 2019-07-09)
      Article
      Open Access
      Restricted Boltzmann Machines (RBMs) have shown success in both the front-end and backend of speaker verification systems. In this paper, we propose applying RBMs to the front-end for the tasks of speaker clustering and ...
    • Self-supervised deep learning approaches to speaker recognition 

      Khan, Umair (Universitat Politècnica de Catalunya, 2021-01-11)
      Doctoral thesis
      Open Access
      In speaker recognition, i-vectors have been the state-of-the-art unsupervised technique over the last few years, whereas x-vectors is becoming the state-of-the-art supervised technique, these days. Recent advances in Deep ...
    • Self-supervised deep learning approaches to speaker recognition: A Ph.D. Thesis overview 

      Khan, Umair; Hernando Pericás, Francisco Javier (International Speech Communication Association (ISCA), 2021)
      Conference lecture
      Open Access
      Recent advances in Deep Learning (DL) for speaker recognition have improved the performance but are constrained to the need of labels for the background data, which is difficult in prac- tice. In i-vector based speaker ...
    • Speaker tracking system using speaker boundary detection 

      Khan, Umair (Universitat Politècnica de Catalunya, 2016-11)
      Master thesis
      Open Access
      This thesis is about a research conducted in the area of Speaker Recognition. The application is concerned to the automatic detection and tracking of target speakers in meetings, conferences, telephone conversations and ...
    • The UPC speaker verification system submitted to VoxCeleb Speaker Recognition Challenge 2020 (VoxSRC-20) 

      Khan, Umair; Hernando Pericás, Francisco Javier (2020-10-27)
      Research report
      Open Access
      This report describes the submission from Technical University of Catalonia (UPC) to the VoxCeleb Speaker Recognition Challenge (VoxSRC-20) at Interspeech 2020. The final submission is a combination of three systems. ...
    • Unsupervised training of siamese networks for speaker verification 

      Khan, Umair; Hernando Pericás, Francisco Javier (International Speech Communication Association (ISCA), 2020)
      Conference report
      Open Access
      Speaker labeled background data is an essential requirement for most state-of-the-art approaches in speaker recognition, e.g., xvectors and i-vector/PLDA. However, in reality it is difficult to access large amount of labeled ...