Browsing by Author "Khan, Umair"
Now showing items 1-10 of 10
-
Auto-encoding nearest neighbor i-vectors for speaker verification
Khan, Umair; India Massana, Miquel Àngel; Hernando Pericás, Francisco Javier (International Speech Communication Association (ISCA), 2019)
Conference lecture
Open AccessIn the last years, i-vectors followed by cosine or PLDA scoringtechniques were the state-of-the-art approach in speaker veri-fication. PLDA requires labeled background data, and thereexists a significant performance gap ... -
DNN speaker embeddings using autoencoder pre-training
Khan, Umair; Hernando Pericás, Francisco Javier (Institute of Electrical and Electronics Engineers (IEEE), 2019)
Conference lecture
Restricted access - publisher's policyOver the last years, i-vectors have been the state-of-the-art approach in speaker recognition. Recent improvements in deep learning have increased the discriminative quality of i-vectors. However, deep learning architectures ... -
I-vector transformation using k-nearest neighbors for speaker verification
Khan, Umair; India Massana, Miquel Àngel; Hernando Pericás, Francisco Javier (Institute of Electrical and Electronics Engineers (IEEE), 2020)
Conference report
Restricted access - publisher's policyProbabilistic Linear Discriminant Analysis (PLDA) is the most efficient backend for i-vectors. However, it requires labeled background data which can be difficult to access in practice. Unlike PLDA, cosine scoring avoids ... -
Restricted Boltzmann Machine vectors for speaker clustering
Khan, Umair; Safari, Pooyan; Hernando Pericás, Francisco Javier (International Speech Communication Association (ISCA), 2018)
Conference lecture
Open AccessRestricted Boltzmann Machines (RBMs) have been used both in the front-end and backend of speaker verification systems. In this work, we apply RBMs as a front-end in the context of speaker clustering. Speakers' utterances ... -
Restricted Boltzmann machine vectors for speaker clustering and tracking tasks in TV broadcast shows
Khan, Umair; Safari, Pooyan; Hernando Pericás, Francisco Javier (Multidisciplinary Digital Publishing Institute, 2019-07-09)
Article
Open AccessRestricted Boltzmann Machines (RBMs) have shown success in both the front-end and backend of speaker verification systems. In this paper, we propose applying RBMs to the front-end for the tasks of speaker clustering and ... -
Self-supervised deep learning approaches to speaker recognition
Khan, Umair (Universitat Politècnica de Catalunya, 2021-01-11)
Doctoral thesis
Open AccessIn speaker recognition, i-vectors have been the state-of-the-art unsupervised technique over the last few years, whereas x-vectors is becoming the state-of-the-art supervised technique, these days. Recent advances in Deep ... -
Self-supervised deep learning approaches to speaker recognition: A Ph.D. Thesis overview
Khan, Umair; Hernando Pericás, Francisco Javier (International Speech Communication Association (ISCA), 2021)
Conference lecture
Open AccessRecent advances in Deep Learning (DL) for speaker recognition have improved the performance but are constrained to the need of labels for the background data, which is difficult in prac- tice. In i-vector based speaker ... -
Speaker tracking system using speaker boundary detection
Khan, Umair (Universitat Politècnica de Catalunya, 2016-11)
Master thesis
Open AccessThis thesis is about a research conducted in the area of Speaker Recognition. The application is concerned to the automatic detection and tracking of target speakers in meetings, conferences, telephone conversations and ... -
The UPC speaker verification system submitted to VoxCeleb Speaker Recognition Challenge 2020 (VoxSRC-20)
Khan, Umair; Hernando Pericás, Francisco Javier (2020-10-27)
Research report
Open AccessThis report describes the submission from Technical University of Catalonia (UPC) to the VoxCeleb Speaker Recognition Challenge (VoxSRC-20) at Interspeech 2020. The final submission is a combination of three systems. ... -
Unsupervised training of siamese networks for speaker verification
Khan, Umair; Hernando Pericás, Francisco Javier (International Speech Communication Association (ISCA), 2020)
Conference report
Open AccessSpeaker labeled background data is an essential requirement for most state-of-the-art approaches in speaker recognition, e.g., xvectors and i-vector/PLDA. However, in reality it is difficult to access large amount of labeled ...