DNN speaker embeddings using autoencoder pre-training
08902945.pdf (368,1Kb) (Restricted access) Request copy
Què és aquest botó?
Aquest botó permet demanar una còpia d'un document restringit a l'autor. Es mostra quan:
- Disposem del correu electrònic de l'autor
- El document té una mida inferior a 20 Mb
- Es tracta d'un document d'accés restringit per decisió de l'autor o d'un document d'accés restringit per política de l'editorial
Document typeConference lecture
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Rights accessRestricted access - publisher's policy
Over the last years, i-vectors have been the state-of-the-art approach in speaker recognition. Recent improvements in deep learning have increased the discriminative quality of i-vectors. However, deep learning architectures require a large amount of labeled background data which is difficult in practice. The aim of this paper is to propose an alternative scheme in order to reduce the need of labeled data. We propose the use of autoencoder pre-training in a speaker verification task. First, we train an autoencoder in an unsupervised way, using a large amount of unlabeled background data. Then, we train a Deep Neural Network (DNN) initialized with the parameters of the pre-trained autoencoder. The DNN training is carried out in a supervised way using relatively small labeled background data. In the testing phase, we extract speaker embeddings as the output of an intermediate layer of the DNN. The training and evaluation were performed on VoxCeleb-2 and VoxCeleb1 databases, respectively. The experimental results have shown that by initializing DNN with the parameters of the pre-trained autoencoder, we have achieved a relative improvement of 21%, in terms of Equal Error Rate (EER), over the baseline i-vector/PLDA system.
CitationKhan, U.; Hernando, J. DNN speaker embeddings using autoencoder pre-training. A: European Signal Processing Conference. "27th EUSIPCO 2019 European Signal Processing Conference: A Coruña, Spain: September 2-6, 2019". Institute of Electrical and Electronics Engineers (IEEE), 2019, p. 1-5.
All rights reserved. This work is protected by the corresponding intellectual and industrial property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public communication or transformation of this work are prohibited without permission of the copyright holder