Mostra el registre d'ítem simple

dc.contributor.authorKhan, Umair
dc.contributor.authorHernando Pericás, Francisco Javier
dc.contributor.otherUniversitat Politècnica de Catalunya. Doctorat en Teoria del Senyal i Comunicacions
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions
dc.date.accessioned2020-11-06T16:51:17Z
dc.date.available2020-11-06T16:51:17Z
dc.date.issued2020-10-27
dc.identifier.citationKhan, U.; Hernando, J. The UPC speaker verification system submitted to VoxCeleb Speaker Recognition Challenge 2020 (VoxSRC-20). 2020.
dc.identifier.urihttp://hdl.handle.net/2117/331625
dc.descriptionThis report describes the submission from Technical University of Catalonia (UPC) to the VoxCeleb Speaker Recognition Challenge (VoxSRC-20) at Interspeech 2020. The final submission is a combination of three systems. System-1 is an autoencoder based approach which tries to reconstruct similar i-vectors, whereas System-2 and -3 are Convolutional Neural Network (CNN) based siamese architectures. The siamese networks have two and three branches, respectively, where each branch is a CNN encoder. The double-branch siamese performs binary classification using cross entropy loss during training. Whereas, our triple-branch siamese is trained to learn speaker embeddings using triplet loss. We provide results of our systems on VoxCeleb-1 test, VoxSRC-20 validation and test sets.
dc.description.abstractThis report describes the submission from Technical University of Catalonia (UPC) to the VoxCeleb Speaker Recognition Challenge (VoxSRC-20) at Interspeech 2020. The final submission is a combination of three systems. System-1 is an autoencoder based approach which tries to reconstruct similar i-vectors, whereas System-2 and -3 are Convolutional Neural Network (CNN) based siamese architectures. The siamese networks have two and three branches, respectively, where each branch is a CNN encoder. The double-branch siamese performs binary classification using cross entropy loss during training. Whereas, our triple-branch siamese is trained to learn speaker embeddings using triplet loss. We provide results of our systems on VoxCeleb-1 test, VoxSRC-20 validation and test sets.
dc.description.sponsorshipThis work was supported by the project PID2019-107579RBI00 / AEI / 10.13039/501100011033
dc.format.extent3 p.
dc.language.isoeng
dc.rightsAttribution-NonCommercial-NoDerivs 3.0 Spain
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/es/
dc.subjectÀrees temàtiques de la UPC::Enginyeria de la telecomunicació::Processament del senyal::Processament de la parla i del senyal acústic
dc.subject.lcshAutomatic speech recognition
dc.titleThe UPC speaker verification system submitted to VoxCeleb Speaker Recognition Challenge 2020 (VoxSRC-20)
dc.typeExternal research report
dc.subject.lemacReconeixement automàtic de la parla
dc.contributor.groupUniversitat Politècnica de Catalunya. VEU - Grup de Tractament de la Parla
dc.relation.publisherversionhttps://arxiv.org/abs/2010.10937
dc.rights.accessOpen Access
local.identifier.drac29753633
dc.description.versionPreprint
local.citation.authorKhan, U.; Hernando, J.


Fitxers d'aquest items

Thumbnail

Aquest ítem apareix a les col·leccions següents

Mostra el registre d'ítem simple