Clustering initialization based on spatial information for speaker diarization of meetings
Tipo de documentoTexto en actas de congreso
Fecha de publicación2008
Condiciones de accesoAcceso restringido por política de la editorial
This paper proposes an initialization for an agglomerative system applied to speaker diarization in the meeting environment. The initialization is based on a previous clustering of the temporal sequence generated by the estimation of the Time Delay of Arrival (TDOA) among pair of sensors. That initial clustering has the purpose of obtaining initial classes with speaker information from a sole speaker. The aim is to ensure the purity of the initial segments based on the position of the speakers in a meeting along time. The TDOA initialization was tested with the dataset used in the RT07s evaluation where an improvement of the diariazation error rate is obtained with respect to the classical uniform initialization. The most of the experiments show that the purity of the beginning segments leads to a better clustering on the posterior hierarchical strategy based on cepstral features.
CitaciónLuque, J., Segura, C., Hernando, J. Clustering initialization based on spatial information for speaker diarization of meetings. A: Annual Conference of the International Speech Communication Association. "INTERSPEECH 2008". Brisbane: 2008, p. 383-386.
Versión del editorhttp://www.lsi.upc.edu/~nlp/papers/hernando_clust.pdf