Now showing items 1-5 of 5

  • LSTM neural network-based speaker segmentation using acoustic and language modelling 

    India Massana, Miquel Àngel; Rodríguez Fonollosa, José Adrián; Hernando Pericás, Francisco Javier (International Speech Communication Association (ISCA), 2017)
    Conference lecture
    Open Access
    This paper presents a new speaker change detection system based on Long Short-Term Memory (LSTM) neural networks using acoustic data and linguistic content. Language modelling is combined with two different ...
  • Towards large scale multimedia indexing: a case study on person discovery in broadcast news 

    Le, Nam; Bredin, Herve; Sergent, Gabriel; India Massana, Miquel Àngel; López-Otero, Paula; Barras, Claude; Guinaudeau, Camille; Gravier, Guillaume; Barbosa da Fonseca, Gabriel; Lyon Freire, Izabela; Patrocinio Jr., Zenilton; Jamil F. Guimarães, Silvio; Martí Juan, Gerard; Morros Rubió, Josep Ramon; Hernando Pericás, Francisco Javier; Docio-Fernández, Laura; García-Mateo, Carmen; Meignier, Sylvain; Odobez, Jean-Marc (Association for Computing Machinery (ACM), 2017)
    Conference report
    Restricted access - publisher's policy
    The rapid growth of multimedia databases and the human interest in their peers make indices representing the location and identity of people in audio-visual documents essential for searching archives. Person discovery ...
  • UPC multimodal speaker diarization system for the 2018 Albayzin challenge 

    India Massana, Miquel Àngel; Sagastiberri, Itziar; Palau Puigdevall, Ponç; Sayrol Clols, Elisa; Morros Rubió, Josep Ramon; Hernando Pericás, Francisco Javier (International Speech Communication Association (ISCA), 2018)
    Conference report
    Open Access
    This paper presents the UPC system proposed for the Multimodal Speaker Diarization task of the 2018 Albayzin Challenge. This approach works by processing individually the speech and the image signal. In the speech domain, ...
  • UPC System for the 2015 MediaEval Multimodal Person Discovery in Broadcast TV Task 

    India Massana, Miquel Àngel (Universitat Politècnica de Catalunya, 2015-12-03)
    Master thesis (pre-Bologna period)
    Open Access
    This project verses about the system that UPC developed to participate in the Multimodal Person Discovery in Broadcast TV task in MediaEval 2015. The main objective of this task is to answer the two questions: Who speaks ...
  • UPC system for the 2016 MediaEval multimodal person discovery in broadcast TV task 

    India Massana, Miquel Àngel; Martí Juan, Gerard; Sayrol Clols, Elisa; Morros Rubió, Josep Ramon; Hernando Pericás, Francisco Javier; Cortillas, Carla; Bouritsas, Giorgos (CEUR-WS.org, 2016)
    Conference lecture
    Open Access
    The UPC system works by extracting monomodal signal segments (face tracks, speech segments) that overlap with the person names overlaid in the video signal. These segments are assigned directly with the name of the person ...