Enviaments recents

  • Cross-modal embeddings for video and audio retrieval 

    Surís Coll-Vinent, Dídac; Duarte, Amanda; Salvador Aguilera, Amaia; Torres Viñals, Jordi; Giró Nieto, Xavier (Springer, 2019)
    Text en actes de congrés
    Accés obert
    In this work, we explore the multi-modal information provided by the Youtube-8M dataset by projecting the audio and visual features into a common feature space, to obtain joint audio-visual embeddings. These links are used ...
  • Action tube extraction based 3D-CNN for RGB-D action recognition 

    Xu, Zhengyu; Vilaplana Besler, Verónica; Morros Rubió, Josep Ramon (Institute of Electrical and Electronics Engineers (IEEE), 2018)
    Text en actes de congrés
    Accés obert
    In this paper we propose a novel action tube extractor for RGB-D action recognition in trimmed videos. The action tube extractor takes as input a video and outputs an action tube. The method consists of two parts: spatial ...
  • UPC multimodal speaker diarization system for the 2018 Albayzin challenge 

    India Massana, Miquel Àngel; Sagastiberri, Itziar; Palau Puigdevall, Ponç; Sayrol Clols, Elisa; Morros Rubió, Josep Ramon; Hernando Pericás, Francisco Javier (International Speech Communication Association (ISCA), 2018)
    Text en actes de congrés
    Accés obert
    This paper presents the UPC system proposed for the Multimodal Speaker Diarization task of the 2018 Albayzin Challenge. This approach works by processing individually the speech and the image signal. In the speech domain, ...
  • Shared latent structures between imaging features and biomarkers in early stages of Alzheimer's disease 

    Casamitjana Díaz, Adrià; Vilaplana Besler, Verónica; Petrone, Paula; Molinuevo, Jose Luis; Gispert, Juan Domingo (Springer International Publishing, 2018)
    Comunicació de congrés
    Accés restringit per política de l'editorial
    In this work, we identify meaningful latent patterns in MR images for patients across the Alzheimer’s disease (AD) continuum. For this purpose, we apply Projection to Latent Structures (PLS) method using cerebrospinal fluid ...
  • Leishmaniasis parasite segmentation and classification using deep learning 

    Górriz, Marc; Aparicio, Albert; Raventós, Berta; Vilaplana Besler, Verónica; Sayrol Clols, Elisa; López Codina, Daniel (Springer, 2018)
    Comunicació de congrés
    Accés restringit per política de l'editorial
    Leishmaniasis is considered a neglected disease that causes thousands of deaths annually in some tropical and subtropical countries. There are various techniques to diagnose leishmaniasis of which manual microscopy is ...
  • Monte-Carlo sampling applied to multiple instance learning for histological image classification 

    Combalia, Marc; Vilaplana Besler, Verónica (Springer, 2018)
    Comunicació de congrés
    Accés restringit per política de l'editorial
    We propose a patch sampling strategy based on a sequential Monte-Carlo method for high resolution image classification in the context of Multiple Instance Learning. When compared with grid sampling and uniform sampling ...
  • Monte-Carlo sampling applied to multiple instance learning for whole slide image classification 

    Combalia, Marc; Vilaplana Besler, Verónica (2018)
    Comunicació de congrés
    Accés obert
    In this paper we propose a patch sampling strategy based on sequential Monte-Carlo methods for Whole Slide Image classification in the context of Multiple Instance Learning and show its capability to achieve high generalization ...
  • Brain MRI super-resolution using generative adversarial networks 

    Sánchez, Irina; Vilaplana Besler, Verónica (2018)
    Comunicació de congrés
    Accés obert
    In this work we propose an adversarial learning approach to generate high resolution MRI scans from low resolution images. The architecture, based on the SRGAN model, adopts 3D convolutions to exploit volumetric information. ...
  • LaViCAD: Laboratorio Virtual de Comunicaciones Analógicas y Digitales 

    Cabrera-Bean, Margarita; Fernández Prades, Carles; Vargas Berzosa, Carlos; Vargas Berzosa, Francisco; Fernández Rubio, Juan Antonio; Gasull Llampallas, Antoni (Universitat de Barcelona. Edicions i Publicacions, 2006)
    Text en actes de congrés
    Accés obert
    En este trabajo se presenta tanto el diseño como la experimentación de un Laboratorio Virtual de Comunicaciones Analógicas y Digitales: LaViCAD. Mediante LaViCAD se pueden experimentar y verificar diferentes aplicaciones ...
  • Online detection of action start in untrimmed, streaming videos 

    Shou, Zheng; Pan, Junting; Chan, Jonathan; Miyazawa, Kazuyuki; Mansour, Hassan; Vetro, Anthony; Giró Nieto, Xavier; Chang, Shih-Fu (Springer, 2018)
    Comunicació de congrés
    Accés restringit per política de l'editorial
    We aim to tackle a novel task in action detection - Online Detection of Action Start (ODAS) in untrimmed, streaming videos. The goal of ODAS is to detect the start of an action instance, with high categorization accuracy ...
  • Demonstration of an open source framework for qualitative evaluation of CBIR systems 

    Gomez Duran, Paula; Mohedano, Eva; McGuinness, Kevin; Giró Nieto, Xavier; O'Connor, Noel (Association for Computing Machinery (ACM), 2018)
    Comunicació de congrés
    Accés obert
    Evaluating image retrieval systems in a quantitative way, for example by computing measures like mean average precision, allows for objective comparisons with a ground-truth. However, in cases where ground-truth is not ...
  • Hybridnet for depth estimation and semantic segmentation 

    Sánchez Escobedo, Dalila; Lin, Xiao; Casas Rius, Joan Ramon; Pardàs Feliu, Montse (Institute of Electrical and Electronics Engineers (IEEE), 2018)
    Text en actes de congrés
    Accés restringit per política de l'editorial
    Semantic segmentation and depth estimation are two important tasks in the area of image processing. Traditionally, these two tasks are addressed in an independent manner. However, for those applications where geometric and ...

Mostra'n més