Enviaments recents

Audience measurement using a top-view camera and oriented trajectories

López Palma, Manuel; Gago Barrio, Javier; Corbalán Fuertes, Montserrat; Morros Rubió, Josep Ramon (2019)
Text en actes de congrés
Accés restringit per política de l'editorial

A crucial aspect for selecting optimal areas for commercial advertising is the probability with which that publicity will be seen. This paper presents a method based on top-view camera measurement, where the probability ...

VLX-Stories: building an online Event Knowledge Base with Emerging Entity detection

Fernández Cañellas, Dèlia; Espadaler, Joan; Rodríguez, David; Garolera, Blai; Canet Tarrés, Gemma; Colom Serra, Aleix; Rimmek, Joan Marco; Giró Nieto, Xavier; Bou Balust, Elisenda; Riveiro, Juan Carlos (Springer, 2019)
Comunicació de congrés
Accés restringit per política de l'editorial

We present an online multilingual system for event detection and comprehension from media feeds. The system retrieves information from news sites, aggregates them into events (event detection), and summarizes them by ...

Budget-aware semi-supervised semantic and instance segmentation

Bellver Bueno, Míriam; Salvador Aguilera, Amaia; Torres Viñals, Jordi; Giró Nieto, Xavier (2019)
Comunicació de congrés
Accés obert

Methods that move towards less supervised scenarios are key for image segmentation, as dense labels demand significant human intervention. Generally, the annotation burden is mitigated by labeling datasets with weaker forms ...

Residual attention graph convolutional network for geometric 3D scene classification

Mosella Montoro, Albert; Ruiz Hidalgo, Javier (Institute of Electrical and Electronics Engineers (IEEE), 2019)
Text en actes de congrés
Accés restringit per política de l'editorial

Geometric 3D scene classification is a very challenging task. Current methodologies extract the geometric information using only a depth channel provided by an RGB-D sensor. These kinds of methodologies introduce possible ...

VLX-Stories: a semantically linked event platform for media publishers

Fernández Cañellas, Dèlia; Espadaler, Joan; Garolera, Blai; Rodríguez, David; Canet, Gemma; Colom, Aleix; Rimmek, Joan Marco; Giró Nieto, Xavier; Bou Balust, Elisenda; Riveiro, Juan Carlos (CEUR-WS.org, 2019)
Comunicació de congrés
Accés obert

In the recent years, video sharing in social media from different video recording devices has resulted in a exponential growth of videos on the Internet. Such video data is continuously increasing with daily recordings ...

Hyperparameter-free losses for model-based monocular reconstruction

Ramon Maldonado, Eduard; Ruiz, Guillermo; Batard, Thomas; Giró Nieto, Xavier (Institute of Electrical and Electronics Engineers (IEEE), 2019)
Comunicació de congrés
Accés obert

This work proposes novel hyperparameter-free losses for single view 3D reconstruction with morphable models (3DMM). We dispense with the hyperparameters used in other works by exploiting geometry, so that the shape of the ...

Picking groups instead of samples: a close look at Static Pool-based Meta-Active Learning

Mas Méndez, Ignasi; Morros Rubió, Josep Ramon; Vilaplana Besler, Verónica (Institute of Electrical and Electronics Engineers (IEEE), 2019)
Comunicació de congrés
Accés obert

Active Learning techniques are used to tackle learning problems where obtaining training labels is costly. In this work we use Meta-Active Learning to learn to select a subset of samples from a pool of unsupervised input ...

Simple vs complex temporal recurrences for video saliency prediction

Linardos, Panagiotis; Mohedano, Eva; Nieto, Juan Jose; O'Connor, Noel; Giró Nieto, Xavier; McGuinness, Kevin (2019)
Comunicació de congrés
Accés restringit per política de l'editorial

This paper investigates modifying an existing neural network architecture for static saliency prediction using two types of recurrences that integrate information from the temporal domain. The first modification is the ...

Video object linguistic grounding

Herrera-Palacio, Alba; Ventura, Carles; Giró Nieto, Xavier (Association for Computing Machinery (ACM), 2019)
Comunicació de congrés
Accés restringit per política de l'editorial

The goal of this work is segmenting on a video sequence the objects which are mentioned in a linguistic description of the scene. We have adapted an existing deep neural network that achieves state of the art performance ...

Multi-view 3D face reconstruction in the wild using siamese networks

Ramon, Eduard; Escur, Janna; Giró Nieto, Xavier (Computer Vision Foundation, 2019)
Text en actes de congrés
Accés obert

In this work, we present a novel learning based approach to reconstruct 3D faces from a single or multiple images. Our method uses a simple yet powerful architecture based on siamese neural networks that helps to extract ...

Digitally stained confocal microscopy through deep learning

Combalia Escudero, Marc; Pérez Ankar, Javiera; García Herrera, Adriana; Alos, Llúcia; Vilaplana Besler, Verónica; Marqués Acosta, Fernando; Puig, Susana; Malvehy, Josep (Microtome Publishing, 2019)
Text en actes de congrés
Accés obert

Specialists have used confocal microscopy in the ex-vivo modality to identify Basal Cell Carcinoma tumors with an overall sensitivity of 96.6% and specificity of 89.2% (Chung et al., 2004). However, this technology hasn’t ...

Wav2Pix: speech-conditioned face generation using generative adversarial networks

Cardoso Duarte, Amanda; Roldan, Francisco; Tubau, Miquel; Escur, Janna; Pascual de la Puente, Santiago; Salvador Aguilera, Amaia; Mohedano, Eva; McGuinness, Kevin; Torres Viñals, Jordi; Giró Nieto, Xavier (Institute of Electrical and Electronics Engineers (IEEE), 2019)
Comunicació de congrés
Accés restringit per política de l'editorial

Speech is a rich biometric signal that contains information about the identity, gender and emotional state of the speaker. In this work, we explore its potential to generate face images of a speaker by conditioning a ...

UPCommons. Portal del coneixement obert de la UPC

Ponències/Comunicacions de congressos: Enviaments recents

Audience measurement using a top-view camera and oriented trajectories

VLX-Stories: building an online Event Knowledge Base with Emerging Entity detection

Budget-aware semi-supervised semantic and instance segmentation

Residual attention graph convolutional network for geometric 3D scene classification

VLX-Stories: a semantically linked event platform for media publishers

Hyperparameter-free losses for model-based monocular reconstruction

Picking groups instead of samples: a close look at Static Pool-based Meta-Active Learning

Simple vs complex temporal recurrences for video saliency prediction

Video object linguistic grounding

Multi-view 3D face reconstruction in the wild using siamese networks

Digitally stained confocal microscopy through deep learning

Wav2Pix: speech-conditioned face generation using generative adversarial networks

Explora

Ponències/Comunicacions de congressos: Enviaments recents

Audience measurement using a top-view camera and oriented trajectories ﻿

VLX-Stories: building an online Event Knowledge Base with Emerging Entity detection ﻿

Budget-aware semi-supervised semantic and instance segmentation ﻿

Residual attention graph convolutional network for geometric 3D scene classification ﻿

VLX-Stories: a semantically linked event platform for media publishers ﻿

Hyperparameter-free losses for model-based monocular reconstruction ﻿

Picking groups instead of samples: a close look at Static Pool-based Meta-Active Learning ﻿

Simple vs complex temporal recurrences for video saliency prediction ﻿

Video object linguistic grounding ﻿

Multi-view 3D face reconstruction in the wild using siamese networks ﻿

Digitally stained confocal microscopy through deep learning ﻿

Wav2Pix: speech-conditioned face generation using generative adversarial networks ﻿

Audience measurement using a top-view camera and oriented trajectories

VLX-Stories: building an online Event Knowledge Base with Emerging Entity detection

Budget-aware semi-supervised semantic and instance segmentation

Residual attention graph convolutional network for geometric 3D scene classification

VLX-Stories: a semantically linked event platform for media publishers

Hyperparameter-free losses for model-based monocular reconstruction

Picking groups instead of samples: a close look at Static Pool-based Meta-Active Learning

Simple vs complex temporal recurrences for video saliency prediction

Video object linguistic grounding

Multi-view 3D face reconstruction in the wild using siamese networks

Digitally stained confocal microscopy through deep learning

Wav2Pix: speech-conditioned face generation using generative adversarial networks