Recent Submissions

  • Channel-wise early stopping without a validation set via NNK polytope interpolation 

    Bonet Solé, David; Ortega, Antonio; Ruiz Hidalgo, Javier; Sarath Shekkizhar, Sarath (2021)
    Conference report
    Open Access
    State-of-the-art neural network architectures continue to scale in size and deliver impressive generalization results, although this comes at the expense of limited interpretability. In particular, a key challenge is to ...
  • H3D-Net: Few-shot high-fidelity 3D head reconstruction 

    Ramon Maldonado, Eduard; Triginer Garcés, Gil; Escurt i Gelabert, Janna; Pumarola Peris, Albert; García Giráldez, Jaime; Giró Nieto, Xavier; Moreno-Noguer, Francesc (Computer Vision Foundation, 2021)
    Conference lecture
    Open Access
    Recent learning approaches that implicitly represent surface geometry using coordinate-based neural representations have shown impressive results in the problem of multi-view 3D reconstruction. The effectiveness of these ...
  • Seasonal contrast: Unsupervised pre-training from uncurated remote sensing data 

    Mañas Sánchez, Óscar; Lacoste, Alexandre; Giró Nieto, Xavier; Vázquez Bermúdez, David; Rodriguez López, Pau (Computer Vision Foundation, 2021)
    Conference lecture
    Open Access
    Remote sensing and automatic earth monitoring are key to solve global-scale challenges such as disaster prevention, land use monitoring, or tackling climate change. Although there exist vast amounts of remote sensing data, ...
  • How2Sign: A large-scale multimodal dataset for continuous American sign language 

    Cardoso Duarte, Amanda; Palaskar, Shruti; Ventura Ripol, Lucas; Ghadiyaram, Deepti; DeHaan, Kenneth; Metze, Florian; Torres Viñals, Jordi; Giró Nieto, Xavier (Institute of Electrical and Electronics Engineers (IEEE), 2021)
    Conference lecture
    Open Access
    One of the factors that have hindered progress in the areas of sign language recognition, translation, and production is the absence of large annotated datasets. Towards this end, we introduce How2Sign, a multimodal and ...
  • Refinement network for unsupervised on the scene foreground segmentation 

    Pardàs Feliu, Montse; Canet Tarrés, Gemma (European Association for Signal Processing (EURASIP), 2020)
    Conference report
    Open Access
    Unsupervised learning represents one of the most interesting challenges in computer vision today. The task has an immense practical value with many applications in artificial intelligence and emerging technologies, as large ...
  • Explore, discover and learn: unsupervised discovery of state-covering skills 

    Campos Camúñez, Víctor; Trott, Alex; Xiong, Caiming; Socher, Richard; Giró Nieto, Xavier; Torres Viñals, Jordi (2020)
    Conference lecture
    Open Access
    Acquiring abilities in the absence of a task-oriented reward function is at the frontier of reinforcement learning research. This problem has been studied through the lens of empowerment, which draws a connection between ...
  • Weakly supervised semantic segmentation for remote sensing hyperspectral imaging 

    Moliner, Eloi; Salgueiro Romero, Luis Fernando; Vilaplana Besler, Verónica (Institute of Electrical and Electronics Engineers (IEEE), 2020)
    Conference lecture
    Restricted access - publisher's policy
    This paper studies the problem of training a semantic segmentation neural network with weak annotations, in order to be applied in aerial vegetation images from Teide National Park. It proposes a Deep Seeded Region Growing ...
  • One perceptron to rule them all: language, vision, audio and speech 

    Giró Nieto, Xavier (Association for Computing Machinery (ACM), 2020)
    Conference lecture
    Restricted access - publisher's policy
    Deep neural networks have boosted the convergence of multimedia data analytics in a unified framework shared by practitioners in natural language, vision and speech. Image captioning, lip reading or video sonorization are ...
  • Automatic reminiscence therapy for dementia 

    Carós, Mariona; Garolera Freixa, Maite; Radeva, Petia; Giró Nieto, Xavier (Association for Computing Machinery (ACM), 2020)
    Conference lecture
    Restricted access - publisher's policy
    With people living longer than ever, the number of cases with dementia such as Alzheimer's disease increases steadily. It affects more than 46 million people worldwide, and it is estimated that in 2050 more than 100 million ...
  • Audience measurement using a top-view camera and oriented trajectories 

    López Palma, Manuel; Gago Barrio, Javier; Corbalán Fuertes, Montserrat; Morros Rubió, Josep Ramon (2019)
    Conference report
    Restricted access - publisher's policy
    A crucial aspect for selecting optimal areas for commercial advertising is the probability with which that publicity will be seen. This paper presents a method based on top-view camera measurement, where the probability ...
  • VLX-Stories: building an online Event Knowledge Base with Emerging Entity detection 

    Fernández Cañellas, Dèlia; Espadaler, Joan; Rodríguez, David; Garolera, Blai; Canet Tarrés, Gemma; Colom Serra, Aleix; Rimmek, Joan Marco; Giró Nieto, Xavier; Bou Balust, Elisenda; Riveiro, Juan Carlos (Springer, 2019)
    Conference lecture
    Restricted access - publisher's policy
    We present an online multilingual system for event detection and comprehension from media feeds. The system retrieves information from news sites, aggregates them into events (event detection), and summarizes them by ...
  • Budget-aware semi-supervised semantic and instance segmentation 

    Bellver Bueno, Míriam; Salvador Aguilera, Amaia; Torres Viñals, Jordi; Giró Nieto, Xavier (2019)
    Conference lecture
    Open Access
    Methods that move towards less supervised scenarios are key for image segmentation, as dense labels demand significant human intervention. Generally, the annotation burden is mitigated by labeling datasets with weaker forms ...

View more