El GPI fa recerca en Processament d'Imatge i Vídeo per representació, codificació, indexació i anàlisi del contingut visual. L'expertesa del grup en Morfologia i Segmentació ha estat la base de contribucions als estàndards ISO MPEG-4 i MPEG-7. La recerca en anàlisi d’imatge li ha permès participar en projectes europeus des de 1992, als programes RACE (Morpheco, coord.), ACTS (MAVT, MoMuSys, Vidas), IST (Diceman, Hypermedia, INTERFACE, ADViSOR, MASCOT, FAETHON), en xarxes d'excel·lència (SCHEMA, SIMILAR, MUSCLE), i en projectes integrats FP6 (CHIL) i FP7 (FASCINATE). El grup ha construït dues “smart rooms” al Campus Nord de la UPC, i ha fet contribucions en anàlisi visual per interacció, així com en aplicacions d’imatge biomèdica i teledetecció. Ha signat convenis de recerca amb empreses com ara Philips (París), France Telecom (Rennes), NXP (Holanda), Thomson (Princeton, USA), Alterface (Bèlgica) i nacionals com Telefónica, CCRTV, MediaPro, Fundació CELLEX, Hospital Clínic, AD Telecom o Abertis.

El GPI investiga sobre procesamiento de imagen y vídeo por representación, codificación, indexación y análisis del contenido visual. La investigación del grupo en Morfología y Segmentación ha sido la base de contribuciones a los estándares ISO MPEG-4 y MPEG-7. La investigación en análisis de imagen le ha permitido participar en proyectos europeos desde 1992, en los programas RACE (Morpheco, coord.), ACTS (MAVT, MoMuSys, Vidas), IST (Diceman, Hypermedia, INTERFACE, Advisor, MASCOT, FAETHON ), en redes de excelencia (SCHEMA, SIMILAR, MUSCLE), y en proyectos integrados FP6 (CHIL) y FP7 (Fascinate). El grupo ha construido dos “smart romos” en el Campus Nord de la UPC, y ha hecho contribuciones en análisis visual por interacción, así como en aplicaciones de imagen biomédica y teledetección. Ha firmado convenios de investigación con empresas como Philips (París), France Telecom (Rennes), NXP (Holanda), Thomson (Princeton, USA), Alterface (Bélgica) y nacionales como Telefónica, CCRTV, MediaPro, Fundación CELLEX, Hospital Clínic, AD Telecom o Abertis.

The GPI does research on image and video processing for representing, coding, indexing and analysing visual content. The expertise of the group working on morphology and segmentation has been the basis for contributions to ISO standards MPEG-4 and MPEG-7. Research on image analysis has allowed it to participate in European projects since 1992, including the programs RACE (Morpheco, coord.), ACTS (MAVT, MoMuSys, Vidas) and IST (Diceman, Hypermedia, INTERFACE, ADViSOR, MASCOT, FAETHON), the networks of excellence SCHEMA, SIMILAR and MUSCLE and the integrated FP6 and FP7 projects CHIL and FASCINATE, respectively.

The GPI does research on image and video processing for representing, coding, indexing and analysing visual content. The expertise of the group working on morphology and segmentation has been the basis for contributions to ISO standards MPEG-4 and MPEG-7. Research on image analysis has allowed it to participate in European projects since 1992, including the programs RACE (Morpheco, coord.), ACTS (MAVT, MoMuSys, Vidas) and IST (Diceman, Hypermedia, INTERFACE, ADViSOR, MASCOT, FAETHON), the networks of excellence SCHEMA, SIMILAR and MUSCLE and the integrated FP6 and FP7 projects CHIL and FASCINATE, respectively.

Recent Submissions

  • One perceptron to rule them all: language, vision, audio and speech 

    Giró Nieto, Xavier (Association for Computing Machinery (ACM), 2020)
    Conference lecture
    Restricted access - publisher's policy
    Deep neural networks have boosted the convergence of multimedia data analytics in a unified framework shared by practitioners in natural language, vision and speech. Image captioning, lip reading or video sonorization are ...
  • Automatic reminiscence therapy for dementia 

    Carós, Mariona; Garolera Freixa, Maite; Radeva, Petia; Giró Nieto, Xavier (Association for Computing Machinery (ACM), 2020)
    Conference lecture
    Restricted access - publisher's policy
    With people living longer than ever, the number of cases with dementia such as Alzheimer's disease increases steadily. It affects more than 46 million people worldwide, and it is estimated that in 2050 more than 100 million ...
  • Audience measurement using a top-view camera and oriented trajectories 

    López Palma, Manuel; Gago Barrio, Javier; Corbalán Fuertes, Montserrat; Morros Rubió, Josep Ramon (2019)
    Conference report
    Restricted access - publisher's policy
    A crucial aspect for selecting optimal areas for commercial advertising is the probability with which that publicity will be seen. This paper presents a method based on top-view camera measurement, where the probability ...
  • Geometric model and calibration method for a solid-state LiDAR 

    García Gómez, Pablo; Royo Royo, Santiago; Rodrigo Arcay, Noel; Casas Pla, Josep Ramon (Multidisciplinary Digital Publishing Institute (MDPI), 2020-05-20)
    Article
    Open Access
    This paper presents a novel calibration method for solid-state LiDAR devices based on a geometrical description of their scanning system, which has variable angular resolution. Determining this distortion across the entire ...
  • VLX-Stories: building an online Event Knowledge Base with Emerging Entity detection 

    Fernández Cañellas, Dèlia; Espadaler, Joan; Rodríguez, David; Garolera, Blai; Canet Tarrés, Gemma; Colom Serra, Aleix; Rimmek, Joan Marco; Giró Nieto, Xavier; Bou Balust, Elisenda; Riveiro, Juan Carlos (Springer, 2019)
    Conference lecture
    Restricted access - publisher's policy
    We present an online multilingual system for event detection and comprehension from media feeds. The system retrieves information from news sites, aggregates them into events (event detection), and summarizes them by ...
  • Budget-aware semi-supervised semantic and instance segmentation 

    Bellver Bueno, Míriam; Salvador Aguilera, Amaia; Torres Viñals, Jordi; Giró Nieto, Xavier (2019)
    Conference lecture
    Open Access
    Methods that move towards less supervised scenarios are key for image segmentation, as dense labels demand significant human intervention. Generally, the annotation burden is mitigated by labeling datasets with weaker forms ...
  • Residual attention graph convolutional network for geometric 3D scene classification 

    Mosella Montoro, Albert; Ruiz Hidalgo, Javier (Institute of Electrical and Electronics Engineers (IEEE), 2019)
    Conference report
    Restricted access - publisher's policy
    Geometric 3D scene classification is a very challenging task. Current methodologies extract the geometric information using only a depth channel provided by an RGB-D sensor. These kinds of methodologies introduce possible ...
  • VLX-Stories: a semantically linked event platform for media publishers 

    Fernández Cañellas, Dèlia; Espadaler, Joan; Garolera, Blai; Rodríguez, David; Canet, Gemma; Colom, Aleix; Rimmek, Joan Marco; Giró Nieto, Xavier; Bou Balust, Elisenda; Riveiro, Juan Carlos (CEUR-WS.org, 2019)
    Conference lecture
    Open Access
    In the recent years, video sharing in social media from different video recording devices has resulted in a exponential growth of videos on the Internet. Such video data is continuously increasing with daily recordings ...
  • Hyperparameter-free losses for model-based monocular reconstruction 

    Ramon Maldonado, Eduard; Ruiz, Guillermo; Batard, Thomas; Giró Nieto, Xavier (Institute of Electrical and Electronics Engineers (IEEE), 2019)
    Conference lecture
    Open Access
    This work proposes novel hyperparameter-free losses for single view 3D reconstruction with morphable models (3DMM). We dispense with the hyperparameters used in other works by exploiting geometry, so that the shape of the ...
  • Picking groups instead of samples: a close look at Static Pool-based Meta-Active Learning 

    Mas Méndez, Ignasi; Morros Rubió, Josep Ramon; Vilaplana Besler, Verónica (Institute of Electrical and Electronics Engineers (IEEE), 2019)
    Conference lecture
    Open Access
    Active Learning techniques are used to tackle learning problems where obtaining training labels is costly. In this work we use Meta-Active Learning to learn to select a subset of samples from a pool of unsupervised input ...
  • Simple vs complex temporal recurrences for video saliency prediction 

    Linardos, Panagiotis; Mohedano, Eva; Nieto, Juan Jose; O'Connor, Noel; Giró Nieto, Xavier; McGuinness, Kevin (2019)
    Conference lecture
    Restricted access - publisher's policy
    This paper investigates modifying an existing neural network architecture for static saliency prediction using two types of recurrences that integrate information from the temporal domain. The first modification is the ...
  • Unsupervised GRN Ensemble 

    Bellot, Pau; Salembier Clairon, Philippe Jean; Pham, Ngoc C.; Meyer, Patrick E. (Springer, 2019)
    Part of book or chapter of book
    Restricted access - publisher's policy
    Inferring gene regulatory networks from expression data is a very challenging problem that has raised the interest of the scientific community. Different algorithms have been proposed to try to solve this issue, but it has ...

View more