Now showing items 1-18 of 18

  • Assessment of crowdsourcing and gamification loss in user-assisted object segmentation 

    Carlier, Axel; Salvador Aguilera, Amaia; Cabezas, Ferran; Giró Nieto, Xavier; Charvillat, Vincent; Marques, Oge (2015-09-12)
    Article
    Open Access
    There has been a growing interest in applying human computation – particularly crowdsourcing techniques – to assist in the solution of multimedia, image processing, and computer vision problems which are still too difficult ...
  • Bags of local convolutional features for scalable instance search 

    Mohedano, Eva; Salvador Aguilera, Amaia; McGuinness, Kevin; Marqués Acosta, Fernando; O'Connor, Noel; Giró Nieto, Xavier (Association for Computing Machinery (ACM), 2016)
    Conference lecture
    Open Access
    This work proposes a simple instance retrieval pipeline based on encoding the convolutional features of CNN using the bag of words aggregation scheme (BoW). Assigning each local array of activations in a convolutional layer ...
  • Click'n'Cut: crowdsourced interactive segmentation with object candidates 

    Carlier, Axel; Charvillat, Vincent; Salvador Aguilera, Amaia; Giró Nieto, Xavier; Marques, Ogé (ACM, 2014)
    Conference lecture
    Open Access
    This paper introduces Click’n’Cut, a novel web tool for inter- active object segmentation designed for crowdsourcing tasks. Click’n’Cut combines bounding boxes and clicks generated by workers to obtain accurate object ...
  • Computer vision beyond the visible : image understanding through language 

    Salvador Aguilera, Amaia (Universitat Politècnica de Catalunya, 2019-06-27)
    Doctoral thesis
    Open Access
    In the past decade, deep neural networks have revolutionized computer vision. High performing deep neural architectures trained for visual recognition tasks have pushed the field towards methods relying on learned image ...
  • Cross-modal embeddings for video and audio retrieval 

    Surís Coll-Vinent, Dídac; Duarte, Amanda; Salvador Aguilera, Amaia; Torres Viñals, Jordi; Giró Nieto, Xavier (Springer, 2019)
    Conference report
    Open Access
    In this work, we explore the multi-modal information provided by the Youtube-8M dataset by projecting the audio and visual features into a common feature space, to obtain joint audio-visual embeddings. These links are used ...
  • Crowdsourced object segmentation with a game 

    Salvador Aguilera, Amaia; Carlier, Axel; Giró Nieto, Xavier; Marques, Oge; Charvillat, Vincent (2013)
    Conference report
    Open Access
    We introduce a new algorithm for image segmentation based on crowdsourcing through a game : Ask'nSeek. The game provides information on the objects of an image, under the form of clicks that are either on the object, ...
  • Cultural event recognition with visual ConvNets and temporal models 

    Salvador Aguilera, Amaia; Manchon Vizuete, Daniel; Calafell, Andrea; Giró Nieto, Xavier; Zeppelzauer, Matthias (2015)
    Conference lecture
    Open Access
    This paper presents our contribution to the ChaLearn Challenge 2015 on Cultural Event Classification. The challenge in this task is to automatically classify images from 50 different cultural events. Our solution is based ...
  • Diving deep into sentiment: understanding fine-tuned CNNs for visual sentiment prediction 

    Campos Camúñez, Victor; Salvador Aguilera, Amaia; Jou, Brendan; Giró Nieto, Xavier (Association for Computing Machinery (ACM), 2015)
    Conference lecture
    Open Access
    Visual media are powerful means of expressing emotions and sentiments. The constant generation of new content in social networks highlights the need of automated visual sentiment analysis tools. While Convolutional Neural ...
  • Exploring EEG for object detection and retrieval 

    Mohedano Robles, Eva; Salvador Aguilera, Amaia; Porta, Sergi; Giró Nieto, Xavier; Healy, Graham; McGuinness, Kevin; O'Connor, Noel; Smeaton, Alan F. (Association for Computing Machinery (ACM), 2015)
    Conference lecture
    Open Access
    This paper explores the potential for using Brain Computer Interfaces (BCI) as a relevance feedback mechanism in content-based image retrieval. Several experiments are performed using a rapid serial visual presentation ...
  • Insight Centre for Data Analytics (DCU) at TRECVid 2014: instance search and semantic indexing tasks 

    McGuinness, Kevin; Mohedano, Eva; Zhang, ZhenXing; Hu, Feiyan; Abatal, Rami; Gurrin, Cathal; O'Connor, Noel; Smeaton, Alan F.; Salvador Aguilera, Amaia; Giró Nieto, Xavier; Ventura, Carles (2014)
    Conference report
    Open Access
    Insight-DCU participated in the instance search (INS) and semantic indexing (SIN) tasks in 2014. Two very different approaches were submitted for instance search, one based on features extracted using pre-trained deep ...
  • Insight DCU at TRECVID 2015 

    McGuinness, Kevin; Mohedano, Eva; Salvador Aguilera, Amaia; Zhan, Zhenxing; Marsden, Mark; Wang, Peng; Jargalsaikhan, Iveel; Antony, Joseph; Giró Nieto, Xavier; Satoh, Shin'ichi; O'Connor, Noel; Smeaton, Alan F. (2015)
    Conference lecture
    Restricted access - publisher's policy
    Insight-DCU participated in the instance search (INS), semantic indexing (SIN), and localization tasks (LOC) this year. In the INS task we used deep convolutional network features trained on external data and the query ...
  • Inverse cooking: recipe generation from food images 

    Salvador Aguilera, Amaia; Drozdzal, Michal; Giró Nieto, Xavier; Romero, Adriana (Computer Vision Foundation, 2019)
    Conference report
    Open Access
    People enjoy food photography because they appreciate food. Behind each meal there is a story described in a complex recipe and, unfortunately, by simply looking at a food image we do not have access to its preparation ...
  • NII-HITACHI-UIT at TRECVID 2015 instance search 

    Nguyen, Vinh-Tiep; Le, Duy-Dinh; Salvador Aguilera, Amaia; Zu, Caizhi; Nguyen, Dinh-Luan; Tran, Minh-Triet; Duc, Thanh-Ngo; Duong, Duc-Anh; Satoh, Shin'ichi; Giró Nieto, Xavier (2015)
    Conference lecture
    Restricted access - publisher's policy
    In this paper, we propose two methods to improve last year instance search framework. Both of them are based on post processing scheme that try to rerank top K shots returned from BOW model. The rst system is to propose a ...
  • Object retrieval with deep convolutional features 

    Mohedano, Eva; Salvador Aguilera, Amaia; McGuinness, Kevin; Giró Nieto, Xavier; O'Connor, Noel; Marqués Acosta, Fernando (IOS Press, 2017-11-23)
    Part of book or chapter of book
    Restricted access - publisher's policy
    Deep learning and image processing are two areas of great interest to academics and industry professionals alike. The areas of application of these two disciplines range widely, encompassing fields such as medicine, robotics, ...
  • Recurrent semantic instance semantic segmentation 

    Bellver, Míriam; Salvador Aguilera, Amaia; Campos, Víctor; Marqués Acosta, Fernando; Giró Nieto, Xavier; Torres Viñals, Jordi (Barcelona Supercomputing Center, 2018-04-24)
    Conference report
    Open Access
  • RVOS: end-to-end recurrent network for video object segmentation 

    Ventura, Carles; Bellver, Míriam; Girbau, Andreu; Salvador Aguilera, Amaia; Marqués Acosta, Fernando; Giró Nieto, Xavier (Computer Vision Foundation, 2019)
    Conference lecture
    Open Access
    Multiple object video object segmentation is a challenging task, specially for the zero-shot case, when no object mask is given at the initial frame and the model has to find the objects to be segmented along the sequence. ...
  • Temporal activity detection in untrimmed videos with recurrent neural networks 

    Montes, Alberto; Salvador Aguilera, Amaia; Pascual, Santiago; Giró Nieto, Xavier (2016)
    Conference lecture
    Open Access
    This work proposes a simple pipeline to classify and temporally localize activities in untrimmed videos. Our system uses features from a 3D Convolutional Neural Network (C3D) as input to train a a recurrent neural network ...
  • Wav2Pix: speech-conditioned face generation using generative adversarial networks 

    Cardoso Duarte, Amanda; Roldan, Francisco; Tubau, Miquel; Escur, Janna; Pascual de la Puente, Santiago; Salvador Aguilera, Amaia; Mohedano, Eva; McGuinness, Kevin; Torres Viñals, Jordi; Giró Nieto, Xavier (Institute of Electrical and Electronics Engineers (IEEE), 2019)
    Conference lecture
    Restricted access - publisher's policy
    Speech is a rich biometric signal that contains information about the identity, gender and emotional state of the speaker. In this work, we explore its potential to generate face images of a speaker by conditioning a ...