Full-network embedding in a multimodal embedding pipeline

View/Open
Cita com:
hdl:2117/192716
Document typeConference lecture
Defense date2017
PublisherAssociation for Computational Linguistics
Rights accessOpen Access
Except where otherwise noted, content on this work
is licensed under a Creative Commons license
:
Attribution 4.0 International
Abstract
The current state-of-the-art for image annotation and image retrieval tasks is obtained through deep neural networks, which combine an image representation and a text representation into a shared embedding space. In this paper we evaluate the impact of using the Full-Network embedding in this setting, replacing the original image representation in a competitive multimodal embedding generation scheme. Unlike the one-layer image embeddings typically used by most approaches, the Full-Network embedding provides a multi-scale representation of images, which results in richer characterizations. To measure the influence of the Full-Network embedding, we evaluate its performance on three different datasets, and compare the results with the original multimodal embedding generation scheme when using a one-layer image embedding, and with the rest of the state-of-the-art. Results for image annotation and image retrieval tasks indicate that the Full-Network embedding is consistently superior to the one-layer embedding. These results motivate the integration of the Full-Network embedding on any multimodal embedding generation scheme, something feasible thanks to the flexibility of the approach
CitationVilalta, A. [et al.]. Full-network embedding in a multimodal embedding pipeline. A: Workshop on Semantic Deep Learning. "Proceedings of the 2nd Workshop on Semantic Deep Learning (SemDeep-2): September 19, 2017, Montpellier, France". Stroudsburg, PA: Association for Computational Linguistics, 2017, p. 24-32.
Publisher versionhttps://www.aclweb.org/anthology/W17-7304/
Other identifiershttps://arxiv.org/abs/1707.09872
Collections
- Departament de Ciències de la Computació - Ponències/Comunicacions de congressos [1.249]
- Computer Sciences - Ponències/Comunicacions de congressos [528]
- KEMLG - Grup d'Enginyeria del Coneixement i Aprenentatge Automàtic - Ponències/Comunicacions de congressos [110]
- CAP - Grup de Computació d'Altes Prestacions - Ponències/Comunicacions de congressos [784]
- Doctorat en Intel·ligència Artificial - Ponències/Comunicacions de congressos [41]
- Departament d'Arquitectura de Computadors - Ponències/Comunicacions de congressos [1.873]
Files | Description | Size | Format | View |
---|---|---|---|---|
W17-7304.pdf | 230,9Kb | View/Open |