Full-network embedding in a multimodal embedding pipeline

Carregant...
Miniatura
El pots comprar en digital a:
El pots comprar en paper a:

Projectes de recerca

Unitats organitzatives

Número de la revista

Títol de la revista

ISSN de la revista

Títol del volum

Col·laborador

Editor

Tribunal avaluador

Realitzat a/amb

Tipus de document

Comunicació de congrés

Data publicació

Editor

Association for Computational Linguistics

Condicions d'accés

Accés obert

item.page.rightslicense

Creative Commons
Aquesta obra està protegida pels drets de propietat intel·lectual i industrial corresponents. Llevat que s'hi indiqui el contrari, els seus continguts estan subjectes a la llicència de Creative Commons: Reconeixement 4.0 Internacional

Assignatures relacionades

Assignatures relacionades

Publicacions relacionades

Datasets relacionats

Datasets relacionats

Projecte CCD

Abstract

The current state-of-the-art for image annotation and image retrieval tasks is obtained through deep neural networks, which combine an image representation and a text representation into a shared embedding space. In this paper we evaluate the impact of using the Full-Network embedding in this setting, replacing the original image representation in a competitive multimodal embedding generation scheme. Unlike the one-layer image embeddings typically used by most approaches, the Full-Network embedding provides a multi-scale representation of images, which results in richer characterizations. To measure the influence of the Full-Network embedding, we evaluate its performance on three different datasets, and compare the results with the original multimodal embedding generation scheme when using a one-layer image embedding, and with the rest of the state-of-the-art. Results for image annotation and image retrieval tasks indicate that the Full-Network embedding is consistently superior to the one-layer embedding. These results motivate the integration of the Full-Network embedding on any multimodal embedding generation scheme, something feasible thanks to the flexibility of the approach

Descripció

Persones/entitats

Document relacionat

Versió de

Citació

Vilalta, A. [et al.]. Full-network embedding in a multimodal embedding pipeline. A: Workshop on Semantic Deep Learning. "Proceedings of the 2nd Workshop on Semantic Deep Learning (SemDeep-2): September 19, 2017, Montpellier, France". Stroudsburg, PA: Association for Computational Linguistics, 2017, p. 24-32.

Ajut

Forma part

DOI

Dipòsit legal

ISBN

ISSN

Referències