Combining geometric, textual and visual features for predicting prepositions in image descriptions

Llevat que s'hi indiqui el contrari, els continguts d'aquesta obra estan subjectes a la llicència de Creative Commons : Reconeixement-NoComercial-SenseObraDerivada 3.0 Espanya

Abstract

We investigate the role that geometric, textual and visual features play in the task of predicting a preposition that links two visual entities depicted in an image. The task is an important part of the subsequent process of generating image descriptions. We explore the prediction of prepositions for a pair of entities, both in the case when the labels of such entities are known and unknown. In all situations we found clear evidence that all three features contribute to the prediction task.

CitacióRamisa, A., Wang, J., Lu, Y., Dellandrea, E., Moreno-Noguer, F., Gaizauskas, R. Combining geometric, textual and visual features for predicting prepositions in image descriptions. A: Joint SIGDAT Conference on Empirical Methods in Natural Language Processing. "Proceedings of the 2015 EMNLP Conference on Empirical Methods in Natural Language Processing". Lisboa: 2016, p. 214-220.

URIhttp://hdl.handle.net/2117/85160

Versió de l'editorhttp://www.emnlp2015.org/proceedings/EMNLP/pdf/EMNLP022.pdf

Col·leccions

Veure estadístiques d'ús d'UPCommons

Mostra el registre d'ítem complet

Fitxers	Descripció	Mida	Format	Visualitza
1693-Combining- ... -in-Image-Descriptions.pdf		653,8Kb	PDF	Visualitza/Obre

UPCommons. Portal del coneixement obert de la UPC

Combining geometric, textual and visual features for predicting prepositions in image descriptions

Visualitza/Obre

Explora