Mostra el registre d'ítem simple
Semantic tuples for evaluation of image sentence generation
dc.contributor.author | Ellebracht, Lily Delores |
dc.contributor.author | Ramisa Ayats, Arnau |
dc.contributor.author | Shantharam Madhyastha, Pranava Swaroop |
dc.contributor.author | Cordero Rama, Jose Alejandro |
dc.contributor.author | Moreno-Noguer, Francesc |
dc.contributor.author | Quattoni, Ariadna Julieta |
dc.contributor.other | Institut de Robòtica i Informàtica Industrial |
dc.date.accessioned | 2016-03-15T15:21:34Z |
dc.date.available | 2016-03-15T15:21:34Z |
dc.date.issued | 2015 |
dc.identifier.citation | Ellebracht, L., Ramisa, A., Shantharam, P., Cordero, J., Moreno-Noguer, F., Quattoni, A. Semantic tuples for evaluation of image sentence generation. A: Workshop on Vision and Language. "Proceedings of the 4th Workshop on Vision and Language, 2015, Lisbon.". Lisboa: 2015, p. 18-28. |
dc.identifier.uri | http://hdl.handle.net/2117/84419 |
dc.description.abstract | The automatic generation of image captions has received considerable attention. The problem of evaluating caption generation systems, though, has not been that much explored. We propose a novel evaluation approach based on comparing the underlying visual semantics of the candidate and ground-truth captions. With this goal in mind we have defined a semantic representation for visually descriptive language and have augmented a subset of the Flickr-8K dataset with semantic annotations. Our evaluation metric (BAST) can be used not only to compare systems but also to do error analysis and get a better understanding of the type of mistakes a system does. To compute BAST we need to predict the semantic representation for the automatically generated captions. We use the Flickr-ST dataset to train classifiers that predict STs so that evaluation can be fully automated. |
dc.format.extent | 11 p. |
dc.language.iso | eng |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/3.0/es/ |
dc.subject | Àrees temàtiques de la UPC::Informàtica::Automàtica i control |
dc.subject.other | computer vision |
dc.subject.other | natural language processing |
dc.title | Semantic tuples for evaluation of image sentence generation |
dc.type | Conference report |
dc.contributor.group | Universitat Politècnica de Catalunya. ROBiri - Grup de Robòtica de l'IRI |
dc.contributor.group | Universitat Politècnica de Catalunya. GPLN - Grup de Processament del Llenguatge Natural |
dc.description.peerreviewed | Peer Reviewed |
dc.subject.inspec | Classificació INSPEC::Pattern recognition::Computer vision |
dc.relation.publisherversion | https://www.cs.cmu.edu/~ark/EMNLP-2015/proceedings/VL/pdf/VL06.pdf |
dc.rights.access | Open Access |
local.identifier.drac | 17548529 |
dc.description.version | Postprint (author's final draft) |
local.citation.author | Ellebracht, L.; Ramisa, A.; Shantharam, P.; Cordero, J.; Moreno-Noguer, F.; Quattoni, A. |
local.citation.contributor | Workshop on Vision and Language |
local.citation.pubplace | Lisboa |
local.citation.publicationName | Proceedings of the 4th Workshop on Vision and Language, 2015, Lisbon. |
local.citation.startingPage | 18 |
local.citation.endingPage | 28 |