Show simple item record

dc.contributor.authorEllebracht, Lily Delores
dc.contributor.authorRamisa Ayats, Arnau
dc.contributor.authorShantharam Madhyastha, Pranava Swaroop
dc.contributor.authorCordero Rama, Jose Alejandro
dc.contributor.authorMoreno-Noguer, Francesc
dc.contributor.authorQuattoni, Ariadna Julieta
dc.contributor.otherInstitut de Robòtica i Informàtica Industrial
dc.identifier.citationEllebracht, L., Ramisa, A., Shantharam, P., Cordero, J., Moreno-Noguer, F., Quattoni, A. Semantic tuples for evaluation of image sentence generation. A: Workshop on Vision and Language. "Proceedings of the 4th Workshop on Vision and Language, 2015, Lisbon.". Lisboa: 2015, p. 18-28.
dc.description.abstractThe automatic generation of image captions has received considerable attention. The problem of evaluating caption generation systems, though, has not been that much explored. We propose a novel evaluation approach based on comparing the underlying visual semantics of the candidate and ground-truth captions. With this goal in mind we have defined a semantic representation for visually descriptive language and have augmented a subset of the Flickr-8K dataset with semantic annotations. Our evaluation metric (BAST) can be used not only to compare systems but also to do error analysis and get a better understanding of the type of mistakes a system does. To compute BAST we need to predict the semantic representation for the automatically generated captions. We use the Flickr-ST dataset to train classifiers that predict STs so that evaluation can be fully automated.
dc.format.extent11 p.
dc.subjectÀrees temàtiques de la UPC::Informàtica::Automàtica i control
dc.subject.othercomputer vision
dc.subject.othernatural language processing
dc.titleSemantic tuples for evaluation of image sentence generation
dc.typeConference report
dc.contributor.groupUniversitat Politècnica de Catalunya. ROBiri - Grup de Robòtica de l'IRI
dc.contributor.groupUniversitat Politècnica de Catalunya. GPLN - Grup de Processament del Llenguatge Natural
dc.description.peerreviewedPeer Reviewed
dc.subject.inspecClassificació INSPEC::Pattern recognition::Computer vision
dc.rights.accessOpen Access
dc.description.versionPostprint (author's final draft)
local.citation.authorEllebracht, L.; Ramisa, A.; Shantharam, P.; Cordero, J.; Moreno-Noguer, F.; Quattoni, A.
local.citation.contributorWorkshop on Vision and Language
local.citation.publicationNameProceedings of the 4th Workshop on Vision and Language, 2015, Lisbon.

Files in this item


This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-NoDerivs 3.0 Spain
Except where otherwise noted, content on this work is licensed under a Creative Commons license : Attribution-NonCommercial-NoDerivs 3.0 Spain