Mostra el registre d'ítem simple

dc.contributor.authorEllebracht, Lily Delores
dc.contributor.authorRamisa Ayats, Arnau
dc.contributor.authorShantharam Madhyastha, Pranava Swaroop
dc.contributor.authorCordero Rama, Jose Alejandro
dc.contributor.authorMoreno-Noguer, Francesc
dc.contributor.authorQuattoni, Ariadna Julieta
dc.contributor.otherInstitut de Robòtica i Informàtica Industrial
dc.date.accessioned2016-03-15T15:21:34Z
dc.date.available2016-03-15T15:21:34Z
dc.date.issued2015
dc.identifier.citationEllebracht, L., Ramisa, A., Shantharam, P., Cordero, J., Moreno-Noguer, F., Quattoni, A. Semantic tuples for evaluation of image sentence generation. A: Workshop on Vision and Language. "Proceedings of the 4th Workshop on Vision and Language, 2015, Lisbon.". Lisboa: 2015, p. 18-28.
dc.identifier.urihttp://hdl.handle.net/2117/84419
dc.description.abstractThe automatic generation of image captions has received considerable attention. The problem of evaluating caption generation systems, though, has not been that much explored. We propose a novel evaluation approach based on comparing the underlying visual semantics of the candidate and ground-truth captions. With this goal in mind we have defined a semantic representation for visually descriptive language and have augmented a subset of the Flickr-8K dataset with semantic annotations. Our evaluation metric (BAST) can be used not only to compare systems but also to do error analysis and get a better understanding of the type of mistakes a system does. To compute BAST we need to predict the semantic representation for the automatically generated captions. We use the Flickr-ST dataset to train classifiers that predict STs so that evaluation can be fully automated.
dc.format.extent11 p.
dc.language.isoeng
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/es/
dc.subjectÀrees temàtiques de la UPC::Informàtica::Automàtica i control
dc.subject.othercomputer vision
dc.subject.othernatural language processing
dc.titleSemantic tuples for evaluation of image sentence generation
dc.typeConference report
dc.contributor.groupUniversitat Politècnica de Catalunya. ROBiri - Grup de Robòtica de l'IRI
dc.contributor.groupUniversitat Politècnica de Catalunya. GPLN - Grup de Processament del Llenguatge Natural
dc.description.peerreviewedPeer Reviewed
dc.subject.inspecClassificació INSPEC::Pattern recognition::Computer vision
dc.relation.publisherversionhttps://www.cs.cmu.edu/~ark/EMNLP-2015/proceedings/VL/pdf/VL06.pdf
dc.rights.accessOpen Access
local.identifier.drac17548529
dc.description.versionPostprint (author's final draft)
local.citation.authorEllebracht, L.; Ramisa, A.; Shantharam, P.; Cordero, J.; Moreno-Noguer, F.; Quattoni, A.
local.citation.contributorWorkshop on Vision and Language
local.citation.pubplaceLisboa
local.citation.publicationNameProceedings of the 4th Workshop on Vision and Language, 2015, Lisbon.
local.citation.startingPage18
local.citation.endingPage28


Fitxers d'aquest items

Thumbnail

Aquest ítem apareix a les col·leccions següents

Mostra el registre d'ítem simple