Full-network embedding in a multimodal embedding pipeline

dc.contributor.authorVilalta Arias, Armand
dc.contributor.authorGarcia Gasulla, Dario
dc.contributor.authorParés Pont, Ferran
dc.contributor.authorMoreno Vázquez, Jonatan
dc.contributor.authorAyguadé Parra, Eduard
dc.contributor.authorLabarta Mancho, Jesús José
dc.contributor.authorCortés García, Claudio Ulises
dc.contributor.authorSuzumura, Toyotaro
dc.contributor.groupUniversitat Politècnica de Catalunya. KEMLG - Grup d'Enginyeria del Coneixement i Aprenentatge Automàtic
dc.contributor.groupUniversitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions
dc.contributor.otherUniversitat Politècnica de Catalunya. Doctorat en Intel·ligència Artificial
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament de Ciències de la Computació
dc.contributor.otherBarcelona Supercomputing Center
dc.date.accessioned2020-07-09T08:59:13Z
dc.date.available2020-07-09T08:59:13Z
dc.date.issued2017
dc.description.abstractThe current state-of-the-art for image annotation and image retrieval tasks is obtained through deep neural networks, which combine an image representation and a text representation into a shared embedding space. In this paper we evaluate the impact of using the Full-Network embedding in this setting, replacing the original image representation in a competitive multimodal embedding generation scheme. Unlike the one-layer image embeddings typically used by most approaches, the Full-Network embedding provides a multi-scale representation of images, which results in richer characterizations. To measure the influence of the Full-Network embedding, we evaluate its performance on three different datasets, and compare the results with the original multimodal embedding generation scheme when using a one-layer image embedding, and with the rest of the state-of-the-art. Results for image annotation and image retrieval tasks indicate that the Full-Network embedding is consistently superior to the one-layer embedding. These results motivate the integration of the Full-Network embedding on any multimodal embedding generation scheme, something feasible thanks to the flexibility of the approach
dc.description.peerreviewedPeer Reviewed
dc.description.sponsorshipThis work is partially supported by the Joint Study Agreement no. W156463 under the IBM/BSC Deep Learning Center agreement, by the Spanish Government through Programa Severo Ochoa (SEV-2015- 0493), by the Spanish Ministry of Science and Technology through TIN2015-65316-P project, by the Generalitat de Catalunya (contracts 2014-SGR-1051), and by the Core Research for Evolutional Science and Technology (CREST) program of Japan Science and Technology Agency (JST).
dc.description.versionPostprint (published version)
dc.format.extent9 p.
dc.identifier.citationVilalta, A. [et al.]. Full-network embedding in a multimodal embedding pipeline. A: Workshop on Semantic Deep Learning. "Proceedings of the 2nd Workshop on Semantic Deep Learning (SemDeep-2): September 19, 2017, Montpellier, France". Stroudsburg, PA: Association for Computational Linguistics, 2017, p. 24-32.
dc.identifier.otherhttps://arxiv.org/abs/1707.09872
dc.identifier.urihttps://hdl.handle.net/2117/192716
dc.language.isoeng
dc.publisherAssociation for Computational Linguistics
dc.relation.projectidinfo:eu-repo/grantAgreement/MINECO//TIN2015-65316-P/ES/COMPUTACION DE ALTAS PRESTACIONES VII/
dc.relation.projectidinfo:eu-repo/grantAgreement/AGAUR/V PRI/2014 SGR 1051
dc.relation.publisherversionhttps://www.aclweb.org/anthology/W17-7304/
dc.rights.accessOpen Access
dc.rights.licensenameAttribution 4.0 International
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subjectÀrees temàtiques de la UPC::Informàtica::Intel·ligència artificial::Aprenentatge automàtic
dc.subject.lcshMachine learning
dc.subject.lcshImage data mining
dc.subject.lcshImage analysis
dc.subject.lemacAprenentatge automàtic
dc.subject.lemacImatges -- Anàlisi
dc.subject.otherArtificial intelligence
dc.subject.otherDeep learning
dc.subject.otherTransfer learning
dc.subject.otherMultimodal embedding
dc.subject.otherImage retrieval
dc.subject.otherImage annotation
dc.subject.otherSemantic deep learning
dc.titleFull-network embedding in a multimodal embedding pipeline
dc.typeConference lecture
dspace.entity.typePublication
local.citation.authorVilalta, A.; Garcáa-Gasulla, D.; Parés, F.; Moreno, J.; Ayguadé, E.; Labarta, J.; Cortés, U.; Suzumura, T.
local.citation.contributorWorkshop on Semantic Deep Learning
local.citation.endingPage32
local.citation.publicationNameProceedings of the 2nd Workshop on Semantic Deep Learning (SemDeep-2): September 19, 2017, Montpellier, France
local.citation.pubplaceStroudsburg, PA
local.citation.startingPage24
local.identifier.drac28845399

Fitxers

Paquet original

Mostrant 1 - 1 de 1
Carregant...
Miniatura
Nom:
W17-7304.pdf
Mida:
230.91 KB
Format:
Adobe Portable Document Format
Descripció: