UPC-CORE : What can machine translation evaluation metrics and Wikipedia do for estimating semantic textual similarity?

dc.contributor.authorBarrón-Cedeño, Alberto
dc.contributor.authorMàrquez Villodre, Lluís
dc.contributor.authorFuentes Fort, Maria
dc.contributor.authorRodríguez Hontoria, Horacio
dc.contributor.authorTurmo Borras, Jorge
dc.contributor.groupUniversitat Politècnica de Catalunya. GPLN - Grup de Processament del Llenguatge Natural
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament de Llenguatges i Sistemes Informàtics
dc.date.accessioned2013-10-15T09:11:02Z
dc.date.available2013-10-15T09:11:02Z
dc.date.created2013
dc.date.issued2013
dc.description.abstractIn this paper we discuss our participation to the 2013 Semeval Semantic Textual Similarity task. Our core features include (i) a set of metrics borrowed from automatic machine translation, originally intended to evaluate automatic against reference translations and (ii) an instance of explicit semantic analysis, built upon opening paragraphs of Wikipedia 2010 articles. Our similarity estimator relies on a support vector regressor with RBF kernel. Our best approach required 13 machine translation metrics + explicit semantic analysis and ranked 65 in the competition. Our postcompetition analysis shows that the features have a good expression level, but overfitting and —mainly— normalization issues caused our correlation values to decrease.
dc.description.peerreviewedPeer Reviewed
dc.description.versionPreprint (authors version)
dc.format.extent5 p.
dc.identifier.citationBarron-Cedeño, A. [et al.]. UPC-CORE : What can machine translation evaluation metrics and Wikipedia do for estimating semantic textual similarity?. A: Joint Conference on Lexical and Computational Semantics. "*SEM 2013: The Second Joint Conference on Lexical and Computational Semantics". Atlanta: 2013, p. 1-5.
dc.identifier.urihttps://hdl.handle.net/2117/20375
dc.language.isoeng
dc.rights.accessOpen Access
dc.subjectÀrees temàtiques de la UPC::Informàtica::Intel·ligència artificial::Llenguatge natural
dc.subject.lcshComputational linguistics -- Research
dc.subject.lcshSemantic textual similarity
dc.subject.lemacSemàntica computacional
dc.titleUPC-CORE : What can machine translation evaluation metrics and Wikipedia do for estimating semantic textual similarity?
dc.typeConference lecture
dspace.entity.typePublication
local.citation.authorBarron-Cedeño, A.; Marquez, L.; Fuentes, M.; Rodriguez, H.; Turmo, J.
local.citation.contributorJoint Conference on Lexical and Computational Semantics
local.citation.endingPage5
local.citation.publicationName*SEM 2013: The Second Joint Conference on Lexical and Computational Semantics
local.citation.pubplaceAtlanta
local.citation.startingPage1
local.identifier.drac12442079

Fitxers

Paquet original

Mostrant 1 - 1 de 1
Carregant...
Miniatura
Nom:
60_Paper.pdf
Mida:
57.8 KB
Format:
Adobe Portable Document Format
Descripció:
In this paper we discuss our participation to the 2013 Semeval Semantic Textual Similarity task. Our core features include (i) a set of met- rics borrowed from automatic machine trans- lation, originally intended to evaluate auto- matic against reference translations and (ii) an instance of explicit semantic analysis, built upon opening paragraphs of Wikipedia 2010 articles. Our similarity estimator relies on a support vector regressor with RBF kernel. Our best approach required 13 machine transla- tion metrics + explicit semantic analysis and ranked 65 in the competition. Our post- competition analysis shows that the features have a good expression level, but overfitting and —mainly— normalization issues caused our correlation values to decrease.