UPC-CORE : What can machine translation evaluation metrics and Wikipedia do for estimating semantic textual similarity?
Estadístiques de LA Referencia / Recolecta
Inclou dades d'ús des de 2022
Cita com:
hdl:2117/20375
Tipus de documentComunicació de congrés
Data publicació2013
Condicions d'accésAccés obert
Tots els drets reservats. Aquesta obra està protegida pels drets de propietat intel·lectual i
industrial corresponents. Sense perjudici de les exempcions legals existents, queda prohibida la seva
reproducció, distribució, comunicació pública o transformació sense l'autorització del titular dels drets
Abstract
In this paper we discuss our participation to
the 2013 Semeval Semantic Textual Similarity
task. Our core features include (i) a set of metrics borrowed from automatic machine translation, originally intended to evaluate automatic against reference translations and (ii) an instance of explicit semantic analysis, built upon opening paragraphs of Wikipedia 2010 articles. Our similarity estimator relies on a support vector regressor with RBF kernel. Our best approach required 13 machine translation metrics + explicit semantic analysis and ranked 65 in the competition. Our postcompetition
analysis shows that the features have a good expression level, but overfitting and —mainly— normalization issues caused our correlation values to decrease.
CitacióBarron-Cedeño, A. [et al.]. UPC-CORE : What can machine translation evaluation metrics and Wikipedia do for estimating semantic textual similarity?. A: Joint Conference on Lexical and Computational Semantics. "*SEM 2013: The Second Joint Conference on Lexical and Computational Semantics". Atlanta: 2013, p. 1-5.
Fitxers | Descripció | Mida | Format | Visualitza |
---|---|---|---|---|
60_Paper.pdf | In this paper we discuss our participation to the 2013 Semeval Semantic Textual Similarity task. Our core features include (i) a set of met- rics borrowed from automatic machine trans- lation, originally intended to evaluate auto- matic against reference translations and (ii) an instance of explicit semantic analysis, built upon opening paragraphs of Wikipedia 2010 articles. Our similarity estimator relies on a support vector regressor with RBF kernel. Our best approach required 13 machine transla- tion metrics + explicit semantic analysis and ranked 65 in the competition. Our post- competition analysis shows that the features have a good expression level, but overfitting and —mainly— normalization issues caused our correlation values to decrease. | 57,80Kb | Visualitza/Obre |