Browsing by Author "Barrón-Cedeño, Alberto"
Now showing items 1-13 of 13
-
A comparison of approaches for measuring cross-lingual similarity of wikipedia articles
Barrón-Cedeño, Alberto; Lestari Paramita, Monica; Clough, Paul; Rosso, Paolo (Springer, 2014)
Conference report
Open AccessWikipedia has been used as a source of comparable texts for a range of tasks, such as Statistical Machine Translation and Cross-Language Information Retrieval. Articles written in different languages on the same topic are ... -
A factory of comparable corpora from Wikipedia
Barrón-Cedeño, Alberto; España Bonet, Cristina; Boldoba Trapote, Josu; Márquez Villodre, Luís (Association for Computational Linguistics, 2015)
Conference report
Open AccessMultiple approaches to grab comparable data from the Web have been developed up to date. Nevertheless, coming out with a high-quality comparable corpus of a specific topic is not straightforward. We present a model ... -
Identifying useful human correction feedback from an on-line machine translation service
Barrón-Cedeño, Alberto; Màrquez Villodre, Lluís; Henríquez Quintana, Carlos Alberto; Formiga Fanals, Lluís; Romero Merino, Enrique; May, Jonathan (2013)
Conference report
Open AccessPost-editing feedback provided by users of on-line translation services offers an excellent opportunity for automatic improvement of statistical machine translation (SMT) systems. However, feedback provided by casual users ... -
Identifying useful human feedback from an on-line translation service
Barrón-Cedeño, Alberto; Màrquez Villodre, Lluís; Henríquez Quintana, Carlos Alberto; Formiga Fanals, Lluís; Romero Merino, Enrique; May, Jonathan (2013)
Conference lecture
Open AccessPost-editing feedback provided by users of on-line translation services offers an excellent opportunity for automatic improvement of statistical machine translation (SMT) systems. However, feedback provided by casual ... -
IPA and STOUT: leveraging linguistic and source-based features for machine translation evaluation
González Bermúdez, Meritxell; Barrón-Cedeño, Alberto; Màrquez Villodre, Lluís (Association for Computational Linguistics, 2014)
Conference lecture
Restricted access - publisher's policyThis paper describes the UPC submissions to the WMT14 Metrics Shared Task : UPC-IPA and UPC-STOUT. These metrics use a collection of evaluation measures integrated in ASIYA, a toolkit for machine translation evaluation. ... -
Leveraging online user feedback to improve statistical machine translation
Formiga, Lluís; Barrón-Cedeño, Alberto; Marquez, Lluis; Henriquez, Carlos A; Mariño Acebal, José Bernardo (2015-09-01)
Article
Open AccessIn this article we present a three-step methodology for dynamically improving a statistical machine translation (SMT) system by incorporating human feedback in the form of free edits on the system translations. We target ... -
Methods for cross-language plagiarism detection
Barrón-Cedeño, Alberto; Gupta, P.; Rosso, Paolo (2013-09)
Article
Restricted access - publisher's policyThree reasons make plagiarism across languages to be on the rise: (i) speakers of under-resourced languages often consult documentation in a foreign language, (ii) people immersed in a foreign country can still consult ... -
PAN@FIRE: overview of the cross-language Indian text re-use detection competition
Barrón-Cedeño, Alberto; Rosso, Paolo; Lalitha Devi, Sobha; Clough, Paul; Stevenson, Mark (2010)
Conference report
Restricted access - publisher's policyThe development of models for automatic detection of text re-use and plagiarism across languages has received increasing attention in recent years. However, the lack of an evaluation framework composed of annotated datasets ... -
Plagiarism meets paraphrasing: insights for the next generation in automatic plagiarism detection
Barrón-Cedeño, Alberto; Vila, Marta; Martí, Maria Antonia; Rosso, Paolo (2013)
Article
Open AccessAlthough paraphrasing is the linguistic mechanism underlying many plagiarism cases, little attention has been paid to its analysis in the framework of automatic plagiarism detection. Therefore, state-of-the-art plagiarism ... -
The TALP-UPC approach to system selection: ASIYA features and pairwise classification using random forests
Formiga Fanals, Lluís; González Bermúdez, Meritxell; Barrón-Cedeño, Alberto; Rodríguez Fonollosa, José Adrián; Màrquez Villodre, Lluís (2013)
Conference report
Restricted access - publisher's policyThis paper describes the TALP-UPC participation in the WMT’13 Shared Task on Quality Estimation (QE). Our participation is reduced to task 1.2 on System Selection. We used a broad set of features (86 for German-to-English ... -
The TALP-UPC phrase-based translation systems for WMT13: system combination with morphology generation, domain adaptation and corpus filtering
Formiga Fanals, Lluís; Ruiz Costa-Jussà, Marta; Mariño Acebal, José Bernardo; Rodríguez Fonollosa, José Adrián; Barrón-Cedeño, Alberto; Màrquez Villodre, Lluís (2013)
Conference report
Restricted access - publisher's policyThis paper describes the TALP participation in the WMT13 evaluation campaign. Our participation is based on the combination of several statistical machine translation systems: based on standard hrasebased Moses systems. ... -
UPC-CORE : What can machine translation evaluation metrics and Wikipedia do for estimating semantic textual similarity?
Barrón-Cedeño, Alberto; Màrquez Villodre, Lluís; Fuentes Fort, Maria; Rodríguez Hontoria, Horacio; Turmo Borras, Jorge (2013)
Conference lecture
Open AccessIn this paper we discuss our participation to the 2013 Semeval Semantic Textual Similarity task. Our core features include (i) a set of metrics borrowed from automatic machine translation, originally intended to evaluate ... -
Wikicardi : hacia la extracción de oraciones paralelas de Wikipedia
Boldoba Trapote, Josu; Barrón-Cedeño, Alberto; España Bonet, Cristina (2014-03-01)
Research report
Open AccessUno de los objetivos del proyecto Tacardi (TIN2012-38523-C02-00) consiste en extraer oraciones paralelas de corpus comparables para enriquecer y adaptar traductores automáticos. En esta investigación usamos un subconjunto ...