Exploració per autor "Barrón-Cedeño, Alberto"

A comparison of approaches for measuring cross-lingual similarity of wikipedia articles

Barrón-Cedeño, Alberto; Lestari Paramita, Monica; Clough, Paul; Rosso, Paolo (Springer, 2014)
Text en actes de congrés
Accés obert

Wikipedia has been used as a source of comparable texts for a range of tasks, such as Statistical Machine Translation and Cross-Language Information Retrieval. Articles written in different languages on the same topic are ...

A factory of comparable corpora from Wikipedia

Barrón-Cedeño, Alberto; España Bonet, Cristina; Boldoba Trapote, Josu; Márquez Villodre, Luís (Association for Computational Linguistics, 2015)
Text en actes de congrés
Accés obert

Multiple approaches to grab comparable data from the Web have been developed up to date. Nevertheless, coming out with a high-quality comparable corpus of a specific topic is not straightforward. We present a model ...

Identifying useful human correction feedback from an on-line machine translation service

Barrón-Cedeño, Alberto; Màrquez Villodre, Lluís; Henríquez Quintana, Carlos Alberto; Formiga Fanals, Lluís; Romero Merino, Enrique; May, Jonathan (2013)
Text en actes de congrés
Accés obert

Post-editing feedback provided by users of on-line translation services offers an excellent opportunity for automatic improvement of statistical machine translation (SMT) systems. However, feedback provided by casual users ...

Identifying useful human feedback from an on-line translation service

Barrón-Cedeño, Alberto; Màrquez Villodre, Lluís; Henríquez Quintana, Carlos Alberto; Formiga Fanals, Lluís; Romero Merino, Enrique; May, Jonathan (2013)
Comunicació de congrés
Accés obert

Post-editing feedback provided by users of on-line translation services offers an excellent opportunity for automatic improvement of statistical machine translation (SMT) systems. However, feedback provided by casual ...

IPA and STOUT: leveraging linguistic and source-based features for machine translation evaluation

González Bermúdez, Meritxell; Barrón-Cedeño, Alberto; Màrquez Villodre, Lluís (Association for Computational Linguistics, 2014)
Comunicació de congrés
Accés restringit per política de l'editorial

This paper describes the UPC submissions to the WMT14 Metrics Shared Task : UPC-IPA and UPC-STOUT. These metrics use a collection of evaluation measures integrated in ASIYA, a toolkit for machine translation evaluation. ...

Leveraging online user feedback to improve statistical machine translation

Formiga, Lluís; Barrón-Cedeño, Alberto; Marquez, Lluis; Henriquez, Carlos A; Mariño Acebal, José Bernardo (2015-09-01)
Article
Accés obert

In this article we present a three-step methodology for dynamically improving a statistical machine translation (SMT) system by incorporating human feedback in the form of free edits on the system translations. We target ...

Methods for cross-language plagiarism detection

Barrón-Cedeño, Alberto; Gupta, P.; Rosso, Paolo (2013-09)
Article
Accés restringit per política de l'editorial

Three reasons make plagiarism across languages to be on the rise: (i) speakers of under-resourced languages often consult documentation in a foreign language, (ii) people immersed in a foreign country can still consult ...

PAN@FIRE: overview of the cross-language Indian text re-use detection competition

Barrón-Cedeño, Alberto; Rosso, Paolo; Lalitha Devi, Sobha; Clough, Paul; Stevenson, Mark (2010)
Text en actes de congrés
Accés restringit per política de l'editorial

The development of models for automatic detection of text re-use and plagiarism across languages has received increasing attention in recent years. However, the lack of an evaluation framework composed of annotated datasets ...

Plagiarism meets paraphrasing: insights for the next generation in automatic plagiarism detection

Barrón-Cedeño, Alberto; Vila, Marta; Martí, Maria Antonia; Rosso, Paolo (2013)
Article
Accés obert

Although paraphrasing is the linguistic mechanism underlying many plagiarism cases, little attention has been paid to its analysis in the framework of automatic plagiarism detection. Therefore, state-of-the-art plagiarism ...

The TALP-UPC approach to system selection: ASIYA features and pairwise classification using random forests

Formiga Fanals, Lluís; González Bermúdez, Meritxell; Barrón-Cedeño, Alberto; Rodríguez Fonollosa, José Adrián; Màrquez Villodre, Lluís (2013)
Text en actes de congrés
Accés restringit per política de l'editorial

This paper describes the TALP-UPC participation in the WMT’13 Shared Task on Quality Estimation (QE). Our participation is reduced to task 1.2 on System Selection. We used a broad set of features (86 for German-to-English ...

The TALP-UPC phrase-based translation systems for WMT13: system combination with morphology generation, domain adaptation and corpus filtering

Formiga Fanals, Lluís; Ruiz Costa-Jussà, Marta; Mariño Acebal, José Bernardo; Rodríguez Fonollosa, José Adrián; Barrón-Cedeño, Alberto; Màrquez Villodre, Lluís (2013)
Text en actes de congrés
Accés restringit per política de l'editorial

This paper describes the TALP participation in the WMT13 evaluation campaign. Our participation is based on the combination of several statistical machine translation systems: based on standard hrasebased Moses systems. ...

UPC-CORE : What can machine translation evaluation metrics and Wikipedia do for estimating semantic textual similarity?

Barrón-Cedeño, Alberto; Màrquez Villodre, Lluís; Fuentes Fort, Maria; Rodríguez Hontoria, Horacio; Turmo Borras, Jorge (2013)
Comunicació de congrés
Accés obert

In this paper we discuss our participation to the 2013 Semeval Semantic Textual Similarity task. Our core features include (i) a set of metrics borrowed from automatic machine translation, originally intended to evaluate ...

Wikicardi : hacia la extracción de oraciones paralelas de Wikipedia

Boldoba Trapote, Josu; Barrón-Cedeño, Alberto; España Bonet, Cristina (2014-03-01)
Report de recerca
Accés obert

Uno de los objetivos del proyecto Tacardi (TIN2012-38523-C02-00) consiste en extraer oraciones paralelas de corpus comparables para enriquecer y adaptar traductores automáticos. En esta investigación usamos un subconjunto ...