<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://purl.org/rss/1.0/" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel rdf:about="http://hdl.handle.net/2117/3780">
    <title>DSpace Collection:</title>
    <link>http://hdl.handle.net/2117/3780</link>
    <description />
    <items>
      <rdf:Seq>
        <rdf:li rdf:resource="http://hdl.handle.net/2117/18801" />
        <rdf:li rdf:resource="http://hdl.handle.net/2117/18417" />
        <rdf:li rdf:resource="http://hdl.handle.net/2117/18263" />
        <rdf:li rdf:resource="http://hdl.handle.net/2117/17980" />
        <rdf:li rdf:resource="http://hdl.handle.net/2117/17709" />
        <rdf:li rdf:resource="http://hdl.handle.net/2117/17703" />
        <rdf:li rdf:resource="http://hdl.handle.net/2117/17700" />
        <rdf:li rdf:resource="http://hdl.handle.net/2117/17699" />
        <rdf:li rdf:resource="http://hdl.handle.net/2117/17655" />
        <rdf:li rdf:resource="http://hdl.handle.net/2117/17654" />
        <rdf:li rdf:resource="http://hdl.handle.net/2117/17223" />
        <rdf:li rdf:resource="http://hdl.handle.net/2117/17082" />
        <rdf:li rdf:resource="http://hdl.handle.net/2117/17066" />
        <rdf:li rdf:resource="http://hdl.handle.net/2117/17063" />
        <rdf:li rdf:resource="http://hdl.handle.net/2117/17061" />
      </rdf:Seq>
    </items>
    <dc:date>2013-05-24T09:45:58Z</dc:date>
  </channel>
  <item rdf:about="http://hdl.handle.net/2117/18801">
    <title>Context-aware machine translation for software localization</title>
    <link>http://hdl.handle.net/2117/18801</link>
    <description>Title: Context-aware machine translation for software localization
Authors: Muntés Mulero, Víctor; Paladini Adell, Patricia; España Bonet, Cristina; Màrquez Villodre, Lluís
Abstract: Software localization requires translating&#xD;
short text strings appearing in user interfaces (UI) into several languages. These&#xD;
strings are usually unrelated to the other&#xD;
strings in the UI. Due to the lack of semantic context, many ambiguity problems cannot be solved during translation. However, UI are composed of several visual&#xD;
components to which text strings are associated. Although this association might&#xD;
be very valuable for word disambiguation,&#xD;
it has not been exploited. In this paper,&#xD;
we present the problem of lack of context awareness for UI localization, providing real examples and identifying the main&#xD;
research challenges.</description>
    <dc:date>2013-04-15T14:59:09Z</dc:date>
  </item>
  <item rdf:about="http://hdl.handle.net/2117/18417">
    <title>A graph-based strategy to streamline translation quality assessments</title>
    <link>http://hdl.handle.net/2117/18417</link>
    <description>Title: A graph-based strategy to streamline translation quality assessments
Authors: Pighin, Daniele; Formiga Fanals, Lluís; Màrquez Villodre, Lluís
Abstract: We present a detailed analysis of a graph-&#xD;
based annotation strategy that we employed&#xD;
to annotate a corpus of 11,292 real-world En-&#xD;
glish to Spanish automatic translations with&#xD;
relative (ranking) and absolute (adequate/non-&#xD;
adequate) quality assessments. The proposed&#xD;
approach, inspired by previous work in In-&#xD;
teractive Evolutionary Computation and Inter-&#xD;
active Genetic Algorithms, results in a sim-&#xD;
pler and faster annotation process. We em-&#xD;
pirically compare the method against a tra-&#xD;
ditional, explicit ranking approach, and show&#xD;
that the graph-based strategy: 1) is consider-&#xD;
ably faster, and 2) produces consistently more&#xD;
reliable annotations</description>
    <dc:date>2013-03-19T16:10:14Z</dc:date>
  </item>
  <item rdf:about="http://hdl.handle.net/2117/18263">
    <title>The UPC submission to the WMT 2012 shared task on quality estimation</title>
    <link>http://hdl.handle.net/2117/18263</link>
    <description>Title: The UPC submission to the WMT 2012 shared task on quality estimation
Authors: Pighin, Daniele; González Bermúdez, Meritxell; Màrquez Villodre, Lluís
Abstract: In this paper, we describe the UPC system that&#xD;
participated in the WMT 2012 shared task on&#xD;
Quality Estimation for Machine Translation.&#xD;
Based on the empirical evidence that fluencyrelated&#xD;
features have a very high correlation&#xD;
with post-editing effort, we present a set of&#xD;
features for the assessment of quality estimation&#xD;
for machine translation designed around&#xD;
different kinds of n-gram language models,&#xD;
plus another set of features that model the&#xD;
quality of dependency parses automatically&#xD;
projected from source sentences to translations.&#xD;
We document the results obtained on&#xD;
the shared task dataset, obtained by combining&#xD;
the features that we designed with the baseline&#xD;
features provided by the task organizers.</description>
    <dc:date>2013-03-13T15:06:25Z</dc:date>
  </item>
  <item rdf:about="http://hdl.handle.net/2117/17980">
    <title>A graphical interface for MT evaluation and error analysis</title>
    <link>http://hdl.handle.net/2117/17980</link>
    <description>Title: A graphical interface for MT evaluation and error analysis
Authors: González Bermúdez, Meritxell; Giménez, J.; Màrquez Villodre, Lluís
Abstract: Error analysis in machine translation is a necessary step in order to investigate the strengths and weaknesses of the MT systems under development and allow fair comparisons among them. This work presents an application that shows how a set of heterogeneous automatic metrics can be used to evaluate a test bed of automatic translations. To do so, we have set up an online graphical interface for the ASIYA&#xD;
toolkit, a rich repository of evaluation&#xD;
measures working at different linguistic levels. The current implementation of the interface shows constituency and dependency trees as well as shallow syntactic and semantic annotations, and word alignments. The intelligent visualization of the linguistic structures used by the metrics, as well as a set of navigational functionalities, may lead towards advanced methods for automatic error analysis.</description>
    <dc:date>2013-02-26T13:02:51Z</dc:date>
  </item>
  <item rdf:about="http://hdl.handle.net/2117/17709">
    <title>Cultural configuration of Wikipedia: measuring autoreferentiality in different languages</title>
    <link>http://hdl.handle.net/2117/17709</link>
    <description>Title: Cultural configuration of Wikipedia: measuring autoreferentiality in different languages
Authors: Miquel Ribé, Marc; Rodríguez Hontoria, Horacio
Abstract: Among the motivations to write in Wikipedia&#xD;
given by the current literature there is often coincidence, but none of the studies presents the hypothesis of contributing for the visibility of the own national or language related content. Similar to topical coverage studies, we outline a method which allows collecting the articles of this content, to later analyse them in several dimensions. To prove its universality, the tests are repeated for up to twenty language editions of Wikipedia. Finally, through the best indicators from each dimension we obtain an index which represents the degree of autoreferentiality of the encyclopedia.&#xD;
Last, we point out the impact of this fact and the risk of not considering its existence&#xD;
in the design of applications based on user generated content.</description>
    <dc:date>2013-02-13T11:13:01Z</dc:date>
  </item>
  <item rdf:about="http://hdl.handle.net/2117/17703">
    <title>Georeferencing textual annotations and tagsets with geographical knowledge and language models</title>
    <link>http://hdl.handle.net/2117/17703</link>
    <description>Title: Georeferencing textual annotations and tagsets with geographical knowledge and language models
Authors: Ferrés Domènech, Daniel; Rodríguez Hontoria, Horacio
Abstract: Presentamos en este artículo cuatro aproximaciones al georeferenciado genérico de anotaciones textuales multilingües y etiquetas sem ánticas. Las cuatro aproximaciones se basan en el uso de 1) Conocimiento geogr áfi co, 2) Modelos del lenguaje (LM), 3) Modelos del lenguaje con predicciones re-ranking y 4) Fusi ón de&#xD;
las predicciones basadas en conocimiento geográfi co con otras aproximaciones. Los&#xD;
recursos empleados incluyen el gazetteer geogr áfi co Geonames, los modelos de recuperación de informaci ón TFIDF y BM25, el Hiemstra Language Modelling (HLM), listas de stop words para varias lenguas y un diccionario electróonico de la lengua inglesa. Los mejores resultados en precisión del georeferenciado se han obtenido con la aproximación de re-ranking que usa el HLM y con su fusióon con conocimiento geográfi co. Estas estrategias mejoran los mejores resultados de los mejores sistemas participantes en la tarea o cial de georeferenciado en MediaEval 2010. Nuestro&#xD;
mejor resultado obtiene una precisión de 68.53% en la tarea de geoeferenciado hasta&#xD;
100 Km.&#xD;
This paper describes generic approaches for georeferencing multilingual textual annotations and sets of tags from metadata associated to textual or multimedia content with high precision. We present four approaches based on: 1) Geographical Knowledge, 2) Language Modelling (LM), 3) Language Modelling with Re-Ranking predictions, 4) Fusion of Geographical Knowledge predictions with the other approaches. The resources employed were the Geonames geographical gazetteer, the TFIDF and BM25 Information Retrieval algorithms, the Hiemstra Language Modelling (HLM) algorithm, stopwords lists from several languages, and an electronic English dictionary. The best results in georeferencing accuracy are achieved with the HLM Re-Ranking approach and its fusion with Geographical Knowledge. These strategies outperformed the best results in accuracy reported by the state-of-the art systems that participated at MediaEval 2010 official Placing task. Our best results achieved are 68.53% of accuracy georeferencing up to a distance of 100 Km.</description>
    <dc:date>2013-02-13T10:35:29Z</dc:date>
  </item>
  <item rdf:about="http://hdl.handle.net/2117/17700">
    <title>TALP at MediaEval 2011 Placing Task: georeferencing Flickr videos with geographical knowledge and information retrieval</title>
    <link>http://hdl.handle.net/2117/17700</link>
    <description>Title: TALP at MediaEval 2011 Placing Task: georeferencing Flickr videos with geographical knowledge and information retrieval
Authors: Ferrés Domènech, Daniel; Rodríguez Hontoria, Horacio
Abstract: This paper describes our Georeferencing approaches, experiments, and results at the MediaEval 2011 Placing Task evaluation. The task consists of predicting the most probable&#xD;
geographical coordinates of Flickr videos. Our approaches used only Flickr users textual annotations and tagsets to predict. We used three approaches for this task: 1) a Geographical Knowledge approach, 2) an Information Retrieval based approach with Re-Ranking, and 3) a combination of both&#xD;
(GeoFusion). The GeoFusion approach achieved the best results within the margin of errors from 10km to 10000km.</description>
    <dc:date>2013-02-13T10:14:59Z</dc:date>
  </item>
  <item rdf:about="http://hdl.handle.net/2117/17699">
    <title>TALP at WePS-3 2010</title>
    <link>http://hdl.handle.net/2117/17699</link>
    <description>Title: TALP at WePS-3 2010
Authors: Ferrés Domènech, Daniel; Rodríguez Hontoria, Horacio
Abstract: In this paper we present our system and experiments at the Third Web People Search Workshop (WePS-3) task for clustering web people search documents in English. In our experiments we used a simple approach with three algorithms: Lingo, Hierachical Agglomerative Clustering (HAC), and a 2-step HAC algorithm. We also present the results and initial conclusions in the context of the&#xD;
WePS-3 Task 1 for clustering. We obtained best results with HAC and 2-step HAC algorithms.</description>
    <dc:date>2013-02-13T10:03:32Z</dc:date>
  </item>
  <item rdf:about="http://hdl.handle.net/2117/17655">
    <title>TALP at MediaEval 2010 placing task: geographical focus detection of Flickr textual annotations</title>
    <link>http://hdl.handle.net/2117/17655</link>
    <description>Title: TALP at MediaEval 2010 placing task: geographical focus detection of Flickr textual annotations
Authors: Ferrés Domènech, Daniel; Rodríguez Hontoria, Horacio
Abstract: This paper describes our geographical text analysis and geotagging experiments in the context of the Multimedia Placing Task at MediaEval 2010 evaluation. The task consists&#xD;
of predicting the most probable coordinates of Flickr videos.&#xD;
We used a Natural Language Processing approach trying to match geographical place names in the Flickr users textual annotations. The resources employed to deal with this task were the Geonames geographical gazetteer, stopwords lists from several languages, and an electronic English dictionary. We used two geographical focus disambiguation strategies, one based on population heuristics and another that&#xD;
combines geographical knowledge and population heuristics. The second strategy does achieve the best results. Using stopwords lists and the English dictionary as a  lter for ambiguous place names also improves the results.</description>
    <dc:date>2013-02-12T12:41:22Z</dc:date>
  </item>
  <item rdf:about="http://hdl.handle.net/2117/17654">
    <title>Semantic annotation of deverbal nominalizations in the Spanish corpus AnCora</title>
    <link>http://hdl.handle.net/2117/17654</link>
    <description>Title: Semantic annotation of deverbal nominalizations in the Spanish corpus AnCora
Authors: Peris, Aina; Taulé, Mariona; Rodríguez Hontoria, Horacio
Abstract: This paper presents the methodology and the linguistic criteria followed to enrich the AnCora-Es corpus with the semantic annotation of deverbal nominalizations. The first step was to run two independent automated processes: one for the annotation of denotation types and another one for the annotation of argument structure. Secondly, we manually checked both types of information and measured inter-annotator agreement. The result is the Spanish AnCora-Es corpus enriched with the semantic&#xD;
annotation of deverbal nominalizations. As far as we know, this is the first Spanish corpus annotated with this type of information.</description>
    <dc:date>2013-02-12T12:26:27Z</dc:date>
  </item>
  <item rdf:about="http://hdl.handle.net/2117/17223">
    <title>Spoken document retrieval based on approximated sequence alignment</title>
    <link>http://hdl.handle.net/2117/17223</link>
    <description>Title: Spoken document retrieval based on approximated sequence alignment
Authors: Comas Umbert, Pere Ramon; Turmo Borras, Jorge
Abstract: This paper presents a new approach to spoken document information retrieval for spontaneous speech corpora. The classical approach to this problem is the use of an automatic speech recognizer (ASR) combined with standard information retrieval techniques. However, ASRs tend to produce transcripts of spontaneous speech with significant word error rate, which is a drawback for standard retrieval techniques. To overcome such a limitation, our method is based on an approximated sequence alignment algorithm to search “sounds like” sequences. Our approach does not depend on extra information from the ASR and outperforms up to 7 points the precision of state-of-the-art techniques in our experiments.</description>
    <dc:date>2013-01-09T09:24:20Z</dc:date>
  </item>
  <item rdf:about="http://hdl.handle.net/2117/17082">
    <title>The exponent of Zipf’s law in language ontogeny.</title>
    <link>http://hdl.handle.net/2117/17082</link>
    <description>Title: The exponent of Zipf’s law in language ontogeny.
Authors: Baixeries i Juvillà, Jaume; Ferrer Cancho, Ramon</description>
    <dc:date>2012-12-07T11:07:01Z</dc:date>
  </item>
  <item rdf:about="http://hdl.handle.net/2117/17066">
    <title>A hybrid system for patent translation</title>
    <link>http://hdl.handle.net/2117/17066</link>
    <description>Title: A hybrid system for patent translation
Authors: Enache, Ramona; España Bonet, Cristina; Ranta, Aarne; Màrquez Villodre, Lluís
Abstract: This work presents a HMT system for patent translation. The system exploits the high coverage of SMT and the high precision of an RBMT system based on GF to deal with specific issues of the language.&#xD;
The translator is specifically developed to&#xD;
translate patents and it is evaluated in the&#xD;
English-French language pair. Although&#xD;
the number of issues tackled by the grammar&#xD;
are not extremely numerous yet, both manual and automatic evaluations consistently show their preference for the hybrid system in front of the two individual translators.</description>
    <dc:date>2012-12-03T17:04:38Z</dc:date>
  </item>
  <item rdf:about="http://hdl.handle.net/2117/17063">
    <title>Deep evaluation of hybrid architectures: simple metrics correlated with human judgments</title>
    <link>http://hdl.handle.net/2117/17063</link>
    <description>Title: Deep evaluation of hybrid architectures: simple metrics correlated with human judgments
Authors: Labaka, Gorka; Díaz de Ilarraza Sánchez, Arantza; Sarasola Gabiola, Kepa; España Bonet, Cristina; Màrquez Villodre, Lluís
Abstract: The process of developing hybrid MT systems&#xD;
is guided by the evaluation method used to&#xD;
compare different combinations of basic subsystems.&#xD;
This work presents a deep evaluation&#xD;
experiment of a hybrid architecture that&#xD;
tries to get the best of both worlds, rule-based and statistical. In a first evaluation human assessments were used to compare just the single statistical system and the hybrid one, the rule-based system was not compared by hand because the results of automatic evaluation showed a clear disadvantage. But a second and wider evaluation experiment surprisingly showed that according to human evaluation the best system was the rule-based, the one that achieved the worst results using automatic evaluation. An examination of sentences with controversial results suggested that linguistic well-formedness in the output&#xD;
should be considered in evaluation. After experimenting with 6 possible metrics we conclude that a simple arithmetic mean of BLEU and BLEU calculated on parts of speech of words is clearly a more human conformant&#xD;
metric than lexical metrics alone.</description>
    <dc:date>2012-12-03T16:21:19Z</dc:date>
  </item>
  <item rdf:about="http://hdl.handle.net/2117/17061">
    <title>Hybrid machine translation guided by a rule-based system</title>
    <link>http://hdl.handle.net/2117/17061</link>
    <description>Title: Hybrid machine translation guided by a rule-based system
Authors: España Bonet, Cristina; Màrquez Villodre, Lluís; Labaka, Gorka; Díaz de Ilarraza Sánchez, Arantza; Sarasola Gabiola, Kepa
Abstract: This paper presents a machine translation architecture which hybridizes Matxin, a rulebased system, with regular phrase-based Statistical Machine Translation. In short, the hybrid translation process is guided by the rulebased engine and, before transference, a set of partial candidate translations provided by SMT subsystems is used to enrich the treebased representation. The final hybrid translation is created by choosing the most probable combination among the available fragments with a statistical decoder in a monotonic way.&#xD;
We have applied the hybrid model to a pair&#xD;
of distant languages, Spanish and Basque, and&#xD;
according to our evaluation (both automatic&#xD;
and manual) the hybrid approach significantly&#xD;
outperforms the best SMT system on out-of-domain data.</description>
    <dc:date>2012-12-03T16:08:57Z</dc:date>
  </item>
</rdf:RDF>

