A differentiable BLEU loss. Analysis and first results

Cita com:
hdl:2117/117201
Document typeConference report
Defense date2018
Rights accessOpen Access
This work is protected by the corresponding intellectual and industrial property rights.
Except where otherwise noted, its contents are licensed under a Creative Commons license
:
Attribution-NonCommercial-NoDerivs 3.0 Spain
ProjectTECNOLOGIAS DE APRENDIZAJE PROFUNDO APLICADAS AL PROCESADO DE VOZ Y AUDIO (MINECO-TEC2015-69266-P)
AUTONOMOUS LIFELONG LEARNING INTELLIGENT SYSTEMS (AEI-PCIN-2017-079)
AUTONOMOUS LIFELONG LEARNING INTELLIGENT SYSTEMS (AEI-PCIN-2017-079)
Abstract
In natural language generation tasks, like neural machine translation and image captioning, there is usually a mismatch between the optimized loss and the de facto evaluation criterion, namely token-level maximum likelihood and corpus-level BLEU score. This article tries to reduce this gap by defining differentiable computations of the BLEU and GLEU scores. We test this approach on simple tasks, obtaining valuable lessons on its potential applications but also its pitfalls, mainly that these loss functions push each token in the hypothesis sequence toward the average of the tokens in the reference, resulting in a poor training signal.
CitationCasas, N., Fonollosa, José A. R., Ruiz, M. A differentiable BLEU loss. Analysis and first results. A: International Conference on Learning Representations. "ICLR 2018 Workshop Track: 6th International Conference on Learning Representations: Vancouver Convention Center, Vancouver, BC, Canada: April 30-May 3, 2018". 2018.
Publisher versionhttps://openreview.net/forum?id=HkG7hzyvf
Files | Description | Size | Format | View |
---|---|---|---|---|
e57a49a5ea21b841d2546ba9aade2d627010d8e0.pdf | 421,9Kb | View/Open |