This paper presents a weak supervised evaluation framework for definition question answering (DefQA) called Solon. It automatically evaluates a set of DefQA systems using existing human definitions as gold standard models. This way it is able to overcome known limitations of the evaluation methods in the state of the art. In addition, Solon assumes that each DefQA task may require a different evaluation configuration, and it is able to automatically find the best one. The results obtained in our experiments show that Solon performs well with respect to the evaluation methods in the state of the art with the advantage that it is less supervised.
CitationKanaan, S., Turmo, J. "An evaluation framework based on gold standard models for definition question answering". 2006.
All rights reserved. This work is protected by the corresponding intellectual and industrial property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public communication or transformation of this work are prohibited without permission of the copyright holder. If you wish to make any use of the work not provided for in the law, please contact: firstname.lastname@example.org