Language technologies: question answering in speech transcripts

Turmo Borras, Jorge; Surdeanu, Mihai; Galibert, Olivier; Rosset, Sophie

doi:10.1007/978-1-84882-054-8

Visualitza/Obre

article principal (129,0Kb) (Accés restringit) Sol·licita una còpia a l'autor

Veure estadístiques d'ús d'UPCommons

Estadístiques de LA Referencia / Recolecta

Cita com:

Mostra el registre d'ítem complet

Turmo Borras, Jorge

Surdeanu, Mihai

Galibert, Olivier

Rosset, Sophie

Tipus de documentCapítol de llibre

Data publicació2009-05-31

EditorSpringer-Verlag

Condicions d'accésAccés restringit per política de l'editorial

Tots els drets reservats. Aquesta obra està protegida pels drets de propietat intel·lectual i industrial corresponents. Sense perjudici de les exempcions legals existents, queda prohibida la seva reproducció, distribució, comunicació pública o transformació sense l'autorització del titular dels drets

Abstract

The Question Answering (QA) task consists of providing short, relevant answers to natural language questions. Most QA research has focused on extracting information from text sources, providing a the shortest relevant text in response to a question. For example, the correct answer to the question ”How many groups participate in the CHIL project?” is ”16”, whereas the response to “who are the partners in CHIL?” is a list of them. This simple example illustrates the two main advantages of QA over current search engines: first, the input is a natural language question rather a keyword query; and second, the answer provides the desired information content and not simply a potentially large set of documents or URLs that the user must plow through. One of the aims of the CHIL project was to provide information about what has been said during interactive seminars. Since the information must be located in speech data, the QA systems have to be able to deal with transcripts (manual or automatic) of spontaneous speech. This is a departure from much of the QA research carried by natural language groups who have typically developed techniques for written texts which are assumed to have a correct syntactic and semantic structure. The structure of spoken language is different from that of written language, and some of the anchor points used in processing such as punctuation must be inferred and are therefore error prone. Other spoken language phenomena include disfluencies, repetitions, restarts and corrections. In the case that automatic processing is used to create the speech transcripts, an additional challenge is dealing with the recognition errors. The response can be a short string, as in text-based QA, or an audio segment containing the response. This chapter summarizes the CHIL efforts devoted to QA for spoken language carried out at UPC and at CNRS-LIMSI. Research at UPC adapted a QA system developed for written texts to manually and automatically created speech transcripts, whereas at LIMSI an interactive oral QA system developed for the French language was adapted to the English language. CHIL organized the pilot track on Question Answering in Speech Transcripts (QAst), as part of CLEF 2007, in order to compare and evaluate QA technology on both manually and automatically produced transcripts of spontaneous speech.

CitacióTurmo, J. [et al.]. Language technologies: question answering in speech transcripts. A: "Computers in the human interaction loop". Springer-Verlag, 2009, p. 75-86.

URIhttp://hdl.handle.net/2117/13011

DOI10.1007/978-1-84882-054-8

ISBN9781848820531

Versió de l'editorhttp://cataleg.upc.edu/record=b1373502~S1*cat

Col·leccions

Veure estadístiques d'ús d'UPCommons

Mostra el registre d'ítem complet

Fitxers	Descripció	Mida	Format	Visualitza
chapter-QA.pdf	article principal	129,0Kb	PDF	Accés restringit

UPCommons. Portal del coneixement obert de la UPC

Language technologies: question answering in speech transcripts

Visualitza/Obre

Explora