Automatic translation between layman and HPO terms using machine learning algorithms

Manzini, Enrico

Visualitza/Obre

tfm-manzini.pdf (3,959Mb)

Veure estadístiques d'ús d'UPCommons

Estadístiques de LA Referencia / Recolecta

Cita com:

Mostra el registre d'ítem complet

Manzini, Enrico

Tutor / directorPerera Lluna, Alexandre

Tipus de documentProjecte Final de Màster Oficial

Data2019-07-10

Condicions d'accésAccés obert

Attribution-NonCommercial-ShareAlike 3.0 Spain

Llevat que s'hi indiqui el contrari, els continguts d'aquesta obra estan subjectes a la llicència de Creative Commons : Reconeixement-NoComercial-CompartirIgual 3.0 Espanya

Abstract

Linguistic differences between specialists and laymen still represent an obstacle for a successfull communication in technical environments. This is especially true in the medical domain where the linguistic gap between clinicians and patients is a considerable issue: from one side, diseases and symptoms must be described with a very specific vocabulary to avoid doubts and ambiguity; from the other side, it can not be expected for patients, that are the main source of information for an accurate diagnosis, to use the same technical jargon as physicians in order to describe their symptomatology. The main objective of this project is to investigate a possible solution to this issue using a deep learning approach to support the collection and description of all the traits of patients with rare disease in the Share4Rare network. Machine learning techniques will be used to develop a machine translation model that will be able to transform the input layman terms into specific medical concepts. In order to achieve this objective, the most common deep learning methods used in Natural Language Processing will be explained and analyzed, with a particular focus on word embedding techniques, convolutional neural networks and recurrent neural networks. Then, three models that combine these techniques will be proposed, trying to outline strengths and weak- nesses of each one. All the models will be created and tested with Python, a high-level, general- purpose programming language. The neural network architectures will be created using Keras, an open-source deep learning library for Python. The proposed models will be trained and tested using the lexicon from the Human Pheno- type Ontology, a formal ontology of human phenotypes with the aim of becoming the standard vocabulary for clinical databases. Terms in the Human Phenotype Ontology contain synonyms and descriptions of the phenotypes to which they refers that will be used as input for the different models. Results will be evaluated with cross-validation, and domain specific performance metrics will be adopted to carry out a specific analysis of the outcomes

MatèriesMachine learning, Algorithms, Traducció automàtica, Aprenentatge automàtic, Algorismes

URIhttp://hdl.handle.net/2117/181447

Col·leccions

Màsters oficials - Màster universitari en Automàtica i Robòtica [215]

Veure estadístiques d'ús d'UPCommons

Mostra el registre d'ítem complet

Fitxers	Descripció	Mida	Format	Visualitza
tfm-manzini.pdf		3,959Mb	PDF	Visualitza/Obre

UPCommons. Portal del coneixement obert de la UPC

Automatic translation between layman and HPO terms using machine learning algorithms

Visualitza/Obre

Explora