Automatic translation between layman and HPO terms using machine learning algorithms
Visualitza/Obre
Estadístiques de LA Referencia / Recolecta
Inclou dades d'ús des de 2022
Cita com:
hdl:2117/181447
Tipus de documentProjecte Final de Màster Oficial
Data2019-07-10
Condicions d'accésAccés obert
Llevat que s'hi indiqui el contrari, els
continguts d'aquesta obra estan subjectes a la llicència de Creative Commons
:
Reconeixement-NoComercial-CompartirIgual 3.0 Espanya
Abstract
Linguistic differences between specialists and laymen still represent an obstacle for a successfull communication in technical environments. This is especially true in the medical domain where the linguistic gap between clinicians and patients is a considerable issue: from one side, diseases and symptoms must be described with a very specific vocabulary to avoid doubts and ambiguity; from the other side, it can not be expected for patients, that are the main source of information for an accurate diagnosis, to use the same technical jargon as physicians in order to describe their symptomatology. The main objective of this project is to investigate a possible solution to this issue using a deep learning approach to support the collection and description of all the traits of patients with rare disease in the Share4Rare network. Machine learning techniques will be used to develop a machine translation model that will be able to transform the input layman terms into specific medical concepts. In order to achieve this objective, the most common deep learning methods used in Natural Language Processing will be explained and analyzed, with a particular focus on word embedding techniques, convolutional neural networks and recurrent neural networks. Then, three models that combine these techniques will be proposed, trying to outline strengths and weak- nesses of each one. All the models will be created and tested with Python, a high-level, general- purpose programming language. The neural network architectures will be created using Keras, an open-source deep learning library for Python. The proposed models will be trained and tested using the lexicon from the Human Pheno- type Ontology, a formal ontology of human phenotypes with the aim of becoming the standard vocabulary for clinical databases. Terms in the Human Phenotype Ontology contain synonyms and descriptions of the phenotypes to which they refers that will be used as input for the different models. Results will be evaluated with cross-validation, and domain specific performance metrics will be adopted to carry out a specific analysis of the outcomes
Col·leccions
Fitxers | Descripció | Mida | Format | Visualitza |
---|---|---|---|---|
tfm-manzini.pdf | 3,959Mb | Visualitza/Obre |