Machine learning for cancer classification
View/Open
memoria.pdf (4,447Mb) (Restricted access)
Cita com:
hdl:2117/122590
Document typeMaster thesis
Date2018-10
Rights accessRestricted access - author's decision
All rights reserved. This work is protected by the corresponding intellectual and industrial
property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public
communication or transformation of this work are prohibited without permission of the copyright holder
Abstract
In this thesis, we used support vector machines (SVMs) to build a tissue-of-origin classifier of 17 cancer types. Our classifier, which uses RNA expression data from over 20000 genes, works with high accuracy on primary (97.6%), metastasis (91.9%) and cell-line samples (71.1%). With the goal of enabling cheaper diagnostics for the clinics, we performed feature selection through recursive feature elimination (RFE) and identified a gene signature of just 120 genes that maintains almost all of the predictive power. We explored how our model could achieve such great accuracy and found that it recognises characteristics from healthy tissues rather than cancer. In order to help disseminate our results among clinicians and basic researchers, we released our trained model and its code in the command-line tool TOPOS (Tissue-of-Origin Predictor of OncoSamples). To our knowledge, this is the first time that a metastasis classifier is developed based on RNAseq data, and we hope to pave the way for others to do the same in the future. (I would like to explain that the reason why I cannot upload the thesis to UPCommons is that we are waiting for peer review to publish a paper with the results. Once the paper has been published, I would be happy to upload it to UPCommons as well if the paper's guidelines allow us to do so.)
DegreeMÀSTER UNIVERSITARI EN MATEMÀTICA AVANÇADA I ENGINYERIA MATEMÀTICA (Pla 2010)
Files | Description | Size | Format | View |
---|---|---|---|---|
memoria.pdf![]() | 4,447Mb | Restricted access |