Show simple item record

dc.contributorRuiz Costa-Jussà, Marta
dc.contributorSchaefer, Martin
dc.contributorWeber, Marc
dc.contributor.authorGarcía Ortegón, Miguel
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions
dc.date.accessioned2018-10-18T11:51:29Z
dc.date.issued2018-10
dc.identifier.urihttp://hdl.handle.net/2117/122590
dc.description.abstractIn this thesis, we used support vector machines (SVMs) to build a tissue-of-origin classifier of 17 cancer types. Our classifier, which uses RNA expression data from over 20000 genes, works with high accuracy on primary (97.6%), metastasis (91.9%) and cell-line samples (71.1%). With the goal of enabling cheaper diagnostics for the clinics, we performed feature selection through recursive feature elimination (RFE) and identified a gene signature of just 120 genes that maintains almost all of the predictive power. We explored how our model could achieve such great accuracy and found that it recognises characteristics from healthy tissues rather than cancer. In order to help disseminate our results among clinicians and basic researchers, we released our trained model and its code in the command-line tool TOPOS (Tissue-of-Origin Predictor of OncoSamples). To our knowledge, this is the first time that a metastasis classifier is developed based on RNAseq data, and we hope to pave the way for others to do the same in the future. (I would like to explain that the reason why I cannot upload the thesis to UPCommons is that we are waiting for peer review to publish a paper with the results. Once the paper has been published, I would be happy to upload it to UPCommons as well if the paper's guidelines allow us to do so.)
dc.language.isoeng
dc.publisherUniversitat Politècnica de Catalunya
dc.subjectÀrees temàtiques de la UPC::Matemàtiques i estadística::Matemàtica aplicada a les ciències
dc.subject.lcshBiology
dc.subject.lcshNatural history
dc.subject.otherSupport vector machine
dc.subject.otherCancer
dc.subject.otherPrimary tumor
dc.subject.otherMetastasis
dc.subject.otherCell line
dc.subject.otherMutation
dc.subject.otherRNA expression
dc.subject.otherDNA methylation
dc.titleMachine learning for cancer classification
dc.typeMaster thesis
dc.subject.lemacBiologia
dc.subject.lemacCiències naturals
dc.subject.amsClassificació AMS::92 Biology and other natural sciences::92C Physiological, cellular and medical topics
dc.identifier.slugFME-1689
dc.rights.accessRestricted access - author's decision
dc.date.lift10000-01-01
dc.date.updated2018-10-16T05:24:33Z
dc.audience.educationlevelMàster
dc.audience.mediatorUniversitat Politècnica de Catalunya. Facultat de Matemàtiques i Estadística
dc.audience.degreeMÀSTER UNIVERSITARI EN MATEMÀTICA AVANÇADA I ENGINYERIA MATEMÀTICA (Pla 2010)
dc.contributor.covenanteeCentre de Regulació Genòmica. Departament de Teoria del Senyal i Comunicacions


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record