Enabling interpretation of the outcome of a human obesity prediction machine learning analysis from genomic data

dc.contributor.authorBilal, Ahsan
dc.contributor.authorVellido Alcacena, Alfredo
dc.contributor.authorRibas Ripoll, Vicent
dc.contributor.groupUniversitat Politècnica de Catalunya. SOCO - Soft Computing
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament de Ciències de la Computació
dc.date.accessioned2020-03-10T11:09:01Z
dc.date.available2020-03-10T11:09:01Z
dc.date.issued2018
dc.description.abstractIn this brief paper, we address the medical problem of human obesity prediction from genomic data. Genomic datasets may contain a huge number of features and they often have to be analyzed within the realm of Big Data technologies. As a medical problem, obesity prediction would welcome interpretables outcomes. Therefore, the analyst would benefit from appraches in which the problem of very high data dimensionality could be eased as much as possible. Feature selection can be an essential part of such approaches. In this context, though, traditional machine learning methods may struggle. Here, we propose a pipeline to address this problem using partitioning strategies: both vertical, by dividing the data based on gender, and horizontal, by splitting each of the analyzed chromosomes into 5,000-instances subsets. For each, Minimum Redundancy and Maximum Relevance feature selection is used to find rankings of the single nucleotide polymorphisms most relevant for classification in the medical dataset.
dc.description.versionPreprint
dc.format.extent6 p.
dc.identifier.citationBilal, H.; Vellido, A.; Ribas, V. "Enabling interpretation of the outcome of a human obesity prediction machine learning analysis from genomic data". 2018.
dc.identifier.urihttps://hdl.handle.net/2117/179493
dc.language.isoeng
dc.relation.projectidinfo:eu-repo/grantAgreement/MINECO/1PE/TIN2016-79576-R
dc.rights.accessOpen Access
dc.subjectÀrees temàtiques de la UPC::Informàtica::Aplicacions de la informàtica::Bioinformàtica
dc.subjectÀrees temàtiques de la UPC::Informàtica::Intel·ligència artificial::Aprenentatge automàtic
dc.subject.lcshBig data
dc.subject.lcshMachine learning
dc.subject.lcshGenomics
dc.subject.lcshObesity
dc.subject.lemacMacrodades
dc.subject.lemacAprenentatge automàtic
dc.subject.lemacGenòmica
dc.subject.lemacObesitat
dc.subject.otherFeature selection
dc.subject.otherMinimum redundancy and maximum relevance
dc.subject.otherSNP
dc.subject.otherApache Spark
dc.titleEnabling interpretation of the outcome of a human obesity prediction machine learning analysis from genomic data
dc.typeExternal research report
dspace.entity.typePublication
local.citation.authorBilal, H.; Vellido, A.; Ribas, V.
local.identifier.drac27266972

Fitxers

Paquet original

Mostrant 1 - 1 de 1
Carregant...
Miniatura
Nom:
Vellido.pdf
Mida:
250.25 KB
Format:
Adobe Portable Document Format