Unsupervised ensemble learning for genome sequencing

Pagès Zamora, Alba Maria; Ochoa Álvarez, Idoia; Ruiz Cavero, Gonzalo; Villalvilla Ornat, Pol

doi:10.1016/j.patcog.2022.108721

Visualitza/Obre

1-s2.0-S0031320322002023-main.pdf (1,375Mb)

Veure estadístiques d'ús d'UPCommons

Estadístiques de LA Referencia / Recolecta

Cita com:

Mostra el registre d'ítem complet

Pagès Zamora, Alba Maria

Ochoa Álvarez, Idoia

Ruiz Cavero, Gonzalo

Villalvilla Ornat, Pol

Tipus de documentArticle

Data publicació2022-09

Condicions d'accésAccés obert

Attribution-NonCommercial-NoDerivs 4.0 International

Llevat que s'hi indiqui el contrari, els continguts d'aquesta obra estan subjectes a la llicència de Creative Commons : Reconeixement-NoComercial-SenseObraDerivada 4.0 Internacional

ProjecteAVANCES EN CODIFICACION Y PROCESADO DE SEÑAL PARA LA SOCIEDAD DIGITAL (AEI-PID2019-104958RB-C41)

Abstract

Unsupervised ensemble learning refers to methods devised for a particular task that combine data provided by decision learners taking into account their reliability, which is usually inferred from the data. Here, the variant calling step of the next generation sequencing technologies is formulated as an unsupervised ensemble classification problem. A variant calling algorithm based on the expectation-maximization algorithm is further proposed that estimates the maximum-a-posteriori decision among a number of classes larger than the number of different labels provided by the learners. Experimental results with real human DNA sequencing data show that the proposed algorithm is competitive compared to state-of-the-art variant callers as GATK, HTSLIB, and Platypus.

CitacióPagès-Zamora, A. [et al.]. Unsupervised ensemble learning for genome sequencing. "Pattern recognition", Setembre 2022, vol. 129, article 108721, p. 1-9.

URIhttp://hdl.handle.net/2117/367991

DOI10.1016/j.patcog.2022.108721

ISSN0031-3203

Versió de l'editorhttps://www.sciencedirect.com/science/article/pii/S0031320322002023

Col·leccions

Veure estadístiques d'ús d'UPCommons

Mostra el registre d'ítem complet

Fitxers	Descripció	Mida	Format	Visualitza
1-s2.0-S0031320322002023-main.pdf		1,375Mb	PDF	Visualitza/Obre

UPCommons. Portal del coneixement obert de la UPC

Unsupervised ensemble learning for genome sequencing

Visualitza/Obre

Explora