Enabling genomics pipelines in commodity personal computers with flash storage
View/Open
Cita com:
hdl:2117/346791
Document typeArticle
Defense date2021-04
PublisherFrontiers Media SA
Rights accessOpen Access
Except where otherwise noted, content on this work
is licensed under a Creative Commons license
:
Attribution 4.0 International
ProjectCOMPUTACION DE ALTAS PRESTACIONES VII (MINECO-TIN2015-65316-P)
Hi-EST - Holistic Integration of Emerging Supercomputing Technologies (EC-H2020-639595)
Hi-EST - Holistic Integration of Emerging Supercomputing Technologies (EC-H2020-639595)
Abstract
Analysis of a patient’s genomics data is the first step toward precision medicine. Such analyses are performed on expensive enterprise-class server machines because input data sets are large, and the intermediate data structures are even larger (TB-size) and require random accesses. We present a general method to perform a specific genomics problem, mutation detection, on a cheap commodity personal computer (PC) with a small amount of DRAM. We construct and access large histograms of k-mers efficiently on external storage (SSDs) and apply our technique to a state-of-the-art referencefree genomics algorithm, SMUFIN, to create SMUFIN-F. We show that on two PCs, SMUFIN-F can achieve the same throughput at only one third (36%) the hardware cost and half (45%) the energy compared to SMUFIN on an enterprise-class server. To the best of our knowledge, SMUFIN-F is the first reference-free system that can detect somatic mutations on commodity PCs for whole human genomes. We believe our technique should apply to other k-mer or n-gram-based algorithms.
CitationCadenelli, N. [et al.]. Enabling genomics pipelines in commodity personal computers with flash storage. "Frontiers in genetics", Abril 2021, vol. 12, article 615958, p. 1-18.
ISSN1664-8021
Publisher versionhttps://www.frontiersin.org/articles/10.3389/fgene.2021.615958/full
Files | Description | Size | Format | View |
---|---|---|---|---|
fgene-12-615958.pdf | 4,784Mb | View/Open |