Mostra el registre d'ítem simple

dc.contributorSvensson, Thomas
dc.contributorJiménez González, Daniel
dc.contributor.authorCarrasco Hernandez, Guillermo
dc.date.accessioned2013-07-08T17:11:22Z
dc.date.available2013-07-08T17:11:22Z
dc.date.issued2013-06-26
dc.identifier.urihttp://hdl.handle.net/2099.1/18704
dc.description.abstractIn the recent years, the costs of obtaining biological data have been drastically reduced. This has lead into an exponential growth of the available data. Having such growth of data to analyze sometimes results in very platform-dependent and difficult to scale software solutions. This final project tries to provide a solution to those problems in a real bioinformatics core facility in the Science For Life Laboratory. Science For Life Laboratory is a center for large-scale biosciences with the focus in health and environmental research. It is located in Stockholm, Sweden. This laboratory has 15 next generation sequencing instruments at present, with a combined capacity for DNA sequencing equal to several hundreds of complete human genomes per year. This implies a massive amount of data to be managed and analyzed. This data is analyzed using bcbio-nextgen. bcbio-nextgen is an in-house maintained genomics pipeline, originally developed by Brad Chapman at Harvard School of Public Health [Rom12]. The first goal of this project is to automate the installation, deployment and testing of the aforementioned pipeline. On the other hand, the alignment1 step of the analysis will be modified to use Seal, a Hadoop based aligner. This will allow us to check that all automations are working properly, as the pipeline will have to be installed and tested in several nodes.
dc.language.isoeng
dc.publisherUniversitat Politècnica de Catalunya
dc.subjectÀrees temàtiques de la UPC::Informàtica::Aplicacions de la informàtica::Bioinformàtica
dc.subject.lcshBioinformatics
dc.subject.otherHadoop
dc.subject.otherautomation
dc.subject.otherparallelization
dc.subject.othercontinuous integration
dc.subject.othercluster
dc.subject.otherdna
dc.subject.othersequence alignment
dc.subject.otherbioinformatics
dc.subject.otherfastq
dc.titleAutomating installation, testing and development of bcbio-nextgen pipeline
dc.title.alternativeParal·lelització del pipeline Bcbio-nextgen per al tractament de dades genòmiques
dc.typeMaster thesis (pre-Bologna period)
dc.subject.lemacBioinformàtica
dc.identifier.slug85882
dc.rights.accessOpen Access
dc.date.updated2013-06-29T22:08:52Z
dc.audience.educationlevelEstudis de primer/segon cicle
dc.audience.mediatorFacultat d'Informàtica de Barcelona
dc.audience.degreeENGINYERIA INFORMÀTICA (Pla 2003)


Fitxers d'aquest items

Thumbnail

Aquest ítem apareix a les col·leccions següents

Mostra el registre d'ítem simple