Sparse unifrac algorithm for the study of microbial beta diversity
Tutor / director / evaluatorDaves, Robert H
Document typeBachelor thesis
Rights accessRestricted access - confidentiality agreement
This document de nes an algorithm that allows beta diversity calculations on sparse data and presents the results obtained. Sparse UniFrac algorithm reads a phylogenetic tree and calculates a distance vector only for the nodes that are relevant on the UniFrac distance calculation. Then makes dense column slices of the sparse data matrix to calculate the UniFrac distance between them. Data with variable observations, samples and density con guration is pro- cessed using sparse UniFrac and the results show a diminution of memory usage compared with the current UniFrac code. This decrease in memory usage solves some of the current problems which are present trying to calculate UniFrac dis- tance with a large amount of data.