Algebraic phylogenetic reconstruction based on proteins and its application to Persea americana
Document typeBachelor thesis
Rights accessOpen Access
Phylogenetic reconstruction methods attempt to infer the evolutionary relationships among a group of species. Apart from classical reconstruction methods, new methods have been developed based on algebraic relationships between the theoretical distribution of the molecular characters. All these methods have been only used so far for nucleotide sequences, as the SVD method. Moreover, the SVD method has been restricted to the reconstruction of quartet trees. This thesis extends the SVD method to any number of character states (implemented in C++) and tests it with different scenarios of simulated amino-acid sequences. Next, SVD is incorporated into a supertree method, specifically the quartet-based method Weight Optimization (WO), to be able to reconstruct a tree with any number of species. In this case, SVD+WO is tested with sequences that have been simulated in analogy with a real data set. Moreover, the study of this data has allowed us to shed light on the controversial phylogeny of avocado. Furthermore, all the simulated data are also reconstructed with a maximum likelihood software to compare the results obtained from the two reconstruction methods. Whereas the results obtained with our method SVD+WO are worse than those obtained by maximum likelihood, it is worth pointing out that our method can deal with much more general evolutionary models and takes less computation time.