Identification of spatial communities in the human genome graph to better understand HIV latency and insertion
Tutor / director / evaluatorPagès Zamora, Alba Maria
Document typeMaster thesis
Rights accessOpen Access
In this work, the 3D spatial organization of a human Jurkat cell, an immune cell who is one ofthe main targets of the human immunodeficiency virus (HIV), is analyzed through the clusteringof genome interactions networks provided by the Hi-C data, a 3D massive sequencing technologycapable of quantifying interactions among regions of the genome inside the nucleus of a cell. Thedata analysis approach consists on a graph theoretic modelling of these networks and the clusteringanalysis is performed by the use of spectral clustering methods, a family of clustering techniquesbased on the spectral decomposition of Laplacian matrices of graph networks. By inferring the3D structure of the Jurkat cell at the nuclear scale, the distribution of HIV integration sites onthe Jurkat genome is analyzed and contrasted with the current knowledge of the the integrationmechanisms and their relationship with the 3D genomic context. The clustering results are alsoevaluated through a common set of metrics, which serve to objectively asses the 3D structure ofthe nucleus of the Jurkat cell. With the proposed data analysis, the main findings are: the 3Dspatial structure is not prominent, the global interaction genomic network contains just a fewcommunities and the insertion pattern of HIV, contrasted on the detected communities, confirmsthe established knowledge of HIV integration mechanisms.