Universitat de Barcelona. Facultat de Biologia
[eng] In recent years, the genetics field has placed a significant emphasis on identifying and characterizing genetic factors contributing to complex diseases, alongside environmental factors. Genome-wide association studies (GWAS) have emerged as one of the principal methodologies for this purpose, as they analyze extensive genetic and phenotypic data from multiple individuals to identify genetic variations associated with specific traits. This approach has advanced our understanding of the genetic architecture of complex diseases, allowing the development of prevention strategies and genetic risk estimation. However, despite progress, much information remains to be uncovered, leading to a heritability discrepancy, which refers to the difference between heritability estimated in population studies and that explained by known genetic variations. Many methodological and statistical limitations are slowing down the identification of the genetic variation associated with the risk to develop complex diseases. Current GWAS rely on Single Nucleotide Polymorphisms (SNP) arrays that have a limited number of variants. To overcome this, the number of variants analyzed can be augmented through imputation of pre- existing genetic variants from reference panels. However, reference panels frequently exclude rare variants and structural variants (SVs) which results in these variants not being considered in the imputation process leading to potential missed associations. Another element neglected in most studies of complex diseases is the X chromosome, which is one of the two sex chromosomes and has unique biology that results in different copy number in females and males. When examining the SNP-trait associations reported in the National Human Genome Research Institute's (NHGRI) GWAS catalog, a clear shortfall in the representation of the X chromosome becomes apparent. Still, only 0.5% of the known associations map on chromosome X. This under-representation is primarily due to the methodological challenges associated with its analysis. The unique pattern of inheritance and the effects of allelic inactivation in females can result in allelic imbalances between the sexes and decrease the statistical power during genetic association studies. In this thesis, we aim to address these challenges by creating a comprehensive genetic resource, consisting of a haplotype map, particularly enriched in well characterized, and phased SVs; and deal with the gap in X-chromosome analysis by designing, implementing and applying a targeted methodology for the study of the role of the X-chromosome across multiple phenotypes. The haplotype map was generated using 785 Illumina high coverage (30x) whole-genomes from the Iberian GCAT Cohort with multiple variant identification methods and Logistic Regression Models (LRMs) for their validation. The resulting catalog includes 35,431,441 variants, including 89,178 SVs (≥50 bp), 30,325,064 SNVs and 5,017,199 indels, across all individuals in the cohort. The haplotype panel demonstrates improved imputation capabilities, with 14,360,728 SNVs/indels and 23,179 SVs being imputed, representing a 2.7-fold increase in SVs compared to other available genetic variation panels. This panel's significance is highlighted by the imputation of a rare Alu element located in a new locus associated with Mononeuritis of the lower limb, a rare neuromuscular disease. This study represents the first in- depth characterization of genetic variation in the Iberian population and the first haplotype panel that systematically includes SVs in genome-wide genetic studies. The X-Chromosome targeted strategy was designed and applied to nearly 800,000 individuals across 600 phenotypes from publicly available cohorts (UK Biobank and dbGaP). This pipeline includes the data collection process, a specific and fundamental quality control for the X-chromosome analysis and the phasing, imputation and association process, which was performed by splitting females and males and then meta-analyzing the results, thus allowing to detect sex-differences. Our analysis of nearly 500,000 X-linked variants, including SVs, resulted in 96 significant associations with 77 traits, with 75 of these being novel. By incorporating sex-specific analyses, we identified 41 loci with different behavior between males and females. These findings give us insight into the level of missing information and the X chromosome's potential role in complex diseases, as well as its contribution to sex-specific risk and manifestation. In conclusion, this work highlights the importance of considering SVs and the chromosome X in genetic studies, particularly in the context of exploring the genetic architecture of human complex diseases. The findings offer a valuable asset for further examination of the genetic components that contribute to complex diseases, marking a progression towards a more complete comprehension of the genetic landscape and its effects on human health.
Genòmica; Genómica; Genomics; Cromosomes; Cromosomas; Chromosomes
575 - General genetics. General cytogenetics. Immunogenetics. Evolution. Phylogeny
Ciències Experimentals i Matemàtiques
Programa de Doctorat en Biomedicina / Tesi realitzada al Barcelona Supercomputing Center (BSC)
Facultat de Biologia [236]