A systematic and comprehensive approach for large-scale genome-wide association studies. Unraveling non-additive inheritance models in age-related diseases

dc.contributor
Universitat de Barcelona. Facultat de Biologia
dc.contributor.author
Guindo Martínez, Marta
dc.date.accessioned
2020-02-14T11:02:32Z
dc.date.available
2020-12-17T01:00:18Z
dc.date.issued
2019-12-18
dc.identifier.uri
http://hdl.handle.net/10803/668641
dc.description
Programa de Doctorat en Biomedicina / Tesi realitzada al Barcelona Supercomputing Center (BSC)
en_US
dc.description.abstract
Genome-wide association studies (GWAS) have been proven useful for identifying thousands of associations between genetic variants and human complex diseases and traits. However, the identified loci account for a small proportion of the estimated heritability (i.e., the proportion of variance for a particular phenotype that can be explained by genetic factors). The usually small effect size of common variants and the low frequencies of some variants with potentially larger effect sizes limit the statistical power of GWAS. The identification of common variants with small effects and low-frequency variants with large effects can be overcome with the analysis of larger sample sizes and imputing genotypes using dense reference panels. However, there is still room for improvement beyond increasing the sample size and the number of variants. As current GWAS are predominantly focused on the autosomes and only test the additive model, current strategies still constrain the full potential of GWAS. In this thesis, we hypothesized that performing a comprehensive analysis improving current GWAS strategies by 1) implementing the analysis of the X chromosome alongside the autosomes, 2) including genetic variants from a broader allele frequency spectrum and type of variants, such as small insertions and deletions (INDELs) through genotype imputation using multiple reference panels, and 3) testing different models of inheritance in the association test, would improve our understanding of the genetic architecture of complex diseases. To test these hypotheses we developed an integrated framework including our methodology, called GUIDANCE. Hence, GUIDANCE integrates state-of-the-art tools for GWAS analysis, including the analysis of X chromosome, a two-step imputation with multiple reference panels, the association testing including additive, dominant, recessive, heterodominant and genotypic inheritance models, and cross-phenotype association analysis when more than one disease is available in the cohort under study. We used GUIDANCE to analyze the Genetic Epidemiology Research on Adult Health and Aging (GERA) cohort, a publicly available cohort that includes 62,281 subjects from European ancestry with an average age of 63 years for 22 diseases, representing the largest cohort for age-related diseases to date. After quality control, we analyzed 56,637 subjects from European descendant populations. Following our methodology, we imputed genotypes using 1000 Genomes Project (1000G) phase 3, the Genome of the Netherlands project (GoNL), the UK10K project22, and the Haplotype Reference Consortium (HRC) as reference panels. Using this strategy, we identified 26 new associated loci for 16 phenotypes (p < 5 × 10-8), with 13 showing significant dominance deviation (p < 0.05). Importantly, we identified three recessive loci with large effects that could not have identified by the additive model. This include a region let by an INDEL associated with cardiovascular disease in CACNB4 (rs201654520, minor allele frequency [MAF] = 0.017, odds ratio [OR] = 19.02, p = 4.32 × 10-8), a lous near PELO associated with type 2 diabetes with the greatest odds ratio for type 2 diabetes in Europeans reported to date (rs77704739, MAF= 0.036, OR = 4.32, p = 1.75 × 10-8), and a rare INDEL associated with age-related macular degeneration near THUMPD2 (rs557998486, MAF= 0.009, OR = 10.5, p = 2.75 × 10-8). Despite the phenotype discrepancies and different demographical characteristics of the GERA cohort and UK Biobank, four of the novel loci were replicated with an equivalent phenotype in UK Biobank, and we found additional supporting associations in related traits, treatments or biomarkers in UK Biobank for the remaining novel loci. Of note, PELO and THUMPD2 recessive loci were replicated using the recessive model in UK Biobank (combined results: PELO, rs77704739, OR = 2.46, p = 4.68 × 10-11, and THUMPD2, rs557998486, OR = 26.51, p = 3.29 × 10-8), which could not have been found with the additive model. Overall, these results highlight the importance of performing a comprehensive analysis of the full spectrum of genetic variation and considering non-additive models when performing GWAS, especially with well-powered biobanks and the increasing ability to impute low-frequency variants. For the benefit of the research community, we make available both GUIDANCE to boost the analysis of existing and ongoing GWAS projects, and the GERA cohort results, which constitute the largest non-additive genetic variation association database to date, through the Type 2 Diabetes Knowledge Portal (http://www.type2diabetesgenetics.org).
en_US
dc.format.extent
267 p.
en_US
dc.format.mimetype
application/pdf
dc.language.iso
eng
en_US
dc.publisher
Universitat de Barcelona
dc.rights.license
L'accés als continguts d'aquesta tesi queda condicionat a l'acceptació de les condicions d'ús establertes per la següent llicència Creative Commons: http://creativecommons.org/licenses/by-nc-nd/4.0/
dc.rights.uri
http://creativecommons.org/licenses/by-nc-nd/4.0/
*
dc.source
TDX (Tesis Doctorals en Xarxa)
dc.subject
Genètica
en_US
dc.subject
Genética
en_US
dc.subject
Genetics
en_US
dc.subject
Epidemiologia genètica
en_US
dc.subject
Epidemiología genética
en_US
dc.subject
Genetic epidemiology
en_US
dc.subject
Bioinformàtica
en_US
dc.subject
Bioinformática
en_US
dc.subject
Bioinformatics
en_US
dc.subject.other
Ciències Experimentals i Matemàtiques
en_US
dc.title
A systematic and comprehensive approach for large-scale genome-wide association studies. Unraveling non-additive inheritance models in age-related diseases
en_US
dc.type
info:eu-repo/semantics/doctoralThesis
dc.type
info:eu-repo/semantics/publishedVersion
dc.subject.udc
577
en_US
dc.contributor.director
Torrents Arenales, David
dc.contributor.director
Mercader Bigas, Josep Maria
dc.contributor.tutor
Gelpi Buchaca, Josep Lluís
dc.embargo.terms
12 mesos
en_US
dc.rights.accessLevel
info:eu-repo/semantics/openAccess


Documents

MGM_PhD-THESIS.pdf

44.46Mb PDF

Aquest element apareix en la col·lecció o col·leccions següent(s)