Genetic discovery in a million people - where do we go from here? - UK Biobank
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Genetic discovery in a million people – where do we go from here? Cristen Willer, PhD Associate Professor Frank N. Wilson Professor of Cardiovascular Medicine Department of Internal Medicine Department of Human Genetics Department of Computational Medicine and Bioinformatics
“The team, the team, the team” - Bo Schembechler Sarah Graham Jonas Nielsen Brooke Wolford Wei Zhou Ida Surakka Whitney Hornsby
Why do we study the genetics of human diseases and traits? So people can live healthy, active, long lives -- avoiding premature death due to heart disease Hopefully, genes involved in the trait, identified through naturally occurring variation in humans, can become leads for prevention and treatment. Lastly, perhaps we can predict who would most benefit from preventive lifestyle changes, medical screening, or treatment.
Genetics for improved treatment of disease Biobank Statistical rigor Clinical trials Enrollment New therapeutics QC PNPLA5 APOE Experimental model systems
HUNT GWAS of 1,400 traits (N~70k) Low-pass genomes of 2,202 Combined with HRC reference + Imputed into ~70k HUNT study 1,400 diseases and quantitative traits ~3 days in the cloud Built a results viewer http://pheweb.sph.umich.edu:5003/pheno/594.1
Colorectal Cancer (N=4,562 cases) BOLT-LMM SAIGE Logistic Saddlepoint approximati mixed Scalable and Accurate model on Implementation of Sample Unbalanced case-control relatedness GEneralized mixed model SAIGE ratio Optimization strategies Large scale data • SAIGE is implemented as an open-source R package available at • https://github.com/weizhouUMICH/SAIGE/ • The GWAS results for 1,403 binary phenotypes (3 days) with the PheCodes in UK Biobank using SAIGE are currently available: • https://www.dropbox.com/sh/wuj4y8wsqjz78om/AAACfAJK54Ktvn zSTAoaZTLma?dl=0 • Michigan PheWeb http://pheweb.sph.umich.edu/UKBiobank Zhou et al., Nature Genetics, 2018 Shawn Lee & Wei Zhou
Atrial fibrillation – basic mechanisms • Irregularities in heart beat • 62% heritability in twin study (Christophersen, Circ Arrhythm Electrophysiol, 2009) Jonas Nielsen
Meta-analysis for Atrial Fibrillation Nielsen et al., Nat Genet 2018 deCODE DiscovEHR/MyCODE MGI European ancestry (USA) European ancestry (USA) Statistical Genetics Iceland 13,471 AF cases 6,679 AF cases 1,226 AF cases 60,620 AF 358,161 controls 41,803 controls 11,049 controls cases HUNT-MI UK biobank AFGen 970,216 Norway European Mostly European controls 6,493 AF cases 14,820 AF cases 17,931 AF cases 63,142 controls 380,919 controls 115,142 controls 163 independent risk variants at 111 loci Prioritized 163 functional candidate genes likely to be involved in AF
Atrial fibrillation GWAS (111 loci, 80 novel loci) PITX2 7x10-443 Nielsen et al., Nat Genet 2018
Candidate functional genes by biological function Cardiac and Skeletal Muscle Function TFs cardiac development AKAP6, COL25A, CFL2, DPT, MYH6, EPHA3, GTF2I, HAND2, NAV2, NKX2-5, MYH7, MYO18B, MYO1C, MYOCD, MYOT, PITX2, SLIT3, SOX15, SOX5, TBX5, TGFB3 MYOZ1, MYPN, PKP2, RBM20, SGCA, SSPN, SYNPO2L, TTN, TTN-AS, WIPF1 Intracellular calcium handling in heart Cardiac ion channels CALU, CAMK2D, CASQ2, PLN, S100A7A GRIK4, KCNC2, KCND3, KCNH2, KCNJ5, KCNN2, KCNN3, SCN10A, SCN5A, SLC9B1 Angiogenesis Hormone signaling TNFSF12, TNFSF12-TNFSF13 ESR2, IGF1R, JMJD1C, NR3C1, THRB1 Congenital heart defects MYH6, NKX2-5, PITX2, TBC1D32, TBX5 Nielsen et al., Nat Genet 2018
PheWAS demonstrates phenotypes correlated with AF
Turn genetic association into biology RoadMap Epigenomics and ENCODE have catalogued regions of open chromatin for many tissue types Jonas Nielsen
Pathways enriched for AF genes include failure of heart looping and abnormal heart development Results from DEPICT Nielsen et al., Nat Genet 2018
AF associated variants show enrichment in regions of open chromatin in fetal heart tissue Analyses performed using GREGOR and GARFIELD Nielsen et al., Nat Genet 2018
AF Association at the MYH6/MYH7 locus rs422068, intronic to MYH6
Rabbit hearts with mechanically induced HF demonstrate arrhythmia and increased expression of MYH7 Todd Herron, José Jalife
Life-time risk of AF based on genetic risk score Nielsen et al., AJHG, 2018
Biobanks allow us to study many phenotypes 157 loci for estimated Glomerular Filtration Genotypes x Rate (eGFR) phenotypes Rare LOF indel 53 new loci 42% risk of fracture to carriers Sex-specific effects (32% risk for BMD < 2 SD) Thyroid Stimulating Hormone 66 loci – 28 novel Pleiotropy 12 Liver-related blood traits 89 coding variants (11 LoF) 17 have impact > 1 SD
Global Lipids Genetics Consortium 22
Goal: longer, healthier lives (prevent premature death) Where do I think the future lies? 1. Genetic discovery • Dual purposes: pharmaceutical targets and identifying high-risk individuals • Focus on coding variation has proven useful 2. Functional fine-mapping by encouraging collaboration between genetics discovery and molecular biology (iPSC, animal models, GTEx, single-cell) 3. Prediction/prevention 4. Return results to participants (large effect variants & polygenic risk scores) to improve uptake of lifestyle intervention (?)
How to make UK biobank even more useful! • Quantify accuracy of any unusual measurements to standard clinical tests (i.e. blood lipid levels, eGFR, heel estimated bone mineral density, etc.) • Self-reported family history information for more phenotypes • Specialty clinic ascertainment (cardiac surgery, for example, for bicuspid aortic valve) • Imputation of more variants from TOPMed • Full exome or genome sequencing (and reference sequences for imputation into other cohorts)
Acknowledgements HUNT-MI Working Team DiscovEHR/MyCode Functional collaborators Kristian Hveem Jonas Nielsen Regeneron Y. Eugene Chen Lars Fritsche Wei Zhou Tanya Teslovich Bo Yang Oddgeir Holmen Sarah Graham Aris Baras Todd Herron Maiken Gabrielsen Ida Surakka Shane McCarthy Pepe José Jalife Anne Heidi Skogholt Brooke Wolford Hyun Min Kang Geisinger GLGC MGI Shawn Lee David Carey Sek Kathiresan Chad Brummett Mike Boehnke Michael Mathis deCODE Rabbit Model Gina Peloso Sachin Kheterpal Kari Stefansson Todd Herron Pradeep Natajaran Daniel Gudbjartsson Jose Jalife All the GLGC cohorts AFGen Consortium Hilma Holm Patrick Ellinor Rosa Thorolfsdottir 25 cristen@umich.edu
A “proxy case” is an unaffected first or second degree relative of a case F=1 F=0 F=0.5 F=0 & A & C F=1 F=0 F=1 F=0.5 F=0 & & & B D Brooke Liu & Pickrell, Nat Genet, 2017 “GWAS by proxy” Wolford 26
Unaffected relatives of cases have intermediate risk allele frequency 27
Unaffected relatives of cases have intermediate risk allele frequency 28
Power improves after modeling unaffected 1,000 cases simulations at MAF 0.1 Power A - Standard GWAS 1.00 B - Exclude unaffected Power at alpha=5e−8 0.75 scheme relatives of cases A 0.50 B C C – Exclude cases (GWAX) D 0.25 D – Model unaffected 0.00 relatives of cases with 0.5 1.0 1.1 1.2 1.3 liability Odds Ratio 29
BOLT-LMM: Linear Mixed Model Venous thromboembolism (VTE) • 2,325 Cases • 65,294 Controls • Case: Control = 0.036 λall = 1.047 GMMAT: Logistic Mixed Model λall = 1.015 SPA-GMMAT: Logistic Mixed Model + SPA tests λall = 1.015 30
Simultaneous consideration of lipid-lowering variants that are protective against liver disease
Figure 1. ZNF529 silencing induces LDLR expression and LDL-C uptake.
LoF carriers and bone mineral density Frameshift indel in MEPE 0.8% frequency in Norway Impact ↓ -0.5 SD on BMD ↑ Fracture risk OR 1.4 – 1.8 (p ~ 10-5) Carrier fracture risk: 42% 572347
You can also read