PgmNr 351: gnomAD-SV: An open resource of structural variation for medical and population genetics.Authors:
R.L. Collins 1,2,3; H. Brand 1,2,4; K.J. Karczewski 1,2; X. Zhao 1,2,4; J. Alföldi 1,2; A.V. Khera 1,2; L.C. Francioli 1,2,5; L.D. Gauthier 1,2,6; H. Wang 1,2; N.A. Watts 1,2; M. Solomonson 1,2; A. O’Donnell-Luria 1,2; A. Baumann 6; R. Munshi 6; C. Lowther 1,2,4; M. Walker 1,2,6; C. Whelan 6,7; E. Valkanas 1,2,3; J. Fu 1,2; A. Philippakis 6; E. Lander 1,8,9; S. Gabriel 1; B.M. Neale 1,2,3,7; S. Kathiresan 1,2,5,10; M.J. Daly 1,2,3,7,11; E. Banks 6; D.G. MacArthur 1,2,3,5; M.E. Talkowski 1,2,3,4,7; The Genome Aggregation (gnomAD) Consortium
View Session Add to Schedule
1) Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA.; 2) Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA; 3) Division of Medical Sciences, Harvard Medical School, Boston, MA; 4) Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA; 5) Department of Medicine, Harvard Medical School, Boston, MA; 6) Data Science Platform, Broad Institute of Harvard and M.I.T., Cambridge, MA; 7) Stanley Center for Psychiatric Research, Broad Institute of Harvard and M.I.T., Cambridge, MA; 8) Department of Systems Biology, Harvard Medical School, Boston, MA; 9) Division of Health Sciences and Technology, M.I.T., Cambridge, MA; 10) Division of Cardiology, Massachusetts General Hospital, Boston, MA; 11) Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland
Structural variants (SVs) rearrange the linear and three-dimensional organization of the genome, which can have profound consequences in evolution, diversity, and disease. As national biobanks, disease association studies, and clinical genetic testing are increasingly reliant on whole-genome sequencing, population variation references have become integral for the evaluation and interpretation of genomic variation. Here, we constructed a reference atlas of SVs from 32X short-read whole-genome sequencing (WGS) of 14,891 individuals across diverse global populations (54% non-European) as a component of gnomAD. We discovered a rich landscape of 498,257 unique SVs, including 5,729 multi-breakpoint complex SVs across 13 mutational subclasses, and examples of localized chromosome shattering, like chromothripsis. SVs were non-uniformly distributed across the chromosomes and SV classes; likewise, mutation rate estimates varied substantially by SV class. Signatures of selection were strongest against inversions and complex SVs, which appeared to be attributable to both coding and noncoding effects. We discovered strong correlations between constraint against predicted loss-of-function (pLoF) SNVs and rare SVs that both disrupt and duplicate protein-coding genes, suggesting that existing per-gene metrics of pLoF SNV constraint do not simply reflect haploinsufficiency, but appear to capture a gene’s general sensitivity to dosage alterations. Our SV pipelines detected 8,202 SVs per genome, including eight rare, gene-altering SVs, and we predicted that SVs constitute at least 25% of all rare loss-of-function events per genome. We observed large (≥1Mb), rare SVs in 3.1% of genomes (∼1:32 individuals), and a clinically reportable pathogenic incidental finding from SVs in 0.24% of genomes (∼1:417 individuals). We also estimated the prevalence of previously reported pathogenic recurrent CNVs associated with genomic disorders, which highlighted differences in frequencies across populations and confirmed that WGS-based analyses can readily recapitulate these clinically important variants. In total, gnomAD-SV includes at least one CNV covering 57% of the genome, while the remaining 43% is significantly enriched for CNVs found in tumors and individuals with developmental disorders. The gnomAD-SV map is browsable online (https://gnomad.broadinstitute.org), which will allow broad, hands-on access to these results as a resource for medical and population genetics.