%0 Journal Article %J Nature %D 2020 %T Mapping and characterization of structural variation in 17,795 human genomes. %A Abel, Haley J %A Larson, David E %A Regier, Allison A %A Chiang, Colby %A Das, Indraniel %A Kanchi, Krishna L %A Layer, Ryan M %A Neale, Benjamin M %A Salerno, William J %A Reeves, Catherine %A Buyske, Steven %A Matise, Tara C %A Muzny, Donna M %A Zody, Michael C %A Lander, Eric S %A Dutcher, Susan K %A Stitziel, Nathan O %A Hall, Ira M %K Alleles %K Case-Control Studies %K Continental Population Groups %K Epigenesis, Genetic %K Female %K Gene Dosage %K Genetic Variation %K Genetics, Population %K Genome, Human %K High-Throughput Nucleotide Sequencing %K Humans %K Male %K Molecular Sequence Annotation %K Quantitative Trait Loci %K Software %K Whole Genome Sequencing %X

A key goal of whole-genome sequencing for studies of human genetics is to interrogate all forms of variation, including single-nucleotide variants, small insertion or deletion (indel) variants and structural variants. However, tools and resources for the study of structural variants have lagged behind those for smaller variants. Here we used a scalable pipeline to map and characterize structural variants in 17,795 deeply sequenced human genomes. We publicly release site-frequency data to create the largest, to our knowledge, whole-genome-sequencing-based structural variant resource so far. On average, individuals carry 2.9 rare structural variants that alter coding regions; these variants affect the dosage or structure of 4.2 genes and account for 4.0-11.2% of rare high-impact coding alleles. Using a computational model, we estimate that structural variants account for 17.2% of rare alleles genome-wide, with predicted deleterious effects that are equivalent to loss-of-function coding alleles; approximately 90% of such structural variants are noncoding deletions (mean 19.1 per genome). We report 158,991 ultra-rare structural variants and show that 2% of individuals carry ultra-rare megabase-scale structural variants, nearly half of which are balanced or complex rearrangements. Finally, we infer the dosage sensitivity of genes and noncoding elements, and reveal trends that relate to element class and conservation. This work will help to guide the analysis and interpretation of structural variants in the era of whole-genome sequencing.

%B Nature %V 583 %P 83-89 %8 2020 07 %G eng %N 7814 %1 https://www.ncbi.nlm.nih.gov/pubmed/32460305?dopt=Abstract %R 10.1038/s41586-020-2371-0 %0 Journal Article %J Bioinformatics %D 2017 %T SVScore: an impact prediction tool for structural variation. %A Ganel, Liron %A Abel, Haley J %A Hall, Ira M %K Gene Frequency %K Genomic Structural Variation %K Genomics %K Humans %K Polymorphism, Single Nucleotide %K Sequence Deletion %K Software %X

Summary: Here we present SVScore, a tool for in silico structural variation (SV) impact prediction. SVScore aggregates per-base single nucleotide polymorphism (SNP) pathogenicity scores across relevant genomic intervals for each SV in a manner that considers variant type, gene features and positional uncertainty. We show that the allele frequency spectrum of high-scoring SVs is strongly skewed toward lower frequencies, suggesting that they are under purifying selection, and that SVScore identifies deleterious variants more effectively than alternative methods. Notably, our results also suggest that duplications are under surprisingly strong selection relative to deletions, and that there are a similar number of strongly pathogenic SVs and SNPs in the human population.

Availability and Implementation: SVScore is implemented in Perl and available freely at {{ http://www.github.com/lganel/SVScore }} for use under the MIT license.

Contact: ihall@wustl.edu.

Supplementary information: Supplementary data are available at Bioinformatics online.

%B Bioinformatics %V 33 %P 1083-1085 %8 2017 04 01 %G eng %N 7 %1 https://www.ncbi.nlm.nih.gov/pubmed/28031184?dopt=Abstract %R 10.1093/bioinformatics/btw789