%0 Journal Article %J Am J Hum Genet %D 2021 %T Association of structural variation with cardiometabolic traits in Finns. %A Chen, Lei %A Abel, Haley J %A Das, Indraniel %A Larson, David E %A Ganel, Liron %A Kanchi, Krishna L %A Regier, Allison A %A Young, Erica P %A Kang, Chul Joo %A Scott, Alexandra J %A Chiang, Colby %A Wang, Xinxin %A Lu, Shuangjia %A Christ, Ryan %A Service, Susan K %A Chiang, Charleston W K %A Havulinna, Aki S %A Kuusisto, Johanna %A Boehnke, Michael %A Laakso, Markku %A Palotie, Aarno %A Ripatti, Samuli %A Freimer, Nelson B %A Locke, Adam E %A Stitziel, Nathan O %A Hall, Ira M %K Alleles %K Cardiovascular Diseases %K Cholesterol %K DNA Copy Number Variations %K Female %K Finland %K Genome, Human %K Genomic Structural Variation %K Genotype %K High-Throughput Nucleotide Sequencing %K Humans %K Male %K Mitochondrial Proteins %K Promoter Regions, Genetic %K Pyruvate Dehydrogenase (Lipoamide)-Phosphatase %K Pyruvic Acid %K Serum Albumin, Human %X

The contribution of genome structural variation (SV) to quantitative traits associated with cardiometabolic diseases remains largely unknown. Here, we present the results of a study examining genetic association between SVs and cardiometabolic traits in the Finnish population. We used sensitive methods to identify and genotype 129,166 high-confidence SVs from deep whole-genome sequencing (WGS) data of 4,848 individuals. We tested the 64,572 common and low-frequency SVs for association with 116 quantitative traits and tested candidate associations using exome sequencing and array genotype data from an additional 15,205 individuals. We discovered 31 genome-wide significant associations at 15 loci, including 2 loci at which SVs have strong phenotypic effects: (1) a deletion of the ALB promoter that is greatly enriched in the Finnish population and causes decreased serum albumin level in carriers (p = 1.47 × 10) and is also associated with increased levels of total cholesterol (p = 1.22 × 10) and 14 additional cholesterol-related traits, and (2) a multi-allelic copy number variant (CNV) at PDPR that is strongly associated with pyruvate (p = 4.81 × 10) and alanine (p = 6.14 × 10) levels and resides within a structurally complex genomic region that has accumulated many rearrangements over evolutionary time. We also confirmed six previously reported associations, including five led by stronger signals in single nucleotide variants (SNVs) and one linking recurrent HP gene deletion and cholesterol levels (p = 6.24 × 10), which was also found to be strongly associated with increased glycoprotein level (p = 3.53 × 10). Our study confirms that integrating SVs in trait-mapping studies will expand our knowledge of genetic factors underlying disease risk.

%B Am J Hum Genet %V 108 %P 583-596 %8 2021 04 01 %G eng %N 4 %1 https://www.ncbi.nlm.nih.gov/pubmed/33798444?dopt=Abstract %R 10.1016/j.ajhg.2021.03.008 %0 Journal Article %J Nat Biotechnol %D 2021 %T Chromosome-scale, haplotype-resolved assembly of human genomes. %A Garg, Shilpa %A Fungtammasan, Arkarachai %A Carroll, Andrew %A Chou, Mike %A Schmitt, Anthony %A Zhou, Xiang %A Mac, Stephen %A Peluso, Paul %A Hatas, Emily %A Ghurye, Jay %A Maguire, Jared %A Mahmoud, Medhat %A Cheng, Haoyu %A Heller, David %A Zook, Justin M %A Moemke, Tobias %A Marschall, Tobias %A Sedlazeck, Fritz J %A Aach, John %A Chin, Chen-Shan %A Church, George M %A Li, Heng %K Algorithms %K Chromosomes, Human %K Genome, Human %K Haplotypes %K Heterozygote %K Humans %K Polymorphism, Single Nucleotide %X

Haplotype-resolved or phased genome assembly provides a complete picture of genomes and their complex genetic variations. However, current algorithms for phased assembly either do not generate chromosome-scale phasing or require pedigree information, which limits their application. We present a method named diploid assembly (DipAsm) that uses long, accurate reads and long-range conformation data for single individuals to generate a chromosome-scale phased assembly within 1 day. Applied to four public human genomes, PGP1, HG002, NA12878 and HG00733, DipAsm produced haplotype-resolved assemblies with minimum contig length needed to cover 50% of the known genome (NG50) up to 25 Mb and phased ~99.5% of heterozygous sites at 98-99% accuracy, outperforming other approaches in terms of both contiguity and phasing completeness. We demonstrate the importance of chromosome-scale phased assemblies for the discovery of structural variants (SVs), including thousands of new transposon insertions, and of highly polymorphic and medically important regions such as the human leukocyte antigen (HLA) and killer cell immunoglobulin-like receptor (KIR) regions. DipAsm will facilitate high-quality precision medicine and studies of individual haplotype variation and population diversity.

%B Nat Biotechnol %V 39 %P 309-312 %8 2021 03 %G eng %N 3 %1 https://www.ncbi.nlm.nih.gov/pubmed/33288905?dopt=Abstract %R 10.1038/s41587-020-0711-0 %0 Journal Article %J Nat Genet %D 2021 %T Efficient phasing and imputation of low-coverage sequencing data using large reference panels. %A Rubinacci, Simone %A Ribeiro, Diogo M %A Hofmeister, Robin J %A Delaneau, Olivier %K Genome, Human %K Genotype %K Humans %K Likelihood Functions %K Polymorphism, Single Nucleotide %K Reference Standards %K Sequence Analysis, DNA %X

Low-coverage whole-genome sequencing followed by imputation has been proposed as a cost-effective genotyping approach for disease and population genetics studies. However, its competitiveness against SNP arrays is undermined because current imputation methods are computationally expensive and unable to leverage large reference panels. Here, we describe a method, GLIMPSE, for phasing and imputation of low-coverage sequencing datasets from modern reference panels. We demonstrate its remarkable performance across different coverages and human populations. GLIMPSE achieves imputation of a genome for less than US$1 in computational cost, considerably outperforming other methods and improving imputation accuracy over the full allele frequency range. As a proof of concept, we show that 1× coverage enables effective gene expression association studies and outperforms dense SNP arrays in rare variant burden tests. Overall, this study illustrates the promising potential of low-coverage imputation and suggests a paradigm shift in the design of future genomic studies.

%B Nat Genet %V 53 %P 120-126 %8 2021 01 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/33414550?dopt=Abstract %R 10.1038/s41588-020-00756-0 %0 Journal Article %J Mov Disord %D 2021 %T Expanded CAG Repeats in ATXN1, ATXN2, ATXN3, and HTT in the 1000 Genomes Project. %A Akçimen, Fulya %A Ross, Jay P %A Liao, Calwing %A Spiegelman, Dan %A Dion, Patrick A %A Rouleau, Guy A %K Alleles %K Ataxin-1 %K Ataxin-2 %K Ataxin-3 %K Humans %K Huntingtin Protein %K Huntington Disease %K Repressor Proteins %K Spinocerebellar Ataxias %K Trinucleotide Repeat Expansion %K Trinucleotide Repeats %X

BACKGROUND: Spinocerebellar ataxia types 1, 2, 3 and Huntington disease are neurodegenerative disorders caused by expanded CAG repeats.

METHODS: We performed an in-silico analysis of CAG repeats in ATXN1, ATXN2, ATXN3, and HTT using 30× whole-=genome sequencing data of 2504 samples from the 1000 Genomes Project.

RESULTS: Seven HTT-positive, 3 ATXN2-positive, 1 ATXN3-positive, and 6 possibly ATXN1-positive samples were identified. No correlation was found between the repeat sizes of the different genes. The distribution of CAG alleles varied by ethnicity.

CONCLUSION: Our results suggest that there may be asymptomatic small expanded repeats in almost 0.5% of these populations. © 2020 International Parkinson and Movement Disorder Society.

%B Mov Disord %V 36 %P 514-518 %8 2021 02 %G eng %N 2 %1 https://www.ncbi.nlm.nih.gov/pubmed/33159825?dopt=Abstract %R 10.1002/mds.28341 %0 Journal Article %J Genome Med %D 2021 %T Genetic and non-genetic factors affecting the expression of COVID-19-relevant genes in the large airway epithelium. %A Kasela, Silva %A Ortega, Victor E %A Martorella, Molly %A Garudadri, Suresh %A Nguyen, Jenna %A Ampleford, Elizabeth %A Pasanen, Anu %A Nerella, Srilaxmi %A Buschur, Kristina L %A Barjaktarevic, Igor Z %A Barr, R Graham %A Bleecker, Eugene R %A Bowler, Russell P %A Comellas, Alejandro P %A Cooper, Christopher B %A Couper, David J %A Criner, Gerard J %A Curtis, Jeffrey L %A Han, MeiLan K %A Hansel, Nadia N %A Hoffman, Eric A %A Kaner, Robert J %A Krishnan, Jerry A %A Martinez, Fernando J %A McDonald, Merry-Lynn N %A Meyers, Deborah A %A Paine, Robert %A Peters, Stephen P %A Castro, Mario %A Denlinger, Loren C %A Erzurum, Serpil C %A Fahy, John V %A Israel, Elliot %A Jarjour, Nizar N %A Levy, Bruce D %A Li, Xingnan %A Moore, Wendy C %A Wenzel, Sally E %A Zein, Joe %A Langelier, Charles %A Woodruff, Prescott G %A Lappalainen, Tuuli %A Christenson, Stephanie A %K Adult %K Aged %K Aged, 80 and over %K Angiotensin-Converting Enzyme 2 %K Asthma %K Bronchi %K Cardiovascular Diseases %K COVID-19 %K Gene Expression %K Genetic Variation %K Humans %K Middle Aged %K Obesity %K Pulmonary Disease, Chronic Obstructive %K Quantitative Trait Loci %K Respiratory Mucosa %K Risk Factors %K SARS-CoV-2 %K Smoking %X

BACKGROUND: The large airway epithelial barrier provides one of the first lines of defense against respiratory viruses, including SARS-CoV-2 that causes COVID-19. Substantial inter-individual variability in individual disease courses is hypothesized to be partially mediated by the differential regulation of the genes that interact with the SARS-CoV-2 virus or are involved in the subsequent host response. Here, we comprehensively investigated non-genetic and genetic factors influencing COVID-19-relevant bronchial epithelial gene expression.

METHODS: We analyzed RNA-sequencing data from bronchial epithelial brushings obtained from uninfected individuals. We related ACE2 gene expression to host and environmental factors in the SPIROMICS cohort of smokers with and without chronic obstructive pulmonary disease (COPD) and replicated these associations in two asthma cohorts, SARP and MAST. To identify airway biology beyond ACE2 binding that may contribute to increased susceptibility, we used gene set enrichment analyses to determine if gene expression changes indicative of a suppressed airway immune response observed early in SARS-CoV-2 infection are also observed in association with host factors. To identify host genetic variants affecting COVID-19 susceptibility in SPIROMICS, we performed expression quantitative trait (eQTL) mapping and investigated the phenotypic associations of the eQTL variants.

RESULTS: We found that ACE2 expression was higher in relation to active smoking, obesity, and hypertension that are known risk factors of COVID-19 severity, while an association with interferon-related inflammation was driven by the truncated, non-binding ACE2 isoform. We discovered that expression patterns of a suppressed airway immune response to early SARS-CoV-2 infection, compared to other viruses, are similar to patterns associated with obesity, hypertension, and cardiovascular disease, which may thus contribute to a COVID-19-susceptible airway environment. eQTL mapping identified regulatory variants for genes implicated in COVID-19, some of which had pheWAS evidence for their potential role in respiratory infections.

CONCLUSIONS: These data provide evidence that clinically relevant variation in the expression of COVID-19-related genes is associated with host factors, environmental exposures, and likely host genetic variation.

%B Genome Med %V 13 %P 66 %8 2021 04 21 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/33883027?dopt=Abstract %R 10.1186/s13073-021-00866-2 %0 Journal Article %J Science %D 2021 %T Haplotype-resolved diverse human genomes and integrated analysis of structural variation. %A Ebert, Peter %A Audano, Peter A %A Zhu, Qihui %A Rodriguez-Martin, Bernardo %A Porubsky, David %A Bonder, Marc Jan %A Sulovari, Arvis %A Ebler, Jana %A Zhou, Weichen %A Serra Mari, Rebecca %A Yilmaz, Feyza %A Zhao, Xuefang %A Hsieh, PingHsun %A Lee, Joyce %A Kumar, Sushant %A Lin, Jiadong %A Rausch, Tobias %A Chen, Yu %A Ren, Jingwen %A Santamarina, Martin %A Höps, Wolfram %A Ashraf, Hufsah %A Chuang, Nelson T %A Yang, Xiaofei %A Munson, Katherine M %A Lewis, Alexandra P %A Fairley, Susan %A Tallon, Luke J %A Clarke, Wayne E %A Basile, Anna O %A Byrska-Bishop, Marta %A Corvelo, André %A Evani, Uday S %A Lu, Tsung-Yu %A Chaisson, Mark J P %A Chen, Junjie %A Li, Chong %A Brand, Harrison %A Wenger, Aaron M %A Ghareghani, Maryam %A Harvey, William T %A Raeder, Benjamin %A Hasenfeld, Patrick %A Regier, Allison A %A Abel, Haley J %A Hall, Ira M %A Flicek, Paul %A Stegle, Oliver %A Gerstein, Mark B %A Tubio, Jose M C %A Mu, Zepeng %A Li, Yang I %A Shi, Xinghua %A Hastie, Alex R %A Ye, Kai %A Chong, Zechen %A Sanders, Ashley D %A Zody, Michael C %A Talkowski, Michael E %A Mills, Ryan E %A Devine, Scott E %A Lee, Charles %A Korbel, Jan O %A Marschall, Tobias %A Eichler, Evan E %K Female %K Genetic Variation %K Genome, Human %K Genotype %K Haplotypes %K High-Throughput Nucleotide Sequencing %K Humans %K INDEL Mutation %K Interspersed Repetitive Sequences %K Male %K Population Groups %K Quantitative Trait Loci %K Retroelements %K Sequence Analysis, DNA %K Sequence Inversion %K Whole Genome Sequencing %X

Long-read and strand-specific sequencing technologies together facilitate the de novo assembly of high-quality haplotype-resolved human genomes without parent-child trio data. We present 64 assembled haplotypes from 32 diverse human genomes. These highly contiguous haplotype assemblies (average minimum contig length needed to cover 50% of the genome: 26 million base pairs) integrate all forms of genetic variation, even across complex loci. We identified 107,590 structural variants (SVs), of which 68% were not discovered with short-read sequencing, and 278 SV hotspots (spanning megabases of gene-rich sequence). We characterized 130 of the most active mobile element source elements and found that 63% of all SVs arise through homology-mediated mechanisms. This resource enables reliable graph-based genotyping from short reads of up to 50,340 SVs, resulting in the identification of 1526 expression quantitative trait loci as well as SV candidates for adaptive selection within the human population.

%B Science %V 372 %8 2021 04 02 %G eng %N 6537 %1 https://www.ncbi.nlm.nih.gov/pubmed/33632895?dopt=Abstract %R 10.1126/science.abf7117 %0 Journal Article %J Cell %D 2021 %T Identification of Required Host Factors for SARS-CoV-2 Infection in Human Cells. %A Daniloski, Zharko %A Jordan, Tristan X %A Wessels, Hans-Hermann %A Hoagland, Daisy A %A Kasela, Silva %A Legut, Mateusz %A Maniatis, Silas %A Mimitou, Eleni P %A Lu, Lu %A Geller, Evan %A Danziger, Oded %A Rosenberg, Brad R %A Phatnani, Hemali %A Smibert, Peter %A Lappalainen, Tuuli %A tenOever, Benjamin R %A Sanjana, Neville E %K A549 Cells %K Alveolar Epithelial Cells %K Angiotensin-Converting Enzyme 2 %K Biosynthetic Pathways %K Cholesterol %K Clustered Regularly Interspaced Short Palindromic Repeats %K COVID-19 %K Endosomes %K Gene Expression Profiling %K Gene Knockdown Techniques %K Gene Knockout Techniques %K Genome-Wide Association Study %K Host-Pathogen Interactions %K Humans %K rab GTP-Binding Proteins %K rab7 GTP-Binding Proteins %K RNA Interference %K SARS-CoV-2 %K Single-Cell Analysis %K Viral Load %X

To better understand host-virus genetic dependencies and find potential therapeutic targets for COVID-19, we performed a genome-scale CRISPR loss-of-function screen to identify host factors required for SARS-CoV-2 viral infection of human alveolar epithelial cells. Top-ranked genes cluster into distinct pathways, including the vacuolar ATPase proton pump, Retromer, and Commander complexes. We validate these gene targets using several orthogonal methods such as CRISPR knockout, RNA interference knockdown, and small-molecule inhibitors. Using single-cell RNA-sequencing, we identify shared transcriptional changes in cholesterol biosynthesis upon loss of top-ranked genes. In addition, given the key role of the ACE2 receptor in the early stages of viral entry, we show that loss of RAB7A reduces viral entry by sequestering the ACE2 receptor inside cells. Overall, this work provides a genome-scale, quantitative resource of the impact of the loss of each host gene on fitness/response to viral infection.

%B Cell %V 184 %P 92-105.e16 %8 2021 01 07 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/33147445?dopt=Abstract %R 10.1016/j.cell.2020.10.030 %0 Journal Article %J Nat Rev Genet %D 2021 %T Towards population-scale long-read sequencing. %A De Coster, Wouter %A Weissensteiner, Matthias H %A Sedlazeck, Fritz J %K Computational Biology %K Genome, Human %K Genomics %K High-Throughput Nucleotide Sequencing %K Humans %K Industrial Development %K Sequence Analysis, DNA %X

Long-read sequencing technologies have now reached a level of accuracy and yield that allows their application to variant detection at a scale of tens to thousands of samples. Concomitant with the development of new computational tools, the first population-scale studies involving long-read sequencing have emerged over the past 2 years and, given the continuous advancement of the field, many more are likely to follow. In this Review, we survey recent developments in population-scale long-read sequencing, highlight potential challenges of a scaled-up approach and provide guidance regarding experimental design. We provide an overview of current long-read sequencing platforms, variant calling methodologies and approaches for de novo assemblies and reference-based mapping approaches. Furthermore, we summarize strategies for variant validation, genotyping and predicting functional impact and emphasize challenges remaining in achieving long-read sequencing at a population scale.

%B Nat Rev Genet %V 22 %P 572-587 %8 2021 09 %G eng %N 9 %1 https://www.ncbi.nlm.nih.gov/pubmed/34050336?dopt=Abstract %R 10.1038/s41576-021-00367-3 %0 Journal Article %J Am J Hum Genet %D 2021 %T Whole-genome sequencing of African Americans implicates differential genetic architecture in inflammatory bowel disease. %A Somineni, Hari K %A Nagpal, Sini %A Venkateswaran, Suresh %A Cutler, David J %A Okou, David T %A Haritunians, Talin %A Simpson, Claire L %A Begum, Ferdouse %A Datta, Lisa W %A Quiros, Antonio J %A Seminerio, Jenifer %A Mengesha, Emebet %A Alexander, Jonathan S %A Baldassano, Robert N %A Dudley-Brown, Sharon %A Cross, Raymond K %A Dassopoulos, Themistocles %A Denson, Lee A %A Dhere, Tanvi A %A Iskandar, Heba %A Dryden, Gerald W %A Hou, Jason K %A Hussain, Sunny Z %A Hyams, Jeffrey S %A Isaacs, Kim L %A Kader, Howard %A Kappelman, Michael D %A Katz, Jeffry %A Kellermayer, Richard %A Kuemmerle, John F %A Lazarev, Mark %A Li, Ellen %A Mannon, Peter %A Moulton, Dedrick E %A Newberry, Rodney D %A Patel, Ashish S %A Pekow, Joel %A Saeed, Shehzad A %A Valentine, John F %A Wang, Ming-Hsi %A McCauley, Jacob L %A Abreu, Maria T %A Jester, Traci %A Molle-Rios, Zarela %A Palle, Sirish %A Scherl, Ellen J %A Kwon, John %A Rioux, John D %A Duerr, Richard H %A Silverberg, Mark S %A Zwick, Michael E %A Stevens, Christine %A Daly, Mark J %A Cho, Judy H %A Gibson, Greg %A McGovern, Dermot P B %A Brant, Steven R %A Kugathasan, Subra %K African Americans %K Aged %K Aged, 80 and over %K Calbindin 2 %K Colitis, Ulcerative %K Crohn Disease %K European Continental Ancestry Group %K Female %K Gene Frequency %K Genetic Predisposition to Disease %K Genome-Wide Association Study %K Humans %K Inflammatory Bowel Diseases %K Male %K Multifactorial Inheritance %K Polymorphism, Single Nucleotide %K Receptors, Prostaglandin E, EP4 Subtype %K Whole Genome Sequencing %X

Whether or not populations diverge with respect to the genetic contribution to risk of specific complex diseases is relevant to understanding the evolution of susceptibility and origins of health disparities. Here, we describe a large-scale whole-genome sequencing study of inflammatory bowel disease encompassing 1,774 affected individuals and 1,644 healthy control Americans with African ancestry (African Americans). Although no new loci for inflammatory bowel disease are discovered at genome-wide significance levels, we identify numerous instances of differential effect sizes in combination with divergent allele frequencies. For example, the major effect at PTGER4 fine maps to a single credible interval of 22 SNPs corresponding to one of four independent associations at the locus in European ancestry individuals but with an elevated odds ratio for Crohn disease in African Americans. A rare variant aggregate analysis implicates Ca-binding neuro-immunomodulator CALB2 in ulcerative colitis. Highly significant overall overlap of common variant risk for inflammatory bowel disease susceptibility between individuals with African and European ancestries was observed, with 41 of 241 previously known lead variants replicated and overall correlations in effect sizes of 0.68 for combined inflammatory bowel disease. Nevertheless, subtle differences influence the performance of polygenic risk scores, and we show that ancestry-appropriate weights significantly improve polygenic prediction in the highest percentiles of risk. The median amount of variance explained per locus remains the same in African and European cohorts, providing evidence for compensation of effect sizes as allele frequencies diverge, as expected under a highly polygenic model of disease.

%B Am J Hum Genet %V 108 %P 431-445 %8 2021 03 04 %G eng %N 3 %1 https://www.ncbi.nlm.nih.gov/pubmed/33600772?dopt=Abstract %R 10.1016/j.ajhg.2021.02.001 %0 Journal Article %J Nat Commun %D 2020 %T Analysis of cardiac magnetic resonance imaging in 36,000 individuals yields genetic insights into dilated cardiomyopathy. %A Pirruccello, James P %A Bick, Alexander %A Wang, Minxian %A Chaffin, Mark %A Friedman, Samuel %A Yao, Jie %A Guo, Xiuqing %A Venkatesh, Bharath Ambale %A Taylor, Kent D %A Post, Wendy S %A Rich, Stephen %A Lima, Joao A C %A Rotter, Jerome I %A Philippakis, Anthony %A Lubitz, Steven A %A Ellinor, Patrick T %A Khera, Amit V %A Kathiresan, Sekar %A Aragam, Krishna G %K Cardiomyopathy, Dilated %K Genome-Wide Association Study %K Heart %K Humans %K Magnetic Resonance Imaging %K Myocardium %K Polymorphism, Single Nucleotide %X

Dilated cardiomyopathy (DCM) is an important cause of heart failure and the leading indication for heart transplantation. Many rare genetic variants have been associated with DCM, but common variant studies of the disease have yielded few associated loci. As structural changes in the heart are a defining feature of DCM, we report a genome-wide association study of cardiac magnetic resonance imaging (MRI)-derived left ventricular measurements in 36,041 UK Biobank participants, with replication in 2184 participants from the Multi-Ethnic Study of Atherosclerosis. We identify 45 previously unreported loci associated with cardiac structure and function, many near well-established genes for Mendelian cardiomyopathies. A polygenic score of MRI-derived left ventricular end systolic volume strongly associates with incident DCM in the general population. Even among carriers of TTN truncating mutations, this polygenic score influences the size and function of the human heart. These results further implicate common genetic polymorphisms in the pathogenesis of DCM.

%B Nat Commun %V 11 %P 2254 %8 2020 05 07 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/32382064?dopt=Abstract %R 10.1038/s41467-020-15823-7 %0 Journal Article %J JAMA Netw Open %D 2020 %T Association of Rare Pathogenic DNA Variants for Familial Hypercholesterolemia, Hereditary Breast and Ovarian Cancer Syndrome, and Lynch Syndrome With Disease Risk in Adults According to Family History. %A Patel, Aniruddh P %A Wang, Minxian %A Fahed, Akl C %A Mason-Suares, Heather %A Brockman, Deanna %A Pelletier, Renee %A Amr, Sami %A Machini, Kalotina %A Hawley, Megan %A Witkowski, Leora %A Koch, Christopher %A Philippakis, Anthony %A Cassa, Christopher A %A Ellinor, Patrick T %A Kathiresan, Sekar %A Ng, Kenney %A Lebo, Matthew %A Khera, Amit V %K Aged %K Cohort Studies %K Colorectal Neoplasms, Hereditary Nonpolyposis %K Female %K Genetic Predisposition to Disease %K Hereditary Breast and Ovarian Cancer Syndrome %K Heterozygote %K Humans %K Hyperlipoproteinemia Type II %K Male %K Middle Aged %K Pedigree %K Proportional Hazards Models %K United Kingdom %K Whole Exome Sequencing %X

Importance: Pathogenic DNA variants associated with familial hypercholesterolemia, hereditary breast and ovarian cancer syndrome, and Lynch syndrome are widely recognized as clinically important and actionable when identified, leading some clinicians to recommend population-wide genomic screening.

Objectives: To assess the prevalence and clinical importance of pathogenic or likely pathogenic variants associated with each of 3 genomic conditions (familial hypercholesterolemia, hereditary breast and ovarian cancer syndrome, and Lynch syndrome) within the context of contemporary clinical care.

Design, Setting, and Participants: This cohort study used gene-sequencing data from 49 738 participants in the UK Biobank who were recruited from 22 sites across the UK between March 21, 2006, and October 1, 2010. Inpatient hospital data date back to 1977; cancer registry data, to 1957; and death registry data, to 2006. Statistical analysis was performed from July 22, 2019, to November 15, 2019.

Exposures: Pathogenic or likely pathogenic DNA variants classified by a clinical laboratory geneticist.

Main Outcomes and Measures: Composite end point specific to each genomic condition based on atherosclerotic cardiovascular disease events for familial hypercholesterolemia, breast or ovarian cancer for hereditary breast and ovarian cancer syndrome, and colorectal or uterine cancer for Lynch syndrome.

Results: Among 49 738 participants (mean [SD] age, 57 [8] years; 27 144 female [55%]), 441 (0.9%) harbored a pathogenic or likely pathogenic variant associated with any of 3 genomic conditions, including 131 (0.3%) for familial hypercholesterolemia, 235 (0.5%) for hereditary breast and ovarian cancer syndrome, and 76 (0.2%) for Lynch syndrome. Presence of these variants was associated with increased risk of disease: for familial hypercholesterolemia, 28 of 131 carriers (21.4%) vs 4663 of 49 607 noncarriers (9.4%) developed atherosclerotic cardiovascular disease; for hereditary breast and ovarian cancer syndrome, 32 of 116 female carriers (27.6%) vs 2080 of 27 028 female noncarriers (7.7%) developed associated cancers; and for Lynch syndrome, 17 of 76 carriers (22.4%) vs 929 of 49 662 noncarriers (1.9%) developed colorectal or uterine cancer. The predicted probability of disease at age 75 years despite contemporary clinical care was 45.3% for carriers of familial hypercholesterolemia, 41.1% for hereditary breast and ovarian cancer syndrome, and 38.3% for Lynch syndrome. Across the 3 conditions, 39.7% (175 of 441) of the carriers reported a family history of disease vs 23.2% (34 517 of 148 772) of noncarriers.

Conclusions and Relevance: The findings suggest that approximately 1% of the middle-aged adult population in the UK Biobank harbored a pathogenic variant associated with any of 3 genomic conditions. These variants were associated with an increased risk of disease despite contemporary clinical care and were not reliably detected by family history.

%B JAMA Netw Open %V 3 %P e203959 %8 2020 04 01 %G eng %N 4 %1 https://www.ncbi.nlm.nih.gov/pubmed/32347951?dopt=Abstract %R 10.1001/jamanetworkopen.2020.3959 %0 Journal Article %J Science %D 2020 %T Cell type-specific genetic regulation of gene expression across human tissues. %A Kim-Hellmuth, Sarah %A Aguet, François %A Oliva, Meritxell %A Muñoz-Aguirre, Manuel %A Kasela, Silva %A Wucher, Valentin %A Castel, Stephane E %A Hamel, Andrew R %A Viñuela, Ana %A Roberts, Amy L %A Mangul, Serghei %A Wen, Xiaoquan %A Wang, Gao %A Barbeira, Alvaro N %A Garrido-Martín, Diego %A Nadel, Brian B %A Zou, Yuxin %A Bonazzola, Rodrigo %A Quan, Jie %A Brown, Andrew %A Martinez-Perez, Angel %A Soria, José Manuel %A Getz, Gad %A Dermitzakis, Emmanouil T %A Small, Kerrin S %A Stephens, Matthew %A Xi, Hualin S %A Im, Hae Kyung %A Guigo, Roderic %A Segrè, Ayellet V %A Stranger, Barbara E %A Ardlie, Kristin G %A Lappalainen, Tuuli %K Cells %K Gene Expression Regulation %K Humans %K Organ Specificity %K Quantitative Trait Loci %K RNA, Long Noncoding %K Transcriptome %X

The Genotype-Tissue Expression (GTEx) project has identified expression and splicing quantitative trait loci in cis (QTLs) for the majority of genes across a wide range of human tissues. However, the functional characterization of these QTLs has been limited by the heterogeneous cellular composition of GTEx tissue samples. We mapped interactions between computational estimates of cell type abundance and genotype to identify cell type-interaction QTLs for seven cell types and show that cell type-interaction expression QTLs (eQTLs) provide finer resolution to tissue specificity than bulk tissue cis-eQTLs. Analyses of genetic associations with 87 complex traits show a contribution from cell type-interaction QTLs and enables the discovery of hundreds of previously unidentified colocalized loci that are masked in bulk tissue.

%B Science %V 369 %8 2020 09 11 %G eng %N 6509 %1 https://www.ncbi.nlm.nih.gov/pubmed/32913075?dopt=Abstract %R 10.1126/science.aaz8528 %0 Journal Article %J Stroke %D 2020 %T Combining Imaging and Genetics to Predict Recurrence of Anticoagulation-Associated Intracerebral Hemorrhage. %A Biffi, Alessandro %A Urday, Sebastian %A Kubiszewski, Patryk %A Gilkerson, Lee %A Sekar, Padmini %A Rodriguez-Torres, Axana %A Bettin, Margaret %A Charidimou, Andreas %A Pasi, Marco %A Kourkoulis, Christina %A Schwab, Kristin %A DiPucchio, Zora %A Behymer, Tyler %A Osborne, Jennifer %A Morgan, Misty %A Moomaw, Charles J %A James, Michael L %A Greenberg, Steven M %A Viswanathan, Anand %A Gurol, M Edip %A Worrall, Bradford B %A Testai, Fernando D %A McCauley, Jacob L %A Falcone, Guido J %A Langefeld, Carl D %A Anderson, Christopher D %A Kamel, Hooman %A Woo, Daniel %A Sheth, Kevin N %A Rosand, Jonathan %K Aged %K Anticoagulants %K Apolipoprotein E4 %K Cerebral Hemorrhage %K Female %K Humans %K Magnetic Resonance Imaging %K Male %K Middle Aged %K Neuroimaging %K Recurrence %X

BACKGROUND AND PURPOSE: For survivors of oral anticoagulation therapy (OAT)-associated intracerebral hemorrhage (OAT-ICH) who are at high risk for thromboembolism, the benefits of OAT resumption must be weighed against increased risk of recurrent hemorrhagic stroke. The ε2/ε4 alleles of the () gene, MRI-defined cortical superficial siderosis, and cerebral microbleeds are the most potent risk factors for recurrent ICH. We sought to determine whether combining MRI markers and genotype could have clinical impact by identifying ICH survivors in whom the risks of OAT resumption are highest.

METHODS: Joint analysis of data from 2 longitudinal cohort studies of OAT-ICH survivors: (1) MGH-ICH study (Massachusetts General Hospital ICH) and (2) longitudinal component of the ERICH study (Ethnic/Racial Variations of Intracerebral Hemorrhage). We evaluated whether MRI markers and genotype predict ICH recurrence. We then developed and validated a combined -MRI classification scheme to predict ICH recurrence, using Classification and Regression Tree analysis.

RESULTS: Cortical superficial siderosis, cerebral microbleed, and ε2/ε4 variants were independently associated with ICH recurrence after OAT-ICH (all <0.05). Combining genotype and MRI data resulted in improved prediction of ICH recurrence (Harrell C: 0.79 versus 0.55 for clinical data alone, =0.033). In the MGH (training) data set, CSS, cerebral microbleed, and ε2/ε4 stratified likelihood of ICH recurrence into high-, medium-, and low-risk categories. In the ERICH (validation) data set, yearly ICH recurrence rates for high-, medium-, and low-risk individuals were 6.6%, 2.5%, and 0.9%, respectively, with overall area under the curve of 0.91 for prediction of recurrent ICH.

CONCLUSIONS: Combining MRI and genotype stratifies likelihood of ICH recurrence into high, medium, and low risk. If confirmed in prospective studies, this combined -MRI classification scheme may prove useful for selecting individuals for OAT resumption after ICH.

%B Stroke %V 51 %P 2153-2160 %8 2020 07 %G eng %N 7 %1 https://www.ncbi.nlm.nih.gov/pubmed/32517581?dopt=Abstract %R 10.1161/STROKEAHA.120.028310 %0 Journal Article %J Mol Genet Genomic Med %D 2020 %T Community-based recruitment and exome sequencing indicates high diagnostic yield in adults with intellectual disability. %A Sabo, Aniko %A Murdock, David %A Dugan, Shannon %A Meng, Qingchang %A Gingras, Marie-Claude %A Hu, Jianhong %A Muzny, Donna %A Gibbs, Richard %K Adult %K Female %K Genetic Testing %K Humans %K Independent Living %K Intellectual Disability %K Male %K Mediator Complex %K Membrane Proteins %K Nuclear Proteins %K Patient Selection %K Sensitivity and Specificity %K Tumor Suppressor Proteins %K Whole Exome Sequencing %X

BACKGROUND: Establishing a genetic diagnosis for individuals with intellectual disability (ID) benefits patients and their families as it may inform the prognosis, lead to appropriate therapy, and facilitate access to medical and supportive services. Exome sequencing has been successfully applied in a diagnostic setting, but most clinical exome referrals are pediatric patients, with many adults with ID lacking a comprehensive genetic evaluation.

METHODS: Our unique recruitment strategy involved partnering with service and education providers for individuals with ID. We performed exome sequencing and analysis, and clinical variant interpretation for each recruited family.

RESULTS: All five families enrolled in the study opted-in for the return of genetic results. In three out of five families exome sequencing analysis identified pathogenic or likely pathogenic variants in KANSL1, TUSC3, and MED13L genes. Families discussed the results and any potential medical follow-up in an appointment with a board certified clinical geneticist.

CONCLUSION: Our study suggests high yield of exome sequencing as a diagnostic tool in adult patients with ID who have not undergone comprehensive sequencing-based genetic testing. Research studies including an option of return of results through a genetic clinic could help minimize the disparity in exome diagnostic testing between pediatric and adult patients with ID.

%B Mol Genet Genomic Med %V 8 %P e1439 %8 2020 10 %G eng %N 10 %1 https://www.ncbi.nlm.nih.gov/pubmed/32767738?dopt=Abstract %R 10.1002/mgg3.1439 %0 Journal Article %J PLoS Genet %D 2020 %T Copy number variants and fixed duplications among 198 rhesus macaques (Macaca mulatta). %A Brasó-Vives, Marina %A Povolotskaya, Inna S %A Hartasánchez, Diego A %A Farré, Xavier %A Fernandez-Callejo, Marcos %A Raveendran, Muthuswamy %A Harris, R Alan %A Rosene, Douglas L %A Lorente-Galdos, Belen %A Navarro, Arcadi %A Marques-Bonet, Tomas %A Rogers, Jeffrey %A Juan, David %K Animals %K Chromosome Mapping %K DNA Copy Number Variations %K Female %K Gene Duplication %K Genetics, Population %K Genome %K High-Throughput Nucleotide Sequencing %K Humans %K Macaca mulatta %K Male %K Open Reading Frames %K Phylogeny %K Sequence Analysis, DNA %K Species Specificity %X

The rhesus macaque is an abundant species of Old World monkeys and a valuable model organism for biomedical research due to its close phylogenetic relationship to humans. Copy number variation is one of the main sources of genomic diversity within and between species and a widely recognized cause of inter-individual differences in disease risk. However, copy number differences among rhesus macaques and between the human and macaque genomes, as well as the relevance of this diversity to research involving this nonhuman primate, remain understudied. Here we present a high-resolution map of sequence copy number for the rhesus macaque genome constructed from a dataset of 198 individuals. Our results show that about one-eighth of the rhesus macaque reference genome is composed of recently duplicated regions, either copy number variable regions or fixed duplications. Comparison with human genomic copy number maps based on previously published data shows that, despite overall similarities in the genome-wide distribution of these regions, there are specific differences at the chromosome level. Some of these create differences in the copy number profile between human disease genes and their rhesus macaque orthologs. Our results highlight the importance of addressing the number of copies of target genes in the design of experiments and cautions against human-centered assumptions in research conducted with model organisms. Overall, we present a genome-wide copy number map from a large sample of rhesus macaque individuals representing an important novel contribution concerning the evolution of copy number in primate genomes.

%B PLoS Genet %V 16 %P e1008742 %8 2020 05 %G eng %N 5 %1 https://www.ncbi.nlm.nih.gov/pubmed/32392208?dopt=Abstract %R 10.1371/journal.pgen.1008742 %0 Journal Article %J Science %D 2020 %T Determinants of telomere length across human tissues. %A Demanelis, Kathryn %A Jasmine, Farzana %A Chen, Lin S %A Chernoff, Meytal %A Tong, Lin %A Delgado, Dayana %A Zhang, Chenan %A Shinkle, Justin %A Sabarinathan, Mekala %A Lin, Hannah %A Ramirez, Eduardo %A Oliva, Meritxell %A Kim-Hellmuth, Sarah %A Stranger, Barbara E %A Lai, Tsung-Po %A Aviv, Abraham %A Ardlie, Kristin G %A Aguet, François %A Ahsan, Habibul %A Doherty, Jennifer A %A Kibriya, Muhammad G %A Pierce, Brandon L %K Aging %K Genetic Markers %K Genetic Variation %K Humans %K Organ Specificity %K Telomere %K Telomere Homeostasis %K Telomere Shortening %X

Telomere shortening is a hallmark of aging. Telomere length (TL) in blood cells has been studied extensively as a biomarker of human aging and disease; however, little is known regarding variability in TL in nonblood, disease-relevant tissue types. Here, we characterize variability in TLs from 6391 tissue samples, representing >20 tissue types and 952 individuals from the Genotype-Tissue Expression (GTEx) project. We describe differences across tissue types, positive correlation among tissue types, and associations with age and ancestry. We show that genetic variation affects TL in multiple tissue types and that TL may mediate the effect of age on gene expression. Our results provide the foundational knowledge regarding TL in healthy tissues that is needed to interpret epidemiological studies of TL and human health.

%B Science %V 369 %8 2020 09 11 %G eng %N 6509 %1 https://www.ncbi.nlm.nih.gov/pubmed/32913074?dopt=Abstract %R 10.1126/science.aaz6876 %0 Journal Article %J Brain %D 2020 %T Epilepsy subtype-specific copy number burden observed in a genome-wide study of 17 458 subjects. %A Niestroj, Lisa-Marie %A Perez-Palma, Eduardo %A Howrigan, Daniel P %A Zhou, Yadi %A Cheng, Feixiong %A Saarentaus, Elmo %A Nürnberg, Peter %A Stevelink, Remi %A Daly, Mark J %A Palotie, Aarno %A Lal, Dennis %K DNA Copy Number Variations %K Epilepsy %K Female %K Genetic Predisposition to Disease %K Genome-Wide Association Study %K Humans %K Male %X

Cytogenic testing is routinely applied in most neurological centres for severe paediatric epilepsies. However, which characteristics of copy number variants (CNVs) confer most epilepsy risk and which epilepsy subtypes carry the most CNV burden, have not been explored on a genome-wide scale. Here, we present the largest CNV investigation in epilepsy to date with 10 712 European epilepsy cases and 6746 ancestry-matched controls. Patients with genetic generalized epilepsy, lesional focal epilepsy, non-acquired focal epilepsy, and developmental and epileptic encephalopathy were included. All samples were processed with the same technology and analysis pipeline. All investigated epilepsy types, including lesional focal epilepsy patients, showed an increase in CNV burden in at least one tested category compared to controls. However, we observed striking differences in CNV burden across epilepsy types and investigated CNV categories. Genetic generalized epilepsy patients have the highest CNV burden in all categories tested, followed by developmental and epileptic encephalopathy patients. Both epilepsy types also show association for deletions covering genes intolerant for truncating variants. Genome-wide CNV breakpoint association showed not only significant loci for genetic generalized and developmental and epileptic encephalopathy patients but also for lesional focal epilepsy patients. With a 34-fold risk for developing genetic generalized epilepsy, we show for the first time that the established epilepsy-associated 15q13.3 deletion represents the strongest risk CNV for genetic generalized epilepsy across the whole genome. Using the human interactome, we examined the largest connected component of the genes overlapped by CNVs in the four epilepsy types. We observed that genetic generalized epilepsy and non-acquired focal epilepsy formed disease modules. In summary, we show that in all common epilepsy types, 1.5-3% of patients carry epilepsy-associated CNVs. The characteristics of risk CNVs vary tremendously across and within epilepsy types. Thus, we advocate genome-wide genomic testing to identify all disease-associated types of CNVs.

%B Brain %V 143 %P 2106-2118 %8 2020 07 01 %G eng %N 7 %1 https://www.ncbi.nlm.nih.gov/pubmed/32568404?dopt=Abstract %R 10.1093/brain/awaa171 %0 Journal Article %J Genes (Basel) %D 2020 %T Evaluation of the VISAGE Basic Tool for Appearance and Ancestry Prediction Using PowerSeq Chemistry on the MiSeq FGx System. %A Palencia-Madrid, Leire %A Xavier, Catarina %A de la Puente, María %A Hohoff, Carsten %A Phillips, Christopher %A Kayser, Manfred %A Parson, Walther %K DNA Fingerprinting %K Eye Color %K Forensic Genetics %K Genetic Markers %K Genotype %K Hair Color %K High-Throughput Nucleotide Sequencing %K Humans %K Polymorphism, Single Nucleotide %K Sequence Analysis, DNA %K Skin Pigmentation %K Software %X

The study of DNA to predict externally visible characteristics (EVCs) and the biogeographical ancestry (BGA) from unknown samples is gaining relevance in forensic genetics. Technical developments in Massively Parallel Sequencing (MPS) enable the simultaneous analysis of hundreds of DNA markers, which improves successful Forensic DNA Phenotyping (FDP). The EU-funded VISAGE (VISible Attributes through GEnomics) Consortium has developed various targeted MPS-based lab tools to apply FDP in routine forensic analyses. Here, we present an evaluation of the VISAGE Basic tool for appearance and ancestry prediction based on PowerSeq chemistry (Promega) on a MiSeq FGx System (Illumina). The panel consists of 153 single nucleotide polymorphisms (SNPs) that provide information about EVCs (41 SNPs for eye, hair and skin color from HIrisPlex-S) and continental BGA (115 SNPs; three overlap with the EVCs SNP set). The assay was evaluated for sensitivity, repeatability and genotyping concordance, as well as its performance with casework-type samples. This targeted MPS assay provided complete genotypes at all 153 SNPs down to 125 pg of input DNA and 99.67% correct genotypes at 50 pg. It was robust in terms of repeatability and concordance and provided useful results with casework-type samples. The results suggest that this MPS assay is a useful tool for basic appearance and ancestry prediction in forensic genetics for users interested in applying PowerSeq chemistry and MiSeq for this purpose.

%B Genes (Basel) %V 11 %8 2020 06 26 %G eng %N 6 %1 https://www.ncbi.nlm.nih.gov/pubmed/32604780?dopt=Abstract %R 10.3390/genes11060708 %0 Journal Article %J Epilepsia %D 2020 %T Genetic diagnoses in epilepsy: The impact of dynamic exome analysis in a pediatric cohort. %A Rochtus, Anne %A Olson, Heather E %A Smith, Lacey %A Keith, Louisa G %A El Achkar, Christelle %A Taylor, Alan %A Mahida, Sonal %A Park, Meredith %A Kelly, McKenna %A Shain, Catherine %A Rockowitz, Shira %A Rosen Sheidley, Beth %A Poduri, Annapurna %K Adolescent %K Adult %K Age of Onset %K Brain Diseases %K Child %K Child, Preschool %K Chromosomes, Human %K Cohort Studies %K Epilepsy %K Epilepsy, Generalized %K Exome %K Female %K Genetic Testing %K Genetic Variation %K Humans %K Infant %K Male %K Microarray Analysis %K Phenotype %K Whole Exome Sequencing %K Young Adult %X

OBJECTIVE: We evaluated the yield of systematic analysis and/or reanalysis of whole exome sequencing (WES) data from a cohort of well-phenotyped pediatric patients with epilepsy and suspected but previously undetermined genetic etiology.

METHODS: We identified and phenotyped 125 participants with pediatric epilepsy. Etiology was unexplained at the time of enrollment despite clinical testing, which included chromosomal microarray (57 patients), epilepsy gene panel (n = 48), both (n = 28), or WES (n = 8). Clinical epilepsy diagnoses included developmental and epileptic encephalopathy (DEE), febrile infection-related epilepsy syndrome, Rasmussen encephalitis, and other focal and generalized epilepsies. We analyzed WES data and compared the yield in participants with and without prior clinical genetic testing.

RESULTS: Overall, we identified pathogenic or likely pathogenic variants in 40% (50/125) of our study participants. Nine patients with DEE had genetic variants in recently published genes that had not been recognized as epilepsy-related at the time of clinical testing (FGF12, GABBR1, GABBR2, ITPA, KAT6A, PTPN23, RHOBTB2, SATB2), and eight patients had genetic variants in candidate epilepsy genes (CAMTA1, FAT3, GABRA6, HUWE1, PTCHD1). Ninety participants had concomitant or subsequent clinical genetic testing, which was ultimately explanatory for 26% (23/90). Of the 67 participants whose molecular diagnoses were "unsolved" through clinical genetic testing, we identified pathogenic or likely pathogenic variants in 17 (25%).

SIGNIFICANCE: Our data argue for early consideration of WES with iterative reanalysis for patients with epilepsy, particularly those with DEE or epilepsy with intellectual disability. Rigorous analysis of WES data of well-phenotyped patients with epilepsy leads to a broader understanding of gene-specific phenotypic spectra as well as candidate disease gene identification. We illustrate the dynamic nature of genetic diagnosis over time, with analysis and in some cases reanalysis of exome data leading to the identification of disease-associated variants among participants with previously nondiagnostic results from a variety of clinical testing strategies.

%B Epilepsia %V 61 %P 249-258 %8 2020 02 %G eng %N 2 %1 https://www.ncbi.nlm.nih.gov/pubmed/31957018?dopt=Abstract %R 10.1111/epi.16427 %0 Journal Article %J Science %D 2020 %T Genetics of schizophrenia in the South African Xhosa. %A Gulsuner, S %A Stein, D J %A Susser, E S %A Sibeko, G %A Pretorius, A %A Walsh, T %A Majara, L %A Mndini, M M %A Mqulwana, S G %A Ntola, O A %A Casadei, S %A Ngqengelele, L L %A Korchina, V %A van der Merwe, C %A Malan, M %A Fader, K M %A Feng, M %A Willoughby, E %A Muzny, D %A Baldinger, A %A Andrews, H F %A Gur, R C %A Gibbs, R A %A Zingela, Z %A Nagdee, M %A Ramesar, R S %A King, M-C %A McClellan, J M %K Age Factors %K Autistic Disorder %K Bipolar Disorder %K Dopamine %K Female %K gamma-Aminobutyric Acid %K Genetic Variation %K Glutamine %K Humans %K Male %K Mutation %K Neural Pathways %K Schizophrenia %K Sex Factors %K South Africa %K Synapses %K Synaptic Transmission %X

Africa, the ancestral home of all modern humans, is the most informative continent for understanding the human genome and its contribution to complex disease. To better understand the genetics of schizophrenia, we studied the illness in the Xhosa population of South Africa, recruiting 909 cases and 917 age-, gender-, and residence-matched controls. Individuals with schizophrenia were significantly more likely than controls to harbor private, severely damaging mutations in genes that are critical to synaptic function, including neural circuitry mediated by the neurotransmitters glutamine, γ-aminobutyric acid, and dopamine. Schizophrenia is genetically highly heterogeneous, involving severe ultrarare mutations in genes that are critical to synaptic plasticity. The depth of genetic variation in Africa revealed this relationship with a moderate sample size and informed our understanding of the genetics of schizophrenia worldwide.

%B Science %V 367 %P 569-573 %8 2020 01 31 %G eng %N 6477 %1 https://www.ncbi.nlm.nih.gov/pubmed/32001654?dopt=Abstract %R 10.1126/science.aay8833 %0 Journal Article %J BMC Med Genomics %D 2020 %T Genome-wide association meta-analysis for early age-related macular degeneration highlights novel loci and insights for advanced disease. %A Winkler, Thomas W %A Grassmann, Felix %A Brandl, Caroline %A Kiel, Christina %A Günther, Felix %A Strunz, Tobias %A Weidner, Lorraine %A Zimmermann, Martina E %A Korb, Christina A %A Poplawski, Alicia %A Schuster, Alexander K %A Müller-Nurasyid, Martina %A Peters, Annette %A Rauscher, Franziska G %A Elze, Tobias %A Horn, Katrin %A Scholz, Markus %A Cañadas-Garre, Marisa %A McKnight, Amy Jayne %A Quinn, Nicola %A Hogg, Ruth E %A Küchenhoff, Helmut %A Heid, Iris M %A Stark, Klaus J %A Weber, Bernhard H F %K Case-Control Studies %K Genetic Loci %K Genetic Markers %K Genetic Predisposition to Disease %K Genome-Wide Association Study %K Humans %K Macular Degeneration %K Polymorphism, Single Nucleotide %X

BACKGROUND: Advanced age-related macular degeneration (AMD) is a leading cause of blindness. While around half of the genetic contribution to advanced AMD has been uncovered, little is known about the genetic architecture of early AMD.

METHODS: To identify genetic factors for early AMD, we conducted a genome-wide association study (GWAS) meta-analysis (14,034 cases, 91,214 controls, 11 sources of data including the International AMD Genomics Consortium, IAMDGC, and UK Biobank, UKBB). We ascertained early AMD via color fundus photographs by manual grading for 10 sources and via an automated machine learning approach for > 170,000 photographs from UKBB. We searched for early AMD loci via GWAS and via a candidate approach based on 14 previously suggested early AMD variants.

RESULTS: Altogether, we identified 10 independent loci with statistical significance for early AMD: (i) 8 from our GWAS with genome-wide significance (P < 5 × 10), (ii) one previously suggested locus with experiment-wise significance (P < 0.05/14) in our non-overlapping data and with genome-wide significance when combining the reported and our non-overlapping data (together 17,539 cases, 105,395 controls), and (iii) one further previously suggested locus with experiment-wise significance in our non-overlapping data. Of these 10 identified loci, 8 were novel and 2 known for early AMD. Most of the 10 loci overlapped with known advanced AMD loci (near ARMS2/HTRA1, CFH, C2, C3, CETP, TNFRSF10A, VEGFA, APOE), except two that have not yet been identified with statistical significance for any AMD. Among the 17 genes within these two loci, in-silico functional annotation suggested CD46 and TYR as the most likely responsible genes. Presence or absence of an early AMD effect distinguished the known pathways of advanced AMD genetics (complement/lipid pathways versus extracellular matrix metabolism).

CONCLUSIONS: Our GWAS on early AMD identified novel loci, highlighted shared and distinct genetics between early and advanced AMD and provides insights into AMD etiology. Our data provide a resource comparable in size to the existing IAMDGC data on advanced AMD genetics enabling a joint view. The biological relevance of this joint view is underscored by the ability of early AMD effects to differentiate the major pathways for advanced AMD.

%B BMC Med Genomics %V 13 %P 120 %8 2020 08 26 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/32843070?dopt=Abstract %R 10.1186/s12920-020-00760-7 %0 Journal Article %J Arterioscler Thromb Vasc Biol %D 2020 %T Genome-Wide Polygenic Score, Clinical Risk Factors, and Long-Term Trajectories of Coronary Artery Disease. %A Hindy, George %A Aragam, Krishna G %A Ng, Kenney %A Chaffin, Mark %A Lotta, Luca A %A Baras, Aris %A Drake, Isabel %A Orho-Melander, Marju %A Melander, Olle %A Kathiresan, Sekar %A Khera, Amit V %K Adult %K Aged %K Coronary Artery Disease %K Female %K Genetic Predisposition to Disease %K Genome-Wide Association Study %K Heart Disease Risk Factors %K Heredity %K Humans %K Incidence %K Male %K Middle Aged %K Multifactorial Inheritance %K Phenotype %K Prognosis %K Risk Assessment %K Sweden %K Time Factors %K United Kingdom %X

OBJECTIVE: To determine the relationship of a genome-wide polygenic score for coronary artery disease (GPS) with lifetime trajectories of CAD risk, directly compare its predictive capacity to traditional risk factors, and assess its interplay with the Pooled Cohort Equations (PCE) clinical risk estimator. Approach and Results: We studied GPS in 28 556 middle-aged participants of the Malmö Diet and Cancer Study, of whom 4122 (14.4%) developed CAD over a median follow-up of 21.3 years. A pronounced gradient in lifetime risk of CAD was observed-16% for those in the lowest GPS decile to 48% in the highest. We evaluated the discriminative capacity of the GPS-as assessed by change in the C-statistic from a baseline model including age and sex-among 5685 individuals with PCE risk estimates available. The increment for the GPS (+0.045, <0.001) was higher than for any of 11 traditional risk factors (range +0.007 to +0.032). Minimal correlation was observed between GPS and 10-year risk defined by the PCE (=0.03), and addition of GPS improved the C-statistic of the PCE model by 0.026. A significant gradient in lifetime risk was observed for the GPS, even among individuals within a given PCE clinical risk stratum. We replicated key findings-noting strikingly consistent results-in 325 003 participants of the UK Biobank.

CONCLUSIONS: GPS-a risk estimator available from birth-stratifies individuals into varying trajectories of clinical risk for CAD. Implementation of GPS may enable identification of high-risk individuals early in life, decades in advance of manifest risk factors or disease.

%B Arterioscler Thromb Vasc Biol %V 40 %P 2738-2746 %8 2020 11 %G eng %N 11 %1 https://www.ncbi.nlm.nih.gov/pubmed/32957805?dopt=Abstract %R 10.1161/ATVBAHA.120.314856 %0 Journal Article %J Science %D 2020 %T The GTEx Consortium atlas of genetic regulatory effects across human tissues. %K Datasets as Topic %K Disease %K Female %K Gene Expression Regulation %K Genome-Wide Association Study %K Humans %K Male %K Organ Specificity %K Quantitative Trait Loci %K Sequence Analysis, RNA %X

The Genotype-Tissue Expression (GTEx) project was established to characterize genetic effects on the transcriptome across human tissues and to link these regulatory mechanisms to trait and disease associations. Here, we present analyses of the version 8 data, examining 15,201 RNA-sequencing samples from 49 tissues of 838 postmortem donors. We comprehensively characterize genetic associations for gene expression and splicing in cis and trans, showing that regulatory associations are found for almost all genes, and describe the underlying molecular mechanisms and their contribution to allelic heterogeneity and pleiotropy of complex traits. Leveraging the large diversity of tissues, we provide insights into the tissue specificity of genetic effects and show that cell type composition is a key factor in understanding gene regulatory mechanisms in human tissues.

%B Science %V 369 %P 1318-1330 %8 2020 09 11 %G eng %N 6509 %1 https://www.ncbi.nlm.nih.gov/pubmed/32913098?dopt=Abstract %R 10.1126/science.aaz1776 %0 Journal Article %J Science %D 2020 %T The impact of sex on gene expression across human tissues. %A Oliva, Meritxell %A Muñoz-Aguirre, Manuel %A Kim-Hellmuth, Sarah %A Wucher, Valentin %A Gewirtz, Ariel D H %A Cotter, Daniel J %A Parsana, Princy %A Kasela, Silva %A Balliu, Brunilda %A Viñuela, Ana %A Castel, Stephane E %A Mohammadi, Pejman %A Aguet, François %A Zou, Yuxin %A Khramtsova, Ekaterina A %A Skol, Andrew D %A Garrido-Martín, Diego %A Reverter, Ferran %A Brown, Andrew %A Evans, Patrick %A Gamazon, Eric R %A Payne, Anthony %A Bonazzola, Rodrigo %A Barbeira, Alvaro N %A Hamel, Andrew R %A Martinez-Perez, Angel %A Soria, José Manuel %A Pierce, Brandon L %A Stephens, Matthew %A Eskin, Eleazar %A Dermitzakis, Emmanouil T %A Segrè, Ayellet V %A Im, Hae Kyung %A Engelhardt, Barbara E %A Ardlie, Kristin G %A Montgomery, Stephen B %A Battle, Alexis J %A Lappalainen, Tuuli %A Guigo, Roderic %A Stranger, Barbara E %K Chromosomes, Human, X %K Disease %K Epigenesis, Genetic %K Female %K Gene Expression %K Gene Expression Regulation %K Genetic Variation %K Genome-Wide Association Study %K Humans %K Male %K Organ Specificity %K Promoter Regions, Genetic %K Quantitative Trait Loci %K Sex Characteristics %K Sex Factors %X

Many complex human phenotypes exhibit sex-differentiated characteristics. However, the molecular mechanisms underlying these differences remain largely unknown. We generated a catalog of sex differences in gene expression and in the genetic regulation of gene expression across 44 human tissue sources surveyed by the Genotype-Tissue Expression project (GTEx, v8 release). We demonstrate that sex influences gene expression levels and cellular composition of tissue samples across the human body. A total of 37% of all genes exhibit sex-biased expression in at least one tissue. We identify cis expression quantitative trait loci (eQTLs) with sex-differentiated effects and characterize their cellular origin. By integrating sex-biased eQTLs with genome-wide association study data, we identify 58 gene-trait associations that are driven by genetic regulation of gene expression in a single sex. These findings provide an extensive characterization of sex differences in the human transcriptome and its genetic regulation.

%B Science %V 369 %8 2020 09 11 %G eng %N 6509 %1 https://www.ncbi.nlm.nih.gov/pubmed/32913072?dopt=Abstract %R 10.1126/science.aba3066 %0 Journal Article %J Nature %D 2020 %T Inherited causes of clonal haematopoiesis in 97,691 whole genomes. %A Bick, Alexander G %A Weinstock, Joshua S %A Nandakumar, Satish K %A Fulco, Charles P %A Bao, Erik L %A Zekavat, Seyedeh M %A Szeto, Mindy D %A Liao, Xiaotian %A Leventhal, Matthew J %A Nasser, Joseph %A Chang, Kyle %A Laurie, Cecelia %A Burugula, Bala Bharathi %A Gibson, Christopher J %A Lin, Amy E %A Taub, Margaret A %A Aguet, François %A Ardlie, Kristin %A Mitchell, Braxton D %A Barnes, Kathleen C %A Moscati, Arden %A Fornage, Myriam %A Redline, Susan %A Psaty, Bruce M %A Silverman, Edwin K %A Weiss, Scott T %A Palmer, Nicholette D %A Vasan, Ramachandran S %A Burchard, Esteban G %A Kardia, Sharon L R %A He, Jiang %A Kaplan, Robert C %A Smith, Nicholas L %A Arnett, Donna K %A Schwartz, David A %A Correa, Adolfo %A de Andrade, Mariza %A Guo, Xiuqing %A Konkle, Barbara A %A Custer, Brian %A Peralta, Juan M %A Gui, Hongsheng %A Meyers, Deborah A %A McGarvey, Stephen T %A Chen, Ida Yii-Der %A Shoemaker, M Benjamin %A Peyser, Patricia A %A Broome, Jai G %A Gogarten, Stephanie M %A Wang, Fei Fei %A Wong, Quenna %A Montasser, May E %A Daya, Michelle %A Kenny, Eimear E %A North, Kari E %A Launer, Lenore J %A Cade, Brian E %A Bis, Joshua C %A Cho, Michael H %A Lasky-Su, Jessica %A Bowden, Donald W %A Cupples, L Adrienne %A Mak, Angel C Y %A Becker, Lewis C %A Smith, Jennifer A %A Kelly, Tanika N %A Aslibekyan, Stella %A Heckbert, Susan R %A Tiwari, Hemant K %A Yang, Ivana V %A Heit, John A %A Lubitz, Steven A %A Johnsen, Jill M %A Curran, Joanne E %A Wenzel, Sally E %A Weeks, Daniel E %A Rao, Dabeeru C %A Darbar, Dawood %A Moon, Jee-Young %A Tracy, Russell P %A Buth, Erin J %A Rafaels, Nicholas %A Loos, Ruth J F %A Durda, Peter %A Liu, Yongmei %A Hou, Lifang %A Lee, Jiwon %A Kachroo, Priyadarshini %A Freedman, Barry I %A Levy, Daniel %A Bielak, Lawrence F %A Hixson, James E %A Floyd, James S %A Whitsel, Eric A %A Ellinor, Patrick T %A Irvin, Marguerite R %A Fingerlin, Tasha E %A Raffield, Laura M %A Armasu, Sebastian M %A Wheeler, Marsha M %A Sabino, Ester C %A Blangero, John %A Williams, L Keoki %A Levy, Bruce D %A Sheu, Wayne Huey-Herng %A Roden, Dan M %A Boerwinkle, Eric %A Manson, JoAnn E %A Mathias, Rasika A %A Desai, Pinkal %A Taylor, Kent D %A Johnson, Andrew D %A Auer, Paul L %A Kooperberg, Charles %A Laurie, Cathy C %A Blackwell, Thomas W %A Smith, Albert V %A Zhao, Hongyu %A Lange, Ethan %A Lange, Leslie %A Rich, Stephen S %A Rotter, Jerome I %A Wilson, James G %A Scheet, Paul %A Kitzman, Jacob O %A Lander, Eric S %A Engreitz, Jesse M %A Ebert, Benjamin L %A Reiner, Alexander P %A Jaiswal, Siddhartha %A Abecasis, Gonçalo %A Sankaran, Vijay G %A Kathiresan, Sekar %A Natarajan, Pradeep %K Adult %K Africa %K African Continental Ancestry Group %K Aged %K Aged, 80 and over %K alpha Karyopherins %K Cell Self Renewal %K Clonal Hematopoiesis %K DNA-Binding Proteins %K Female %K Genetic Predisposition to Disease %K Genome, Human %K Germ-Line Mutation %K Hematopoietic Stem Cells %K Humans %K Intracellular Signaling Peptides and Proteins %K Male %K Middle Aged %K National Heart, Lung, and Blood Institute (U.S.) %K Phenotype %K Precision Medicine %K Proto-Oncogene Proteins %K Tripartite Motif Proteins %K United States %K Whole Genome Sequencing %X

Age is the dominant risk factor for most chronic human diseases, but the mechanisms through which ageing confers this risk are largely unknown. The age-related acquisition of somatic mutations that lead to clonal expansion in regenerating haematopoietic stem cell populations has recently been associated with both haematological cancer and coronary heart disease-this phenomenon is termed clonal haematopoiesis of indeterminate potential (CHIP). Simultaneous analyses of germline and somatic whole-genome sequences provide the opportunity to identify root causes of CHIP. Here we analyse high-coverage whole-genome sequences from 97,691 participants of diverse ancestries in the National Heart, Lung, and Blood Institute Trans-omics for Precision Medicine (TOPMed) programme, and identify 4,229 individuals with CHIP. We identify associations with blood cell, lipid and inflammatory traits that are specific to different CHIP driver genes. Association of a genome-wide set of germline genetic variants enabled the identification of three genetic loci associated with CHIP status, including one locus at TET2 that was specific to individuals of African ancestry. In silico-informed in vitro evaluation of the TET2 germline locus enabled the identification of a causal variant that disrupts a TET2 distal enhancer, resulting in increased self-renewal of haematopoietic stem cells. Overall, we observe that germline genetic variation shapes haematopoietic stem cell function, leading to CHIP through mechanisms that are specific to clonal haematopoiesis as well as shared mechanisms that lead to somatic mutations across tissues.

%B Nature %V 586 %P 763-768 %8 2020 10 %G eng %N 7831 %1 https://www.ncbi.nlm.nih.gov/pubmed/33057201?dopt=Abstract %R 10.1038/s41586-020-2819-2 %0 Journal Article %J Cell %D 2020 %T Large-Scale Exome Sequencing Study Implicates Both Developmental and Functional Changes in the Neurobiology of Autism. %A Satterstrom, F Kyle %A Kosmicki, Jack A %A Wang, Jiebiao %A Breen, Michael S %A De Rubeis, Silvia %A An, Joon-Yong %A Peng, Minshi %A Collins, Ryan %A Grove, Jakob %A Klei, Lambertus %A Stevens, Christine %A Reichert, Jennifer %A Mulhern, Maureen S %A Artomov, Mykyta %A Gerges, Sherif %A Sheppard, Brooke %A Xu, Xinyi %A Bhaduri, Aparna %A Norman, Utku %A Brand, Harrison %A Schwartz, Grace %A Nguyen, Rachel %A Guerrero, Elizabeth E %A Dias, Caroline %A Betancur, Catalina %A Cook, Edwin H %A Gallagher, Louise %A Gill, Michael %A Sutcliffe, James S %A Thurm, Audrey %A Zwick, Michael E %A Børglum, Anders D %A State, Matthew W %A Cicek, A Ercument %A Talkowski, Michael E %A Cutler, David J %A Devlin, Bernie %A Sanders, Stephan J %A Roeder, Kathryn %A Daly, Mark J %A Buxbaum, Joseph D %K Autistic Disorder %K Case-Control Studies %K Cell Lineage %K Cerebral Cortex %K Cohort Studies %K Exome %K Female %K Gene Expression Regulation, Developmental %K Gene Frequency %K Genetic Predisposition to Disease %K Humans %K Male %K Mutation, Missense %K Neurobiology %K Neurons %K Phenotype %K Sex Factors %K Single-Cell Analysis %K Whole Exome Sequencing %X

We present the largest exome sequencing study of autism spectrum disorder (ASD) to date (n = 35,584 total samples, 11,986 with ASD). Using an enhanced analytical framework to integrate de novo and case-control rare variation, we identify 102 risk genes at a false discovery rate of 0.1 or less. Of these genes, 49 show higher frequencies of disruptive de novo variants in individuals ascertained to have severe neurodevelopmental delay, whereas 53 show higher frequencies in individuals ascertained to have ASD; comparing ASD cases with mutations in these groups reveals phenotypic differences. Expressed early in brain development, most risk genes have roles in regulation of gene expression or neuronal communication (i.e., mutations effect neurodevelopmental and neurophysiological changes), and 13 fall within loci recurrently hit by copy number variants. In cells from the human cortex, expression of risk genes is enriched in excitatory and inhibitory neuronal lineages, consistent with multiple paths to an excitatory-inhibitory imbalance underlying ASD.

%B Cell %V 180 %P 568-584.e23 %8 2020 02 06 %G eng %N 3 %1 https://www.ncbi.nlm.nih.gov/pubmed/31981491?dopt=Abstract %R 10.1016/j.cell.2019.12.036 %0 Journal Article %J Nature %D 2020 %T Mapping and characterization of structural variation in 17,795 human genomes. %A Abel, Haley J %A Larson, David E %A Regier, Allison A %A Chiang, Colby %A Das, Indraniel %A Kanchi, Krishna L %A Layer, Ryan M %A Neale, Benjamin M %A Salerno, William J %A Reeves, Catherine %A Buyske, Steven %A Matise, Tara C %A Muzny, Donna M %A Zody, Michael C %A Lander, Eric S %A Dutcher, Susan K %A Stitziel, Nathan O %A Hall, Ira M %K Alleles %K Case-Control Studies %K Continental Population Groups %K Epigenesis, Genetic %K Female %K Gene Dosage %K Genetic Variation %K Genetics, Population %K Genome, Human %K High-Throughput Nucleotide Sequencing %K Humans %K Male %K Molecular Sequence Annotation %K Quantitative Trait Loci %K Software %K Whole Genome Sequencing %X

A key goal of whole-genome sequencing for studies of human genetics is to interrogate all forms of variation, including single-nucleotide variants, small insertion or deletion (indel) variants and structural variants. However, tools and resources for the study of structural variants have lagged behind those for smaller variants. Here we used a scalable pipeline to map and characterize structural variants in 17,795 deeply sequenced human genomes. We publicly release site-frequency data to create the largest, to our knowledge, whole-genome-sequencing-based structural variant resource so far. On average, individuals carry 2.9 rare structural variants that alter coding regions; these variants affect the dosage or structure of 4.2 genes and account for 4.0-11.2% of rare high-impact coding alleles. Using a computational model, we estimate that structural variants account for 17.2% of rare alleles genome-wide, with predicted deleterious effects that are equivalent to loss-of-function coding alleles; approximately 90% of such structural variants are noncoding deletions (mean 19.1 per genome). We report 158,991 ultra-rare structural variants and show that 2% of individuals carry ultra-rare megabase-scale structural variants, nearly half of which are balanced or complex rearrangements. Finally, we infer the dosage sensitivity of genes and noncoding elements, and reveal trends that relate to element class and conservation. This work will help to guide the analysis and interpretation of structural variants in the era of whole-genome sequencing.

%B Nature %V 583 %P 83-89 %8 2020 07 %G eng %N 7814 %1 https://www.ncbi.nlm.nih.gov/pubmed/32460305?dopt=Abstract %R 10.1038/s41586-020-2371-0 %0 Journal Article %J PLoS Genet %D 2020 %T A missense variant in Mitochondrial Amidoxime Reducing Component 1 gene and protection against liver disease. %A Emdin, Connor A %A Haas, Mary E %A Khera, Amit V %A Aragam, Krishna %A Chaffin, Mark %A Klarin, Derek %A Hindy, George %A Jiang, Lan %A Wei, Wei-Qi %A Feng, Qiping %A Karjalainen, Juha %A Havulinna, Aki %A Kiiskinen, Tuomo %A Bick, Alexander %A Ardissino, Diego %A Wilson, James G %A Schunkert, Heribert %A McPherson, Ruth %A Watkins, Hugh %A Elosua, Roberto %A Bown, Matthew J %A Samani, Nilesh J %A Baber, Usman %A Erdmann, Jeanette %A Gupta, Namrata %A Danesh, John %A Saleheen, Danish %A Chang, Kyong-Mi %A Vujkovic, Marijana %A Voight, Ben %A Damrauer, Scott %A Lynch, Julie %A Kaplan, David %A Serper, Marina %A Tsao, Philip %A Mercader, Josep %A Hanis, Craig %A Daly, Mark %A Denny, Joshua %A Gabriel, Stacey %A Kathiresan, Sekar %K Alleles %K Cholesterol, LDL %K Coronary Artery Disease %K Datasets as Topic %K Fatty Liver %K Female %K Genetic Predisposition to Disease %K Homozygote %K Humans %K Liver %K Liver Cirrhosis %K Liver Cirrhosis, Alcoholic %K Loss of Function Mutation %K Male %K Middle Aged %K Mitochondrial Proteins %K Mutation, Missense %K Oxidoreductases %X

Analyzing 12,361 all-cause cirrhosis cases and 790,095 controls from eight cohorts, we identify a common missense variant in the Mitochondrial Amidoxime Reducing Component 1 gene (MARC1 p.A165T) that associates with protection from all-cause cirrhosis (OR 0.91, p = 2.3*10-11). This same variant also associates with lower levels of hepatic fat on computed tomographic imaging and lower odds of physician-diagnosed fatty liver as well as lower blood levels of alanine transaminase (-0.025 SD, 3.7*10-43), alkaline phosphatase (-0.025 SD, 1.2*10-37), total cholesterol (-0.030 SD, p = 1.9*10-36) and LDL cholesterol (-0.027 SD, p = 5.1*10-30) levels. We identified a series of additional MARC1 alleles (low-frequency missense p.M187K and rare protein-truncating p.R200Ter) that also associated with lower cholesterol levels, liver enzyme levels and reduced risk of cirrhosis (0 cirrhosis cases for 238 R200Ter carriers versus 17,046 cases of cirrhosis among 759,027 non-carriers, p = 0.04) suggesting that deficiency of the MARC1 enzyme may lower blood cholesterol levels and protect against cirrhosis.

%B PLoS Genet %V 16 %P e1008629 %8 2020 04 %G eng %N 4 %1 https://www.ncbi.nlm.nih.gov/pubmed/32282858?dopt=Abstract %R 10.1371/journal.pgen.1008629 %0 Journal Article %J Genes (Basel) %D 2020 %T Next Generation Sequencing of 134 Children with Autism Spectrum Disorder and Regression. %A Yin, Jiani %A Chun, Chun-An %A Zavadenko, Nikolay N %A Pechatnikova, Natalia L %A Naumova, Oxana Yu %A Doddapaneni, Harsha V %A Hu, Jianhong %A Muzny, Donna M %A Schaaf, Christian P %A Grigorenko, Elena L %K Autism Spectrum Disorder %K Child %K Child, Preschool %K Cohort Studies %K Disease Progression %K Female %K Gene Expression Regulation %K Genetic Markers %K Genetic Predisposition to Disease %K High-Throughput Nucleotide Sequencing %K Humans %K Infant %K Male %K Mutation %X

Approximately 30% of individuals with autism spectrum disorder (ASD) experience developmental regression, the etiology of which remains largely unknown. We performed a complete literature search and identified 47 genes that had been implicated in such cases. We sequenced these genes in a preselected cohort of 134 individuals with regressive autism. In total, 16 variants in 12 genes with evidence supportive of pathogenicity were identified. They were classified as variants of uncertain significance based on ACMG standards and guidelines. Among these were recurring variants in and , variants in genes that were linked to syndromic forms of ASD (, , , , , and ), and variants in the form of oligogenic heterozygosity (, , and ).

%B Genes (Basel) %V 11 %8 2020 07 25 %G eng %N 8 %1 https://www.ncbi.nlm.nih.gov/pubmed/32722525?dopt=Abstract %R 10.3390/genes11080853 %0 Journal Article %J Am J Hum Genet %D 2020 %T Non-parametric Polygenic Risk Prediction via Partitioned GWAS Summary Statistics. %A Chun, Sung %A Imakaev, Maxim %A Hui, Daniel %A Patsopoulos, Nikolaos A %A Neale, Benjamin M %A Kathiresan, Sekar %A Stitziel, Nathan O %A Sunyaev, Shamil R %K Aged %K Cohort Studies %K Diabetes Mellitus, Type 2 %K Female %K Genome-Wide Association Study %K Genotype %K Humans %K Linkage Disequilibrium %K Male %K Middle Aged %K Models, Genetic %K Multifactorial Inheritance %K Phenotype %K Polymorphism, Single Nucleotide %K Quantitative Trait Loci %X

In complex trait genetics, the ability to predict phenotype from genotype is the ultimate measure of our understanding of genetic architecture underlying the heritability of a trait. A complete understanding of the genetic basis of a trait should allow for predictive methods with accuracies approaching the trait's heritability. The highly polygenic nature of quantitative traits and most common phenotypes has motivated the development of statistical strategies focused on combining myriad individually non-significant genetic effects. Now that predictive accuracies are improving, there is a growing interest in the practical utility of such methods for predicting risk of common diseases responsive to early therapeutic intervention. However, existing methods require individual-level genotypes or depend on accurately specifying the genetic architecture underlying each disease to be predicted. Here, we propose a polygenic risk prediction method that does not require explicitly modeling any underlying genetic architecture. We start with summary statistics in the form of SNP effect sizes from a large GWAS cohort. We then remove the correlation structure across summary statistics arising due to linkage disequilibrium and apply a piecewise linear interpolation on conditional mean effects. In both simulated and real datasets, this new non-parametric shrinkage (NPS) method can reliably allow for linkage disequilibrium in summary statistics of 5 million dense genome-wide markers and consistently improves prediction accuracy. We show that NPS improves the identification of groups at high risk for breast cancer, type 2 diabetes, inflammatory bowel disease, and coronary heart disease, all of which have available early intervention or prevention treatments.

%B Am J Hum Genet %V 107 %P 46-59 %8 2020 07 02 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/32470373?dopt=Abstract %R 10.1016/j.ajhg.2020.05.004 %0 Journal Article %J Mov Disord %D 2020 %T The Parkinson's Disease Genome-Wide Association Study Locus Browser. %A Grenn, Francis P %A Kim, Jonggeol J %A Makarious, Mary B %A Iwaki, Hirotaka %A Illarionova, Anastasia %A Brolin, Kajsa %A Kluss, Jillian H %A Schumacher-Schuh, Artur F %A Leonard, Hampton %A Faghri, Faraz %A Billingsley, Kimberley %A Krohn, Lynne %A Hall, Ashley %A Diez-Fairen, Monica %A Periñán, Maria Teresa %A Foo, Jia Nee %A Sandor, Cynthia %A Webber, Caleb %A Fiske, Brian K %A Gibbs, J Raphael %A Nalls, Mike A %A Singleton, Andrew B %A Bandres-Ciga, Sara %A Reed, Xylena %A Blauwendraat, Cornelis %K Age of Onset %K Genetic Predisposition to Disease %K Genome-Wide Association Study %K Humans %K Neurodegenerative Diseases %K Parkinson Disease %K Risk Factors %X

BACKGROUND: Parkinson's disease (PD) is a neurodegenerative disease with an often complex component identifiable by genome-wide association studies. The most recent large-scale PD genome-wide association studies have identified more than 90 independent risk variants for PD risk and progression across more than 80 genomic regions. One major challenge in current genomics is the identification of the causal gene(s) and variant(s) at each genome-wide association study locus. The objective of the current study was to create a tool that would display data for relevant PD risk loci and provide guidance with the prioritization of causal genes and potential mechanisms at each locus.

METHODS: We included all significant genome-wide signals from multiple recent PD genome-wide association studies including themost recent PD risk genome-wide association study, age-at-onset genome-wide association study, progression genome-wide association study, and Asian population PD risk genome-wide association study. We gathered data for all genes 1 Mb up and downstream of each variant to allow users to assess which gene(s) are most associated with the variant of interest based on a set of self-ranked criteria. Multiple databases were queried for each gene to collect additional causal data.

RESULTS: We created a PD genome-wide association study browser tool (https://pdgenetics.shinyapps.io/GWASBrowser/) to assist the PD research community with the prioritization of genes for follow-up functional studies to identify potential therapeutic targets.

CONCLUSIONS: Our PD genome-wide association study browser tool provides users with a useful method of identifying potential causal genes at all known PD risk loci from large-scale PD genome-wide association studies. We plan to update this tool with new relevant data as sample sizes increase and new PD risk loci are discovered. © 2020 The Authors. Movement Disorders published by Wiley Periodicals LLC on behalf of International Parkinson and Movement Disorder Society. This article has been contributed to by US Government employees and their work is in the public domain in the USA.

%B Mov Disord %V 35 %P 2056-2067 %8 2020 11 %G eng %N 11 %1 https://www.ncbi.nlm.nih.gov/pubmed/32864809?dopt=Abstract %R 10.1002/mds.28197 %0 Journal Article %J Nat Commun %D 2020 %T Polygenic background modifies penetrance of monogenic variants for tier 1 genomic conditions. %A Fahed, Akl C %A Wang, Minxian %A Homburger, Julian R %A Patel, Aniruddh P %A Bick, Alexander G %A Neben, Cynthia L %A Lai, Carmen %A Brockman, Deanna %A Philippakis, Anthony %A Ellinor, Patrick T %A Cassa, Christopher A %A Lebo, Matthew %A Ng, Kenney %A Lander, Eric S %A Zhou, Alicia Y %A Kathiresan, Sekar %A Khera, Amit V %K Aged %K Breast Neoplasms %K Case-Control Studies %K Colorectal Neoplasms %K Coronary Artery Disease %K Female %K Genetic Predisposition to Disease %K Genome, Human %K Humans %K Male %K Middle Aged %K Multifactorial Inheritance %K Odds Ratio %K Penetrance %K Risk Factors %X

Genetic variation can predispose to disease both through (i) monogenic risk variants that disrupt a physiologic pathway with large effect on disease and (ii) polygenic risk that involves many variants of small effect in different pathways. Few studies have explored the interplay between monogenic and polygenic risk. Here, we study 80,928 individuals to examine whether polygenic background can modify penetrance of disease in tier 1 genomic conditions - familial hypercholesterolemia, hereditary breast and ovarian cancer, and Lynch syndrome. Among carriers of a monogenic risk variant, we estimate substantial gradients in disease risk based on polygenic background - the probability of disease by age 75 years ranged from 17% to 78% for coronary artery disease, 13% to 76% for breast cancer, and 11% to 80% for colon cancer. We propose that accounting for polygenic background is likely to increase accuracy of risk estimation for individuals who inherit a monogenic risk variant.

%B Nat Commun %V 11 %P 3635 %8 2020 08 20 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/32820175?dopt=Abstract %R 10.1038/s41467-020-17374-3 %0 Journal Article %J Genes (Basel) %D 2020 %T Rapid, Paralog-Sensitive CNV Analysis of 2457 Human Genomes Using QuicK-mer2. %A Shen, Feichen %A Kidd, Jeffrey M %K Algorithms %K Computational Biology %K DNA Copy Number Variations %K Evolution, Molecular %K Gene Duplication %K Genome, Human %K Humans %K Sequence Analysis, DNA %X

Gene duplication is a major mechanism for the evolution of gene novelty, and copy-number variation makes a major contribution to inter-individual genetic diversity. However, most approaches for studying copy-number variation rely upon uniquely mapping reads to a genome reference and are unable to distinguish among duplicated sequences. Specialized approaches to interrogate specific paralogs are comparatively slow and have a high degree of computational complexity, limiting their effective application to emerging population-scale data sets. We present QuicK-mer2, a self-contained, mapping-free approach that enables the rapid construction of paralog-specific copy-number maps from short-read sequence data. This approach is based on the tabulation of unique k-mer sequences from short-read data sets, and is able to analyze a 20X coverage human genome in approximately 20 min. We applied our approach to newly released sequence data from the 1000 Genomes Project, constructed paralog-specific copy-number maps from 2457 unrelated individuals, and uncovered copy-number variation of paralogous genes. We identify nine genes where none of the analyzed samples have a copy number of two, 92 genes where the majority of samples have a copy number other than two, and describe rare copy number variation effecting multiple genes at the APOBEC3 locus.

%B Genes (Basel) %V 11 %8 2020 01 29 %G eng %N 2 %1 https://www.ncbi.nlm.nih.gov/pubmed/32013076?dopt=Abstract %R 10.3390/genes11020141 %0 Journal Article %J N Engl J Med %D 2020 %T RNA Identification of PRIME Cells Predicting Rheumatoid Arthritis Flares. %A Orange, Dana E %A Yao, Vicky %A Sawicka, Kirsty %A Fak, John %A Frank, Mayu O %A Parveen, Salina %A Blachère, Nathalie E %A Hale, Caryn %A Zhang, Fan %A Raychaudhuri, Soumya %A Troyanskaya, Olga G %A Darnell, Robert B %K Adult %K Arthritis, Rheumatoid %K B-Lymphocytes %K Female %K Fibroblasts %K Flow Cytometry %K Gene Expression %K Humans %K Male %K Mesenchymal Stem Cells %K Middle Aged %K Patient Acuity %K Sequence Analysis, RNA %K Surveys and Questionnaires %K Symptom Flare Up %K Synovial Fluid %X

BACKGROUND: Rheumatoid arthritis, like many inflammatory diseases, is characterized by episodes of quiescence and exacerbation (flares). The molecular events leading to flares are unknown.

METHODS: We established a clinical and technical protocol for repeated home collection of blood in patients with rheumatoid arthritis to allow for longitudinal RNA sequencing (RNA-seq). Specimens were obtained from 364 time points during eight flares over a period of 4 years in our index patient, as well as from 235 time points during flares in three additional patients. We identified transcripts that were differentially expressed before flares and compared these with data from synovial single-cell RNA-seq. Flow cytometry and sorted-blood-cell RNA-seq in additional patients were used to validate the findings.

RESULTS: Consistent changes were observed in blood transcriptional profiles 1 to 2 weeks before a rheumatoid arthritis flare. B-cell activation was followed by expansion of circulating CD45-CD31-PDPN+ preinflammatory mesenchymal, or PRIME, cells in the blood from patients with rheumatoid arthritis; these cells shared features of inflammatory synovial fibroblasts. Levels of circulating PRIME cells decreased during flares in all 4 patients, and flow cytometry and sorted-cell RNA-seq confirmed the presence of PRIME cells in 19 additional patients with rheumatoid arthritis.

CONCLUSIONS: Longitudinal genomic analysis of rheumatoid arthritis flares revealed PRIME cells in the blood during the period before a flare and suggested a model in which these cells become activated by B cells in the weeks before a flare and subsequently migrate out of the blood into the synovium. (Funded by the National Institutes of Health and others.).

%B N Engl J Med %V 383 %P 218-228 %8 2020 07 16 %G eng %N 3 %1 https://www.ncbi.nlm.nih.gov/pubmed/32668112?dopt=Abstract %R 10.1056/NEJMoa2004114 %0 Journal Article %J Am J Clin Nutr %D 2020 %T Serum sphingolipids and incident diabetes in a US population with high diabetes burden: the Hispanic Community Health Study/Study of Latinos (HCHS/SOL). %A Chen, Guo-Chong %A Chai, Jin Choul %A Yu, Bing %A Michelotti, Gregory A %A Grove, Megan L %A Fretts, Amanda M %A Daviglus, Martha L %A Garcia-Bedoya, Olga L %A Thyagarajan, Bharat %A Schneiderman, Neil %A Cai, Jianwen %A Kaplan, Robert C %A Boerwinkle, Eric %A Qi, Qibin %K Adolescent %K Adult %K Aged %K Diabetes Mellitus %K Female %K Hispanic Americans %K Humans %K Male %K Middle Aged %K Prospective Studies %K Risk Factors %K Sphingolipids %K United States %K Young Adult %X

BACKGROUND: Genetic or pharmacological inhibition of de novo sphingolipid synthases prevented diabetes in animal studies.

OBJECTIVES: We sought to evaluate prospective associations of serum sphingolipids with incident diabetes in a population-based cohort.

METHODS: We included 2010 participants of the Hispanic Community Health Study/Study of Latinos (HCHS/SOL) aged 18-74 y who were free of diabetes and other major chronic diseases at baseline (2008-2011). Metabolomic profiling of fasting serum was performed using a global, untargeted approach. A total of 43 sphingolipids were quantified and, considering subclasses and chemical structures of individual species, 6 sphingolipid scores were constructed. Diabetes status was assessed using standard procedures including blood tests. Multivariable survey Poisson regressions were applied to estimate RR and 95% CI of incident diabetes associated with individual sphingolipids or sphingolipid scores.

RESULTS: There were 224 incident cases of diabetes identified during, on average, 6 y of follow-up. After adjustment for socioeconomic and lifestyle factors, a ceramide score (RR Q4 versus Q1 = 2.40; 95% CI: 1.24, 4.65; P-trend = 0.003) and a score of sphingomyelins with fully saturated sphingoid-fatty acid pairs (RR Q4 versus Q1 = 3.15; 95% CI: 1.75, 5.67; P-trend <0.001) both were positively associated with risk of diabetes, whereas scores of glycosylceramides, lactosylceramides, or other unsaturated sphingomyelins (even if having an SFA base) were not associated with risk of diabetes. After additional adjustment for numerous traditional risk factors (especially triglycerides), both associations were attenuated and only the saturated-sphingomyelin score remained associated with risk of diabetes (RR Q4 versus Q1 = 1.98; 95% CI: 1.09, 3.59; P-trend = 0.031).

CONCLUSIONS: Our findings suggest that a cluster of saturated sphingomyelins may be associated with elevated risk of diabetes beyond traditional risk factors, which needs to be verified in other population studies. This study was registered at clinicaltrials.gov as NCT02060344.

%B Am J Clin Nutr %V 112 %P 57-65 %8 2020 07 01 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/32469399?dopt=Abstract %R 10.1093/ajcn/nqaa114 %0 Journal Article %J Genet Med %D 2020 %T Spinal muscular atrophy diagnosis and carrier screening from genome sequencing data. %A Chen, Xiao %A Sanchis-Juan, Alba %A French, Courtney E %A Connell, Andrew J %A Delon, Isabelle %A Kingsbury, Zoya %A Chawla, Aditi %A Halpern, Aaron L %A Taft, Ryan J %A Bentley, David R %A Butchbach, Matthew E R %A Raymond, F Lucy %A Eberle, Michael A %K Base Sequence %K Child %K Child, Preschool %K Humans %K Muscular Atrophy, Spinal %K Survival of Motor Neuron 1 Protein %X

PURPOSE: Spinal muscular atrophy (SMA), caused by loss of the SMN1 gene, is a leading cause of early childhood death. Due to the near identical sequences of SMN1 and SMN2, analysis of this region is challenging. Population-wide SMA screening to quantify the SMN1 copy number (CN) is recommended by the American College of Medical Genetics and Genomics.

METHODS: We developed a method that accurately identifies the CN of SMN1 and SMN2 using genome sequencing (GS) data by analyzing read depth and eight informative reference genome differences between SMN1/2.

RESULTS: We characterized SMN1/2 in 12,747 genomes, identified 1568 samples with SMN1 gains or losses and 6615 samples with SMN2 gains or losses, and calculated a pan-ethnic carrier frequency of 2%, consistent with previous studies. Additionally, 99.8% of our SMN1 and 99.7% of SMN2 CN calls agreed with orthogonal methods, with a recall of 100% for SMA and 97.8% for carriers, and a precision of 100% for both SMA and carriers.

CONCLUSION: This SMN copy-number caller can be used to identify both carrier and affected status of SMA, enabling SMA testing to be offered as a comprehensive test in neonatal care and an accurate carrier screening tool in GS sequencing projects.

%B Genet Med %V 22 %P 945-953 %8 2020 05 %G eng %N 5 %1 https://www.ncbi.nlm.nih.gov/pubmed/32066871?dopt=Abstract %R 10.1038/s41436-020-0754-0 %0 Journal Article %J Nature %D 2020 %T A structural variation reference for medical and population genetics. %A Collins, Ryan L %A Brand, Harrison %A Karczewski, Konrad J %A Zhao, Xuefang %A Alföldi, Jessica %A Francioli, Laurent C %A Khera, Amit V %A Lowther, Chelsea %A Gauthier, Laura D %A Wang, Harold %A Watts, Nicholas A %A Solomonson, Matthew %A O'Donnell-Luria, Anne %A Baumann, Alexander %A Munshi, Ruchi %A Walker, Mark %A Whelan, Christopher W %A Huang, Yongqing %A Brookings, Ted %A Sharpe, Ted %A Stone, Matthew R %A Valkanas, Elise %A Fu, Jack %A Tiao, Grace %A Laricchia, Kristen M %A Ruano-Rubio, Valentin %A Stevens, Christine %A Gupta, Namrata %A Cusick, Caroline %A Margolin, Lauren %A Taylor, Kent D %A Lin, Henry J %A Rich, Stephen S %A Post, Wendy S %A Chen, Yii-Der Ida %A Rotter, Jerome I %A Nusbaum, Chad %A Philippakis, Anthony %A Lander, Eric %A Gabriel, Stacey %A Neale, Benjamin M %A Kathiresan, Sekar %A Daly, Mark J %A Banks, Eric %A MacArthur, Daniel G %A Talkowski, Michael E %K Continental Population Groups %K Disease %K Female %K Genetic Testing %K Genetic Variation %K Genetics, Medical %K Genetics, Population %K Genome, Human %K Genotyping Techniques %K Humans %K Male %K Middle Aged %K Mutation %K Polymorphism, Single Nucleotide %K Reference Standards %K Selection, Genetic %K Whole Genome Sequencing %X

Structural variants (SVs) rearrange large segments of DNA and can have profound consequences in evolution and human disease. As national biobanks, disease-association studies, and clinical genetic testing have grown increasingly reliant on genome sequencing, population references such as the Genome Aggregation Database (gnomAD) have become integral in the interpretation of single-nucleotide variants (SNVs). However, there are no reference maps of SVs from high-coverage genome sequencing comparable to those for SNVs. Here we present a reference of sequence-resolved SVs constructed from 14,891 genomes across diverse global populations (54% non-European) in gnomAD. We discovered a rich and complex landscape of 433,371 SVs, from which we estimate that SVs are responsible for 25-29% of all rare protein-truncating events per genome. We found strong correlations between natural selection against damaging SNVs and rare SVs that disrupt or duplicate protein-coding sequence, which suggests that genes that are highly intolerant to loss-of-function are also sensitive to increased dosage. We also uncovered modest selection against noncoding SVs in cis-regulatory elements, although selection against protein-truncating SVs was stronger than all noncoding effects. Finally, we identified very large (over one megabase), rare SVs in 3.9% of samples, and estimate that 0.13% of individuals may carry an SV that meets the existing criteria for clinically important incidental findings. This SV resource is freely distributed via the gnomAD browser and will have broad utility in population genetics, disease-association studies, and diagnostic screening.

%B Nature %V 581 %P 444-451 %8 2020 05 %G eng %N 7809 %1 https://www.ncbi.nlm.nih.gov/pubmed/32461652?dopt=Abstract %R 10.1038/s41586-020-2287-8 %0 Journal Article %J J Am Coll Cardiol %D 2020 %T Titin Truncating Variants in Adults Without Known Congestive Heart Failure. %A Pirruccello, James P %A Bick, Alexander %A Chaffin, Mark %A Aragam, Krishna G %A Choi, Seung Hoan %A Lubitz, Steven A %A Ho, Carolyn Y %A Ng, Kenney %A Philippakis, Anthony %A Ellinor, Patrick T %A Kathiresan, Sekar %A Khera, Amit V %K Adult %K Aged %K Asymptomatic Diseases %K Connectin %K Female %K Genetic Variation %K Heart Failure %K Humans %K Male %K Middle Aged %B J Am Coll Cardiol %V 75 %P 1239-1241 %8 2020 03 17 %G eng %N 10 %1 https://www.ncbi.nlm.nih.gov/pubmed/32164899?dopt=Abstract %R 10.1016/j.jacc.2020.01.013 %0 Journal Article %J Science %D 2020 %T Transcriptomic signatures across human tissues identify functional rare genetic variation. %A Ferraro, Nicole M %A Strober, Benjamin J %A Einson, Jonah %A Abell, Nathan S %A Aguet, François %A Barbeira, Alvaro N %A Brandt, Margot %A Bucan, Maja %A Castel, Stephane E %A Davis, Joe R %A Greenwald, Emily %A Hess, Gaelen T %A Hilliard, Austin T %A Kember, Rachel L %A Kotis, Bence %A Park, YoSon %A Peloso, Gina %A Ramdas, Shweta %A Scott, Alexandra J %A Smail, Craig %A Tsang, Emily K %A Zekavat, Seyedeh M %A Ziosi, Marcello %A Ardlie, Kristin G %A Assimes, Themistocles L %A Bassik, Michael C %A Brown, Christopher D %A Correa, Adolfo %A Hall, Ira %A Im, Hae Kyung %A Li, Xin %A Natarajan, Pradeep %A Lappalainen, Tuuli %A Mohammadi, Pejman %A Montgomery, Stephen B %A Battle, Alexis %K Genetic Variation %K Genome, Human %K Humans %K Multifactorial Inheritance %K Organ Specificity %K Transcriptome %X

Rare genetic variants are abundant across the human genome, and identifying their function and phenotypic impact is a major challenge. Measuring aberrant gene expression has aided in identifying functional, large-effect rare variants (RVs). Here, we expanded detection of genetically driven transcriptome abnormalities by analyzing gene expression, allele-specific expression, and alternative splicing from multitissue RNA-sequencing data, and demonstrate that each signal informs unique classes of RVs. We developed Watershed, a probabilistic model that integrates multiple genomic and transcriptomic signals to predict variant function, validated these predictions in additional cohorts and through experimental assays, and used them to assess RVs in the UK Biobank, the Million Veterans Program, and the Jackson Heart Study. Our results link thousands of RVs to diverse molecular effects and provide evidence to associate RVs affecting the transcriptome with human traits.

%B Science %V 369 %8 2020 09 11 %G eng %N 6509 %1 https://www.ncbi.nlm.nih.gov/pubmed/32913073?dopt=Abstract %R 10.1126/science.aaz5900 %0 Journal Article %J Nat Commun %D 2020 %T Type 2 and interferon inflammation regulate SARS-CoV-2 entry factor expression in the airway epithelium. %A Sajuthi, Satria P %A DeFord, Peter %A Li, Yingchun %A Jackson, Nathan D %A Montgomery, Michael T %A Everman, Jamie L %A Rios, Cydney L %A Pruesse, Elmar %A Nolin, James D %A Plender, Elizabeth G %A Wechsler, Michael E %A Mak, Angel C Y %A Eng, Celeste %A Salazar, Sandra %A Medina, Vivian %A Wohlford, Eric M %A Huntsman, Scott %A Nickerson, Deborah A %A Germer, Soren %A Zody, Michael C %A Abecasis, Gonçalo %A Kang, Hyun Min %A Rice, Kenneth M %A Kumar, Rajesh %A Oh, Sam %A Rodriguez-Santana, Jose %A Burchard, Esteban G %A Seibold, Max A %K Angiotensin-Converting Enzyme 2 %K Betacoronavirus %K Child %K Coronavirus Infections %K COVID-19 %K Epithelial Cells %K Gene Expression Profiling %K Gene Expression Regulation %K Genetic Variation %K Host-Pathogen Interactions %K Humans %K Inflammation %K Interferons %K Interleukin-13 %K Middle Aged %K Nasal Mucosa %K Pandemics %K Peptidyl-Dipeptidase A %K Pneumonia, Viral %K SARS-CoV-2 %K Serine Endopeptidases %K Virus Internalization %X

Coronavirus disease 2019 (COVID-19) is caused by SARS-CoV-2, an emerging virus that utilizes host proteins ACE2 and TMPRSS2 as entry factors. Understanding the factors affecting the pattern and levels of expression of these genes is important for deeper understanding of SARS-CoV-2 tropism and pathogenesis. Here we explore the role of genetics and co-expression networks in regulating these genes in the airway, through the analysis of nasal airway transcriptome data from 695 children. We identify expression quantitative trait loci for both ACE2 and TMPRSS2, that vary in frequency across world populations. We find TMPRSS2 is part of a mucus secretory network, highly upregulated by type 2 (T2) inflammation through the action of interleukin-13, and that the interferon response to respiratory viruses highly upregulates ACE2 expression. IL-13 and virus infection mediated effects on ACE2 expression were also observed at the protein level in the airway epithelium. Finally, we define airway responses to common coronavirus infections in children, finding that these infections generate host responses similar to other viral species, including upregulation of IL6 and ACE2. Our results reveal possible mechanisms influencing SARS-CoV-2 infectivity and COVID-19 clinical outcomes.

%B Nat Commun %V 11 %P 5139 %8 2020 10 12 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/33046696?dopt=Abstract %R 10.1038/s41467-020-18781-2 %0 Journal Article %J J Am Coll Cardiol %D 2020 %T Validation of a Genome-Wide Polygenic Score for Coronary Artery Disease in South Asians. %A Wang, Minxian %A Menon, Ramesh %A Mishra, Sanghamitra %A Patel, Aniruddh P %A Chaffin, Mark %A Tanneeru, Deepak %A Deshmukh, Manjari %A Mathew, Oshin %A Apte, Sanika %A Devanboo, Christina S %A Sundaram, Sumathi %A Lakshmipathy, Praveena %A Murugan, Sakthivel %A Sharma, Krishna Kumar %A Rajendran, Karthikeyan %A Santhosh, Sam %A Thachathodiyl, Rajesh %A Ahamed, Hisham %A Balegadde, Aniketh Vijay %A Alexander, Thomas %A Swaminathan, Krishnan %A Gupta, Rajeev %A Mullasari, Ajit S %A Sigamani, Alben %A Kanchi, Muralidhar %A Peterson, Andrew S %A Butterworth, Adam S %A Danesh, John %A Di Angelantonio, Emanuele %A Naheed, Aliya %A Inouye, Michael %A Chowdhury, Rajiv %A Vedam, Ramprasad L %A Kathiresan, Sekar %A Gupta, Ravi %A Khera, Amit V %K Adult %K Aged %K Bangladesh %K Case-Control Studies %K Coronary Artery Disease %K Female %K Genome-Wide Association Study %K Humans %K India %K Male %K Middle Aged %K Multifactorial Inheritance %X

BACKGROUND: Genome-wide polygenic scores (GPS) integrate information from many common DNA variants into a single number. Because rates of coronary artery disease (CAD) are substantially higher among South Asians, a GPS to identify high-risk individuals may be particularly useful in this population.

OBJECTIVES: This analysis used summary statistics from a prior genome-wide association study to derive a new GPS for South Asians.

METHODS: This GPS was validated in 7,244 South Asian UK Biobank participants and tested in 491 individuals from a case-control study in Bangladesh. Next, a static ancestry and GPS reference distribution was built using whole-genome sequencing from 1,522 Indian individuals, and a framework was tested for projecting individuals onto this static ancestry and GPS reference distribution using 1,800 CAD cases and 1,163 control subjects newly recruited in India.

RESULTS: The GPS, containing 6,630,150 common DNA variants, had an odds ratio (OR) per SD of 1.58 in South Asian UK Biobank participants and 1.60 in the Bangladeshi study (p < 0.001 for each). Next, individuals of the Indian case-control study were projected onto static reference distributions, observing an OR/SD of 1.66 (p < 0.001). Compared with the middle quintile, risk for CAD was most pronounced for those in the top 5% of the GPS distribution-ORs of 4.16, 2.46, and 3.22 in the South Asian UK Biobank, Bangladeshi, and Indian studies, respectively (p < 0.05 for each).

CONCLUSIONS: The new GPS has been developed and tested using 3 distinct South Asian studies, and provides a generalizable framework for ancestry-specific GPS assessment.

%B J Am Coll Cardiol %V 76 %P 703-714 %8 2020 08 11 %G eng %N 6 %1 https://www.ncbi.nlm.nih.gov/pubmed/32762905?dopt=Abstract %R 10.1016/j.jacc.2020.06.024 %0 Journal Article %J Elife %D 2020 %T A variant-centric perspective on geographic patterns of human allele frequency variation. %A Biddanda, Arjun %A Rice, Daniel P %A Novembre, John %K Gene Frequency %K Genetic Variation %K Genetics, Population %K Geography %K Humans %X

A key challenge in human genetics is to understand the geographic distribution of human genetic variation. Often genetic variation is described by showing relationships among populations or individuals, drawing inferences over many variants. Here, we introduce an alternative representation of genetic variation that reveals the relative abundance of different allele frequency patterns. This approach allows viewers to easily see several features of human genetic structure: (1) most variants are rare and geographically localized, (2) variants that are common in a single geographic region are more likely to be shared across the globe than to be private to that region, and (3) where two individuals differ, it is most often due to variants that are found globally, regardless of whether the individuals are from the same region or different regions. Our variant-centric visualization clarifies the geographic patterns of human variation and can help address misconceptions about genetic differentiation among populations.

%B Elife %V 9 %8 2020 12 22 %G eng %1 https://www.ncbi.nlm.nih.gov/pubmed/33350384?dopt=Abstract %R 10.7554/eLife.60107 %0 Journal Article %J Circulation %D 2020 %T What Is Familial Hypercholesterolemia, and Why Does It Matter? %A Khera, Amit V %A Hegele, Robert A %K Atherosclerosis %K Cardiovascular Diseases %K Cholesterol, LDL %K Humans %K Hyperlipoproteinemia Type II %K Prevalence %B Circulation %V 141 %P 1760-1763 %8 2020 06 02 %G eng %N 22 %1 https://www.ncbi.nlm.nih.gov/pubmed/32479201?dopt=Abstract %R 10.1161/CIRCULATIONAHA.120.046961 %0 Journal Article %J Nat Commun %D 2019 %T A multi-task convolutional deep neural network for variant calling in single molecule sequencing. %A Luo, Ruibang %A Sedlazeck, Fritz J %A Lam, Tak-Wah %A Schatz, Michael C %K Base Sequence %K Computational Biology %K DNA Mutational Analysis %K Genome, Human %K Genome-Wide Association Study %K Genomics %K Genotype %K Genotyping Techniques %K Humans %K INDEL Mutation %K Nanopores %K Neural Networks (Computer) %K Polymorphism, Single Nucleotide %K Sequence Analysis, DNA %K Software %X

The accurate identification of DNA sequence variants is an important, but challenging task in genomics. It is particularly difficult for single molecule sequencing, which has a per-nucleotide error rate of ~5-15%. Meeting this demand, we developed Clairvoyante, a multi-task five-layer convolutional neural network model for predicting variant type (SNP or indel), zygosity, alternative allele and indel length from aligned reads. For the well-characterized NA12878 human sample, Clairvoyante achieves 99.67, 95.78, 90.53% F1-score on 1KP common variants, and 98.65, 92.57, 87.26% F1-score for whole-genome analysis, using Illumina, PacBio, and Oxford Nanopore data, respectively. Training on a second human sample shows Clairvoyante is sample agnostic and finds variants in less than 2 h on a standard server. Furthermore, we present 3,135 variants that are missed using Illumina but supported independently by both PacBio and Oxford Nanopore reads. Clairvoyante is available open-source ( https://github.com/aquaskyline/Clairvoyante ), with modules to train, utilize and visualize the model.

%B Nat Commun %V 10 %P 998 %8 2019 03 01 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/30824707?dopt=Abstract %R 10.1038/s41467-019-09025-z %0 Journal Article %J Nat Methods %D 2018 %T Accurate detection of complex structural variations using single-molecule sequencing. %A Sedlazeck, Fritz J %A Rescheneder, Philipp %A Smolka, Moritz %A Fang, Han %A Nattestad, Maria %A von Haeseler, Arndt %A Schatz, Michael C %K DNA Mutational Analysis %K Genome, Human %K Genomics %K High-Throughput Nucleotide Sequencing %K Humans %K Sequence Analysis, DNA %X

Structural variations are the greatest source of genetic variation, but they remain poorly understood because of technological limitations. Single-molecule long-read sequencing has the potential to dramatically advance the field, although high error rates are a challenge with existing methods. Addressing this need, we introduce open-source methods for long-read alignment (NGMLR; https://github.com/philres/ngmlr ) and structural variant identification (Sniffles; https://github.com/fritzsedlazeck/Sniffles ) that provide unprecedented sensitivity and precision for variant detection, even in repeat-rich regions and for complex nested events that can have substantial effects on human health. In several long-read datasets, including healthy and cancerous human genomes, we discovered thousands of novel variants and categorized systematic errors in short-read approaches. NGMLR and Sniffles can automatically filter false events and operate on low-coverage data, thereby reducing the high costs that have hindered the application of long reads in clinical and research settings.

%B Nat Methods %V 15 %P 461-468 %8 2018 06 %G eng %N 6 %1 https://www.ncbi.nlm.nih.gov/pubmed/29713083?dopt=Abstract %R 10.1038/s41592-018-0001-7 %0 Journal Article %J Nat Commun %D 2018 %T Analysis of predicted loss-of-function variants in UK Biobank identifies variants protective for disease. %A Emdin, Connor A %A Khera, Amit V %A Chaffin, Mark %A Klarin, Derek %A Natarajan, Pradeep %A Aragam, Krishna %A Haas, Mary %A Bick, Alexander %A Zekavat, Seyedeh M %A Nomura, Akihiro %A Ardissino, Diego %A Wilson, James G %A Schunkert, Heribert %A McPherson, Ruth %A Watkins, Hugh %A Elosua, Roberto %A Bown, Matthew J %A Samani, Nilesh J %A Baber, Usman %A Erdmann, Jeanette %A Gupta, Namrata %A Danesh, John %A Chasman, Daniel %A Ridker, Paul %A Denny, Joshua %A Bastarache, Lisa %A Lichtman, Judith H %A D'Onofrio, Gail %A Mattera, Jennifer %A Spertus, John A %A Sheu, Wayne H-H %A Taylor, Kent D %A Psaty, Bruce M %A Rich, Stephen S %A Post, Wendy %A Rotter, Jerome I %A Chen, Yii-Der Ida %A Krumholz, Harlan %A Saleheen, Danish %A Gabriel, Stacey %A Kathiresan, Sekar %K Databases, Genetic %K Diabetes Mellitus, Type 2 %K Disease %K Gene Frequency %K Genetic Testing %K Genetic Variation %K Humans %K Obesity %K Phenotype %K Proteins %K Respiratory Hypersensitivity %K United Kingdom %X

Less than 3% of protein-coding genetic variants are predicted to result in loss of protein function through the introduction of a stop codon, frameshift, or the disruption of an essential splice site; however, such predicted loss-of-function (pLOF) variants provide insight into effector transcript and direction of biological effect. In >400,000 UK Biobank participants, we conduct association analyses of 3759 pLOF variants with six metabolic traits, six cardiometabolic diseases, and twelve additional diseases. We identified 18 new low-frequency or rare (allele frequency < 5%) pLOF variant-phenotype associations. pLOF variants in the gene GPR151 protect against obesity and type 2 diabetes, in the gene IL33 against asthma and allergic disease, and in the gene IFIH1 against hypothyroidism. In the gene PDE3B, pLOF variants associate with elevated height, improved body fat distribution and protection from coronary artery disease. Our findings prioritize genes for which pharmacologic mimics of pLOF variants may lower risk for disease.

%B Nat Commun %V 9 %P 1613 %8 2018 04 24 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/29691411?dopt=Abstract %R 10.1038/s41467-018-03911-8 %0 Journal Article %J BMC Genomics %D 2018 %T Characterization and remediation of sample index swaps by non-redundant dual indexing on massively parallel sequencing platforms. %A Costello, Maura %A Fleharty, Mark %A Abreu, Justin %A Farjoun, Yossi %A Ferriera, Steven %A Holmes, Laurie %A Granger, Brian %A Green, Lisa %A Howd, Tom %A Mason, Tamara %A Vicente, Gina %A Dasilva, Michael %A Brodeur, Wendy %A DeSmet, Timothy %A Dodge, Sheila %A Lennon, Niall J %A Gabriel, Stacey %K DNA %K Gene Library %K Genome, Human %K High-Throughput Nucleotide Sequencing %K Humans %K Sequence Analysis %K Sequence Analysis, DNA %X

BACKGROUND: Here we present an in-depth characterization of the mechanism of sequencer-induced sample contamination due to the phenomenon of index swapping that impacts Illumina sequencers employing patterned flow cells with Exclusion Amplification (ExAmp) chemistry (HiSeqX, HiSeq4000, and NovaSeq). We also present a remediation method that minimizes the impact of such swaps.

RESULTS: Leveraging data collected over a two-year period, we demonstrate the widespread prevalence of index swapping in patterned flow cell data. We calculate mean swap rates across multiple sample preparation methods and sequencer models, demonstrating that different library methods can have vastly different swapping rates and that even non-ExAmp chemistry instruments display trace levels of index swapping. We provide methods for eliminating sample data cross contamination by utilizing non-redundant dual indexing for complete filtering of index swapped reads, and share the sequences for 96 non-combinatorial dual indexes we have validated across various library preparation methods and sequencer models. Finally, using computational methods we provide a greater insight into the mechanism of index swapping.

CONCLUSIONS: Index swapping in pooled libraries is a prevalent phenomenon that we observe at a rate of 0.2 to 6% in all sequencing runs on HiSeqX, HiSeq 4000/3000, and NovaSeq. Utilizing non-redundant dual indexing allows for the removal (flagging/filtering) of these swapped reads and eliminates swapping induced sample contamination, which is critical for sensitive applications such as RNA-seq, single cell, blood biopsy using circulating tumor DNA, or clinical sequencing.

%B BMC Genomics %V 19 %P 332 %8 2018 May 08 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/29739332?dopt=Abstract %R 10.1186/s12864-018-4703-0 %0 Journal Article %J Genet Med %D 2018 %T Characterizing reduced coverage regions through comparison of exome and genome sequencing data across 10 centers. %A Sanghvi, Rashesh V %A Buhay, Christian J %A Powell, Bradford C %A Tsai, Ellen A %A Dorschner, Michael O %A Hong, Celine S %A Lebo, Matthew S %A Sasson, Ariella %A Hanna, David S %A McGee, Sean %A Bowling, Kevin M %A Cooper, Gregory M %A Gray, David E %A Lonigro, Robert J %A Dunford, Andrew %A Brennan, Christine A %A Cibulskis, Carrie %A Walker, Kimberly %A Carneiro, Mauricio O %A Sailsbery, Joshua %A Hindorff, Lucia A %A Robinson, Dan R %A Santani, Avni %A Sarmady, Mahdi %A Rehm, Heidi L %A Biesecker, Leslie G %A Nickerson, Deborah A %A Hutter, Carolyn M %A Garraway, Levi %A Muzny, Donna M %A Wagle, Nikhil %K Base Sequence %K Chromosome Mapping %K Exome %K Genome, Human %K High-Throughput Nucleotide Sequencing %K Humans %K Sequence Analysis, DNA %K Software %K Whole Exome Sequencing %K Whole Genome Sequencing %X

PURPOSE: As massively parallel sequencing is increasingly being used for clinical decision making, it has become critical to understand parameters that affect sequencing quality and to establish methods for measuring and reporting clinical sequencing standards. In this report, we propose a definition for reduced coverage regions and describe a set of standards for variant calling in clinical sequencing applications.

METHODS: To enable sequencing centers to assess the regions of poor sequencing quality in their own data, we optimized and used a tool (ExCID) to identify reduced coverage loci within genes or regions of particular interest. We used this framework to examine sequencing data from 500 patients generated in 10 projects at sequencing centers in the National Human Genome Research Institute/National Cancer Institute Clinical Sequencing Exploratory Research Consortium.

RESULTS: This approach identified reduced coverage regions in clinically relevant genes, including known clinically relevant loci that were uniquely missed at individual centers, in multiple centers, and in all centers.

CONCLUSION: This report provides a process road map for clinical sequencing centers looking to perform similar analyses on their data.

%B Genet Med %V 20 %P 855-866 %8 2018 08 %G eng %N 8 %1 http://www.ncbi.nlm.nih.gov/pubmed/29144510?dopt=Abstract %R 10.1038/gim.2017.192 %0 Journal Article %J Genome Res %D 2018 %T Complex rearrangements and oncogene amplifications revealed by long-read DNA and RNA sequencing of a breast cancer cell line. %A Nattestad, Maria %A Goodwin, Sara %A Ng, Karen %A Baslan, Timour %A Sedlazeck, Fritz J %A Rescheneder, Philipp %A Garvin, Tyler %A Fang, Han %A Gurtowski, James %A Hutton, Elizabeth %A Tseng, Elizabeth %A Chin, Chen-Shan %A Beck, Timothy %A Sundaravadanam, Yogi %A Kramer, Melissa %A Antoniou, Eric %A McPherson, John D %A Hicks, James %A McCombie, W Richard %A Schatz, Michael C %K Breast Neoplasms %K Female %K Gene Amplification %K Gene Rearrangement %K Genome, Human %K Genomic Structural Variation %K High-Throughput Nucleotide Sequencing %K Humans %K MCF-7 Cells %K Oncogenes %K Receptor, ErbB-2 %K Repetitive Sequences, Nucleic Acid %K Transcriptome %X

The SK-BR-3 cell line is one of the most important models for HER2+ breast cancers, which affect one in five breast cancer patients. SK-BR-3 is known to be highly rearranged, although much of the variation is in complex and repetitive regions that may be underreported. Addressing this, we sequenced SK-BR-3 using long-read single molecule sequencing from Pacific Biosciences and develop one of the most detailed maps of structural variations (SVs) in a cancer genome available, with nearly 20,000 variants present, most of which were missed by short-read sequencing. Surrounding the important oncogene (also known as ), we discover a complex sequence of nested duplications and translocations, suggesting a punctuated progression. Full-length transcriptome sequencing further revealed several novel gene fusions within the nested genomic variants. Combining long-read genome and transcriptome sequencing enables an in-depth analysis of how SVs disrupt the genome and sheds new light on the complex mechanisms involved in cancer genome evolution.

%B Genome Res %V 28 %P 1126-1135 %8 2018 08 %G eng %N 8 %1 http://www.ncbi.nlm.nih.gov/pubmed/29954844?dopt=Abstract %R 10.1101/gr.231100.117 %0 Journal Article %J Nat Commun %D 2018 %T Functional equivalence of genome sequencing analysis pipelines enables harmonized variant calling across human genetics projects. %A Regier, Allison A %A Farjoun, Yossi %A Larson, David E %A Krasheninina, Olga %A Kang, Hyun Min %A Howrigan, Daniel P %A Chen, Bo-Juen %A Kher, Manisha %A Banks, Eric %A Ames, Darren C %A English, Adam C %A Li, Heng %A Xing, Jinchuan %A Zhang, Yeting %A Matise, Tara %A Abecasis, Goncalo R %A Salerno, Will %A Zody, Michael C %A Neale, Benjamin M %A Hall, Ira M %K Genome, Human %K Human Genetics %K Humans %K Whole Genome Sequencing %X

Hundreds of thousands of human whole genome sequencing (WGS) datasets will be generated over the next few years. These data are more valuable in aggregate: joint analysis of genomes from many sources increases sample size and statistical power. A central challenge for joint analysis is that different WGS data processing pipelines cause substantial differences in variant calling in combined datasets, necessitating computationally expensive reprocessing. This approach is no longer tenable given the scale of current studies and data volumes. Here, we define WGS data processing standards that allow different groups to produce functionally equivalent (FE) results, yet still innovate on data processing pipelines. We present initial FE pipelines developed at five genome centers and show that they yield similar variant calling results and produce significantly less variability than sequencing replicates. This work alleviates a key technical bottleneck for genome aggregation and helps lay the foundation for community-wide human genetics studies.

%B Nat Commun %V 9 %P 4038 %8 2018 10 02 %G eng %N 1 %1 http://www.ncbi.nlm.nih.gov/pubmed/30279509?dopt=Abstract %R 10.1038/s41467-018-06159-4 %0 Journal Article %J Hum Genet %D 2018 %T Genetic variants in microRNA genes and targets associated with cardiovascular disease risk factors in the African-American population. %A Li, Chang %A Grove, Megan L %A Yu, Bing %A Jones, Barbara C %A Morrison, Alanna %A Boerwinkle, Eric %A Liu, Xiaoming %K 3' Untranslated Regions %K Adult %K African Americans %K Cardiovascular Diseases %K Female %K Genetic Predisposition to Disease %K Genotyping Techniques %K Humans %K Male %K MicroRNAs %K Middle Aged %K Polymorphism, Single Nucleotide %K Risk Factors %K Whole Genome Sequencing %X

The purpose of this study is to identify microRNA (miRNA) related polymorphism, including single nucleotide variants (SNVs) in mature miRNA-encoding sequences or in miRNA-target sites, and their association with cardiovascular disease (CVD) risk factors in African-American population. To achieve our objective, we examined 1900 African-Americans from the Atherosclerosis Risk in Communities study using SNVs identified from whole-genome sequencing data. A total of 971 SNVs found in 726 different mature miRNA-encoding sequences and 16,057 SNVs found in the three prime untranslated region (3'UTR) of 3647 protein-coding genes were identified and interrogated their associations with 17 CVD risk factors. Using single-variant-based approach, we found 5 SNVs in miRNA-encoding sequences to be associated with serum Lipoprotein(a) [Lp(a)], high-density lipoprotein (HDL) or triglycerides, and 2 SNVs in miRNA-target sites to be associated with Lp(a) and HDL, all with false discovery rates of 5%. Using a gene-based approach, we identified 3 pairs of associations between gene NSD1 and platelet count, gene HSPA4L and cardiac troponin T, and gene AHSA2 and magnesium. We successfully validated the association between a variant specific to African-American population, NR_039880.1:n.18A>C, in mature hsa-miR-4727-5p encoding sequence and serum HDL level in an independent sample of 2135 African-Americans. Our study provided candidate miRNAs and their targets for further investigation of their potential contribution to ethnic disparities in CVD risk factors.

%B Hum Genet %V 137 %P 85-94 %8 2018 Jan %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/29264654?dopt=Abstract %R 10.1007/s00439-017-1858-8 %0 Journal Article %J J Am Coll Cardiol %D 2017 %T ANGPTL3 Deficiency and Protection Against Coronary Artery Disease. %A Stitziel, Nathan O %A Khera, Amit V %A Wang, Xiao %A Bierhals, Andrew J %A Vourakis, A Christina %A Sperry, Alexandra E %A Natarajan, Pradeep %A Klarin, Derek %A Emdin, Connor A %A Zekavat, Seyedeh M %A Nomura, Akihiro %A Erdmann, Jeanette %A Schunkert, Heribert %A Samani, Nilesh J %A Kraus, William E %A Shah, Svati H %A Yu, Bing %A Boerwinkle, Eric %A Rader, Daniel J %A Gupta, Namrata %A Frossard, Philippe M %A Rasheed, Asif %A Danesh, John %A Lander, Eric S %A Gabriel, Stacey %A Saleheen, Danish %A Musunuru, Kiran %A Kathiresan, Sekar %K Adult %K Angiopoietin-Like Protein 3 %K Angiopoietin-like Proteins %K Angiopoietins %K Animals %K Atherosclerosis %K Case-Control Studies %K Coronary Artery Disease %K Female %K Humans %K Lipids %K Male %K Mice, Inbred C57BL %K Mice, Knockout %K Middle Aged %K Mutation, Missense %K Myocardial Infarction %K Risk Factors %X

BACKGROUND: Familial combined hypolipidemia, a Mendelian condition characterized by substantial reductions in all 3 major lipid fractions, is caused by mutations that inactivate the gene angiopoietin-like 3 (ANGPTL3). Whether ANGPTL3 deficiency reduces risk of coronary artery disease (CAD) is unknown.

OBJECTIVES: The study goal was to leverage 3 distinct lines of evidence-a family that included individuals with complete (compound heterozygote) ANGPTL3 deficiency, a population based-study of humans with partial (heterozygote) ANGPTL3 deficiency, and biomarker levels in patients with myocardial infarction (MI)-to test whether ANGPTL3 deficiency is associated with lower risk for CAD.

METHODS: We assessed coronary atherosclerotic burden in 3 individuals with complete ANGPTL3 deficiency and 3 wild-type first-degree relatives using computed tomography angiography. In the population, ANGPTL3 loss-of-function (LOF) mutations were ascertained in up to 21,980 people with CAD and 158,200 control subjects. LOF mutations were defined as nonsense, frameshift, and splice-site variants, along with missense variants resulting in <25% of wild-type ANGPTL3 activity in a mouse model. In a biomarker study, circulating ANGPTL3 concentration was measured in 1,493 people who presented with MI and 3,232 control subjects.

RESULTS: The 3 individuals with complete ANGPTL3 deficiency showed no evidence of coronary atherosclerotic plaque. ANGPTL3 gene sequencing demonstrated that approximately 1 in 309 people was a heterozygous carrier for an LOF mutation. Compared with those without mutation, heterozygous carriers of ANGPTL3 LOF mutations demonstrated a 17% reduction in circulating triglycerides and a 12% reduction in low-density lipoprotein cholesterol. Carrier status was associated with a 34% reduction in odds of CAD (odds ratio: 0.66; 95% confidence interval: 0.44 to 0.98; p = 0.04). Individuals in the lowest tertile of circulating ANGPTL3 concentrations, compared with the highest, had reduced odds of MI (adjusted odds ratio: 0.65; 95% confidence interval: 0.55 to 0.77; p < 0.001).

CONCLUSIONS: ANGPTL3 deficiency is associated with protection from CAD.

%B J Am Coll Cardiol %V 69 %P 2054-2063 %8 2017 Apr 25 %G eng %N 16 %1 https://www.ncbi.nlm.nih.gov/pubmed/28385496?dopt=Abstract %R 10.1016/j.jacc.2017.02.030 %0 Journal Article %J Circulation %D 2017 %T Is Coronary Atherosclerosis One Disease or Many? Setting Realistic Expectations for Precision Medicine. %A Khera, Amit V %A Kathiresan, Sekar %K Atherosclerosis %K Coronary Artery Disease %K Humans %K Precision Medicine %B Circulation %V 135 %P 1005-1007 %8 2017 03 14 %G eng %N 11 %1 https://www.ncbi.nlm.nih.gov/pubmed/28289003?dopt=Abstract %R 10.1161/CIRCULATIONAHA.116.026479 %0 Journal Article %J Neuron %D 2017 %T cTag-PAPERCLIP Reveals Alternative Polyadenylation Promotes Cell-Type Specific Protein Diversity and Shifts Araf Isoforms with Microglia Activation. %A Hwang, Hun-Way %A Saito, Yuhki %A Park, Christopher Y %A Blachère, Nathalie E %A Tajima, Yoko %A Fak, John J %A Zucker-Scharff, Ilana %A Darnell, Robert B %K Animals %K Antigens, Neoplasm %K Astrocytes %K Brain %K Cells, Cultured %K Female %K Humans %K Male %K Mice %K Microglia %K Nerve Tissue Proteins %K Neuro-Oncological Ventral Antigen %K Neurons %K Organ Specificity %K Polyadenylation %K Polypyrimidine Tract-Binding Protein %K Protein Isoforms %K Protein Serine-Threonine Kinases %K RNA-Binding Proteins %X

Alternative polyadenylation (APA) is increasingly recognized to regulate gene expression across different cell types, but obtaining APA maps from individual cell types typically requires prior purification, a stressful procedure that can itself alter cellular states. Here, we describe a new platform, cTag-PAPERCLIP, that generates APA profiles from single cell populations in intact tissues; cTag-PAPERCLIP requires no tissue dissociation and preserves transcripts in native states. Applying cTag-PAPERCLIP to profile four major cell types in the mouse brain revealed common APA preferences between excitatory and inhibitory neurons distinct from astrocytes and microglia, regulated in part by neuron-specific RNA-binding proteins NOVA2 and PTBP2. We further identified a role of APA in switching Araf protein isoforms during microglia activation, impacting production of downstream inflammatory cytokines. Our results demonstrate the broad applicability of cTag-PAPERCLIP and a previously undiscovered role of APA in contributing to protein diversity between different cell types and cellular states within the brain.

%B Neuron %V 95 %P 1334-1349.e5 %8 2017 Sep 13 %G eng %N 6 %1 https://www.ncbi.nlm.nih.gov/pubmed/28910620?dopt=Abstract %R 10.1016/j.neuron.2017.08.024 %0 Journal Article %J Nat Genet %D 2017 %T Disruption of the ATXN1-CIC complex causes a spectrum of neurobehavioral phenotypes in mice and humans. %A Lu, Hsiang-Chih %A Tan, Qiumin %A Rousseaux, Maxime W C %A Wang, Wei %A Kim, Ji-Yoen %A Richman, Ronald %A Wan, Ying-Wooi %A Yeh, Szu-Ying %A Patel, Jay M %A Liu, Xiuyun %A Lin, Tao %A Lee, Yoontae %A Fryer, John D %A Han, Jing %A Chahrour, Maria %A Finnell, Richard H %A Lei, Yunping %A Zurita-Jimenez, Maria E %A Ahimaz, Priyanka %A Anyane-Yeboa, Kwame %A Van Maldergem, Lionel %A Lehalle, Daphne %A Jean-Marcais, Nolwenn %A Mosca-Boidron, Anne-Laure %A Thevenon, Julien %A Cousin, Margot A %A Bro, Della E %A Lanpher, Brendan C %A Klee, Eric W %A Alexander, Nora %A Bainbridge, Matthew N %A Orr, Harry T %A Sillitoe, Roy V %A Ljungberg, M Cecilia %A Liu, Zhandong %A Schaaf, Christian P %A Zoghbi, Huda Y %K Animals %K Ataxin-1 %K Autism Spectrum Disorder %K Cerebellum %K Female %K Humans %K Intellectual Disability %K Interpersonal Relations %K Male %K Mice %K Nerve Tissue Proteins %K Neurodegenerative Diseases %K Nuclear Proteins %K Phenotype %K Repressor Proteins %X

Gain-of-function mutations in some genes underlie neurodegenerative conditions, whereas loss-of-function mutations in the same genes have distinct phenotypes. This appears to be the case with the protein ataxin 1 (ATXN1), which forms a transcriptional repressor complex with capicua (CIC). Gain of function of the complex leads to neurodegeneration, but ATXN1-CIC is also essential for survival. We set out to understand the functions of the ATXN1-CIC complex in the developing forebrain and found that losing this complex results in hyperactivity, impaired learning and memory, and abnormal maturation and maintenance of upper-layer cortical neurons. We also found that CIC activity in the hypothalamus and medial amygdala modulates social interactions. Informed by these neurobehavioral features in mouse mutants, we identified five individuals with de novo heterozygous truncating mutations in CIC who share similar clinical features, including intellectual disability, attention deficit/hyperactivity disorder (ADHD), and autism spectrum disorder. Our study demonstrates that loss of ATXN1-CIC complexes causes a spectrum of neurobehavioral phenotypes.

%B Nat Genet %V 49 %P 527-536 %8 2017 Apr %G eng %N 4 %1 https://www.ncbi.nlm.nih.gov/pubmed/28288114?dopt=Abstract %R 10.1038/ng.3808 %0 Journal Article %J Obesity (Silver Spring) %D 2017 %T Exome sequencing reveals novel genetic loci influencing obesity-related traits in Hispanic children. %A Sabo, Aniko %A Mishra, Pamela %A Dugan-Perez, Shannon %A Voruganti, V Saroja %A Kent, Jack W %A Kalra, Divya %A Cole, Shelley A %A Comuzzie, Anthony G %A Muzny, Donna M %A Gibbs, Richard A %A Butte, Nancy F %K Adolescent %K ATPases Associated with Diverse Cellular Activities %K Body Mass Index %K Body Weight %K Child %K Child, Preschool %K Cohort Studies %K Exome %K Genetic Loci %K Genome-Wide Association Study %K Hispanic or Latino %K Humans %K Membrane Proteins %K Pediatric Obesity %K Polymorphism, Single Nucleotide %K Risk Factors %K Sequence Analysis, DNA %K Software %K Waist Circumference %K Young Adult %X

OBJECTIVE: To perform whole exome sequencing in 928 Hispanic children and identify variants and genes associated with childhood obesity.

METHODS: Single-nucleotide variants (SNVs) were identified from Illumina whole exome sequencing data using integrated read mapping, variant calling, and an annotation pipeline (Mercury). Association analyses of 74 obesity-related traits and exonic variants were performed using SeqMeta software. Rare autosomal variants were analyzed using gene-based association analyses, and common autosomal variants were analyzed at the SNV level.

RESULTS: (1) Rare exonic variants in 10 genes and 16 common SNVs in 11 genes that were associated with obesity traits in a cohort of Hispanic children were identified, (2) novel rare variants in peroxisome biogenesis factor 1 (PEX1) associated with several obesity traits (weight, weight z score, BMI, BMI z score, waist circumference, fat mass, trunk fat mass) were discovered, and (3) previously reported SNVs associated with childhood obesity were replicated.

CONCLUSIONS: Convergence of whole exome sequencing, a family-based design, and extensive phenotyping discovered novel rare and common variants associated with childhood obesity. Linking PEX1 to obesity phenotypes poses a novel mechanism of peroxisomal biogenesis and metabolism underlying the development of childhood obesity.

%B Obesity (Silver Spring) %V 25 %P 1270-1276 %8 2017 07 %G eng %N 7 %1 https://www.ncbi.nlm.nih.gov/pubmed/28508493?dopt=Abstract %R 10.1002/oby.21869 %0 Journal Article %J Nat Genet %D 2017 %T Fast, scalable prediction of deleterious noncoding variants from functional and population genomic data. %A Huang, Yi-Fei %A Gulko, Brad %A Siepel, Adam %K Animals %K Base Sequence %K Computational Biology %K Evolution, Molecular %K Genetic Variation %K Genome %K Humans %K Mammals %K Metagenomics %K Phenotype %K Primates %K Vertebrates %X

Many genetic variants that influence phenotypes of interest are located outside of protein-coding genes, yet existing methods for identifying such variants have poor predictive power. Here we introduce a new computational method, called LINSIGHT, that substantially improves the prediction of noncoding nucleotide sites at which mutations are likely to have deleterious fitness consequences, and which, therefore, are likely to be phenotypically important. LINSIGHT combines a generalized linear model for functional genomic data with a probabilistic model of molecular evolution. The method is fast and highly scalable, enabling it to exploit the 'big data' available in modern genomics. We show that LINSIGHT outperforms the best available methods in identifying human noncoding variants associated with inherited diseases. In addition, we apply LINSIGHT to an atlas of human enhancers and show that the fitness consequences at enhancers depend on cell type, tissue specificity, and constraints at associated promoters.

%B Nat Genet %V 49 %P 618-624 %8 2017 Apr %G eng %N 4 %1 https://www.ncbi.nlm.nih.gov/pubmed/28288115?dopt=Abstract %R 10.1038/ng.3810 %0 Journal Article %J Trends Cardiovasc Med %D 2017 %T Genetic association studies in cardiovascular diseases: Do we have enough power? %A Auer, Paul L %A Stitziel, Nathan O %K Cardiovascular Diseases %K Data Accuracy %K Data Interpretation, Statistical %K Genetic Association Studies %K Genetic Markers %K Genetic Predisposition to Disease %K Genetic Variation %K Humans %K Phenotype %K Reproducibility of Results %K Research Design %K Risk Assessment %K Risk Factors %X

Genetic association studies have a long history of delivering insightful results for cardiovascular disease (CVD) research. Beginning with early candidate gene studies, to genome-wide association studies, and now on to newer whole-genome sequencing studies, research in human genetics has enriched our understanding of the pathobiology of CVD. As these studies continue to expand, the issue of statistical power plays an important role in study design as well as the interpretation of results. We provide an overview of the component parts that determine statistical power and preview the future of CVD genetic association studies through this lens.

%B Trends Cardiovasc Med %V 27 %P 397-404 %8 2017 08 %G eng %N 6 %1 https://www.ncbi.nlm.nih.gov/pubmed/28456354?dopt=Abstract %R 10.1016/j.tcm.2017.03.005 %0 Journal Article %J Nature %D 2017 %T Genetic effects on gene expression across human tissues. %A Battle, Alexis %A Brown, Christopher D %A Engelhardt, Barbara E %A Montgomery, Stephen B %K Alleles %K Chromosomes, Human %K Disease %K Female %K Gene Expression Profiling %K Gene Expression Regulation %K Genetic Variation %K Genome, Human %K Genotype %K Humans %K Male %K Organ Specificity %K Quantitative Trait Loci %X

Characterization of the molecular function of the human genome and its variation across individuals is essential for identifying the cellular mechanisms that underlie human genetic traits and diseases. The Genotype-Tissue Expression (GTEx) project aims to characterize variation in gene expression levels across individuals and diverse tissues of the human body, many of which are not easily accessible. Here we describe genetic effects on gene expression levels across 44 human tissues. We find that local genetic variation affects gene expression levels for the majority of genes, and we further identify inter-chromosomal genetic effects for 93 genes and 112 loci. On the basis of the identified genetic effects, we characterize patterns of tissue specificity, compare local and distal effects, and evaluate the functional properties of the genetic effects. We also demonstrate that multi-tissue, multi-individual data can be used to identify genes and pathways affected by human disease-associated variation, enabling a mechanistic interpretation of gene regulation and the genetic basis of disease.

%B Nature %V 550 %P 204-213 %8 2017 10 11 %G eng %N 7675 %1 https://www.ncbi.nlm.nih.gov/pubmed/29022597?dopt=Abstract %R 10.1038/nature24277 %0 Journal Article %J Nat Commun %D 2017 %T Genetic regulatory effects modified by immune activation contribute to autoimmune disease associations. %A Kim-Hellmuth, Sarah %A Bechheim, Matthias %A Pütz, Benno %A Mohammadi, Pejman %A Nédélec, Yohann %A Giangreco, Nicholas %A Becker, Jessica %A Kaiser, Vera %A Fricker, Nadine %A Beier, Esther %A Boor, Peter %A Castel, Stephane E %A Nöthen, Markus M %A Barreiro, Luis B %A Pickrell, Joseph K %A Müller-Myhsok, Bertram %A Lappalainen, Tuuli %A Schumacher, Johannes %A Hornung, Veit %K Acetylmuramyl-Alanyl-Isoglutamine %K Adjuvants, Immunologic %K Adolescent %K Adult %K Autoimmune Diseases %K Gene Expression %K Gene Expression Profiling %K Gene Expression Regulation %K Genetic Predisposition to Disease %K Healthy Volunteers %K Humans %K Indicators and Reagents %K Lipids %K Lipopolysaccharides %K Male %K Monocytes %K Quantitative Trait Loci %K Regulatory Sequences, Nucleic Acid %K RNA, Double-Stranded %K RNA, Messenger %K Young Adult %X

The immune system plays a major role in human health and disease, and understanding genetic causes of interindividual variability of immune responses is vital. Here, we isolate monocytes from 134 genotyped individuals, stimulate these cells with three defined microbe-associated molecular patterns (LPS, MDP, and 5'-ppp-dsRNA), and profile the transcriptomes at three time points. Mapping expression quantitative trait loci (eQTL), we identify 417 response eQTLs (reQTLs) with varying effects between conditions. We characterize the dynamics of genetic regulation on early and late immune response and observe an enrichment of reQTLs in distal cis-regulatory elements. In addition, reQTLs are enriched for recent positive selection with an evolutionary trend towards enhanced immune response. Finally, we uncover reQTL effects in multiple GWAS loci and show a stronger enrichment for response than constant eQTLs in GWAS signals of several autoimmune diseases. This demonstrates the importance of infectious stimuli in modifying genetic predisposition to disease.Insight into the genetic influence on the immune response is important for the understanding of interindividual variability in human pathologies. Here, the authors generate transcriptome data from human blood monocytes stimulated with various immune stimuli and provide a time-resolved response eQTL map.

%B Nat Commun %V 8 %P 266 %8 2017 08 16 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/28814792?dopt=Abstract %R 10.1038/s41467-017-00366-1 %0 Journal Article %J Nat Rev Genet %D 2017 %T Genetics of coronary artery disease: discovery, biology and clinical translation. %A Khera, Amit V %A Kathiresan, Sekar %K Animals %K Coronary Artery Disease %K Humans %K Precision Medicine %K Translational Research, Biomedical %X

Coronary artery disease is the leading global cause of mortality. Long recognized to be heritable, recent advances have started to unravel the genetic architecture of the disease. Common variant association studies have linked approximately 60 genetic loci to coronary risk. Large-scale gene sequencing efforts and functional studies have facilitated a better understanding of causal risk factors, elucidated underlying biology and informed the development of new therapeutics. Moving forwards, genetic testing could enable precision medicine approaches by identifying subgroups of patients at increased risk of coronary artery disease or those with a specific driving pathophysiology in whom a therapeutic or preventive approach would be most useful.

%B Nat Rev Genet %V 18 %P 331-344 %8 2017 06 %G eng %N 6 %1 https://www.ncbi.nlm.nih.gov/pubmed/28286336?dopt=Abstract %R 10.1038/nrg.2016.160 %0 Journal Article %J Nat Methods %D 2017 %T Genome-wide profiling of heritable and de novo STR variations. %A Willems, Thomas %A Zielinski, Dina %A Yuan, Jie %A Gordon, Assaf %A Gymrek, Melissa %A Erlich, Yaniv %K Algorithms %K Chromosome Mapping %K DNA Fingerprinting %K Genetic Predisposition to Disease %K Genetic Variation %K Genome, Human %K High-Throughput Nucleotide Sequencing %K Humans %K Microsatellite Repeats %K Sequence Alignment %K Sequence Analysis, DNA %K Software %X

Short tandem repeats (STRs) are highly variable elements that play a pivotal role in multiple genetic diseases, population genetics applications, and forensic casework. However, it has proven problematic to genotype STRs from high-throughput sequencing data. Here, we describe HipSTR, a novel haplotype-based method for robustly genotyping and phasing STRs from Illumina sequencing data, and we report a genome-wide analysis and validation of de novo STR mutations. HipSTR is freely available at https://hipstr-tool.github.io/HipSTR.

%B Nat Methods %V 14 %P 590-592 %8 2017 Jun %G eng %N 6 %1 https://www.ncbi.nlm.nih.gov/pubmed/28436466?dopt=Abstract %R 10.1038/nmeth.4267 %0 Journal Article %J Cell %D 2017 %T Genomic Patterns of De Novo Mutation in Simplex Autism. %A Turner, Tychele N %A Coe, Bradley P %A Dickel, Diane E %A Hoekzema, Kendra %A Nelson, Bradley J %A Zody, Michael C %A Kronenberg, Zev N %A Hormozdiari, Fereydoun %A Raja, Archana %A Pennacchio, Len A %A Darnell, Robert B %A Eichler, Evan E %K Animals %K Autistic Disorder %K DNA Copy Number Variations %K DNA Mutational Analysis %K Female %K Genome-Wide Association Study %K Humans %K INDEL Mutation %K Male %K Mice %K Polymorphism, Single Nucleotide %X

To further our understanding of the genetic etiology of autism, we generated and analyzed genome sequence data from 516 idiopathic autism families (2,064 individuals). This resource includes >59 million single-nucleotide variants (SNVs) and 9,212 private copy number variants (CNVs), of which 133,992 and 88 are de novo mutations (DNMs), respectively. We estimate a mutation rate of ∼1.5 × 10 SNVs per site per generation with a significantly higher mutation rate in repetitive DNA. Comparing probands and unaffected siblings, we observe several DNM trends. Probands carry more gene-disruptive CNVs and SNVs, resulting in severe missense mutations and mapping to predicted fetal brain promoters and embryonic stem cell enhancers. These differences become more pronounced for autism genes (p = 1.8 × 10, OR = 2.2). Patients are more likely to carry multiple coding and noncoding DNMs in different genes, which are enriched for expression in striatal neurons (p = 3 × 10), suggesting a path forward for genetically characterizing more complex cases of autism.

%B Cell %V 171 %P 710-722.e12 %8 2017 Oct 19 %G eng %N 3 %1 https://www.ncbi.nlm.nih.gov/pubmed/28965761?dopt=Abstract %R 10.1016/j.cell.2017.08.047 %0 Journal Article %J Curr Opin Lipidol %D 2017 %T Human genetic insights into lipoproteins and risk of cardiometabolic disease. %A Stitziel, Nathan O %K Cardiovascular Diseases %K Genetic Predisposition to Disease %K Humans %K Lipoproteins %K Risk %X

PURPOSE OF REVIEW: Human genetic studies have been successfully used to identify genes and pathways relevant to human biology. Using genetic instruments composed of loci associated with human lipid traits, recent studies have begun to clarify the causal role of major lipid fractions in risk of cardiometabolic disease.

RECENT FINDINGS: The causal relationship between LDL cholesterol and coronary disease has been firmly established. Of the remaining two major fractions, recent studies have found that HDL cholesterol is not likely to be a causal particle in atherogenesis, and have instead shifted the causal focus to triglyceride-rich lipoproteins. Subsequent results are refining this view to suggest that triglycerides themselves might not be causal, but instead may be a surrogate for the causal cholesterol content within this fraction. Other studies have used a similar approach to address the association between lipid fractions and risk of type 2 diabetes. Beyond genetic variation in the target of statin medications, reduced LDL cholesterol associated with multiple genes encoding current or prospective drug targets associated with increased diabetic risk. In addition, genetically lower HDL cholesterol and genetically lower triglycerides both appear to increase risk of type 2 diabetes.

SUMMARY: Results of these and future human genetic studies are positioned to provide substantive insights into the causal relationship between lipids and human disease, and should highlight mechanisms with important implications for our understanding of human biology and future lipid-altering therapeutic development.

%B Curr Opin Lipidol %V 28 %P 113-119 %8 2017 Apr %G eng %N 2 %1 https://www.ncbi.nlm.nih.gov/pubmed/28059951?dopt=Abstract %R 10.1097/MOL.0000000000000389 %0 Journal Article %J Nat Genet %D 2017 %T The impact of structural variation on human gene expression. %A Chiang, Colby %A Scott, Alexandra J %A Davis, Joe R %A Tsang, Emily K %A Li, Xin %A Kim, Yungil %A Hadzic, Tarik %A Damani, Farhan N %A Ganel, Liron %A Montgomery, Stephen B %A Battle, Alexis %A Conrad, Donald F %A Hall, Ira M %K Algorithms %K Chromosome Mapping %K Gene Expression Regulation %K Genetic Variation %K Genome, Human %K Genome-Wide Association Study %K Humans %K INDEL Mutation %K Linear Models %K Polymorphism, Single Nucleotide %K Quantitative Trait Loci %K Sequence Analysis, DNA %X

Structural variants (SVs) are an important source of human genetic diversity, but their contribution to traits, disease and gene regulation remains unclear. We mapped cis expression quantitative trait loci (eQTLs) in 13 tissues via joint analysis of SVs, single-nucleotide variants (SNVs) and short insertion/deletion (indel) variants from deep whole-genome sequencing (WGS). We estimated that SVs are causal at 3.5-6.8% of eQTLs-a substantially higher fraction than prior estimates-and that expression-altering SVs have larger effect sizes than do SNVs and indels. We identified 789 putative causal SVs predicted to directly alter gene expression: most (88.3%) were noncoding variants enriched at enhancers and other regulatory elements, and 52 were linked to genome-wide association study loci. We observed a notable abundance of rare high-impact SVs associated with aberrant expression of nearby genes. These results suggest that comprehensive WGS-based SV analyses will increase the power of common- and rare-variant association studies.

%B Nat Genet %V 49 %P 692-699 %8 2017 May %G eng %N 5 %1 https://www.ncbi.nlm.nih.gov/pubmed/28369037?dopt=Abstract %R 10.1038/ng.3834 %0 Journal Article %J Circulation %D 2017 %T Polygenic Risk Score Identifies Subgroup With Higher Burden of Atherosclerosis and Greater Relative Benefit From Statin Therapy in the Primary Prevention Setting. %A Natarajan, Pradeep %A Young, Robin %A Stitziel, Nathan O %A Padmanabhan, Sandosh %A Baber, Usman %A Mehran, Roxana %A Sartori, Samantha %A Fuster, Valentin %A Reilly, Dermot F %A Butterworth, Adam %A Rader, Daniel J %A Ford, Ian %A Sattar, Naveed %A Kathiresan, Sekar %K Adolescent %K Adult %K Aged %K Aged, 80 and over %K Atherosclerosis %K Cohort Studies %K Cost of Illness %K Female %K Humans %K Hydroxymethylglutaryl-CoA Reductase Inhibitors %K Male %K Middle Aged %K Multifactorial Inheritance %K Primary Prevention %K Risk Factors %K Young Adult %X

BACKGROUND: Relative risk reduction with statin therapy has been consistent across nearly all subgroups studied to date. However, in analyses of 2 randomized controlled primary prevention trials (ASCOT [Anglo-Scandinavian Cardiac Outcomes Trial-Lipid-Lowering Arm] and JUPITER [Justification for the Use of Statins in Prevention: An Intervention Trial Evaluating Rosuvastatin]), statin therapy led to a greater relative risk reduction among a subgroup at high genetic risk. Here, we aimed to confirm this observation in a third primary prevention randomized controlled trial. In addition, we assessed whether those at high genetic risk had a greater burden of subclinical coronary atherosclerosis.

METHODS: We studied participants from a randomized controlled trial of primary prevention with statin therapy (WOSCOPS [West of Scotland Coronary Prevention Study]; n=4910) and 2 observational cohort studies (CARDIA [Coronary Artery Risk Development in Young Adults] and BioImage; n=1154 and 4392, respectively). For each participant, we calculated a polygenic risk score derived from up to 57 common DNA sequence variants previously associated with coronary heart disease. We compared the relative efficacy of statin therapy in those at high genetic risk (top quintile of polygenic risk score) versus all others (WOSCOPS), as well as the association between the polygenic risk score and coronary artery calcification (CARDIA) and carotid artery plaque burden (BioImage).

RESULTS: Among WOSCOPS trial participants at high genetic risk, statin therapy was associated with a relative risk reduction of 44% (95% confidence interval [CI], 22-60; <0.001), whereas in all others, the relative risk reduction was 24% (95% CI, 8-37; =0.004) despite similar low-density lipoprotein cholesterol lowering. In a study-level meta-analysis across the WOSCOPS, ASCOT, and JUPITER primary prevention, relative risk reduction in those at high genetic risk was 46% versus 26% in all others ( for heterogeneity=0.05). Across all 3 studies, the absolute risk reduction with statin therapy was 3.6% (95% CI, 2.0-5.1) among those in the high genetic risk group and 1.3% (95% CI, 0.6-1.9) in all others. Each 1-SD increase in the polygenic risk score was associated with 1.32-fold (95% CI, 1.04-1.68) greater likelihood of having coronary artery calcification and 9.7% higher (95% CI, 2.2-17.8) burden of carotid plaque.

CONCLUSIONS: Those at high genetic risk have a greater burden of subclinical atherosclerosis and derive greater relative and absolute benefit from statin therapy to prevent a first coronary heart disease event.

CLINICAL TRIAL REGISTRATION: URL: http://www.clinicaltrials.gov. Unique identifiers: NCT00738725 (BioImage) and NCT00005130 (CARDIA). WOSCOPS was carried out and completed before the requirement for clinical trial registration.

%B Circulation %V 135 %P 2091-2101 %8 2017 May 30 %G eng %N 22 %1 https://www.ncbi.nlm.nih.gov/pubmed/28223407?dopt=Abstract %R 10.1161/CIRCULATIONAHA.116.024436 %0 Journal Article %J Am J Hum Genet %D 2017 %T Practical Approaches for Whole-Genome Sequence Analysis of Heart- and Blood-Related Traits. %A Morrison, Alanna C %A Huang, Zhuoyi %A Yu, Bing %A Metcalf, Ginger %A Liu, Xiaoming %A Ballantyne, Christie %A Coresh, Josef %A Yu, Fuli %A Muzny, Donna %A Feofanova, Elena %A Rustagi, Navin %A Gibbs, Richard %A Boerwinkle, Eric %K Black or African American %K C-Reactive Protein %K Cholesterol, HDL %K Cholesterol, LDL %K Chromosomes, Human, Pair 9 %K Gene Frequency %K Genome, Human %K Genome-Wide Association Study %K Genomics %K Hemoglobins %K Humans %K Introns %K Leukocyte Count %K Lipoprotein(a) %K Magnesium %K Natriuretic Peptide, Brain %K Neutrophils %K Peptide Fragments %K Phosphorus %K Platelet Count %K Polymorphism, Single Nucleotide %K Quantitative Trait Loci %K Troponin T %K White People %X

Whole-genome sequencing (WGS) allows for a comprehensive view of the sequence of the human genome. We present and apply integrated methodologic steps for interrogating WGS data to characterize the genetic architecture of 10 heart- and blood-related traits in a sample of 1,860 African Americans. In order to evaluate the contribution of regulatory and non-protein coding regions of the genome, we conducted aggregate tests of rare variation across the entire genomic landscape using a sliding window, complemented by an annotation-based assessment of the genome using predefined regulatory elements and within the first intron of all genes. These tests were performed treating all variants equally as well as with individual variants weighted by a measure of predicted functional consequence. Significant findings were assessed in 1,705 individuals of European ancestry. After these steps, we identified and replicated components of the genomic landscape significantly associated with heart- and blood-related traits. For two traits, lipoprotein(a) levels and neutrophil count, aggregate tests of low-frequency and rare variation were significantly associated across multiple motifs. For a third trait, cardiac troponin T, investigation of regulatory domains identified a locus on chromosome 9. These practical approaches for WGS analysis led to the identification of informative genomic regions and also showed that defined non-coding regions, such as first introns of genes and regulatory domains, are associated with important risk factor phenotypes. This study illustrates the tractable nature of WGS data and outlines an approach for characterizing the genetic architecture of complex traits.

%B Am J Hum Genet %V 100 %P 205-215 %8 2017 Feb 02 %G eng %N 2 %1 https://www.ncbi.nlm.nih.gov/pubmed/28089252?dopt=Abstract %R 10.1016/j.ajhg.2016.12.009 %0 Journal Article %J Genome Res %D 2017 %T Quantifying the regulatory effect size of -acting genetic variation using allelic fold change. %A Mohammadi, Pejman %A Castel, Stephane E %A Brown, Andrew A %A Lappalainen, Tuuli %K Alleles %K Databases, Genetic %K Gene Expression %K Gene Expression Profiling %K Gene Regulatory Networks %K Genetic Variation %K Humans %K Models, Theoretical %K Quantitative Trait Loci %X

Mapping -acting expression quantitative trait loci (-eQTL) has become a popular approach for characterizing proximal genetic regulatory variants. In this paper, we describe and characterize log allelic fold change (aFC), the magnitude of expression change associated with a given genetic variant, as a biologically interpretable unit for quantifying the effect size of -eQTLs and a mathematically convenient approach for systematic modeling of -regulation. This measure is mathematically independent from expression level and allele frequency, additive, applicable to multiallelic variants, and generalizable to multiple independent variants. We provide efficient tools and guidelines for estimating aFC from both eQTL and allelic expression data sets and apply it to Genotype Tissue Expression (GTEx) data. We show that aFC estimates independently derived from eQTL and allelic expression data are highly consistent, and identify technical and biological correlates of eQTL effect size. We generalize aFC to analyze genes with two eQTLs in GTEx and show that in nearly all cases the two eQTLs act independently in regulating gene expression. In summary, aFC is a solid measure of -regulatory effect size that allows quantitative interpretation of cellular regulatory events from population data, and it is a valuable approach for investigating novel aspects of eQTL data sets.

%B Genome Res %V 27 %P 1872-1884 %8 2017 11 %G eng %N 11 %1 https://www.ncbi.nlm.nih.gov/pubmed/29021289?dopt=Abstract %R 10.1101/gr.216747.116 %0 Journal Article %J N Engl J Med %D 2017 %T Resolution of Disease Phenotypes Resulting from Multilocus Genomic Variation. %A Posey, Jennifer E %A Harel, Tamar %A Liu, Pengfei %A Rosenfeld, Jill A %A James, Regis A %A Coban Akdemir, Zeynep H %A Walkiewicz, Magdalena %A Bi, Weimin %A Xiao, Rui %A Ding, Yan %A Xia, Fan %A Beaudet, Arthur L %A Muzny, Donna M %A Gibbs, Richard A %A Boerwinkle, Eric %A Eng, Christine M %A Sutton, V Reid %A Shaw, Chad A %A Plon, Sharon E %A Yang, Yaping %A Lupski, James R %K Exome %K Genetic Diseases, Inborn %K Genetic Variation %K Genotyping Techniques %K High-Throughput Nucleotide Sequencing %K Humans %K Phenotype %K Retrospective Studies %K Sequence Analysis, DNA %X

BACKGROUND: Whole-exome sequencing can provide insight into the relationship between observed clinical phenotypes and underlying genotypes.

METHODS: We conducted a retrospective analysis of data from a series of 7374 consecutive unrelated patients who had been referred to a clinical diagnostic laboratory for whole-exome sequencing; our goal was to determine the frequency and clinical characteristics of patients for whom more than one molecular diagnosis was reported. The phenotypic similarity between molecularly diagnosed pairs of diseases was calculated with the use of terms from the Human Phenotype Ontology.

RESULTS: A molecular diagnosis was rendered for 2076 of 7374 patients (28.2%); among these patients, 101 (4.9%) had diagnoses that involved two or more disease loci. We also analyzed parental samples, when available, and found that de novo variants accounted for 67.8% (61 of 90) of pathogenic variants in autosomal dominant disease genes and 51.7% (15 of 29) of pathogenic variants in X-linked disease genes; both variants were de novo in 44.7% (17 of 38) of patients with two monoallelic variants. Causal copy-number variants were found in 12 patients (11.9%) with multiple diagnoses. Phenotypic similarity scores were significantly lower among patients in whom the phenotype resulted from two distinct mendelian disorders that affected different organ systems (50 patients) than among patients with disorders that had overlapping phenotypic features (30 patients) (median score, 0.21 vs. 0.36; P=1.77×10).

CONCLUSIONS: In our study, we found multiple molecular diagnoses in 4.9% of cases in which whole-exome sequencing was informative. Our results show that structured clinical ontologies can be used to determine the degree of overlap between two mendelian diseases in the same patient; the diseases can be distinct or overlapping. Distinct disease phenotypes affect different organ systems, whereas overlapping disease phenotypes are more likely to be caused by two genes encoding proteins that interact within the same pathway. (Funded by the National Institutes of Health and the Ting Tsung and Wei Fong Chao Foundation.).

%B N Engl J Med %V 376 %P 21-31 %8 2017 Jan 05 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/27959697?dopt=Abstract %R 10.1056/NEJMoa1516767 %0 Journal Article %J Bioinformatics %D 2017 %T SVScore: an impact prediction tool for structural variation. %A Ganel, Liron %A Abel, Haley J %A Hall, Ira M %K Gene Frequency %K Genomic Structural Variation %K Genomics %K Humans %K Polymorphism, Single Nucleotide %K Sequence Deletion %K Software %X

Summary: Here we present SVScore, a tool for in silico structural variation (SV) impact prediction. SVScore aggregates per-base single nucleotide polymorphism (SNP) pathogenicity scores across relevant genomic intervals for each SV in a manner that considers variant type, gene features and positional uncertainty. We show that the allele frequency spectrum of high-scoring SVs is strongly skewed toward lower frequencies, suggesting that they are under purifying selection, and that SVScore identifies deleterious variants more effectively than alternative methods. Notably, our results also suggest that duplications are under surprisingly strong selection relative to deletions, and that there are a similar number of strongly pathogenic SVs and SNPs in the human population.

Availability and Implementation: SVScore is implemented in Perl and available freely at {{ http://www.github.com/lganel/SVScore }} for use under the MIT license.

Contact: ihall@wustl.edu.

Supplementary information: Supplementary data are available at Bioinformatics online.

%B Bioinformatics %V 33 %P 1083-1085 %8 2017 04 01 %G eng %N 7 %1 https://www.ncbi.nlm.nih.gov/pubmed/28031184?dopt=Abstract %R 10.1093/bioinformatics/btw789 %0 Journal Article %J Cell %D 2017 %T Type 2 Diabetes Variants Disrupt Function of SLC16A11 through Two Distinct Mechanisms. %A Rusu, Victor %A Hoch, Eitan %A Mercader, Josep M %A Tenen, Danielle E %A Gymrek, Melissa %A Hartigan, Christina R %A DeRan, Michael %A von Grotthuss, Marcin %A Fontanillas, Pierre %A Spooner, Alexandra %A Guzman, Gaelen %A Deik, Amy A %A Pierce, Kerry A %A Dennis, Courtney %A Clish, Clary B %A Carr, Steven A %A Wagner, Bridget K %A Schenone, Monica %A Ng, Maggie C Y %A Chen, Brian H %A Centeno-Cruz, Federico %A Zerrweck, Carlos %A Orozco, Lorena %A Altshuler, David M %A Schreiber, Stuart L %A Florez, Jose C %A Jacobs, Suzanne B R %A Lander, Eric S %K Basigin %K Cell Membrane %K Chromosomes, Human, Pair 17 %K Diabetes Mellitus, Type 2 %K Gene Knockdown Techniques %K Haplotypes %K Hepatocytes %K Heterozygote %K Histone Code %K Humans %K Liver %K Models, Molecular %K Monocarboxylic Acid Transporters %X

Type 2 diabetes (T2D) affects Latinos at twice the rate seen in populations of European descent. We recently identified a risk haplotype spanning SLC16A11 that explains ∼20% of the increased T2D prevalence in Mexico. Here, through genetic fine-mapping, we define a set of tightly linked variants likely to contain the causal allele(s). We show that variants on the T2D-associated haplotype have two distinct effects: (1) decreasing SLC16A11 expression in liver and (2) disrupting a key interaction with basigin, thereby reducing cell-surface localization. Both independent mechanisms reduce SLC16A11 function and suggest SLC16A11 is the causal gene at this locus. To gain insight into how SLC16A11 disruption impacts T2D risk, we demonstrate that SLC16A11 is a proton-coupled monocarboxylate transporter and that genetic perturbation of SLC16A11 induces changes in fatty acid and lipid metabolism that are associated with increased T2D risk. Our findings suggest that increasing SLC16A11 function could be therapeutically beneficial for T2D. VIDEO ABSTRACT.

%B Cell %V 170 %P 199-212.e20 %8 2017 Jun 29 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/28666119?dopt=Abstract %R 10.1016/j.cell.2017.06.011 %0 Journal Article %J Cell %D 2016 %T Concerted Genetic Function in Blood Traits. %A Kim-Hellmuth, Sarah %A Lappalainen, Tuuli %K Genetic Predisposition to Disease %K Genetic Variation %K Genome-Wide Association Study %K Humans %K Phenotype %K Polymorphism, Single Nucleotide %X

The hematopoietic system plays a major role in human health. Two studies by Astle et al. and Chen et al. published in this issue of Cell use genome-wide association and functional genomics approaches to provide deep insights into the role of genetic variants in hematological traits. We discuss these discoveries and future strategies toward completing our understanding of the genetic basis for variation in human traits.

%B Cell %V 167 %P 1167-1169 %8 2016 11 17 %G eng %N 5 %1 https://www.ncbi.nlm.nih.gov/pubmed/27863238?dopt=Abstract %R 10.1016/j.cell.2016.10.055 %0 Journal Article %J N Engl J Med %D 2016 %T Genetic Risk, Adherence to a Healthy Lifestyle, and Coronary Disease. %A Khera, Amit V %A Emdin, Connor A %A Drake, Isabel %A Natarajan, Pradeep %A Bick, Alexander G %A Cook, Nancy R %A Chasman, Daniel I %A Baber, Usman %A Mehran, Roxana %A Rader, Daniel J %A Fuster, Valentin %A Boerwinkle, Eric %A Melander, Olle %A Orho-Melander, Marju %A Ridker, Paul M %A Kathiresan, Sekar %K Aged %K Cohort Studies %K Coronary Disease %K Cross-Sectional Studies %K Female %K Genetic Predisposition to Disease %K Healthy Lifestyle %K Humans %K Incidence %K Male %K Middle Aged %K Multifactorial Inheritance %K Patient Compliance %K Polymorphism, Genetic %K Risk %X

BACKGROUND: Both genetic and lifestyle factors contribute to individual-level risk of coronary artery disease. The extent to which increased genetic risk can be offset by a healthy lifestyle is unknown.

METHODS: Using a polygenic score of DNA sequence polymorphisms, we quantified genetic risk for coronary artery disease in three prospective cohorts - 7814 participants in the Atherosclerosis Risk in Communities (ARIC) study, 21,222 in the Women's Genome Health Study (WGHS), and 22,389 in the Malmö Diet and Cancer Study (MDCS) - and in 4260 participants in the cross-sectional BioImage Study for whom genotype and covariate data were available. We also determined adherence to a healthy lifestyle among the participants using a scoring system consisting of four factors: no current smoking, no obesity, regular physical activity, and a healthy diet.

RESULTS: The relative risk of incident coronary events was 91% higher among participants at high genetic risk (top quintile of polygenic scores) than among those at low genetic risk (bottom quintile of polygenic scores) (hazard ratio, 1.91; 95% confidence interval [CI], 1.75 to 2.09). A favorable lifestyle (defined as at least three of the four healthy lifestyle factors) was associated with a substantially lower risk of coronary events than an unfavorable lifestyle (defined as no or only one healthy lifestyle factor), regardless of the genetic risk category. Among participants at high genetic risk, a favorable lifestyle was associated with a 46% lower relative risk of coronary events than an unfavorable lifestyle (hazard ratio, 0.54; 95% CI, 0.47 to 0.63). This finding corresponded to a reduction in the standardized 10-year incidence of coronary events from 10.7% for an unfavorable lifestyle to 5.1% for a favorable lifestyle in ARIC, from 4.6% to 2.0% in WGHS, and from 8.2% to 5.3% in MDCS. In the BioImage Study, a favorable lifestyle was associated with significantly less coronary-artery calcification within each genetic risk category.

CONCLUSIONS: Across four studies involving 55,685 participants, genetic and lifestyle factors were independently associated with susceptibility to coronary artery disease. Among participants at high genetic risk, a favorable lifestyle was associated with a nearly 50% lower relative risk of coronary artery disease than was an unfavorable lifestyle. (Funded by the National Institutes of Health and others.).

%B N Engl J Med %V 375 %P 2349-2358 %8 2016 Dec 15 %G eng %N 24 %1 https://www.ncbi.nlm.nih.gov/pubmed/27959714?dopt=Abstract %R 10.1056/NEJMoa1605086 %0 Journal Article %J Am J Hum Genet %D 2016 %T Genome Sequencing of Autism-Affected Families Reveals Disruption of Putative Noncoding Regulatory DNA. %A Turner, Tychele N %A Hormozdiari, Fereydoun %A Duyzend, Michael H %A McClymont, Sarah A %A Hook, Paul W %A Iossifov, Ivan %A Raja, Archana %A Baker, Carl %A Hoekzema, Kendra %A Stessman, Holly A %A Zody, Michael C %A Nelson, Bradley J %A Huddleston, John %A Sandstrom, Richard %A Smith, Joshua D %A Hanna, David %A Swanson, James M %A Faustman, Elaine M %A Bamshad, Michael J %A Stamatoyannopoulos, John %A Nickerson, Deborah A %A McCallion, Andrew S %A Darnell, Robert %A Eichler, Evan E %K Autistic Disorder %K DNA %K Exome %K Female %K Genome, Human %K Humans %K Male %K Pedigree %K Polymorphism, Single Nucleotide %X

We performed whole-genome sequencing (WGS) of 208 genomes from 53 families affected by simplex autism. For the majority of these families, no copy-number variant (CNV) or candidate de novo gene-disruptive single-nucleotide variant (SNV) had been detected by microarray or whole-exome sequencing (WES). We integrated multiple CNV and SNV analyses and extensive experimental validation to identify additional candidate mutations in eight families. We report that compared to control individuals, probands showed a significant (p = 0.03) enrichment of de novo and private disruptive mutations within fetal CNS DNase I hypersensitive sites (i.e., putative regulatory regions). This effect was only observed within 50 kb of genes that have been previously associated with autism risk, including genes where dosage sensitivity has already been established by recurrent disruptive de novo protein-coding mutations (ARID1B, SCN2A, NR3C2, PRKCA, and DSCAM). In addition, we provide evidence of gene-disruptive CNVs (in DISC1, WNT7A, RBFOX1, and MBD5), as well as smaller de novo CNVs and exon-specific SNVs missed by exome sequencing in neurodevelopmental genes (e.g., CANX, SAE1, and PIK3CA). Our results suggest that the detection of smaller, often multiple CNVs affecting putative regulatory elements might help explain additional risk of simplex autism.

%B Am J Hum Genet %V 98 %P 58-74 %8 2016 Jan 07 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/26749308?dopt=Abstract %R 10.1016/j.ajhg.2015.11.023