%0 Journal Article %J Am J Hum Genet %D 2021 %T Association of structural variation with cardiometabolic traits in Finns. %A Chen, Lei %A Abel, Haley J %A Das, Indraniel %A Larson, David E %A Ganel, Liron %A Kanchi, Krishna L %A Regier, Allison A %A Young, Erica P %A Kang, Chul Joo %A Scott, Alexandra J %A Chiang, Colby %A Wang, Xinxin %A Lu, Shuangjia %A Christ, Ryan %A Service, Susan K %A Chiang, Charleston W K %A Havulinna, Aki S %A Kuusisto, Johanna %A Boehnke, Michael %A Laakso, Markku %A Palotie, Aarno %A Ripatti, Samuli %A Freimer, Nelson B %A Locke, Adam E %A Stitziel, Nathan O %A Hall, Ira M %K Alleles %K Cardiovascular Diseases %K Cholesterol %K DNA Copy Number Variations %K Female %K Finland %K Genome, Human %K Genomic Structural Variation %K Genotype %K High-Throughput Nucleotide Sequencing %K Humans %K Male %K Mitochondrial Proteins %K Promoter Regions, Genetic %K Pyruvate Dehydrogenase (Lipoamide)-Phosphatase %K Pyruvic Acid %K Serum Albumin, Human %X

The contribution of genome structural variation (SV) to quantitative traits associated with cardiometabolic diseases remains largely unknown. Here, we present the results of a study examining genetic association between SVs and cardiometabolic traits in the Finnish population. We used sensitive methods to identify and genotype 129,166 high-confidence SVs from deep whole-genome sequencing (WGS) data of 4,848 individuals. We tested the 64,572 common and low-frequency SVs for association with 116 quantitative traits and tested candidate associations using exome sequencing and array genotype data from an additional 15,205 individuals. We discovered 31 genome-wide significant associations at 15 loci, including 2 loci at which SVs have strong phenotypic effects: (1) a deletion of the ALB promoter that is greatly enriched in the Finnish population and causes decreased serum albumin level in carriers (p = 1.47 × 10) and is also associated with increased levels of total cholesterol (p = 1.22 × 10) and 14 additional cholesterol-related traits, and (2) a multi-allelic copy number variant (CNV) at PDPR that is strongly associated with pyruvate (p = 4.81 × 10) and alanine (p = 6.14 × 10) levels and resides within a structurally complex genomic region that has accumulated many rearrangements over evolutionary time. We also confirmed six previously reported associations, including five led by stronger signals in single nucleotide variants (SNVs) and one linking recurrent HP gene deletion and cholesterol levels (p = 6.24 × 10), which was also found to be strongly associated with increased glycoprotein level (p = 3.53 × 10). Our study confirms that integrating SVs in trait-mapping studies will expand our knowledge of genetic factors underlying disease risk.

%B Am J Hum Genet %V 108 %P 583-596 %8 2021 04 01 %G eng %N 4 %1 https://www.ncbi.nlm.nih.gov/pubmed/33798444?dopt=Abstract %R 10.1016/j.ajhg.2021.03.008 %0 Journal Article %J Genome Med %D 2021 %T Genetic and non-genetic factors affecting the expression of COVID-19-relevant genes in the large airway epithelium. %A Kasela, Silva %A Ortega, Victor E %A Martorella, Molly %A Garudadri, Suresh %A Nguyen, Jenna %A Ampleford, Elizabeth %A Pasanen, Anu %A Nerella, Srilaxmi %A Buschur, Kristina L %A Barjaktarevic, Igor Z %A Barr, R Graham %A Bleecker, Eugene R %A Bowler, Russell P %A Comellas, Alejandro P %A Cooper, Christopher B %A Couper, David J %A Criner, Gerard J %A Curtis, Jeffrey L %A Han, MeiLan K %A Hansel, Nadia N %A Hoffman, Eric A %A Kaner, Robert J %A Krishnan, Jerry A %A Martinez, Fernando J %A McDonald, Merry-Lynn N %A Meyers, Deborah A %A Paine, Robert %A Peters, Stephen P %A Castro, Mario %A Denlinger, Loren C %A Erzurum, Serpil C %A Fahy, John V %A Israel, Elliot %A Jarjour, Nizar N %A Levy, Bruce D %A Li, Xingnan %A Moore, Wendy C %A Wenzel, Sally E %A Zein, Joe %A Langelier, Charles %A Woodruff, Prescott G %A Lappalainen, Tuuli %A Christenson, Stephanie A %K Adult %K Aged %K Aged, 80 and over %K Angiotensin-Converting Enzyme 2 %K Asthma %K Bronchi %K Cardiovascular Diseases %K COVID-19 %K Gene Expression %K Genetic Variation %K Humans %K Middle Aged %K Obesity %K Pulmonary Disease, Chronic Obstructive %K Quantitative Trait Loci %K Respiratory Mucosa %K Risk Factors %K SARS-CoV-2 %K Smoking %X

BACKGROUND: The large airway epithelial barrier provides one of the first lines of defense against respiratory viruses, including SARS-CoV-2 that causes COVID-19. Substantial inter-individual variability in individual disease courses is hypothesized to be partially mediated by the differential regulation of the genes that interact with the SARS-CoV-2 virus or are involved in the subsequent host response. Here, we comprehensively investigated non-genetic and genetic factors influencing COVID-19-relevant bronchial epithelial gene expression.

METHODS: We analyzed RNA-sequencing data from bronchial epithelial brushings obtained from uninfected individuals. We related ACE2 gene expression to host and environmental factors in the SPIROMICS cohort of smokers with and without chronic obstructive pulmonary disease (COPD) and replicated these associations in two asthma cohorts, SARP and MAST. To identify airway biology beyond ACE2 binding that may contribute to increased susceptibility, we used gene set enrichment analyses to determine if gene expression changes indicative of a suppressed airway immune response observed early in SARS-CoV-2 infection are also observed in association with host factors. To identify host genetic variants affecting COVID-19 susceptibility in SPIROMICS, we performed expression quantitative trait (eQTL) mapping and investigated the phenotypic associations of the eQTL variants.

RESULTS: We found that ACE2 expression was higher in relation to active smoking, obesity, and hypertension that are known risk factors of COVID-19 severity, while an association with interferon-related inflammation was driven by the truncated, non-binding ACE2 isoform. We discovered that expression patterns of a suppressed airway immune response to early SARS-CoV-2 infection, compared to other viruses, are similar to patterns associated with obesity, hypertension, and cardiovascular disease, which may thus contribute to a COVID-19-susceptible airway environment. eQTL mapping identified regulatory variants for genes implicated in COVID-19, some of which had pheWAS evidence for their potential role in respiratory infections.

CONCLUSIONS: These data provide evidence that clinically relevant variation in the expression of COVID-19-related genes is associated with host factors, environmental exposures, and likely host genetic variation.

%B Genome Med %V 13 %P 66 %8 2021 04 21 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/33883027?dopt=Abstract %R 10.1186/s13073-021-00866-2 %0 Journal Article %J NPJ Genom Med %D 2021 %T Genetic profiles of 103,106 individuals in the Taiwan Biobank provide insights into the health and history of Han Chinese. %A Wei, Chun-Yu %A Yang, Jenn-Hwai %A Yeh, Erh-Chan %A Tsai, Ming-Fang %A Kao, Hsiao-Jung %A Lo, Chen-Zen %A Chang, Lung-Pao %A Lin, Wan-Jia %A Hsieh, Feng-Jen %A Belsare, Saurabh %A Bhaskar, Anand %A Su, Ming-Wei %A Lee, Te-Chang %A Lin, Yi-Ling %A Liu, Fu-Tong %A Shen, Chen-Yang %A Li, Ling-Hui %A Chen, Chien-Hsiun %A Wall, Jeffrey D %A Wu, Jer-Yuarn %A Kwok, Pui-Yan %X

Personalized medical care focuses on prediction of disease risk and response to medications. To build the risk models, access to both large-scale genomic resources and human genetic studies is required. The Taiwan Biobank (TWB) has generated high-coverage, whole-genome sequencing data from 1492 individuals and genome-wide SNP data from 103,106 individuals of Han Chinese ancestry using custom SNP arrays. Principal components analysis of the genotyping data showed that the full range of Han Chinese genetic variation was found in the cohort. The arrays also include thousands of known functional variants, allowing for simultaneous ascertainment of Mendelian disease-causing mutations and variants that affect drug metabolism. We found that 21.2% of the population are mutation carriers of autosomal recessive diseases, 3.1% have mutations in cancer-predisposing genes, and 87.3% carry variants that affect drug response. We highlight how TWB data provide insight into both population history and disease burden, while showing how widespread genetic testing can be used to improve clinical care.

%B NPJ Genom Med %V 6 %P 10 %8 2021 Feb 11 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/33574314?dopt=Abstract %R 10.1038/s41525-021-00178-9 %0 Journal Article %J Science %D 2021 %T Haplotype-resolved diverse human genomes and integrated analysis of structural variation. %A Ebert, Peter %A Audano, Peter A %A Zhu, Qihui %A Rodriguez-Martin, Bernardo %A Porubsky, David %A Bonder, Marc Jan %A Sulovari, Arvis %A Ebler, Jana %A Zhou, Weichen %A Serra Mari, Rebecca %A Yilmaz, Feyza %A Zhao, Xuefang %A Hsieh, PingHsun %A Lee, Joyce %A Kumar, Sushant %A Lin, Jiadong %A Rausch, Tobias %A Chen, Yu %A Ren, Jingwen %A Santamarina, Martin %A Höps, Wolfram %A Ashraf, Hufsah %A Chuang, Nelson T %A Yang, Xiaofei %A Munson, Katherine M %A Lewis, Alexandra P %A Fairley, Susan %A Tallon, Luke J %A Clarke, Wayne E %A Basile, Anna O %A Byrska-Bishop, Marta %A Corvelo, André %A Evani, Uday S %A Lu, Tsung-Yu %A Chaisson, Mark J P %A Chen, Junjie %A Li, Chong %A Brand, Harrison %A Wenger, Aaron M %A Ghareghani, Maryam %A Harvey, William T %A Raeder, Benjamin %A Hasenfeld, Patrick %A Regier, Allison A %A Abel, Haley J %A Hall, Ira M %A Flicek, Paul %A Stegle, Oliver %A Gerstein, Mark B %A Tubio, Jose M C %A Mu, Zepeng %A Li, Yang I %A Shi, Xinghua %A Hastie, Alex R %A Ye, Kai %A Chong, Zechen %A Sanders, Ashley D %A Zody, Michael C %A Talkowski, Michael E %A Mills, Ryan E %A Devine, Scott E %A Lee, Charles %A Korbel, Jan O %A Marschall, Tobias %A Eichler, Evan E %K Female %K Genetic Variation %K Genome, Human %K Genotype %K Haplotypes %K High-Throughput Nucleotide Sequencing %K Humans %K INDEL Mutation %K Interspersed Repetitive Sequences %K Male %K Population Groups %K Quantitative Trait Loci %K Retroelements %K Sequence Analysis, DNA %K Sequence Inversion %K Whole Genome Sequencing %X

Long-read and strand-specific sequencing technologies together facilitate the de novo assembly of high-quality haplotype-resolved human genomes without parent-child trio data. We present 64 assembled haplotypes from 32 diverse human genomes. These highly contiguous haplotype assemblies (average minimum contig length needed to cover 50% of the genome: 26 million base pairs) integrate all forms of genetic variation, even across complex loci. We identified 107,590 structural variants (SVs), of which 68% were not discovered with short-read sequencing, and 278 SV hotspots (spanning megabases of gene-rich sequence). We characterized 130 of the most active mobile element source elements and found that 63% of all SVs arise through homology-mediated mechanisms. This resource enables reliable graph-based genotyping from short reads of up to 50,340 SVs, resulting in the identification of 1526 expression quantitative trait loci as well as SV candidates for adaptive selection within the human population.

%B Science %V 372 %8 2021 04 02 %G eng %N 6537 %1 https://www.ncbi.nlm.nih.gov/pubmed/33632895?dopt=Abstract %R 10.1126/science.abf7117 %0 Journal Article %J Eur J Hum Genet %D 2021 %T Phylogenetic history of patrilineages rare in northern and eastern Europe from large-scale re-sequencing of human Y-chromosomes. %A Ilumäe, Anne-Mai %A Post, Helen %A Flores, Rodrigo %A Karmin, Monika %A Sahakyan, Hovhannes %A Mondal, Mayukh %A Montinaro, Francesco %A Saag, Lauri %A Bormans, Concetta %A Sanchez, Luisa Fernanda %A Ameur, Adam %A Gyllensten, Ulf %A Kals, Mart %A Mägi, Reedik %A Pagani, Luca %A Behar, Doron M %A Rootsi, Siiri %A Villems, Richard %X

The most frequent Y-chromosomal (chrY) haplogroups in northern and eastern Europe (NEE) are well-known and thoroughly characterised. Yet a considerable number of men in every population carry rare paternal lineages with estimated frequencies around 5%. So far, limited sample-sizes and insufficient resolution of genotyping have obstructed a truly comprehensive look into the variety of rare paternal lineages segregating within populations and potential signals of population history that such lineages might convey. Here we harness the power of massive re-sequencing of human Y chromosomes to identify previously unknown population-specific clusters among rare paternal lineages in NEE. We construct dated phylogenies for haplogroups E2-M215, J2-M172, G-M201 and Q-M242 on the basis of 421 (of them 282 novel) high-coverage chrY sequences collected from large-scale databases focusing on populations of NEE. Within these otherwise rare haplogroups we disclose lineages that began to radiate ~1-3 thousand years ago in Estonia and Sweden and reveal male phylogenetic patterns testifying of comparatively recent local demographic expansions. Conversely, haplogroup Q lineages bear evidence of ancient Siberian influence lingering in the modern paternal gene pool of northern Europe. We assess the possible direction of influx of ancestral carriers for some of these male lineages. In addition, we demonstrate the congruency of paternal haplogroup composition of our dataset with two independent population-based cohorts from Estonia and Sweden.

%B Eur J Hum Genet %8 2021 May 07 %G eng %1 https://www.ncbi.nlm.nih.gov/pubmed/33958743?dopt=Abstract %R 10.1038/s41431-021-00897-8 %0 Journal Article %J Am J Hum Genet %D 2021 %T Whole-genome sequencing of African Americans implicates differential genetic architecture in inflammatory bowel disease. %A Somineni, Hari K %A Nagpal, Sini %A Venkateswaran, Suresh %A Cutler, David J %A Okou, David T %A Haritunians, Talin %A Simpson, Claire L %A Begum, Ferdouse %A Datta, Lisa W %A Quiros, Antonio J %A Seminerio, Jenifer %A Mengesha, Emebet %A Alexander, Jonathan S %A Baldassano, Robert N %A Dudley-Brown, Sharon %A Cross, Raymond K %A Dassopoulos, Themistocles %A Denson, Lee A %A Dhere, Tanvi A %A Iskandar, Heba %A Dryden, Gerald W %A Hou, Jason K %A Hussain, Sunny Z %A Hyams, Jeffrey S %A Isaacs, Kim L %A Kader, Howard %A Kappelman, Michael D %A Katz, Jeffry %A Kellermayer, Richard %A Kuemmerle, John F %A Lazarev, Mark %A Li, Ellen %A Mannon, Peter %A Moulton, Dedrick E %A Newberry, Rodney D %A Patel, Ashish S %A Pekow, Joel %A Saeed, Shehzad A %A Valentine, John F %A Wang, Ming-Hsi %A McCauley, Jacob L %A Abreu, Maria T %A Jester, Traci %A Molle-Rios, Zarela %A Palle, Sirish %A Scherl, Ellen J %A Kwon, John %A Rioux, John D %A Duerr, Richard H %A Silverberg, Mark S %A Zwick, Michael E %A Stevens, Christine %A Daly, Mark J %A Cho, Judy H %A Gibson, Greg %A McGovern, Dermot P B %A Brant, Steven R %A Kugathasan, Subra %K African Americans %K Aged %K Aged, 80 and over %K Calbindin 2 %K Colitis, Ulcerative %K Crohn Disease %K European Continental Ancestry Group %K Female %K Gene Frequency %K Genetic Predisposition to Disease %K Genome-Wide Association Study %K Humans %K Inflammatory Bowel Diseases %K Male %K Multifactorial Inheritance %K Polymorphism, Single Nucleotide %K Receptors, Prostaglandin E, EP4 Subtype %K Whole Genome Sequencing %X

Whether or not populations diverge with respect to the genetic contribution to risk of specific complex diseases is relevant to understanding the evolution of susceptibility and origins of health disparities. Here, we describe a large-scale whole-genome sequencing study of inflammatory bowel disease encompassing 1,774 affected individuals and 1,644 healthy control Americans with African ancestry (African Americans). Although no new loci for inflammatory bowel disease are discovered at genome-wide significance levels, we identify numerous instances of differential effect sizes in combination with divergent allele frequencies. For example, the major effect at PTGER4 fine maps to a single credible interval of 22 SNPs corresponding to one of four independent associations at the locus in European ancestry individuals but with an elevated odds ratio for Crohn disease in African Americans. A rare variant aggregate analysis implicates Ca-binding neuro-immunomodulator CALB2 in ulcerative colitis. Highly significant overall overlap of common variant risk for inflammatory bowel disease susceptibility between individuals with African and European ancestries was observed, with 41 of 241 previously known lead variants replicated and overall correlations in effect sizes of 0.68 for combined inflammatory bowel disease. Nevertheless, subtle differences influence the performance of polygenic risk scores, and we show that ancestry-appropriate weights significantly improve polygenic prediction in the highest percentiles of risk. The median amount of variance explained per locus remains the same in African and European cohorts, providing evidence for compensation of effect sizes as allele frequencies diverge, as expected under a highly polygenic model of disease.

%B Am J Hum Genet %V 108 %P 431-445 %8 2021 03 04 %G eng %N 3 %1 https://www.ncbi.nlm.nih.gov/pubmed/33600772?dopt=Abstract %R 10.1016/j.ajhg.2021.02.001 %0 Journal Article %J Nat Commun %D 2020 %T Analysis of cardiac magnetic resonance imaging in 36,000 individuals yields genetic insights into dilated cardiomyopathy. %A Pirruccello, James P %A Bick, Alexander %A Wang, Minxian %A Chaffin, Mark %A Friedman, Samuel %A Yao, Jie %A Guo, Xiuqing %A Venkatesh, Bharath Ambale %A Taylor, Kent D %A Post, Wendy S %A Rich, Stephen %A Lima, Joao A C %A Rotter, Jerome I %A Philippakis, Anthony %A Lubitz, Steven A %A Ellinor, Patrick T %A Khera, Amit V %A Kathiresan, Sekar %A Aragam, Krishna G %K Cardiomyopathy, Dilated %K Genome-Wide Association Study %K Heart %K Humans %K Magnetic Resonance Imaging %K Myocardium %K Polymorphism, Single Nucleotide %X

Dilated cardiomyopathy (DCM) is an important cause of heart failure and the leading indication for heart transplantation. Many rare genetic variants have been associated with DCM, but common variant studies of the disease have yielded few associated loci. As structural changes in the heart are a defining feature of DCM, we report a genome-wide association study of cardiac magnetic resonance imaging (MRI)-derived left ventricular measurements in 36,041 UK Biobank participants, with replication in 2184 participants from the Multi-Ethnic Study of Atherosclerosis. We identify 45 previously unreported loci associated with cardiac structure and function, many near well-established genes for Mendelian cardiomyopathies. A polygenic score of MRI-derived left ventricular end systolic volume strongly associates with incident DCM in the general population. Even among carriers of TTN truncating mutations, this polygenic score influences the size and function of the human heart. These results further implicate common genetic polymorphisms in the pathogenesis of DCM.

%B Nat Commun %V 11 %P 2254 %8 2020 05 07 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/32382064?dopt=Abstract %R 10.1038/s41467-020-15823-7 %0 Journal Article %J JAMA Netw Open %D 2020 %T Association of Rare Pathogenic DNA Variants for Familial Hypercholesterolemia, Hereditary Breast and Ovarian Cancer Syndrome, and Lynch Syndrome With Disease Risk in Adults According to Family History. %A Patel, Aniruddh P %A Wang, Minxian %A Fahed, Akl C %A Mason-Suares, Heather %A Brockman, Deanna %A Pelletier, Renee %A Amr, Sami %A Machini, Kalotina %A Hawley, Megan %A Witkowski, Leora %A Koch, Christopher %A Philippakis, Anthony %A Cassa, Christopher A %A Ellinor, Patrick T %A Kathiresan, Sekar %A Ng, Kenney %A Lebo, Matthew %A Khera, Amit V %K Aged %K Cohort Studies %K Colorectal Neoplasms, Hereditary Nonpolyposis %K Female %K Genetic Predisposition to Disease %K Hereditary Breast and Ovarian Cancer Syndrome %K Heterozygote %K Humans %K Hyperlipoproteinemia Type II %K Male %K Middle Aged %K Pedigree %K Proportional Hazards Models %K United Kingdom %K Whole Exome Sequencing %X

Importance: Pathogenic DNA variants associated with familial hypercholesterolemia, hereditary breast and ovarian cancer syndrome, and Lynch syndrome are widely recognized as clinically important and actionable when identified, leading some clinicians to recommend population-wide genomic screening.

Objectives: To assess the prevalence and clinical importance of pathogenic or likely pathogenic variants associated with each of 3 genomic conditions (familial hypercholesterolemia, hereditary breast and ovarian cancer syndrome, and Lynch syndrome) within the context of contemporary clinical care.

Design, Setting, and Participants: This cohort study used gene-sequencing data from 49 738 participants in the UK Biobank who were recruited from 22 sites across the UK between March 21, 2006, and October 1, 2010. Inpatient hospital data date back to 1977; cancer registry data, to 1957; and death registry data, to 2006. Statistical analysis was performed from July 22, 2019, to November 15, 2019.

Exposures: Pathogenic or likely pathogenic DNA variants classified by a clinical laboratory geneticist.

Main Outcomes and Measures: Composite end point specific to each genomic condition based on atherosclerotic cardiovascular disease events for familial hypercholesterolemia, breast or ovarian cancer for hereditary breast and ovarian cancer syndrome, and colorectal or uterine cancer for Lynch syndrome.

Results: Among 49 738 participants (mean [SD] age, 57 [8] years; 27 144 female [55%]), 441 (0.9%) harbored a pathogenic or likely pathogenic variant associated with any of 3 genomic conditions, including 131 (0.3%) for familial hypercholesterolemia, 235 (0.5%) for hereditary breast and ovarian cancer syndrome, and 76 (0.2%) for Lynch syndrome. Presence of these variants was associated with increased risk of disease: for familial hypercholesterolemia, 28 of 131 carriers (21.4%) vs 4663 of 49 607 noncarriers (9.4%) developed atherosclerotic cardiovascular disease; for hereditary breast and ovarian cancer syndrome, 32 of 116 female carriers (27.6%) vs 2080 of 27 028 female noncarriers (7.7%) developed associated cancers; and for Lynch syndrome, 17 of 76 carriers (22.4%) vs 929 of 49 662 noncarriers (1.9%) developed colorectal or uterine cancer. The predicted probability of disease at age 75 years despite contemporary clinical care was 45.3% for carriers of familial hypercholesterolemia, 41.1% for hereditary breast and ovarian cancer syndrome, and 38.3% for Lynch syndrome. Across the 3 conditions, 39.7% (175 of 441) of the carriers reported a family history of disease vs 23.2% (34 517 of 148 772) of noncarriers.

Conclusions and Relevance: The findings suggest that approximately 1% of the middle-aged adult population in the UK Biobank harbored a pathogenic variant associated with any of 3 genomic conditions. These variants were associated with an increased risk of disease despite contemporary clinical care and were not reliably detected by family history.

%B JAMA Netw Open %V 3 %P e203959 %8 2020 04 01 %G eng %N 4 %1 https://www.ncbi.nlm.nih.gov/pubmed/32347951?dopt=Abstract %R 10.1001/jamanetworkopen.2020.3959 %0 Journal Article %J Science %D 2020 %T Cell type-specific genetic regulation of gene expression across human tissues. %A Kim-Hellmuth, Sarah %A Aguet, François %A Oliva, Meritxell %A Muñoz-Aguirre, Manuel %A Kasela, Silva %A Wucher, Valentin %A Castel, Stephane E %A Hamel, Andrew R %A Viñuela, Ana %A Roberts, Amy L %A Mangul, Serghei %A Wen, Xiaoquan %A Wang, Gao %A Barbeira, Alvaro N %A Garrido-Martín, Diego %A Nadel, Brian B %A Zou, Yuxin %A Bonazzola, Rodrigo %A Quan, Jie %A Brown, Andrew %A Martinez-Perez, Angel %A Soria, José Manuel %A Getz, Gad %A Dermitzakis, Emmanouil T %A Small, Kerrin S %A Stephens, Matthew %A Xi, Hualin S %A Im, Hae Kyung %A Guigo, Roderic %A Segrè, Ayellet V %A Stranger, Barbara E %A Ardlie, Kristin G %A Lappalainen, Tuuli %K Cells %K Gene Expression Regulation %K Humans %K Organ Specificity %K Quantitative Trait Loci %K RNA, Long Noncoding %K Transcriptome %X

The Genotype-Tissue Expression (GTEx) project has identified expression and splicing quantitative trait loci in cis (QTLs) for the majority of genes across a wide range of human tissues. However, the functional characterization of these QTLs has been limited by the heterogeneous cellular composition of GTEx tissue samples. We mapped interactions between computational estimates of cell type abundance and genotype to identify cell type-interaction QTLs for seven cell types and show that cell type-interaction expression QTLs (eQTLs) provide finer resolution to tissue specificity than bulk tissue cis-eQTLs. Analyses of genetic associations with 87 complex traits show a contribution from cell type-interaction QTLs and enables the discovery of hundreds of previously unidentified colocalized loci that are masked in bulk tissue.

%B Science %V 369 %8 2020 09 11 %G eng %N 6509 %1 https://www.ncbi.nlm.nih.gov/pubmed/32913075?dopt=Abstract %R 10.1126/science.aaz8528 %0 Journal Article %J Stroke %D 2020 %T Combining Imaging and Genetics to Predict Recurrence of Anticoagulation-Associated Intracerebral Hemorrhage. %A Biffi, Alessandro %A Urday, Sebastian %A Kubiszewski, Patryk %A Gilkerson, Lee %A Sekar, Padmini %A Rodriguez-Torres, Axana %A Bettin, Margaret %A Charidimou, Andreas %A Pasi, Marco %A Kourkoulis, Christina %A Schwab, Kristin %A DiPucchio, Zora %A Behymer, Tyler %A Osborne, Jennifer %A Morgan, Misty %A Moomaw, Charles J %A James, Michael L %A Greenberg, Steven M %A Viswanathan, Anand %A Gurol, M Edip %A Worrall, Bradford B %A Testai, Fernando D %A McCauley, Jacob L %A Falcone, Guido J %A Langefeld, Carl D %A Anderson, Christopher D %A Kamel, Hooman %A Woo, Daniel %A Sheth, Kevin N %A Rosand, Jonathan %K Aged %K Anticoagulants %K Apolipoprotein E4 %K Cerebral Hemorrhage %K Female %K Humans %K Magnetic Resonance Imaging %K Male %K Middle Aged %K Neuroimaging %K Recurrence %X

BACKGROUND AND PURPOSE: For survivors of oral anticoagulation therapy (OAT)-associated intracerebral hemorrhage (OAT-ICH) who are at high risk for thromboembolism, the benefits of OAT resumption must be weighed against increased risk of recurrent hemorrhagic stroke. The ε2/ε4 alleles of the () gene, MRI-defined cortical superficial siderosis, and cerebral microbleeds are the most potent risk factors for recurrent ICH. We sought to determine whether combining MRI markers and genotype could have clinical impact by identifying ICH survivors in whom the risks of OAT resumption are highest.

METHODS: Joint analysis of data from 2 longitudinal cohort studies of OAT-ICH survivors: (1) MGH-ICH study (Massachusetts General Hospital ICH) and (2) longitudinal component of the ERICH study (Ethnic/Racial Variations of Intracerebral Hemorrhage). We evaluated whether MRI markers and genotype predict ICH recurrence. We then developed and validated a combined -MRI classification scheme to predict ICH recurrence, using Classification and Regression Tree analysis.

RESULTS: Cortical superficial siderosis, cerebral microbleed, and ε2/ε4 variants were independently associated with ICH recurrence after OAT-ICH (all <0.05). Combining genotype and MRI data resulted in improved prediction of ICH recurrence (Harrell C: 0.79 versus 0.55 for clinical data alone, =0.033). In the MGH (training) data set, CSS, cerebral microbleed, and ε2/ε4 stratified likelihood of ICH recurrence into high-, medium-, and low-risk categories. In the ERICH (validation) data set, yearly ICH recurrence rates for high-, medium-, and low-risk individuals were 6.6%, 2.5%, and 0.9%, respectively, with overall area under the curve of 0.91 for prediction of recurrent ICH.

CONCLUSIONS: Combining MRI and genotype stratifies likelihood of ICH recurrence into high, medium, and low risk. If confirmed in prospective studies, this combined -MRI classification scheme may prove useful for selecting individuals for OAT resumption after ICH.

%B Stroke %V 51 %P 2153-2160 %8 2020 07 %G eng %N 7 %1 https://www.ncbi.nlm.nih.gov/pubmed/32517581?dopt=Abstract %R 10.1161/STROKEAHA.120.028310 %0 Journal Article %J J Hepatol %D 2020 %T A common variant in PNPLA3 is associated with age at diagnosis of NAFLD in patients from a multi-ethnic biobank. %A Walker, Ryan W %A Belbin, Gillian M %A Sorokin, Elena P %A Van Vleck, Tielman %A Wojcik, Genevieve L %A Moscati, Arden %A Gignoux, Christopher R %A Cho, Judy %A Abul-Husn, Noura S %A Nadkarni, Girish %A Kenny, Eimear E %A Loos, Ruth J F %X

BACKGROUND & AIMS: The Ile138Met variant (rs738409) in the PNPLA3 gene has the largest effect on non-alcoholic fatty liver disease (NAFLD), increasing the risk of progression to severe forms of liver disease. It remains unknown if the variant plays a role in age of NAFLD onset. We aimed to determine if rs738409 impacts on the age of NAFLD diagnosis.

METHODS: We applied a novel natural language processing (NLP) algorithm to a longitudinal electronic health records (EHR) dataset of >27,000 individuals with genetic data from a multi-ethnic biobank, defining NAFLD cases (n = 1,703) and confirming controls (n = 8,119). We conducted i) a survival analysis to determine if age at diagnosis differed by rs738409 genotype, ii) a receiver operating characteristics analysis to assess the utility of the rs738409 genotype in discriminating NAFLD cases from controls, and iii) a phenome-wide association study (PheWAS) between rs738409 and 10,095 EHR-derived disease diagnoses.

RESULTS: The PNPLA3 G risk allele was associated with: i) earlier age of NAFLD diagnosis, with the strongest effect in Hispanics (hazard ratio 1.33; 95% CI 1.15-1.53; p <0.0001) among whom a NAFLD diagnosis was 15% more likely in risk allele carriers vs. non-carriers; ii) increased NAFLD risk (odds ratio 1.61; 95% CI 1.349-1.73; p <0.0001), with the strongest effect among Hispanics (odds ratio 1.43; 95% CI 1.28-1.59; p <0.0001); iii) additional liver diseases in a PheWAS (p <4.95 × 10) where the risk variant also associated with earlier age of diagnosis.

CONCLUSION: Given the role of the rs738409 in NAFLD diagnosis age, our results suggest that stratifying risk within populations known to have an enhanced risk of liver disease, such as Hispanic carriers of the rs738409 variant, would be effective in earlier identification of those who would benefit most from early NAFLD prevention and treatment strategies.

LAY SUMMARY: Despite clear associations between the PNPLA3 rs738409 variant and elevated risk of progression from non-alcoholic fatty liver disease (NAFLD) to more severe forms of liver disease, it remains unknown if PNPLA3 rs738409 plays a role in the age of NAFLD onset. Herein, we found that this risk variant is associated with an earlier age of NAFLD and other liver disease diagnoses; an observation most pronounced in Hispanic Americans. We conclude that PNPLA3 rs738409 could be used to better understand liver disease risk within vulnerable populations and identify patients that may benefit from early prevention strategies.

%B J Hepatol %V 72 %P 1070-1081 %8 2020 06 %G eng %N 6 %1 https://www.ncbi.nlm.nih.gov/pubmed/32145261?dopt=Abstract %R 10.1016/j.jhep.2020.01.029 %0 Journal Article %J Genome Res %D 2020 %T Complex mosaic structural variations in human fetal brains. %A Sekar, Shobana %A Tomasini, Livia %A Proukakis, Christos %A Bae, Taejeong %A Manlove, Logan %A Jang, Yeongjun %A Scuderi, Soraya %A Zhou, Bo %A Kalyva, Maria %A Amiri, Anahita %A Mariani, Jessica %A Sedlazeck, Fritz J %A Urban, Alexander E %A Vaccarino, Flora M %A Abyzov, Alexej %X

Somatic mosaicism, manifesting as single nucleotide variants (SNVs), mobile element insertions, and structural changes in the DNA, is a common phenomenon in human brain cells, with potential functional consequences. Using a clonal approach, we previously detected 200-400 mosaic SNVs per cell in three human fetal brains (15-21 wk postconception). However, structural variation in the human fetal brain has not yet been investigated. Here, we discover and validate four mosaic structural variants (SVs) in the same brains and resolve their precise breakpoints. The SVs were of kilobase scale and complex, consisting of deletion(s) and rearranged genomic fragments, which sometimes originated from different chromosomes. Sequences at the breakpoints of these rearrangements had microhomologies, suggesting their origin from replication errors. One SV was found in two clones, and we timed its origin to ∼14 wk postconception. No large scale mosaic copy number variants (CNVs) were detectable in normal fetal human brains, suggesting that previously reported megabase-scale CNVs in neurons arise at later stages of development. By reanalysis of public single nuclei data from adult brain neurons, we detected an extrachromosomal circular DNA event. Our study reveals the existence of mosaic SVs in the developing human brain, likely arising from cell proliferation during mid-neurogenesis. Although relatively rare compared to SNVs and present in ∼10% of neurons, SVs in developing human brain affect a comparable number of bases in the genome (∼6200 vs. ∼4000 bp), implying that they may have similar functional consequences.

%B Genome Res %V 30 %P 1695-1704 %8 2020 12 %G eng %N 12 %1 https://www.ncbi.nlm.nih.gov/pubmed/33122304?dopt=Abstract %R 10.1101/gr.262667.120 %0 Journal Article %J PLoS Genet %D 2020 %T Copy number variants and fixed duplications among 198 rhesus macaques (Macaca mulatta). %A Brasó-Vives, Marina %A Povolotskaya, Inna S %A Hartasánchez, Diego A %A Farré, Xavier %A Fernandez-Callejo, Marcos %A Raveendran, Muthuswamy %A Harris, R Alan %A Rosene, Douglas L %A Lorente-Galdos, Belen %A Navarro, Arcadi %A Marques-Bonet, Tomas %A Rogers, Jeffrey %A Juan, David %K Animals %K Chromosome Mapping %K DNA Copy Number Variations %K Female %K Gene Duplication %K Genetics, Population %K Genome %K High-Throughput Nucleotide Sequencing %K Humans %K Macaca mulatta %K Male %K Open Reading Frames %K Phylogeny %K Sequence Analysis, DNA %K Species Specificity %X

The rhesus macaque is an abundant species of Old World monkeys and a valuable model organism for biomedical research due to its close phylogenetic relationship to humans. Copy number variation is one of the main sources of genomic diversity within and between species and a widely recognized cause of inter-individual differences in disease risk. However, copy number differences among rhesus macaques and between the human and macaque genomes, as well as the relevance of this diversity to research involving this nonhuman primate, remain understudied. Here we present a high-resolution map of sequence copy number for the rhesus macaque genome constructed from a dataset of 198 individuals. Our results show that about one-eighth of the rhesus macaque reference genome is composed of recently duplicated regions, either copy number variable regions or fixed duplications. Comparison with human genomic copy number maps based on previously published data shows that, despite overall similarities in the genome-wide distribution of these regions, there are specific differences at the chromosome level. Some of these create differences in the copy number profile between human disease genes and their rhesus macaque orthologs. Our results highlight the importance of addressing the number of copies of target genes in the design of experiments and cautions against human-centered assumptions in research conducted with model organisms. Overall, we present a genome-wide copy number map from a large sample of rhesus macaque individuals representing an important novel contribution concerning the evolution of copy number in primate genomes.

%B PLoS Genet %V 16 %P e1008742 %8 2020 05 %G eng %N 5 %1 https://www.ncbi.nlm.nih.gov/pubmed/32392208?dopt=Abstract %R 10.1371/journal.pgen.1008742 %0 Journal Article %J Hum Mutat %D 2020 %T dbMTS: A comprehensive database of putative human microRNA target site SNVs and their functional predictions. %A Li, Chang %A Mou, Chengcheng %A Swartz, Michael D %A Yu, Bing %A Bai, Yongsheng %A Tu, Yicheng %A Liu, Xiaoming %X

MicroRNAs (miRNA) are short noncoding RNAs that can repress the expression of protein-coding messenger RNAs (mRNAs) by binding to the 3'-untranslated region (UTR) of the target. Genetic mutations such as single nucleotide variants (SNVs) in the 3'-UTR of the mRNAs can disrupt miRNA regulation. In this study, we presented dbMTS, a database for miRNA target site (MTS) SNVs and their functional annotations. This database can help studies easily identify putative SNVs that affect miRNA targeting and facilitate the prioritization of their functional importance. dbMTS is freely available for academic use at http://database.liulab.science/dbMTS as a web service or a downloadable attached database of dbNSFP.

%B Hum Mutat %V 41 %P 1123-1130 %8 2020 06 %G eng %N 6 %1 https://www.ncbi.nlm.nih.gov/pubmed/32227657?dopt=Abstract %R 10.1002/humu.24020 %0 Journal Article %J Forensic Sci Int Genet %D 2020 %T Development and validation of the VISAGE AmpliSeq basic tool to predict appearance and ancestry from DNA. %A Xavier, Catarina %A de la Puente, María %A Mosquera-Miguel, Ana %A Freire-Aradas, Ana %A Kalamara, Vivian %A Vidaki, Athina %A E Gross, Theresa %A Revoir, Andrew %A Pośpiech, Ewelina %A Kartasińska, Ewa %A Spólnicka, Magdalena %A Branicki, Wojciech %A E Ames, Carole %A M Schneider, Peter %A Hohoff, Carsten %A Kayser, Manfred %A Phillips, Christopher %A Parson, Walther %X

Forensic DNA phenotyping is gaining interest as the number of applications increases within the forensic genetics community. The possibility of providing investigative leads in addition to conventional DNA profiling for human identification provides new insights into otherwise "cold" police investigations. The ability of reporting on the bio-geographical ancestry (BGA), appearance characteristics and age based on DNA obtained from a crime scene sample of an unknown donor makes the exploration of such markers and the development of new methods meaningful for criminal investigations. The VISible Attributes through GEnomics (VISAGE) Consortium aims to disseminate and broaden the use of predictive markers and develop fully optimized and validated prototypes for forensic casework implementation. Here, the first VISAGE appearance and ancestry tool development, performance and validation is reported. A total of 153 SNPs (96.84 % assay conversion rate) were successfully incorporated into a single multiplex reaction using the AmpliSeq™ design pipeline, and applied for massively parallel sequencing with the Ion S5 platform. A collaborative effort involving six VISAGE laboratory partners was devised to perform all validation tests. An extensive validation plan was carefully organized to explore the assay's overall performance with optimum and low-input samples, as well as with challenging and casework mock samples. In addition, forensic validation studies such as concordance and mixture tests recurring to the Coriell sample set with known genotypes were performed. Finally, inhibitor tolerance and specificity were also evaluated. Results showed a robust, highly sensitive assay with good overall concordance between laboratories.

%B Forensic Sci Int Genet %V 48 %P 102336 %8 2020 09 %G eng %1 https://www.ncbi.nlm.nih.gov/pubmed/32619960?dopt=Abstract %R 10.1016/j.fsigen.2020.102336 %0 Journal Article %J Nat Commun %D 2020 %T Discovery and population genomics of structural variation in a songbird genus. %A Weissensteiner, Matthias H %A Bunikis, Ignas %A Catalán, Ana %A Francoijs, Kees-Jan %A Knief, Ulrich %A Heim, Wieland %A Peona, Valentina %A Pophaly, Saurabh D %A Sedlazeck, Fritz J %A Suh, Alexander %A Warmuth, Vera M %A Wolf, Jochen B W %K Animals %K Chromosome Inversion %K Gene Deletion %K Genetic Variation %K Genetics, Population %K Genome %K Genomic Structural Variation %K Genotype %K Phylogeny %K Polymorphism, Single Nucleotide %K Retroelements %K Sequence Analysis, DNA %K Songbirds %X

Structural variation (SV) constitutes an important type of genetic mutations providing the raw material for evolution. Here, we uncover the genome-wide spectrum of intra- and interspecific SV segregating in natural populations of seven songbird species in the genus Corvus. Combining short-read (N = 127) and long-read re-sequencing (N = 31), as well as optical mapping (N = 16), we apply both assembly- and read mapping approaches to detect SV and characterize a total of 220,452 insertions, deletions and inversions. We exploit sampling across wide phylogenetic timescales to validate SV genotypes and assess the contribution of SV to evolutionary processes in an avian model of incipient speciation. We reveal an evolutionary young (~530,000 years) cis-acting 2.25-kb LTR retrotransposon insertion reducing expression of the NDP gene with consequences for premating isolation. Our results attest to the wealth and evolutionary significance of SV segregating in natural populations and highlight the need for reliable SV genotyping.

%B Nat Commun %V 11 %P 3403 %8 2020 07 07 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/32636372?dopt=Abstract %R 10.1038/s41467-020-17195-4 %0 Journal Article %J Trends Mol Med %D 2020 %T Emerging Targets for Cardiovascular Disease Prevention in Diabetes. %A Stitziel, Nathan O %A Kanter, Jenny E %A Bornfeldt, Karin E %X

Type 1 and type 2 diabetes mellitus (T1DM and T2DM) increase the risk of atherosclerotic cardiovascular disease (CVD), resulting in acute cardiovascular events, such as heart attack and stroke. Recent clinical trials point toward new treatment and prevention strategies for cardiovascular complications of T2DM. New antidiabetic agents show unexpected cardioprotective benefits. Moreover, genetic and reverse translational strategies have revealed potential novel targets for CVD prevention in diabetes, including inhibition of apolipoprotein C3 (APOC3). Modeling and pharmacology-based approaches to improve insulin action provide additional potential strategies to combat CVD. The development of new strategies for improved diabetes and lipid control fuels hope for future prevention of CVD associated with diabetes.

%B Trends Mol Med %V 26 %P 744-757 %8 2020 08 %G eng %N 8 %1 https://www.ncbi.nlm.nih.gov/pubmed/32423639?dopt=Abstract %R 10.1016/j.molmed.2020.03.011 %0 Journal Article %J Science %D 2020 %T Genetics of schizophrenia in the South African Xhosa. %A Gulsuner, S %A Stein, D J %A Susser, E S %A Sibeko, G %A Pretorius, A %A Walsh, T %A Majara, L %A Mndini, M M %A Mqulwana, S G %A Ntola, O A %A Casadei, S %A Ngqengelele, L L %A Korchina, V %A van der Merwe, C %A Malan, M %A Fader, K M %A Feng, M %A Willoughby, E %A Muzny, D %A Baldinger, A %A Andrews, H F %A Gur, R C %A Gibbs, R A %A Zingela, Z %A Nagdee, M %A Ramesar, R S %A King, M-C %A McClellan, J M %K Age Factors %K Autistic Disorder %K Bipolar Disorder %K Dopamine %K Female %K gamma-Aminobutyric Acid %K Genetic Variation %K Glutamine %K Humans %K Male %K Mutation %K Neural Pathways %K Schizophrenia %K Sex Factors %K South Africa %K Synapses %K Synaptic Transmission %X

Africa, the ancestral home of all modern humans, is the most informative continent for understanding the human genome and its contribution to complex disease. To better understand the genetics of schizophrenia, we studied the illness in the Xhosa population of South Africa, recruiting 909 cases and 917 age-, gender-, and residence-matched controls. Individuals with schizophrenia were significantly more likely than controls to harbor private, severely damaging mutations in genes that are critical to synaptic function, including neural circuitry mediated by the neurotransmitters glutamine, γ-aminobutyric acid, and dopamine. Schizophrenia is genetically highly heterogeneous, involving severe ultrarare mutations in genes that are critical to synaptic plasticity. The depth of genetic variation in Africa revealed this relationship with a moderate sample size and informed our understanding of the genetics of schizophrenia worldwide.

%B Science %V 367 %P 569-573 %8 2020 01 31 %G eng %N 6477 %1 https://www.ncbi.nlm.nih.gov/pubmed/32001654?dopt=Abstract %R 10.1126/science.aay8833 %0 Journal Article %J BMC Med Genomics %D 2020 %T Genome-wide association meta-analysis for early age-related macular degeneration highlights novel loci and insights for advanced disease. %A Winkler, Thomas W %A Grassmann, Felix %A Brandl, Caroline %A Kiel, Christina %A Günther, Felix %A Strunz, Tobias %A Weidner, Lorraine %A Zimmermann, Martina E %A Korb, Christina A %A Poplawski, Alicia %A Schuster, Alexander K %A Müller-Nurasyid, Martina %A Peters, Annette %A Rauscher, Franziska G %A Elze, Tobias %A Horn, Katrin %A Scholz, Markus %A Cañadas-Garre, Marisa %A McKnight, Amy Jayne %A Quinn, Nicola %A Hogg, Ruth E %A Küchenhoff, Helmut %A Heid, Iris M %A Stark, Klaus J %A Weber, Bernhard H F %K Case-Control Studies %K Genetic Loci %K Genetic Markers %K Genetic Predisposition to Disease %K Genome-Wide Association Study %K Humans %K Macular Degeneration %K Polymorphism, Single Nucleotide %X

BACKGROUND: Advanced age-related macular degeneration (AMD) is a leading cause of blindness. While around half of the genetic contribution to advanced AMD has been uncovered, little is known about the genetic architecture of early AMD.

METHODS: To identify genetic factors for early AMD, we conducted a genome-wide association study (GWAS) meta-analysis (14,034 cases, 91,214 controls, 11 sources of data including the International AMD Genomics Consortium, IAMDGC, and UK Biobank, UKBB). We ascertained early AMD via color fundus photographs by manual grading for 10 sources and via an automated machine learning approach for > 170,000 photographs from UKBB. We searched for early AMD loci via GWAS and via a candidate approach based on 14 previously suggested early AMD variants.

RESULTS: Altogether, we identified 10 independent loci with statistical significance for early AMD: (i) 8 from our GWAS with genome-wide significance (P < 5 × 10), (ii) one previously suggested locus with experiment-wise significance (P < 0.05/14) in our non-overlapping data and with genome-wide significance when combining the reported and our non-overlapping data (together 17,539 cases, 105,395 controls), and (iii) one further previously suggested locus with experiment-wise significance in our non-overlapping data. Of these 10 identified loci, 8 were novel and 2 known for early AMD. Most of the 10 loci overlapped with known advanced AMD loci (near ARMS2/HTRA1, CFH, C2, C3, CETP, TNFRSF10A, VEGFA, APOE), except two that have not yet been identified with statistical significance for any AMD. Among the 17 genes within these two loci, in-silico functional annotation suggested CD46 and TYR as the most likely responsible genes. Presence or absence of an early AMD effect distinguished the known pathways of advanced AMD genetics (complement/lipid pathways versus extracellular matrix metabolism).

CONCLUSIONS: Our GWAS on early AMD identified novel loci, highlighted shared and distinct genetics between early and advanced AMD and provides insights into AMD etiology. Our data provide a resource comparable in size to the existing IAMDGC data on advanced AMD genetics enabling a joint view. The biological relevance of this joint view is underscored by the ability of early AMD effects to differentiate the major pathways for advanced AMD.

%B BMC Med Genomics %V 13 %P 120 %8 2020 08 26 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/32843070?dopt=Abstract %R 10.1186/s12920-020-00760-7 %0 Journal Article %J Arterioscler Thromb Vasc Biol %D 2020 %T Genome-Wide Polygenic Score, Clinical Risk Factors, and Long-Term Trajectories of Coronary Artery Disease. %A Hindy, George %A Aragam, Krishna G %A Ng, Kenney %A Chaffin, Mark %A Lotta, Luca A %A Baras, Aris %A Drake, Isabel %A Orho-Melander, Marju %A Melander, Olle %A Kathiresan, Sekar %A Khera, Amit V %K Adult %K Aged %K Coronary Artery Disease %K Female %K Genetic Predisposition to Disease %K Genome-Wide Association Study %K Heart Disease Risk Factors %K Heredity %K Humans %K Incidence %K Male %K Middle Aged %K Multifactorial Inheritance %K Phenotype %K Prognosis %K Risk Assessment %K Sweden %K Time Factors %K United Kingdom %X

OBJECTIVE: To determine the relationship of a genome-wide polygenic score for coronary artery disease (GPS) with lifetime trajectories of CAD risk, directly compare its predictive capacity to traditional risk factors, and assess its interplay with the Pooled Cohort Equations (PCE) clinical risk estimator. Approach and Results: We studied GPS in 28 556 middle-aged participants of the Malmö Diet and Cancer Study, of whom 4122 (14.4%) developed CAD over a median follow-up of 21.3 years. A pronounced gradient in lifetime risk of CAD was observed-16% for those in the lowest GPS decile to 48% in the highest. We evaluated the discriminative capacity of the GPS-as assessed by change in the C-statistic from a baseline model including age and sex-among 5685 individuals with PCE risk estimates available. The increment for the GPS (+0.045, <0.001) was higher than for any of 11 traditional risk factors (range +0.007 to +0.032). Minimal correlation was observed between GPS and 10-year risk defined by the PCE (=0.03), and addition of GPS improved the C-statistic of the PCE model by 0.026. A significant gradient in lifetime risk was observed for the GPS, even among individuals within a given PCE clinical risk stratum. We replicated key findings-noting strikingly consistent results-in 325 003 participants of the UK Biobank.

CONCLUSIONS: GPS-a risk estimator available from birth-stratifies individuals into varying trajectories of clinical risk for CAD. Implementation of GPS may enable identification of high-risk individuals early in life, decades in advance of manifest risk factors or disease.

%B Arterioscler Thromb Vasc Biol %V 40 %P 2738-2746 %8 2020 11 %G eng %N 11 %1 https://www.ncbi.nlm.nih.gov/pubmed/32957805?dopt=Abstract %R 10.1161/ATVBAHA.120.314856 %0 Journal Article %J Circ Genom Precis Med %D 2020 %T Heterozygous Gene Deficiency and Risk of Coronary Artery Disease. %A Nomura, Akihiro %A Emdin, Connor A %A Won, Hong Hee %A Peloso, Gina M %A Natarajan, Pradeep %A Ardissino, Diego %A Danesh, John %A Schunkert, Heribert %A Correa, Adolfo %A Bown, Matthew J %A Samani, Nilesh J %A Erdmann, Jeanette %A McPherson, Ruth %A Watkins, Hugh %A Saleheen, Danish %A Elosua, Roberto %A Kawashiri, Masa-Aki %A Tada, Hayato %A Gupta, Namrata %A Shah, Svati H %A Rader, Daniel J %A Gabriel, Stacey %A Khera, Amit V %A Kathiresan, Sekar %X

BACKGROUND: Familial sitosterolemia is a rare Mendelian disorder characterized by hyperabsorption and decreased biliary excretion of dietary sterols. Affected individuals typically have complete genetic deficiency-homozygous loss-of-function (LoF) variants-in the or genes and have substantially elevated plasma sitosterol and LDL (low-density lipoprotein) cholesterol (LDL-C) levels. The impact of partial genetic deficiency of or -as occurs in heterozygous carriers of LoF variants-on LDL-C and risk of coronary artery disease (CAD) has remained uncertain.

METHODS: We first recruited 9 sitosterolemia families, identified causative LoF variants in or , and evaluated the associations of these or LoF variants with plasma phytosterols and lipid levels. We next assessed for LoF variants in or in CAD cases (n=29 321) versus controls (n=357 326). We tested the association of rare LoF variants in or with blood lipids and risk for CAD. Rare LoF variants were defined as protein-truncating variants with minor allele frequency <0.1% in or .

RESULTS: In sitosterolemia families, 7 pedigrees harbored causative LoF variants in and 2 pedigrees in . Homozygous LoF variants in either or led to marked elevations in sitosterol and LDL-C. Of those sitosterolemia families, heterozygous carriers of LoF variants exhibited increased sitosterol and LDL-C levels compared with noncarriers. Within large-scale CAD case-control cohorts, prevalence of rare LoF variants in and in was ≈0.1% each. heterozygous LoF variant carriers had significantly elevated LDL-C levels (25 mg/dL [95% CI, 14-35]; =1.1×10) and were at 2-fold increased risk of CAD (odds ratio, 2.06 [95% CI, 1.27-3.35]; =0.004). By contrast, heterozygous LoF carrier status was not associated with increased LDL-C or risk of CAD.

CONCLUSIONS: Although familial sitosterolemia is traditionally considered as a recessive disorder, we observed that heterozygous carriers of an LoF variant in had significantly increased sitosterol and LDL-C levels and a 2-fold increase in risk of CAD.

%B Circ Genom Precis Med %V 13 %P 417-423 %8 2020 10 %G eng %N 5 %1 https://www.ncbi.nlm.nih.gov/pubmed/32862661?dopt=Abstract %R 10.1161/CIRCGEN.119.002871 %0 Journal Article %J Science %D 2020 %T The impact of sex on gene expression across human tissues. %A Oliva, Meritxell %A Muñoz-Aguirre, Manuel %A Kim-Hellmuth, Sarah %A Wucher, Valentin %A Gewirtz, Ariel D H %A Cotter, Daniel J %A Parsana, Princy %A Kasela, Silva %A Balliu, Brunilda %A Viñuela, Ana %A Castel, Stephane E %A Mohammadi, Pejman %A Aguet, François %A Zou, Yuxin %A Khramtsova, Ekaterina A %A Skol, Andrew D %A Garrido-Martín, Diego %A Reverter, Ferran %A Brown, Andrew %A Evans, Patrick %A Gamazon, Eric R %A Payne, Anthony %A Bonazzola, Rodrigo %A Barbeira, Alvaro N %A Hamel, Andrew R %A Martinez-Perez, Angel %A Soria, José Manuel %A Pierce, Brandon L %A Stephens, Matthew %A Eskin, Eleazar %A Dermitzakis, Emmanouil T %A Segrè, Ayellet V %A Im, Hae Kyung %A Engelhardt, Barbara E %A Ardlie, Kristin G %A Montgomery, Stephen B %A Battle, Alexis J %A Lappalainen, Tuuli %A Guigo, Roderic %A Stranger, Barbara E %K Chromosomes, Human, X %K Disease %K Epigenesis, Genetic %K Female %K Gene Expression %K Gene Expression Regulation %K Genetic Variation %K Genome-Wide Association Study %K Humans %K Male %K Organ Specificity %K Promoter Regions, Genetic %K Quantitative Trait Loci %K Sex Characteristics %K Sex Factors %X

Many complex human phenotypes exhibit sex-differentiated characteristics. However, the molecular mechanisms underlying these differences remain largely unknown. We generated a catalog of sex differences in gene expression and in the genetic regulation of gene expression across 44 human tissue sources surveyed by the Genotype-Tissue Expression project (GTEx, v8 release). We demonstrate that sex influences gene expression levels and cellular composition of tissue samples across the human body. A total of 37% of all genes exhibit sex-biased expression in at least one tissue. We identify cis expression quantitative trait loci (eQTLs) with sex-differentiated effects and characterize their cellular origin. By integrating sex-biased eQTLs with genome-wide association study data, we identify 58 gene-trait associations that are driven by genetic regulation of gene expression in a single sex. These findings provide an extensive characterization of sex differences in the human transcriptome and its genetic regulation.

%B Science %V 369 %8 2020 09 11 %G eng %N 6509 %1 https://www.ncbi.nlm.nih.gov/pubmed/32913072?dopt=Abstract %R 10.1126/science.aba3066 %0 Journal Article %J Nature %D 2020 %T Inherited causes of clonal haematopoiesis in 97,691 whole genomes. %A Bick, Alexander G %A Weinstock, Joshua S %A Nandakumar, Satish K %A Fulco, Charles P %A Bao, Erik L %A Zekavat, Seyedeh M %A Szeto, Mindy D %A Liao, Xiaotian %A Leventhal, Matthew J %A Nasser, Joseph %A Chang, Kyle %A Laurie, Cecelia %A Burugula, Bala Bharathi %A Gibson, Christopher J %A Lin, Amy E %A Taub, Margaret A %A Aguet, François %A Ardlie, Kristin %A Mitchell, Braxton D %A Barnes, Kathleen C %A Moscati, Arden %A Fornage, Myriam %A Redline, Susan %A Psaty, Bruce M %A Silverman, Edwin K %A Weiss, Scott T %A Palmer, Nicholette D %A Vasan, Ramachandran S %A Burchard, Esteban G %A Kardia, Sharon L R %A He, Jiang %A Kaplan, Robert C %A Smith, Nicholas L %A Arnett, Donna K %A Schwartz, David A %A Correa, Adolfo %A de Andrade, Mariza %A Guo, Xiuqing %A Konkle, Barbara A %A Custer, Brian %A Peralta, Juan M %A Gui, Hongsheng %A Meyers, Deborah A %A McGarvey, Stephen T %A Chen, Ida Yii-Der %A Shoemaker, M Benjamin %A Peyser, Patricia A %A Broome, Jai G %A Gogarten, Stephanie M %A Wang, Fei Fei %A Wong, Quenna %A Montasser, May E %A Daya, Michelle %A Kenny, Eimear E %A North, Kari E %A Launer, Lenore J %A Cade, Brian E %A Bis, Joshua C %A Cho, Michael H %A Lasky-Su, Jessica %A Bowden, Donald W %A Cupples, L Adrienne %A Mak, Angel C Y %A Becker, Lewis C %A Smith, Jennifer A %A Kelly, Tanika N %A Aslibekyan, Stella %A Heckbert, Susan R %A Tiwari, Hemant K %A Yang, Ivana V %A Heit, John A %A Lubitz, Steven A %A Johnsen, Jill M %A Curran, Joanne E %A Wenzel, Sally E %A Weeks, Daniel E %A Rao, Dabeeru C %A Darbar, Dawood %A Moon, Jee-Young %A Tracy, Russell P %A Buth, Erin J %A Rafaels, Nicholas %A Loos, Ruth J F %A Durda, Peter %A Liu, Yongmei %A Hou, Lifang %A Lee, Jiwon %A Kachroo, Priyadarshini %A Freedman, Barry I %A Levy, Daniel %A Bielak, Lawrence F %A Hixson, James E %A Floyd, James S %A Whitsel, Eric A %A Ellinor, Patrick T %A Irvin, Marguerite R %A Fingerlin, Tasha E %A Raffield, Laura M %A Armasu, Sebastian M %A Wheeler, Marsha M %A Sabino, Ester C %A Blangero, John %A Williams, L Keoki %A Levy, Bruce D %A Sheu, Wayne Huey-Herng %A Roden, Dan M %A Boerwinkle, Eric %A Manson, JoAnn E %A Mathias, Rasika A %A Desai, Pinkal %A Taylor, Kent D %A Johnson, Andrew D %A Auer, Paul L %A Kooperberg, Charles %A Laurie, Cathy C %A Blackwell, Thomas W %A Smith, Albert V %A Zhao, Hongyu %A Lange, Ethan %A Lange, Leslie %A Rich, Stephen S %A Rotter, Jerome I %A Wilson, James G %A Scheet, Paul %A Kitzman, Jacob O %A Lander, Eric S %A Engreitz, Jesse M %A Ebert, Benjamin L %A Reiner, Alexander P %A Jaiswal, Siddhartha %A Abecasis, Gonçalo %A Sankaran, Vijay G %A Kathiresan, Sekar %A Natarajan, Pradeep %K Adult %K Africa %K African Continental Ancestry Group %K Aged %K Aged, 80 and over %K alpha Karyopherins %K Cell Self Renewal %K Clonal Hematopoiesis %K DNA-Binding Proteins %K Female %K Genetic Predisposition to Disease %K Genome, Human %K Germ-Line Mutation %K Hematopoietic Stem Cells %K Humans %K Intracellular Signaling Peptides and Proteins %K Male %K Middle Aged %K National Heart, Lung, and Blood Institute (U.S.) %K Phenotype %K Precision Medicine %K Proto-Oncogene Proteins %K Tripartite Motif Proteins %K United States %K Whole Genome Sequencing %X

Age is the dominant risk factor for most chronic human diseases, but the mechanisms through which ageing confers this risk are largely unknown. The age-related acquisition of somatic mutations that lead to clonal expansion in regenerating haematopoietic stem cell populations has recently been associated with both haematological cancer and coronary heart disease-this phenomenon is termed clonal haematopoiesis of indeterminate potential (CHIP). Simultaneous analyses of germline and somatic whole-genome sequences provide the opportunity to identify root causes of CHIP. Here we analyse high-coverage whole-genome sequences from 97,691 participants of diverse ancestries in the National Heart, Lung, and Blood Institute Trans-omics for Precision Medicine (TOPMed) programme, and identify 4,229 individuals with CHIP. We identify associations with blood cell, lipid and inflammatory traits that are specific to different CHIP driver genes. Association of a genome-wide set of germline genetic variants enabled the identification of three genetic loci associated with CHIP status, including one locus at TET2 that was specific to individuals of African ancestry. In silico-informed in vitro evaluation of the TET2 germline locus enabled the identification of a causal variant that disrupts a TET2 distal enhancer, resulting in increased self-renewal of haematopoietic stem cells. Overall, we observe that germline genetic variation shapes haematopoietic stem cell function, leading to CHIP through mechanisms that are specific to clonal haematopoiesis as well as shared mechanisms that lead to somatic mutations across tissues.

%B Nature %V 586 %P 763-768 %8 2020 10 %G eng %N 7831 %1 https://www.ncbi.nlm.nih.gov/pubmed/33057201?dopt=Abstract %R 10.1038/s41586-020-2819-2 %0 Journal Article %J Cell %D 2020 %T Large-Scale Exome Sequencing Study Implicates Both Developmental and Functional Changes in the Neurobiology of Autism. %A Satterstrom, F Kyle %A Kosmicki, Jack A %A Wang, Jiebiao %A Breen, Michael S %A De Rubeis, Silvia %A An, Joon-Yong %A Peng, Minshi %A Collins, Ryan %A Grove, Jakob %A Klei, Lambertus %A Stevens, Christine %A Reichert, Jennifer %A Mulhern, Maureen S %A Artomov, Mykyta %A Gerges, Sherif %A Sheppard, Brooke %A Xu, Xinyi %A Bhaduri, Aparna %A Norman, Utku %A Brand, Harrison %A Schwartz, Grace %A Nguyen, Rachel %A Guerrero, Elizabeth E %A Dias, Caroline %A Betancur, Catalina %A Cook, Edwin H %A Gallagher, Louise %A Gill, Michael %A Sutcliffe, James S %A Thurm, Audrey %A Zwick, Michael E %A Børglum, Anders D %A State, Matthew W %A Cicek, A Ercument %A Talkowski, Michael E %A Cutler, David J %A Devlin, Bernie %A Sanders, Stephan J %A Roeder, Kathryn %A Daly, Mark J %A Buxbaum, Joseph D %K Autistic Disorder %K Case-Control Studies %K Cell Lineage %K Cerebral Cortex %K Cohort Studies %K Exome %K Female %K Gene Expression Regulation, Developmental %K Gene Frequency %K Genetic Predisposition to Disease %K Humans %K Male %K Mutation, Missense %K Neurobiology %K Neurons %K Phenotype %K Sex Factors %K Single-Cell Analysis %K Whole Exome Sequencing %X

We present the largest exome sequencing study of autism spectrum disorder (ASD) to date (n = 35,584 total samples, 11,986 with ASD). Using an enhanced analytical framework to integrate de novo and case-control rare variation, we identify 102 risk genes at a false discovery rate of 0.1 or less. Of these genes, 49 show higher frequencies of disruptive de novo variants in individuals ascertained to have severe neurodevelopmental delay, whereas 53 show higher frequencies in individuals ascertained to have ASD; comparing ASD cases with mutations in these groups reveals phenotypic differences. Expressed early in brain development, most risk genes have roles in regulation of gene expression or neuronal communication (i.e., mutations effect neurodevelopmental and neurophysiological changes), and 13 fall within loci recurrently hit by copy number variants. In cells from the human cortex, expression of risk genes is enriched in excitatory and inhibitory neuronal lineages, consistent with multiple paths to an excitatory-inhibitory imbalance underlying ASD.

%B Cell %V 180 %P 568-584.e23 %8 2020 02 06 %G eng %N 3 %1 https://www.ncbi.nlm.nih.gov/pubmed/31981491?dopt=Abstract %R 10.1016/j.cell.2019.12.036 %0 Journal Article %J Genetics %D 2020 %T Lung Function in African American Children with Asthma Is Associated with Novel Regulatory Variants of the KIT Ligand and Gene-By-Air-Pollution Interaction. %A Mak, Angel C Y %A Sajuthi, Satria %A Joo, Jaehyun %A Xiao, Shujie %A Sleiman, Patrick M %A White, Marquitta J %A Lee, Eunice Y %A Saef, Benjamin %A Hu, Donglei %A Gui, Hongsheng %A Keys, Kevin L %A Lurmann, Fred %A Jain, Deepti %A Abecasis, Gonçalo %A Kang, Hyun Min %A Nickerson, Deborah A %A Germer, Soren %A Zody, Michael C %A Winterkorn, Lara %A Reeves, Catherine %A Huntsman, Scott %A Eng, Celeste %A Salazar, Sandra %A Oh, Sam S %A Gilliland, Frank D %A Chen, Zhanghua %A Kumar, Rajesh %A Martínez, Fernando D %A Wu, Ann Chen %A Ziv, Elad %A Hakonarson, Hakon %A Himes, Blanca E %A Williams, L Keoki %A Seibold, Max A %A Burchard, Esteban G %X

Baseline lung function, quantified as forced expiratory volume in the first second of exhalation (FEV), is a standard diagnostic criterion used by clinicians to identify and classify lung diseases. Using whole-genome sequencing data from the National Heart, Lung, and Blood Institute Trans-Omics for Precision Medicine project, we identified a novel genetic association with FEV on chromosome 12 in 867 African American children with asthma ( = 1.26 × 10, β = 0.302). Conditional analysis within 1 Mb of the tag signal (rs73429450) yielded one major and two other weaker independent signals within this peak. We explored statistical and functional evidence for all variants in linkage disequilibrium with the three independent signals and yielded nine variants as the most likely candidates responsible for the association with FEV Hi-C data and expression QTL analysis demonstrated that these variants physically interacted with (KIT ligand, also known as ), and their minor alleles were associated with increased expression of the gene in nasal epithelial cells. Gene-by-air-pollution interaction analysis found that the candidate variant rs58475486 interacted with past-year ambient sulfur dioxide exposure ( = 0.003, β = 0.32). This study identified a novel protective genetic association with FEV, possibly mediated through , in African American children with asthma. This is the first study that has identified a genetic association between lung function and , which has established a role in orchestrating allergic inflammation in asthma.

%B Genetics %V 215 %P 869-886 %8 2020 07 %G eng %N 3 %1 https://www.ncbi.nlm.nih.gov/pubmed/32327564?dopt=Abstract %R 10.1534/genetics.120.303231 %0 Journal Article %J Cell %D 2020 %T Major Impacts of Widespread Structural Variation on Gene Expression and Crop Improvement in Tomato. %A Alonge, Michael %A Wang, Xingang %A Benoit, Matthias %A Soyk, Sebastian %A Pereira, Lara %A Zhang, Lei %A Suresh, Hamsini %A Ramakrishnan, Srividya %A Maumus, Florian %A Ciren, Danielle %A Levy, Yuval %A Harel, Tom Hai %A Shalev-Schlosser, Gili %A Amsellem, Ziva %A Razifard, Hamid %A Caicedo, Ana L %A Tieman, Denise M %A Klee, Harry %A Kirsche, Melanie %A Aganezov, Sergey %A Ranallo-Benavidez, T Rhyker %A Lemmon, Zachary H %A Kim, Jennifer %A Robitaille, Gina %A Kramer, Melissa %A Goodwin, Sara %A McCombie, W Richard %A Hutton, Samuel %A Van Eck, Joyce %A Gillis, Jesse %A Eshed, Yuval %A Sedlazeck, Fritz J %A van der Knaap, Esther %A Schatz, Michael C %A Lippman, Zachary B %K Alleles %K Crops, Agricultural %K Cytochrome P-450 Enzyme System %K Ecotype %K Epistasis, Genetic %K Fruit %K Gene Duplication %K Gene Expression Regulation, Plant %K Genome, Plant %K Genomic Structural Variation %K Genotype %K Inbreeding %K Lycopersicon esculentum %K Molecular Sequence Annotation %K Phenotype %K Plant Breeding %K Quantitative Trait Loci %X

Structural variants (SVs) underlie important crop improvement and domestication traits. However, resolving the extent, diversity, and quantitative impact of SVs has been challenging. We used long-read nanopore sequencing to capture 238,490 SVs in 100 diverse tomato lines. This panSV genome, along with 14 new reference assemblies, revealed large-scale intermixing of diverse genotypes, as well as thousands of SVs intersecting genes and cis-regulatory regions. Hundreds of SV-gene pairs exhibit subtle and significant expression changes, which could broadly influence quantitative trait variation. By combining quantitative genetics with genome editing, we show how multiple SVs that changed gene dosage and expression levels modified fruit flavor, size, and production. In the last example, higher order epistasis among four SVs affecting three related transcription factors allowed introduction of an important harvesting trait in modern tomato. Our findings highlight the underexplored role of SVs in genotype-to-phenotype relationships and their widespread importance and utility in crop improvement.

%B Cell %V 182 %P 145-161.e23 %8 2020 07 09 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/32553272?dopt=Abstract %R 10.1016/j.cell.2020.05.021 %0 Journal Article %J Nature %D 2020 %T Mapping and characterization of structural variation in 17,795 human genomes. %A Abel, Haley J %A Larson, David E %A Regier, Allison A %A Chiang, Colby %A Das, Indraniel %A Kanchi, Krishna L %A Layer, Ryan M %A Neale, Benjamin M %A Salerno, William J %A Reeves, Catherine %A Buyske, Steven %A Matise, Tara C %A Muzny, Donna M %A Zody, Michael C %A Lander, Eric S %A Dutcher, Susan K %A Stitziel, Nathan O %A Hall, Ira M %K Alleles %K Case-Control Studies %K Continental Population Groups %K Epigenesis, Genetic %K Female %K Gene Dosage %K Genetic Variation %K Genetics, Population %K Genome, Human %K High-Throughput Nucleotide Sequencing %K Humans %K Male %K Molecular Sequence Annotation %K Quantitative Trait Loci %K Software %K Whole Genome Sequencing %X

A key goal of whole-genome sequencing for studies of human genetics is to interrogate all forms of variation, including single-nucleotide variants, small insertion or deletion (indel) variants and structural variants. However, tools and resources for the study of structural variants have lagged behind those for smaller variants. Here we used a scalable pipeline to map and characterize structural variants in 17,795 deeply sequenced human genomes. We publicly release site-frequency data to create the largest, to our knowledge, whole-genome-sequencing-based structural variant resource so far. On average, individuals carry 2.9 rare structural variants that alter coding regions; these variants affect the dosage or structure of 4.2 genes and account for 4.0-11.2% of rare high-impact coding alleles. Using a computational model, we estimate that structural variants account for 17.2% of rare alleles genome-wide, with predicted deleterious effects that are equivalent to loss-of-function coding alleles; approximately 90% of such structural variants are noncoding deletions (mean 19.1 per genome). We report 158,991 ultra-rare structural variants and show that 2% of individuals carry ultra-rare megabase-scale structural variants, nearly half of which are balanced or complex rearrangements. Finally, we infer the dosage sensitivity of genes and noncoding elements, and reveal trends that relate to element class and conservation. This work will help to guide the analysis and interpretation of structural variants in the era of whole-genome sequencing.

%B Nature %V 583 %P 83-89 %8 2020 07 %G eng %N 7814 %1 https://www.ncbi.nlm.nih.gov/pubmed/32460305?dopt=Abstract %R 10.1038/s41586-020-2371-0 %0 Journal Article %J PLoS Genet %D 2020 %T A missense variant in Mitochondrial Amidoxime Reducing Component 1 gene and protection against liver disease. %A Emdin, Connor A %A Haas, Mary E %A Khera, Amit V %A Aragam, Krishna %A Chaffin, Mark %A Klarin, Derek %A Hindy, George %A Jiang, Lan %A Wei, Wei-Qi %A Feng, Qiping %A Karjalainen, Juha %A Havulinna, Aki %A Kiiskinen, Tuomo %A Bick, Alexander %A Ardissino, Diego %A Wilson, James G %A Schunkert, Heribert %A McPherson, Ruth %A Watkins, Hugh %A Elosua, Roberto %A Bown, Matthew J %A Samani, Nilesh J %A Baber, Usman %A Erdmann, Jeanette %A Gupta, Namrata %A Danesh, John %A Saleheen, Danish %A Chang, Kyong-Mi %A Vujkovic, Marijana %A Voight, Ben %A Damrauer, Scott %A Lynch, Julie %A Kaplan, David %A Serper, Marina %A Tsao, Philip %A Mercader, Josep %A Hanis, Craig %A Daly, Mark %A Denny, Joshua %A Gabriel, Stacey %A Kathiresan, Sekar %K Alleles %K Cholesterol, LDL %K Coronary Artery Disease %K Datasets as Topic %K Fatty Liver %K Female %K Genetic Predisposition to Disease %K Homozygote %K Humans %K Liver %K Liver Cirrhosis %K Liver Cirrhosis, Alcoholic %K Loss of Function Mutation %K Male %K Middle Aged %K Mitochondrial Proteins %K Mutation, Missense %K Oxidoreductases %X

Analyzing 12,361 all-cause cirrhosis cases and 790,095 controls from eight cohorts, we identify a common missense variant in the Mitochondrial Amidoxime Reducing Component 1 gene (MARC1 p.A165T) that associates with protection from all-cause cirrhosis (OR 0.91, p = 2.3*10-11). This same variant also associates with lower levels of hepatic fat on computed tomographic imaging and lower odds of physician-diagnosed fatty liver as well as lower blood levels of alanine transaminase (-0.025 SD, 3.7*10-43), alkaline phosphatase (-0.025 SD, 1.2*10-37), total cholesterol (-0.030 SD, p = 1.9*10-36) and LDL cholesterol (-0.027 SD, p = 5.1*10-30) levels. We identified a series of additional MARC1 alleles (low-frequency missense p.M187K and rare protein-truncating p.R200Ter) that also associated with lower cholesterol levels, liver enzyme levels and reduced risk of cirrhosis (0 cirrhosis cases for 238 R200Ter carriers versus 17,046 cases of cirrhosis among 759,027 non-carriers, p = 0.04) suggesting that deficiency of the MARC1 enzyme may lower blood cholesterol levels and protect against cirrhosis.

%B PLoS Genet %V 16 %P e1008629 %8 2020 04 %G eng %N 4 %1 https://www.ncbi.nlm.nih.gov/pubmed/32282858?dopt=Abstract %R 10.1371/journal.pgen.1008629 %0 Journal Article %J Mov Disord %D 2020 %T The Parkinson's Disease Genome-Wide Association Study Locus Browser. %A Grenn, Francis P %A Kim, Jonggeol J %A Makarious, Mary B %A Iwaki, Hirotaka %A Illarionova, Anastasia %A Brolin, Kajsa %A Kluss, Jillian H %A Schumacher-Schuh, Artur F %A Leonard, Hampton %A Faghri, Faraz %A Billingsley, Kimberley %A Krohn, Lynne %A Hall, Ashley %A Diez-Fairen, Monica %A Periñán, Maria Teresa %A Foo, Jia Nee %A Sandor, Cynthia %A Webber, Caleb %A Fiske, Brian K %A Gibbs, J Raphael %A Nalls, Mike A %A Singleton, Andrew B %A Bandres-Ciga, Sara %A Reed, Xylena %A Blauwendraat, Cornelis %K Age of Onset %K Genetic Predisposition to Disease %K Genome-Wide Association Study %K Humans %K Neurodegenerative Diseases %K Parkinson Disease %K Risk Factors %X

BACKGROUND: Parkinson's disease (PD) is a neurodegenerative disease with an often complex component identifiable by genome-wide association studies. The most recent large-scale PD genome-wide association studies have identified more than 90 independent risk variants for PD risk and progression across more than 80 genomic regions. One major challenge in current genomics is the identification of the causal gene(s) and variant(s) at each genome-wide association study locus. The objective of the current study was to create a tool that would display data for relevant PD risk loci and provide guidance with the prioritization of causal genes and potential mechanisms at each locus.

METHODS: We included all significant genome-wide signals from multiple recent PD genome-wide association studies including themost recent PD risk genome-wide association study, age-at-onset genome-wide association study, progression genome-wide association study, and Asian population PD risk genome-wide association study. We gathered data for all genes 1 Mb up and downstream of each variant to allow users to assess which gene(s) are most associated with the variant of interest based on a set of self-ranked criteria. Multiple databases were queried for each gene to collect additional causal data.

RESULTS: We created a PD genome-wide association study browser tool (https://pdgenetics.shinyapps.io/GWASBrowser/) to assist the PD research community with the prioritization of genes for follow-up functional studies to identify potential therapeutic targets.

CONCLUSIONS: Our PD genome-wide association study browser tool provides users with a useful method of identifying potential causal genes at all known PD risk loci from large-scale PD genome-wide association studies. We plan to update this tool with new relevant data as sample sizes increase and new PD risk loci are discovered. © 2020 The Authors. Movement Disorders published by Wiley Periodicals LLC on behalf of International Parkinson and Movement Disorder Society. This article has been contributed to by US Government employees and their work is in the public domain in the USA.

%B Mov Disord %V 35 %P 2056-2067 %8 2020 11 %G eng %N 11 %1 https://www.ncbi.nlm.nih.gov/pubmed/32864809?dopt=Abstract %R 10.1002/mds.28197 %0 Journal Article %J Gigascience %D 2020 %T Parliament2: Accurate structural variant calling at scale. %A Zarate, Samantha %A Carroll, Andrew %A Mahmoud, Medhat %A Krasheninina, Olga %A Jun, Goo %A Salerno, William J %A Schatz, Michael C %A Boerwinkle, Eric %A Gibbs, Richard A %A Sedlazeck, Fritz J %X

BACKGROUND: Structural variants (SVs) are critical contributors to genetic diversity and genomic disease. To predict the phenotypic impact of SVs, there is a need for better estimates of both the occurrence and frequency of SVs, preferably from large, ethnically diverse cohorts. Thus, the current standard approach requires the use of short paired-end reads, which remain challenging to detect, especially at the scale of hundreds to thousands of samples.

FINDINGS: We present Parliament2, a consensus SV framework that leverages multiple best-in-class methods to identify high-quality SVs from short-read DNA sequence data at scale. Parliament2 incorporates pre-installed SV callers that are optimized for efficient execution in parallel to reduce the overall runtime and costs. We demonstrate the accuracy of Parliament2 when applied to data from NovaSeq and HiSeq X platforms with the Genome in a Bottle (GIAB) SV call set across all size classes. The reported quality score per SV is calibrated across different SV types and size classes. Parliament2 has the highest F1 score (74.27%) measured across the independent gold standard from GIAB. We illustrate the compute performance by processing all 1000 Genomes samples (2,691 samples) in <1 day on GRCH38. Parliament2 improves the runtime performance of individual methods and is open source (https://github.com/slzarate/parliament2), and a Docker image, as well as a WDL implementation, is available.

CONCLUSION: Parliament2 provides both a highly accurate single-sample SV call set from short-read DNA sequence data and enables cost-efficient application over cloud or cluster environments, processing thousands of samples.

%B Gigascience %V 9 %8 2020 12 21 %G eng %N 12 %1 https://www.ncbi.nlm.nih.gov/pubmed/33347570?dopt=Abstract %R 10.1093/gigascience/giaa145 %0 Journal Article %J Nat Commun %D 2020 %T Polygenic background modifies penetrance of monogenic variants for tier 1 genomic conditions. %A Fahed, Akl C %A Wang, Minxian %A Homburger, Julian R %A Patel, Aniruddh P %A Bick, Alexander G %A Neben, Cynthia L %A Lai, Carmen %A Brockman, Deanna %A Philippakis, Anthony %A Ellinor, Patrick T %A Cassa, Christopher A %A Lebo, Matthew %A Ng, Kenney %A Lander, Eric S %A Zhou, Alicia Y %A Kathiresan, Sekar %A Khera, Amit V %K Aged %K Breast Neoplasms %K Case-Control Studies %K Colorectal Neoplasms %K Coronary Artery Disease %K Female %K Genetic Predisposition to Disease %K Genome, Human %K Humans %K Male %K Middle Aged %K Multifactorial Inheritance %K Odds Ratio %K Penetrance %K Risk Factors %X

Genetic variation can predispose to disease both through (i) monogenic risk variants that disrupt a physiologic pathway with large effect on disease and (ii) polygenic risk that involves many variants of small effect in different pathways. Few studies have explored the interplay between monogenic and polygenic risk. Here, we study 80,928 individuals to examine whether polygenic background can modify penetrance of disease in tier 1 genomic conditions - familial hypercholesterolemia, hereditary breast and ovarian cancer, and Lynch syndrome. Among carriers of a monogenic risk variant, we estimate substantial gradients in disease risk based on polygenic background - the probability of disease by age 75 years ranged from 17% to 78% for coronary artery disease, 13% to 76% for breast cancer, and 11% to 80% for colon cancer. We propose that accounting for polygenic background is likely to increase accuracy of risk estimation for individuals who inherit a monogenic risk variant.

%B Nat Commun %V 11 %P 3635 %8 2020 08 20 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/32820175?dopt=Abstract %R 10.1038/s41467-020-17374-3 %0 Journal Article %J N Engl J Med %D 2020 %T RNA Identification of PRIME Cells Predicting Rheumatoid Arthritis Flares. %A Orange, Dana E %A Yao, Vicky %A Sawicka, Kirsty %A Fak, John %A Frank, Mayu O %A Parveen, Salina %A Blachère, Nathalie E %A Hale, Caryn %A Zhang, Fan %A Raychaudhuri, Soumya %A Troyanskaya, Olga G %A Darnell, Robert B %K Adult %K Arthritis, Rheumatoid %K B-Lymphocytes %K Female %K Fibroblasts %K Flow Cytometry %K Gene Expression %K Humans %K Male %K Mesenchymal Stem Cells %K Middle Aged %K Patient Acuity %K Sequence Analysis, RNA %K Surveys and Questionnaires %K Symptom Flare Up %K Synovial Fluid %X

BACKGROUND: Rheumatoid arthritis, like many inflammatory diseases, is characterized by episodes of quiescence and exacerbation (flares). The molecular events leading to flares are unknown.

METHODS: We established a clinical and technical protocol for repeated home collection of blood in patients with rheumatoid arthritis to allow for longitudinal RNA sequencing (RNA-seq). Specimens were obtained from 364 time points during eight flares over a period of 4 years in our index patient, as well as from 235 time points during flares in three additional patients. We identified transcripts that were differentially expressed before flares and compared these with data from synovial single-cell RNA-seq. Flow cytometry and sorted-blood-cell RNA-seq in additional patients were used to validate the findings.

RESULTS: Consistent changes were observed in blood transcriptional profiles 1 to 2 weeks before a rheumatoid arthritis flare. B-cell activation was followed by expansion of circulating CD45-CD31-PDPN+ preinflammatory mesenchymal, or PRIME, cells in the blood from patients with rheumatoid arthritis; these cells shared features of inflammatory synovial fibroblasts. Levels of circulating PRIME cells decreased during flares in all 4 patients, and flow cytometry and sorted-cell RNA-seq confirmed the presence of PRIME cells in 19 additional patients with rheumatoid arthritis.

CONCLUSIONS: Longitudinal genomic analysis of rheumatoid arthritis flares revealed PRIME cells in the blood during the period before a flare and suggested a model in which these cells become activated by B cells in the weeks before a flare and subsequently migrate out of the blood into the synovium. (Funded by the National Institutes of Health and others.).

%B N Engl J Med %V 383 %P 218-228 %8 2020 07 16 %G eng %N 3 %1 https://www.ncbi.nlm.nih.gov/pubmed/32668112?dopt=Abstract %R 10.1056/NEJMoa2004114 %0 Journal Article %J Am J Clin Nutr %D 2020 %T Serum sphingolipids and incident diabetes in a US population with high diabetes burden: the Hispanic Community Health Study/Study of Latinos (HCHS/SOL). %A Chen, Guo-Chong %A Chai, Jin Choul %A Yu, Bing %A Michelotti, Gregory A %A Grove, Megan L %A Fretts, Amanda M %A Daviglus, Martha L %A Garcia-Bedoya, Olga L %A Thyagarajan, Bharat %A Schneiderman, Neil %A Cai, Jianwen %A Kaplan, Robert C %A Boerwinkle, Eric %A Qi, Qibin %K Adolescent %K Adult %K Aged %K Diabetes Mellitus %K Female %K Hispanic Americans %K Humans %K Male %K Middle Aged %K Prospective Studies %K Risk Factors %K Sphingolipids %K United States %K Young Adult %X

BACKGROUND: Genetic or pharmacological inhibition of de novo sphingolipid synthases prevented diabetes in animal studies.

OBJECTIVES: We sought to evaluate prospective associations of serum sphingolipids with incident diabetes in a population-based cohort.

METHODS: We included 2010 participants of the Hispanic Community Health Study/Study of Latinos (HCHS/SOL) aged 18-74 y who were free of diabetes and other major chronic diseases at baseline (2008-2011). Metabolomic profiling of fasting serum was performed using a global, untargeted approach. A total of 43 sphingolipids were quantified and, considering subclasses and chemical structures of individual species, 6 sphingolipid scores were constructed. Diabetes status was assessed using standard procedures including blood tests. Multivariable survey Poisson regressions were applied to estimate RR and 95% CI of incident diabetes associated with individual sphingolipids or sphingolipid scores.

RESULTS: There were 224 incident cases of diabetes identified during, on average, 6 y of follow-up. After adjustment for socioeconomic and lifestyle factors, a ceramide score (RR Q4 versus Q1 = 2.40; 95% CI: 1.24, 4.65; P-trend = 0.003) and a score of sphingomyelins with fully saturated sphingoid-fatty acid pairs (RR Q4 versus Q1 = 3.15; 95% CI: 1.75, 5.67; P-trend <0.001) both were positively associated with risk of diabetes, whereas scores of glycosylceramides, lactosylceramides, or other unsaturated sphingomyelins (even if having an SFA base) were not associated with risk of diabetes. After additional adjustment for numerous traditional risk factors (especially triglycerides), both associations were attenuated and only the saturated-sphingomyelin score remained associated with risk of diabetes (RR Q4 versus Q1 = 1.98; 95% CI: 1.09, 3.59; P-trend = 0.031).

CONCLUSIONS: Our findings suggest that a cluster of saturated sphingomyelins may be associated with elevated risk of diabetes beyond traditional risk factors, which needs to be verified in other population studies. This study was registered at clinicaltrials.gov as NCT02060344.

%B Am J Clin Nutr %V 112 %P 57-65 %8 2020 07 01 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/32469399?dopt=Abstract %R 10.1093/ajcn/nqaa114 %0 Journal Article %J Genet Med %D 2020 %T Spinal muscular atrophy diagnosis and carrier screening from genome sequencing data. %A Chen, Xiao %A Sanchis-Juan, Alba %A French, Courtney E %A Connell, Andrew J %A Delon, Isabelle %A Kingsbury, Zoya %A Chawla, Aditi %A Halpern, Aaron L %A Taft, Ryan J %A Bentley, David R %A Butchbach, Matthew E R %A Raymond, F Lucy %A Eberle, Michael A %K Base Sequence %K Child %K Child, Preschool %K Humans %K Muscular Atrophy, Spinal %K Survival of Motor Neuron 1 Protein %X

PURPOSE: Spinal muscular atrophy (SMA), caused by loss of the SMN1 gene, is a leading cause of early childhood death. Due to the near identical sequences of SMN1 and SMN2, analysis of this region is challenging. Population-wide SMA screening to quantify the SMN1 copy number (CN) is recommended by the American College of Medical Genetics and Genomics.

METHODS: We developed a method that accurately identifies the CN of SMN1 and SMN2 using genome sequencing (GS) data by analyzing read depth and eight informative reference genome differences between SMN1/2.

RESULTS: We characterized SMN1/2 in 12,747 genomes, identified 1568 samples with SMN1 gains or losses and 6615 samples with SMN2 gains or losses, and calculated a pan-ethnic carrier frequency of 2%, consistent with previous studies. Additionally, 99.8% of our SMN1 and 99.7% of SMN2 CN calls agreed with orthogonal methods, with a recall of 100% for SMA and 97.8% for carriers, and a precision of 100% for both SMA and carriers.

CONCLUSION: This SMN copy-number caller can be used to identify both carrier and affected status of SMA, enabling SMA testing to be offered as a comprehensive test in neonatal care and an accurate carrier screening tool in GS sequencing projects.

%B Genet Med %V 22 %P 945-953 %8 2020 05 %G eng %N 5 %1 https://www.ncbi.nlm.nih.gov/pubmed/32066871?dopt=Abstract %R 10.1038/s41436-020-0754-0 %0 Journal Article %J Nature %D 2020 %T A structural variation reference for medical and population genetics. %A Collins, Ryan L %A Brand, Harrison %A Karczewski, Konrad J %A Zhao, Xuefang %A Alföldi, Jessica %A Francioli, Laurent C %A Khera, Amit V %A Lowther, Chelsea %A Gauthier, Laura D %A Wang, Harold %A Watts, Nicholas A %A Solomonson, Matthew %A O'Donnell-Luria, Anne %A Baumann, Alexander %A Munshi, Ruchi %A Walker, Mark %A Whelan, Christopher W %A Huang, Yongqing %A Brookings, Ted %A Sharpe, Ted %A Stone, Matthew R %A Valkanas, Elise %A Fu, Jack %A Tiao, Grace %A Laricchia, Kristen M %A Ruano-Rubio, Valentin %A Stevens, Christine %A Gupta, Namrata %A Cusick, Caroline %A Margolin, Lauren %A Taylor, Kent D %A Lin, Henry J %A Rich, Stephen S %A Post, Wendy S %A Chen, Yii-Der Ida %A Rotter, Jerome I %A Nusbaum, Chad %A Philippakis, Anthony %A Lander, Eric %A Gabriel, Stacey %A Neale, Benjamin M %A Kathiresan, Sekar %A Daly, Mark J %A Banks, Eric %A MacArthur, Daniel G %A Talkowski, Michael E %K Continental Population Groups %K Disease %K Female %K Genetic Testing %K Genetic Variation %K Genetics, Medical %K Genetics, Population %K Genome, Human %K Genotyping Techniques %K Humans %K Male %K Middle Aged %K Mutation %K Polymorphism, Single Nucleotide %K Reference Standards %K Selection, Genetic %K Whole Genome Sequencing %X

Structural variants (SVs) rearrange large segments of DNA and can have profound consequences in evolution and human disease. As national biobanks, disease-association studies, and clinical genetic testing have grown increasingly reliant on genome sequencing, population references such as the Genome Aggregation Database (gnomAD) have become integral in the interpretation of single-nucleotide variants (SNVs). However, there are no reference maps of SVs from high-coverage genome sequencing comparable to those for SNVs. Here we present a reference of sequence-resolved SVs constructed from 14,891 genomes across diverse global populations (54% non-European) in gnomAD. We discovered a rich and complex landscape of 433,371 SVs, from which we estimate that SVs are responsible for 25-29% of all rare protein-truncating events per genome. We found strong correlations between natural selection against damaging SNVs and rare SVs that disrupt or duplicate protein-coding sequence, which suggests that genes that are highly intolerant to loss-of-function are also sensitive to increased dosage. We also uncovered modest selection against noncoding SVs in cis-regulatory elements, although selection against protein-truncating SVs was stronger than all noncoding effects. Finally, we identified very large (over one megabase), rare SVs in 3.9% of samples, and estimate that 0.13% of individuals may carry an SV that meets the existing criteria for clinically important incidental findings. This SV resource is freely distributed via the gnomAD browser and will have broad utility in population genetics, disease-association studies, and diagnostic screening.

%B Nature %V 581 %P 444-451 %8 2020 05 %G eng %N 7809 %1 https://www.ncbi.nlm.nih.gov/pubmed/32461652?dopt=Abstract %R 10.1038/s41586-020-2287-8 %0 Journal Article %J J Am Coll Cardiol %D 2020 %T Titin Truncating Variants in Adults Without Known Congestive Heart Failure. %A Pirruccello, James P %A Bick, Alexander %A Chaffin, Mark %A Aragam, Krishna G %A Choi, Seung Hoan %A Lubitz, Steven A %A Ho, Carolyn Y %A Ng, Kenney %A Philippakis, Anthony %A Ellinor, Patrick T %A Kathiresan, Sekar %A Khera, Amit V %K Adult %K Aged %K Asymptomatic Diseases %K Connectin %K Female %K Genetic Variation %K Heart Failure %K Humans %K Male %K Middle Aged %B J Am Coll Cardiol %V 75 %P 1239-1241 %8 2020 03 17 %G eng %N 10 %1 https://www.ncbi.nlm.nih.gov/pubmed/32164899?dopt=Abstract %R 10.1016/j.jacc.2020.01.013 %0 Journal Article %J Science %D 2020 %T Transcriptomic signatures across human tissues identify functional rare genetic variation. %A Ferraro, Nicole M %A Strober, Benjamin J %A Einson, Jonah %A Abell, Nathan S %A Aguet, François %A Barbeira, Alvaro N %A Brandt, Margot %A Bucan, Maja %A Castel, Stephane E %A Davis, Joe R %A Greenwald, Emily %A Hess, Gaelen T %A Hilliard, Austin T %A Kember, Rachel L %A Kotis, Bence %A Park, YoSon %A Peloso, Gina %A Ramdas, Shweta %A Scott, Alexandra J %A Smail, Craig %A Tsang, Emily K %A Zekavat, Seyedeh M %A Ziosi, Marcello %A Ardlie, Kristin G %A Assimes, Themistocles L %A Bassik, Michael C %A Brown, Christopher D %A Correa, Adolfo %A Hall, Ira %A Im, Hae Kyung %A Li, Xin %A Natarajan, Pradeep %A Lappalainen, Tuuli %A Mohammadi, Pejman %A Montgomery, Stephen B %A Battle, Alexis %K Genetic Variation %K Genome, Human %K Humans %K Multifactorial Inheritance %K Organ Specificity %K Transcriptome %X

Rare genetic variants are abundant across the human genome, and identifying their function and phenotypic impact is a major challenge. Measuring aberrant gene expression has aided in identifying functional, large-effect rare variants (RVs). Here, we expanded detection of genetically driven transcriptome abnormalities by analyzing gene expression, allele-specific expression, and alternative splicing from multitissue RNA-sequencing data, and demonstrate that each signal informs unique classes of RVs. We developed Watershed, a probabilistic model that integrates multiple genomic and transcriptomic signals to predict variant function, validated these predictions in additional cohorts and through experimental assays, and used them to assess RVs in the UK Biobank, the Million Veterans Program, and the Jackson Heart Study. Our results link thousands of RVs to diverse molecular effects and provide evidence to associate RVs affecting the transcriptome with human traits.

%B Science %V 369 %8 2020 09 11 %G eng %N 6509 %1 https://www.ncbi.nlm.nih.gov/pubmed/32913073?dopt=Abstract %R 10.1126/science.aaz5900 %0 Journal Article %J Nat Commun %D 2020 %T Type 2 and interferon inflammation regulate SARS-CoV-2 entry factor expression in the airway epithelium. %A Sajuthi, Satria P %A DeFord, Peter %A Li, Yingchun %A Jackson, Nathan D %A Montgomery, Michael T %A Everman, Jamie L %A Rios, Cydney L %A Pruesse, Elmar %A Nolin, James D %A Plender, Elizabeth G %A Wechsler, Michael E %A Mak, Angel C Y %A Eng, Celeste %A Salazar, Sandra %A Medina, Vivian %A Wohlford, Eric M %A Huntsman, Scott %A Nickerson, Deborah A %A Germer, Soren %A Zody, Michael C %A Abecasis, Gonçalo %A Kang, Hyun Min %A Rice, Kenneth M %A Kumar, Rajesh %A Oh, Sam %A Rodriguez-Santana, Jose %A Burchard, Esteban G %A Seibold, Max A %K Angiotensin-Converting Enzyme 2 %K Betacoronavirus %K Child %K Coronavirus Infections %K COVID-19 %K Epithelial Cells %K Gene Expression Profiling %K Gene Expression Regulation %K Genetic Variation %K Host-Pathogen Interactions %K Humans %K Inflammation %K Interferons %K Interleukin-13 %K Middle Aged %K Nasal Mucosa %K Pandemics %K Peptidyl-Dipeptidase A %K Pneumonia, Viral %K SARS-CoV-2 %K Serine Endopeptidases %K Virus Internalization %X

Coronavirus disease 2019 (COVID-19) is caused by SARS-CoV-2, an emerging virus that utilizes host proteins ACE2 and TMPRSS2 as entry factors. Understanding the factors affecting the pattern and levels of expression of these genes is important for deeper understanding of SARS-CoV-2 tropism and pathogenesis. Here we explore the role of genetics and co-expression networks in regulating these genes in the airway, through the analysis of nasal airway transcriptome data from 695 children. We identify expression quantitative trait loci for both ACE2 and TMPRSS2, that vary in frequency across world populations. We find TMPRSS2 is part of a mucus secretory network, highly upregulated by type 2 (T2) inflammation through the action of interleukin-13, and that the interferon response to respiratory viruses highly upregulates ACE2 expression. IL-13 and virus infection mediated effects on ACE2 expression were also observed at the protein level in the airway epithelium. Finally, we define airway responses to common coronavirus infections in children, finding that these infections generate host responses similar to other viral species, including upregulation of IL6 and ACE2. Our results reveal possible mechanisms influencing SARS-CoV-2 infectivity and COVID-19 clinical outcomes.

%B Nat Commun %V 11 %P 5139 %8 2020 10 12 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/33046696?dopt=Abstract %R 10.1038/s41467-020-18781-2 %0 Journal Article %J bioRxiv %D 2020 %T Type 2 and interferon inflammation strongly regulate SARS-CoV-2 related gene expression in the airway epithelium. %A Sajuthi, Satria P %A DeFord, Peter %A Jackson, Nathan D %A Montgomery, Michael T %A Everman, Jamie L %A Rios, Cydney L %A Pruesse, Elmar %A Nolin, James D %A Plender, Elizabeth G %A Wechsler, Michael E %A Mak, Angel Cy %A Eng, Celeste %A Salazar, Sandra %A Medina, Vivian %A Wohlford, Eric M %A Huntsman, Scott %A Nickerson, Deborah A %A Germer, Soren %A Zody, Michael C %A Abecasis, Gonçalo %A Kang, Hyun Min %A Rice, Kenneth M %A Kumar, Rajesh %A Oh, Sam %A Rodriguez-Santana, Jose %A Burchard, Esteban G %A Seibold, Max A %X

Coronavirus disease 2019 (COVID-19) outcomes vary from asymptomatic infection to death. This disparity may reflect different airway levels of the SARS-CoV-2 receptor, ACE2, and the spike protein activator, TMPRSS2. Here we explore the role of genetics and co-expression networks in regulating these genes in the airway, through the analysis of nasal airway transcriptome data from 695 children. We identify expression quantitative trait loci (eQTL) for both and , that vary in frequency across world populations. Importantly, we find is part of a mucus secretory network, highly upregulated by T2 inflammation through the action of interleukin-13, and that interferon response to respiratory viruses highly upregulates expression. Finally, we define airway responses to coronavirus infections in children, finding that these infections upregulate while also stimulating a more pronounced cytotoxic immune response relative to other respiratory viruses. Our results reveal mechanisms likely influencing SARS-CoV-2 infectivity and COVID-19 clinical outcomes.

%B bioRxiv %8 2020 Apr 10 %G eng %1 https://www.ncbi.nlm.nih.gov/pubmed/32511326?dopt=Abstract %R 10.1101/2020.04.09.034454 %0 Journal Article %J J Am Coll Cardiol %D 2020 %T Validation of a Genome-Wide Polygenic Score for Coronary Artery Disease in South Asians. %A Wang, Minxian %A Menon, Ramesh %A Mishra, Sanghamitra %A Patel, Aniruddh P %A Chaffin, Mark %A Tanneeru, Deepak %A Deshmukh, Manjari %A Mathew, Oshin %A Apte, Sanika %A Devanboo, Christina S %A Sundaram, Sumathi %A Lakshmipathy, Praveena %A Murugan, Sakthivel %A Sharma, Krishna Kumar %A Rajendran, Karthikeyan %A Santhosh, Sam %A Thachathodiyl, Rajesh %A Ahamed, Hisham %A Balegadde, Aniketh Vijay %A Alexander, Thomas %A Swaminathan, Krishnan %A Gupta, Rajeev %A Mullasari, Ajit S %A Sigamani, Alben %A Kanchi, Muralidhar %A Peterson, Andrew S %A Butterworth, Adam S %A Danesh, John %A Di Angelantonio, Emanuele %A Naheed, Aliya %A Inouye, Michael %A Chowdhury, Rajiv %A Vedam, Ramprasad L %A Kathiresan, Sekar %A Gupta, Ravi %A Khera, Amit V %K Adult %K Aged %K Bangladesh %K Case-Control Studies %K Coronary Artery Disease %K Female %K Genome-Wide Association Study %K Humans %K India %K Male %K Middle Aged %K Multifactorial Inheritance %X

BACKGROUND: Genome-wide polygenic scores (GPS) integrate information from many common DNA variants into a single number. Because rates of coronary artery disease (CAD) are substantially higher among South Asians, a GPS to identify high-risk individuals may be particularly useful in this population.

OBJECTIVES: This analysis used summary statistics from a prior genome-wide association study to derive a new GPS for South Asians.

METHODS: This GPS was validated in 7,244 South Asian UK Biobank participants and tested in 491 individuals from a case-control study in Bangladesh. Next, a static ancestry and GPS reference distribution was built using whole-genome sequencing from 1,522 Indian individuals, and a framework was tested for projecting individuals onto this static ancestry and GPS reference distribution using 1,800 CAD cases and 1,163 control subjects newly recruited in India.

RESULTS: The GPS, containing 6,630,150 common DNA variants, had an odds ratio (OR) per SD of 1.58 in South Asian UK Biobank participants and 1.60 in the Bangladeshi study (p < 0.001 for each). Next, individuals of the Indian case-control study were projected onto static reference distributions, observing an OR/SD of 1.66 (p < 0.001). Compared with the middle quintile, risk for CAD was most pronounced for those in the top 5% of the GPS distribution-ORs of 4.16, 2.46, and 3.22 in the South Asian UK Biobank, Bangladeshi, and Indian studies, respectively (p < 0.05 for each).

CONCLUSIONS: The new GPS has been developed and tested using 3 distinct South Asian studies, and provides a generalizable framework for ancestry-specific GPS assessment.

%B J Am Coll Cardiol %V 76 %P 703-714 %8 2020 08 11 %G eng %N 6 %1 https://www.ncbi.nlm.nih.gov/pubmed/32762905?dopt=Abstract %R 10.1016/j.jacc.2020.06.024 %0 Journal Article %J Elife %D 2020 %T A variant-centric perspective on geographic patterns of human allele frequency variation. %A Biddanda, Arjun %A Rice, Daniel P %A Novembre, John %K Gene Frequency %K Genetic Variation %K Genetics, Population %K Geography %K Humans %X

A key challenge in human genetics is to understand the geographic distribution of human genetic variation. Often genetic variation is described by showing relationships among populations or individuals, drawing inferences over many variants. Here, we introduce an alternative representation of genetic variation that reveals the relative abundance of different allele frequency patterns. This approach allows viewers to easily see several features of human genetic structure: (1) most variants are rare and geographically localized, (2) variants that are common in a single geographic region are more likely to be shared across the globe than to be private to that region, and (3) where two individuals differ, it is most often due to variants that are found globally, regardless of whether the individuals are from the same region or different regions. Our variant-centric visualization clarifies the geographic patterns of human variation and can help address misconceptions about genetic differentiation among populations.

%B Elife %V 9 %8 2020 12 22 %G eng %1 https://www.ncbi.nlm.nih.gov/pubmed/33350384?dopt=Abstract %R 10.7554/eLife.60107 %0 Journal Article %J Clin Genet %D 2020 %T Whole-exome sequencing in adult patients with developmental and epileptic encephalopathy: It is never too late. %A Minardi, Raffaella %A Licchetta, Laura %A Baroni, Maria Chiara %A Pippucci, Tommaso %A Stipa, Carlotta %A Mostacci, Barbara %A Severi, Giulia %A Toni, Francesco %A Bergonzini, Luca %A Carelli, Valerio %A Seri, Marco %A Tinuper, Paolo %A Bisulli, Francesca %X

Developmental and epileptic encephalopathies (DEE) encompass rare, sporadic neurodevelopmental disorders and usually with pediatric onset. As these conditions are characterized by marked clinical and genetic heterogeneity, whole-exome sequencing (WES) represents the strategy of choice for the molecular diagnosis. While its usefulness is well established in pediatric DEE cohorts, our study is aimed at assessing the WES feasibility in adult DEE patients who experienced a diagnostic odyssey prior to the advent of this technique. We analyzed exomes from 71 unrelated adult DEE patients, consecutively recruited from an Italian cohort for the EPI25 Project. All patients underwent accurate clinical and electrophysiological characterization. An overwhelming percentage (90.1%) had already undergone negative genetic testing. Variants were classified according to the American College of Medical Genetics and Genomics guidelines. WES disclosed 24 (likely) pathogenic variants among 18 patients in epilepsy-related genes with either autosomal dominant, recessive or X-linked inheritance. Ten of these were novel. We obtained a diagnostic yield of 25.3%, higher among patients with brain malformations, early-onset epilepsy and dysmorphisms. Despite a median diagnostic delay of 38.7 years, WES analysis provided the long-awaited diagnosis for 18 adult patients, which also had an impact on the clinical management of 50% of them.

%B Clin Genet %V 98 %P 477-485 %8 2020 11 %G eng %N 5 %1 https://www.ncbi.nlm.nih.gov/pubmed/32725632?dopt=Abstract %R 10.1111/cge.13823 %0 Journal Article %J Am J Hum Genet %D 2019 %T Aberrant Function of the C-Terminal Tail of HIST1H1E Accelerates Cellular Senescence and Causes Premature Aging. %A Flex, Elisabetta %A Martinelli, Simone %A Van Dijck, Anke %A Ciolfi, Andrea %A Cecchetti, Serena %A Coluzzi, Elisa %A Pannone, Luca %A Andreoli, Cristina %A Radio, Francesca Clementina %A Pizzi, Simone %A Carpentieri, Giovanna %A Bruselles, Alessandro %A Catanzaro, Giuseppina %A Pedace, Lucia %A Miele, Evelina %A Carcarino, Elena %A Ge, Xiaoyan %A Chijiwa, Chieko %A Lewis, M E Suzanne %A Meuwissen, Marije %A Kenis, Sandra %A Van der Aa, Nathalie %A Larson, Austin %A Brown, Kathleen %A Wasserstein, Melissa P %A Skotko, Brian G %A Begtrup, Amber %A Person, Richard %A Karayiorgou, Maria %A Roos, J Louw %A Van Gassen, Koen L %A Koopmans, Marije %A Bijlsma, Emilia K %A Santen, Gijs W E %A Barge-Schaapveld, Daniela Q C M %A Ruivenkamp, Claudia A L %A Hoffer, Mariette J V %A Lalani, Seema R %A Streff, Haley %A Craigen, William J %A Graham, Brett H %A van den Elzen, Annette P M %A Kamphuis, Daan J %A Õunap, Katrin %A Reinson, Karit %A Pajusalu, Sander %A Wojcik, Monica H %A Viberti, Clara %A Di Gaetano, Cornelia %A Bertini, Enrico %A Petrucci, Simona %A De Luca, Alessandro %A Rota, Rossella %A Ferretti, Elisabetta %A Matullo, Giuseppe %A Dallapiccola, Bruno %A Sgura, Antonella %A Walkiewicz, Magdalena %A Kooy, R Frank %A Tartaglia, Marco %X

Histones mediate dynamic packaging of nuclear DNA in chromatin, a process that is precisely controlled to guarantee efficient compaction of the genome and proper chromosomal segregation during cell division and to accomplish DNA replication, transcription, and repair. Due to the important structural and regulatory roles played by histones, it is not surprising that histone functional dysregulation or aberrant levels of histones can have severe consequences for multiple cellular processes and ultimately might affect development or contribute to cell transformation. Recently, germline frameshift mutations involving the C-terminal tail of HIST1H1E, which is a widely expressed member of the linker histone family and facilitates higher-order chromatin folding, have been causally linked to an as-yet poorly defined syndrome that includes intellectual disability. We report that these mutations result in stable proteins that reside in the nucleus, bind to chromatin, disrupt proper compaction of DNA, and are associated with a specific methylation pattern. Cells expressing these mutant proteins have a dramatically reduced proliferation rate and competence, hardly enter into the S phase, and undergo accelerated senescence. Remarkably, clinical assessment of a relatively large cohort of subjects sharing these mutations revealed a premature aging phenotype as a previously unrecognized feature of the disorder. Our findings identify a direct link between aberrant chromatin remodeling, cellular senescence, and accelerated aging.

%B Am J Hum Genet %V 105 %P 493-508 %8 2019 Sep 05 %G eng %N 3 %1 https://www.ncbi.nlm.nih.gov/pubmed/31447100?dopt=Abstract %R 10.1016/j.ajhg.2019.07.007 %0 Journal Article %J Am J Hum Genet %D 2019 %T ACAT: A Fast and Powerful p Value Combination Method for Rare-Variant Analysis in Sequencing Studies. %A Liu, Yaowu %A Chen, Sixing %A Li, Zilin %A Morrison, Alanna C %A Boerwinkle, Eric %A Lin, Xihong %X

Set-based analysis that jointly tests the association of variants in a group has emerged as a popular tool for analyzing rare and low-frequency variants in sequencing studies. The existing set-based tests can suffer significant power loss when only a small proportion of variants are causal, and their powers can be sensitive to the number, effect sizes, and effect directions of the causal variants and the choices of weights. Here we propose an aggregated Cauchy association test (ACAT), a general, powerful, and computationally efficient p value combination method for boosting power in sequencing studies. First, by combining variant-level p values, we use ACAT to construct a set-based test (ACAT-V) that is particularly powerful in the presence of only a small number of causal variants in a variant set. Second, by combining different variant-set-level p values, we use ACAT to construct an omnibus test (ACAT-O) that combines the strength of multiple complimentary set-based tests, including the burden test, sequence kernel association test (SKAT), and ACAT-V. Through analysis of extensively simulated data and the whole-genome sequencing data from the Atherosclerosis Risk in Communities (ARIC) study, we demonstrate that ACAT-V complements the SKAT and the burden test, and that ACAT-O has a substantially more robust and higher power than those of the alternative tests.

%B Am J Hum Genet %V 104 %P 410-421 %8 2019 Mar 07 %G eng %N 3 %1 https://www.ncbi.nlm.nih.gov/pubmed/30849328?dopt=Abstract %R 10.1016/j.ajhg.2019.01.002 %0 Journal Article %J Genet Med %D 2019 %T Atlas-CNV: a validated approach to call single-exon CNVs in the eMERGESeq gene panel. %A Chiang, Theodore %A Liu, Xiuping %A Wu, Tsung-Jung %A Hu, Jianhong %A Sedlazeck, Fritz J %A White, Simon %A Schaid, Daniel %A Andrade, Mariza de %A Jarvik, Gail P %A Crosslin, David %A Stanaway, Ian %A Carrell, David S %A Connolly, John J %A Hakonarson, Hakon %A Groopman, Emily E %A Gharavi, Ali G %A Fedotov, Alexander %A Bi, Weimin %A Leduc, Magalie S %A Murdock, David R %A Jiang, Yunyun %A Meng, Linyan %A Eng, Christine M %A Wen, Shu %A Yang, Yaping %A Muzny, Donna M %A Boerwinkle, Eric %A Salerno, William %A Venner, Eric %A Gibbs, Richard A %X

PURPOSE: To provide a validated method to confidently identify exon-containing copy-number variants (CNVs), with a low false discovery rate (FDR), in targeted sequencing data from a clinical laboratory with particular focus on single-exon CNVs.

METHODS: DNA sequence coverage data are normalized within each sample and subsequently exonic CNVs are identified in a batch of samples, when the target log ratio of the sample to the batch median exceeds defined thresholds. The quality of exonic CNV calls is assessed by C-scores (Z-like scores) using thresholds derived from gold standard samples and simulation studies. We integrate an ExonQC threshold to lower FDR and compare performance with alternate software (VisCap).

RESULTS: Thirteen CNVs were used as a truth set to validate Atlas-CNV and compared with VisCap. We demonstrated FDR reduction in validation, simulation, and 10,926 eMERGESeq samples without sensitivity loss. Sixty-four multiexon and 29 single-exon CNVs with high C-scores were assessed by Multiplex Ligation-dependent Probe Amplification (MLPA).

CONCLUSION: Atlas-CNV is validated as a method to identify exonic CNVs in targeted sequencing data generated in the clinical laboratory. The ExonQC and C-score assignment can reduce FDR (identification of targets with high variance) and improve calling accuracy of single-exon CNVs respectively. We propose guidelines and criteria to identify high confidence single-exon CNVs.

%B Genet Med %V 21 %P 2135-2144 %8 2019 Sep %G eng %N 9 %1 https://www.ncbi.nlm.nih.gov/pubmed/30890783?dopt=Abstract %R 10.1038/s41436-019-0475-4 %0 Journal Article %J Am J Med Genet A %D 2019 %T Biallelic and De Novo Variants in DONSON Reveal a Clinical Spectrum of Cell Cycle-opathies with Microcephaly, Dwarfism and Skeletal Abnormalities. %A Karaca, Ender %A Posey, Jennifer E %A Bostwick, Bret %A Liu, Pengfei %A Gezdirici, Alper %A Yesil, Gozde %A Coban Akdemir, Zeynep %A Bayram, Yavuz %A Harms, Frederike L %A Meinecke, Peter %A Alawi, Malik %A Bacino, Carlos A %A Sutton, V Reid %A Kortüm, Fanny %A Lupski, James R %X

Co-occurrence of primordial dwarfism and microcephaly together with particular skeletal findings are seen in a wide range of Mendelian syndromes including microcephaly micromelia syndrome (MMS, OMIM 251230), microcephaly, short stature, and limb abnormalities (MISSLA, OMIM 617604), and microcephalic primordial dwarfisms (MPDs). Genes associated with these syndromes encode proteins that have crucial roles in DNA replication or in other critical steps of the cell cycle that link DNA replication to cell division. We identified four unrelated families with five affected individuals having biallelic or de novo variants in DONSON presenting with a core phenotype of severe short stature (z score < -3 SD), additional skeletal abnormalities, and microcephaly. Two apparently unrelated families with identical homozygous c.631C > T p.(Arg211Cys) variant had clinical features typical of Meier-Gorlin syndrome (MGS), while two siblings with compound heterozygous c.346delG p.(Asp116Ile*62) and c.1349A > G p.(Lys450Arg) variants presented with Seckel-like phenotype. We also identified a de novo c.683G > T p.(Trp228Leu) variant in DONSON in a patient with prominent micrognathia, short stature and hypoplastic femur and tibia, clinically diagnosed with Femoral-Facial syndrome (FFS, OMIM 134780). Biallelic variants in DONSON have been recently described in individuals with microcephalic dwarfism. These studies also demonstrated that DONSON has an essential conserved role in the cell cycle. Here we describe novel biallelic and de novo variants that are associated with MGS, Seckel-like phenotype and FFS, the last of which has not been associated with any disease gene to date.

%B Am J Med Genet A %V 179 %P 2056-2066 %8 2019 Oct %G eng %N 10 %1 https://www.ncbi.nlm.nih.gov/pubmed/31407851?dopt=Abstract %R 10.1002/ajmg.a.61315 %0 Journal Article %J Am J Hum Genet %D 2019 %T Bi-allelic GOT2 Mutations Cause a Treatable Malate-Aspartate Shuttle-Related Encephalopathy. %A van Karnebeek, Clara D M %A Ramos, Rúben J %A Wen, Xiao-Yan %A Tarailo-Graovac, Maja %A Gleeson, Joseph G %A Skrypnyk, Cristina %A Brand-Arzamendi, Koroboshka %A Karbassi, Farhad %A Issa, Mahmoud Y %A van der Lee, Robin %A Drögemöller, Britt I %A Koster, Janet %A Rousseau, Justine %A Campeau, Philippe M %A Wang, Youdong %A Cao, Feng %A Li, Meng %A Ruiter, Jos %A Ciapaite, Jolita %A Kluijtmans, Leo A J %A Willemsen, Michel A A P %A Jans, Judith J %A Ross, Colin J %A Wintjes, Liesbeth T %A Rodenburg, Richard J %A Huigen, Marleen C D G %A Jia, Zhengping %A Waterham, Hans R %A Wasserman, Wyeth W %A Wanders, Ronald J A %A Verhoeven-Duif, Nanda M %A Zaki, Maha S %A Wevers, Ron A %X

Early-infantile encephalopathies with epilepsy are devastating conditions mandating an accurate diagnosis to guide proper management. Whole-exome sequencing was used to investigate the disease etiology in four children from independent families with intellectual disability and epilepsy, revealing bi-allelic GOT2 mutations. In-depth metabolic studies in individual 1 showed low plasma serine, hypercitrullinemia, hyperlactatemia, and hyperammonemia. The epilepsy was serine and pyridoxine responsive. Functional consequences of observed mutations were tested by measuring enzyme activity and by cell and animal models. Zebrafish and mouse models were used to validate brain developmental and functional defects and to test therapeutic strategies. GOT2 encodes the mitochondrial glutamate oxaloacetate transaminase. GOT2 enzyme activity was deficient in fibroblasts with bi-allelic mutations. GOT2, a member of the malate-aspartate shuttle, plays an essential role in the intracellular NAD(H) redox balance. De novo serine biosynthesis was impaired in fibroblasts with GOT2 mutations and GOT2-knockout HEK293 cells. Correcting the highly oxidized cytosolic NAD-redox state by pyruvate supplementation restored serine biosynthesis in GOT2-deficient cells. Knockdown of got2a in zebrafish resulted in a brain developmental defect associated with seizure-like electroencephalography spikes, which could be rescued by supplying pyridoxine in embryo water. Both pyridoxine and serine synergistically rescued embryonic developmental defects in zebrafish got2a morphants. The two treated individuals reacted favorably to their treatment. Our data provide a mechanistic basis for the biochemical abnormalities in GOT2 deficiency that may also hold for other MAS defects.

%B Am J Hum Genet %V 105 %P 534-548 %8 2019 Sep 05 %G eng %N 3 %1 https://www.ncbi.nlm.nih.gov/pubmed/31422819?dopt=Abstract %R 10.1016/j.ajhg.2019.07.015 %0 Journal Article %J Eur J Hum Genet %D 2019 %T Genetic architecture of laterality defects revealed by whole exome sequencing. %A Li, Alexander H %A Hanchard, Neil A %A Azamian, Mahshid %A D'Alessandro, Lisa C A %A Coban-Akdemir, Zeynep %A Lopez, Keila N %A Hall, Nancy J %A Dickerson, Heather %A Nicosia, Annarita %A Fernbach, Susan %A Boone, Philip M %A Gambin, Tomaz %A Karaca, Ender %A Gu, Shen %A Yuan, Bo %A Jhangiani, Shalini N %A Doddapaneni, HarshaVardhan %A Hu, Jianhong %A Dinh, Huyen %A Jayaseelan, Joy %A Muzny, Donna %A Lalani, Seema %A Towbin, Jeffrey %A Penny, Daniel %A Fraser, Charles %A Martin, James %A Lupski, James R %A Gibbs, Richard A %A Boerwinkle, Eric %A Ware, Stephanie M %A Belmont, John W %X

Aberrant left-right patterning in the developing human embryo can lead to a broad spectrum of congenital malformations. The causes of most laterality defects are not known, with variants in established genes accounting for <20% of cases. We sought to characterize the genetic spectrum of these conditions by performing whole-exome sequencing of 323 unrelated laterality cases. We investigated the role of rare, predicted-damaging variation in 1726 putative laterality candidate genes derived from model organisms, pathway analyses, and human phenotypes. We also evaluated the contribution of homo/hemizygous exon deletions and gene-based burden of rare variation. A total of 28 candidate variants (26 rare predicted-damaging variants and 2 hemizygous deletions) were identified, including variants in genes known to cause heterotaxy and primary ciliary dyskinesia (ACVR2B, NODAL, ZIC3, DNAI1, DNAH5, HYDIN, MMP21), and genes without a human phenotype association, but with prior evidence for a role in embryonic laterality or cardiac development. Sanger validation of the latter variants in probands and their parents revealed no de novo variants, but apparent transmitted heterozygous (ROCK2, ISL1, SMAD2), and hemizygous (RAI2, RIPPLY1) variant patterns. Collectively, these variants account for 7.1% of our study subjects. We also observe evidence for an excess burden of rare, predicted loss-of-function variation in PXDNL and BMS1- two genes relevant to the broader laterality phenotype. These findings highlight potential new genes in the development of laterality defects, and suggest extensive locus heterogeneity and complex genetic models in this class of birth defects.

%B Eur J Hum Genet %V 27 %P 563-573 %8 2019 Apr %G eng %N 4 %1 https://www.ncbi.nlm.nih.gov/pubmed/30622330?dopt=Abstract %R 10.1038/s41431-018-0307-z %0 Journal Article %J Science %D 2019 %T Genetic regulatory variation in populations informs transcriptome analysis in rare disease. %A Mohammadi, Pejman %A Castel, Stephane E %A Cummings, Beryl B %A Einson, Jonah %A Sousa, Christina %A Hoffman, Paul %A Donkervoort, Sandra %A Jiang, Zhuoxun %A Mohassel, Payam %A Foley, A Reghan %A Wheeler, Heather E %A Im, Hae Kyung %A Bonnemann, Carsten G %A MacArthur, Daniel G %A Lappalainen, Tuuli %X

Transcriptome data can facilitate the interpretation of the effects of rare genetic variants. Here, we introduce ANEVA (analysis of expression variation) to quantify genetic variation in gene dosage from allelic expression (AE) data in a population. Application of ANEVA to the Genotype-Tissues Expression (GTEx) data showed that this variance estimate is robust and correlated with selective constraint in a gene. Using these variance estimates in a dosage outlier test (ANEVA-DOT) applied to AE data from 70 Mendelian muscular disease patients showed accuracy in detecting genes with pathogenic variants in previously resolved cases and led to one confirmed and several potential new diagnoses. Using our reference estimates from GTEx data, ANEVA-DOT can be incorporated in rare disease diagnostic pipelines to use RNA-sequencing data more effectively.

%B Science %V 366 %P 351-356 %8 2019 10 18 %G eng %N 6463 %1 https://www.ncbi.nlm.nih.gov/pubmed/31601707?dopt=Abstract %R 10.1126/science.aay0256 %0 Journal Article %J Genet Med %D 2019 %T Genome sequencing identifies multiple deleterious variants in autism patients with more severe phenotypes. %A Guo, Hui %A Duyzend, Michael H %A Coe, Bradley P %A Baker, Carl %A Hoekzema, Kendra %A Gerdts, Jennifer %A Turner, Tychele N %A Zody, Michael C %A Beighley, Jennifer S %A Murali, Shwetha C %A Nelson, Bradley J %A Bamshad, Michael J %A Nickerson, Deborah A %A Bernier, Raphael A %A Eichler, Evan E %X

PURPOSE: To maximize the discovery of potentially pathogenic variants to better understand the diagnostic utility of genome sequencing (GS) and to assess how the presence of multiple risk events might affect the phenotypic severity in autism spectrum disorders (ASD).

METHODS: GS was applied to 180 simplex and multiplex ASD families (578 individuals, 213 patients) with exome sequencing and array comparative genomic hybridization further applied to a subset for validation and cross-platform comparisons.

RESULTS: We found that 40.8% of patients carried variants with evidence of disease risk, including a de novo frameshift variant in NR4A2 and two de novo missense variants in SYNCRIP, while 21.1% carried clinically relevant pathogenic or likely pathogenic variants. Patients with more than one risk variant (9.9%) were more severely affected with respect to cognitive ability compared with patients with a single or no-risk variant. We observed no instance among the 27 multiplex families where a pathogenic or likely pathogenic variant was transmitted to all affected members in the family.

CONCLUSION: The study demonstrates the diagnostic utility of GS, especially for multiple risk variants that contribute to the phenotypic severity, shows the genetic heterogeneity in multiplex families, and provides evidence for new genes for follow up.

%B Genet Med %V 21 %P 1611-1620 %8 2019 Jul %G eng %N 7 %1 https://www.ncbi.nlm.nih.gov/pubmed/30504930?dopt=Abstract %R 10.1038/s41436-018-0380-2 %0 Journal Article %J Cell %D 2019 %T Genomic Analysis in the Age of Human Genome Sequencing. %A Lappalainen, Tuuli %A Scott, Alexandra J %A Brandt, Margot %A Hall, Ira M %X

Affordable genome sequencing technologies promise to revolutionize the field of human genetics by enabling comprehensive studies that interrogate all classes of genome variation, genome-wide, across the entire allele frequency spectrum. Ongoing projects worldwide are sequencing many thousands-and soon millions-of human genomes as part of various gene mapping studies, biobanking efforts, and clinical programs. However, while genome sequencing data production has become routine, genome analysis and interpretation remain challenging endeavors with many limitations and caveats. Here, we review the current state of technologies for genetic variant discovery, genotyping, and functional interpretation and discuss the prospects for future advances. We focus on germline variants discovered by whole-genome sequencing, genome-wide functional genomic approaches for predicting and measuring variant functional effects, and implications for studies of common and rare human disease.

%B Cell %V 177 %P 70-84 %8 2019 Mar 21 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/30901550?dopt=Abstract %R 10.1016/j.cell.2019.02.032 %0 Journal Article %J Genet Med %D 2019 %T Insights into genetics, human biology and disease gleaned from family based genomic studies. %A Posey, Jennifer E %A O'Donnell-Luria, Anne H %A Chong, Jessica X %A Harel, Tamar %A Jhangiani, Shalini N %A Coban Akdemir, Zeynep H %A Buyske, Steven %A Pehlivan, Davut %A Carvalho, Claudia M B %A Baxter, Samantha %A Sobreira, Nara %A Liu, Pengfei %A Wu, Nan %A Rosenfeld, Jill A %A Kumar, Sushant %A Avramopoulos, Dimitri %A White, Janson J %A Doheny, Kimberly F %A Witmer, P Dane %A Boehm, Corinne %A Sutton, V Reid %A Muzny, Donna M %A Boerwinkle, Eric %A Günel, Murat %A Nickerson, Deborah A %A Mane, Shrikant %A MacArthur, Daniel G %A Gibbs, Richard A %A Hamosh, Ada %A Lifton, Richard P %A Matise, Tara C %A Rehm, Heidi L %A Gerstein, Mark %A Bamshad, Michael J %A Valle, David %A Lupski, James R %X

Identifying genes and variants contributing to rare disease phenotypes and Mendelian conditions informs biology and medicine, yet potential phenotypic consequences for variation of >75% of the ~20,000 annotated genes in the human genome are lacking. Technical advances to assess rare variation genome-wide, particularly exome sequencing (ES), enabled establishment in the United States of the National Institutes of Health (NIH)-supported Centers for Mendelian Genomics (CMGs) and have facilitated collaborative studies resulting in novel "disease gene" discoveries. Pedigree-based genomic studies and rare variant analyses in families with suspected Mendelian conditions have led to the elucidation of hundreds of novel disease genes and highlighted the impact of de novo mutational events, somatic variation underlying nononcologic traits, incompletely penetrant alleles, phenotypes with high locus heterogeneity, and multilocus pathogenic variation. Herein, we highlight CMG collaborative discoveries that have contributed to understanding both rare and common diseases and discuss opportunities for future discovery in single-locus Mendelian disorder genomics. Phenotypic annotation of all human genes; development of bioinformatic tools and analytic methods; exploration of non-Mendelian modes of inheritance including reduced penetrance, multilocus variation, and oligogenic inheritance; construction of allelic series at a locus; enhanced data sharing worldwide; and integration with clinical genomics are explored. Realizing the full contribution of rare disease research to functional annotation of the human genome, and further illuminating human biology and health, will lay the foundation for the Precision Medicine Initiative.

%B Genet Med %V 21 %P 798-812 %8 2019 04 %G eng %N 4 %1 https://www.ncbi.nlm.nih.gov/pubmed/30655598?dopt=Abstract %R 10.1038/s41436-018-0408-7 %0 Journal Article %J Am J Hum Genet %D 2019 %T Mendelian Gene Discovery: Fast and Furious with No End in Sight. %A Bamshad, Michael J %A Nickerson, Deborah A %A Chong, Jessica X %X

Gene discovery for Mendelian conditions (MCs) offers a direct path to understanding genome function. Approaches based on next-generation sequencing applied at scale have dramatically accelerated gene discovery and transformed genetic medicine. Finding the genetic basis of ∼6,000-13,000 MCs yet to be delineated will require both technical and computational innovation, but will rely to a larger extent on meaningful data sharing.

%B Am J Hum Genet %V 105 %P 448-455 %8 2019 Sep 05 %G eng %N 3 %1 https://www.ncbi.nlm.nih.gov/pubmed/31491408?dopt=Abstract %R 10.1016/j.ajhg.2019.07.011 %0 Journal Article %J Kidney Int %D 2019 %T Monogenic causes of chronic kidney disease in adults. %A Connaughton, Dervla M %A Kennedy, Claire %A Shril, Shirlee %A Mann, Nina %A Murray, Susan L %A Williams, Patrick A %A Conlon, Eoin %A Nakayama, Makiko %A van der Ven, Amelie T %A Ityel, Hadas %A Kause, Franziska %A Kolvenbach, Caroline M %A Dai, Rufeng %A Vivante, Asaf %A Braun, Daniela A %A Schneider, Ronen %A Kitzler, Thomas M %A Moloney, Brona %A Moran, Conor P %A Smyth, John S %A Kennedy, Alan %A Benson, Katherine %A Stapleton, Caragh %A Denton, Mark %A Magee, Colm %A O'Seaghdha, Conall M %A Plant, William D %A Griffin, Matthew D %A Awan, Atif %A Sweeney, Clodagh %A Mane, Shrikant M %A Lifton, Richard P %A Griffin, Brenda %A Leavey, Sean %A Casserly, Liam %A de Freitas, Declan G %A Holian, John %A Dorman, Anthony %A Doyle, Brendan %A Lavin, Peter J %A Little, Mark A %A Conlon, Peter J %A Hildebrandt, Friedhelm %X

Approximately 500 monogenic causes of chronic kidney disease (CKD) have been identified, mainly in pediatric populations. The frequency of monogenic causes among adults with CKD has been less extensively studied. To determine the likelihood of detecting monogenic causes of CKD in adults presenting to nephrology services in Ireland, we conducted whole exome sequencing (WES) in a multi-centre cohort of 114 families including 138 affected individuals with CKD. Affected adults were recruited from 78 families with a positive family history, 16 families with extra-renal features, and 20 families with neither a family history nor extra-renal features. We detected a pathogenic mutation in a known CKD gene in 42 of 114 families (37%). A monogenic cause was identified in 36% of affected families with a positive family history of CKD, 69% of those with extra-renal features, and only 15% of those without a family history or extra-renal features. There was no difference in the rate of genetic diagnosis in individuals with childhood versus adult onset CKD. Among the 42 families in whom a monogenic cause was identified, WES confirmed the clinical diagnosis in 17 (40%), corrected the clinical diagnosis in 9 (22%), and established a diagnosis for the first time in 16 families referred with CKD of unknown etiology (38%). In this multi-centre study of adults with CKD, a molecular genetic diagnosis was established in over one-third of families. In the evolving era of precision medicine, WES may be an important tool to identify the cause of CKD in adults.

%B Kidney Int %V 95 %P 914-928 %8 2019 Apr %G eng %N 4 %1 https://www.ncbi.nlm.nih.gov/pubmed/30773290?dopt=Abstract %R 10.1016/j.kint.2018.10.031 %0 Journal Article %J Acta Neuropathol %D 2019 %T MSTO1 mutations cause mtDNA depletion, manifesting as muscular dystrophy with cerebellar involvement. %A Donkervoort, S %A Sabouny, R %A Yun, P %A Gauquelin, L %A Chao, K R %A Hu, Y %A Al Khatib, I %A Töpf, A %A Mohassel, P %A Cummings, B B %A Kaur, R %A Saade, D %A Moore, S A %A Waddell, L B %A Farrar, M A %A Goodrich, J K %A Uapinyoying, P %A Chan, S H S %A Javed, A %A Leach, M E %A Karachunski, P %A Dalton, J %A Medne, L %A Harper, A %A Thompson, C %A Thiffault, I %A Specht, S %A Lamont, R E %A Saunders, C %A Racher, H %A Bernier, F P %A Mowat, D %A Witting, N %A Vissing, J %A Hanson, R %A Coffman, K A %A Hainlen, M %A Parboosingh, J S %A Carnevale, A %A Yoon, G %A Schnur, R E %A Boycott, K M %A Mah, J K %A Straub, V %A Foley, A Reghan %A Innes, A M %A Bönnemann, C G %A Shutt, T E %X

MSTO1 encodes a cytosolic mitochondrial fusion protein, misato homolog 1 or MSTO1. While the full genotype-phenotype spectrum remains to be explored, pathogenic variants in MSTO1 have recently been reported in a small number of patients presenting with a phenotype of cerebellar ataxia, congenital muscle involvement with histologic findings ranging from myopathic to dystrophic and pigmentary retinopathy. The proposed underlying pathogenic mechanism of MSTO1-related disease is suggestive of impaired mitochondrial fusion secondary to a loss of function of MSTO1. Disorders of mitochondrial fusion and fission have been shown to also lead to mitochondrial DNA (mtDNA) depletion, linking them to the mtDNA depletion syndromes, a clinically and genetically diverse class of mitochondrial diseases characterized by a reduction of cellular mtDNA content. However, the consequences of pathogenic variants in MSTO1 on mtDNA maintenance remain poorly understood. We present extensive phenotypic and genetic data from 12 independent families, including 15 new patients harbouring a broad array of bi-allelic MSTO1 pathogenic variants, and we provide functional characterization from seven MSTO1-related disease patient fibroblasts. Bi-allelic loss-of-function variants in MSTO1 manifest clinically with a remarkably consistent phenotype of childhood-onset muscular dystrophy, corticospinal tract dysfunction and early-onset non-progressive cerebellar atrophy. MSTO1 protein was not detectable in the cultured fibroblasts of all seven patients evaluated, suggesting that pathogenic variants result in a loss of protein expression and/or affect protein stability. Consistent with impaired mitochondrial fusion, mitochondrial networks in fibroblasts were found to be fragmented. Furthermore, all fibroblasts were found to have depletion of mtDNA ranging from 30 to 70% along with alterations to mtDNA nucleoids. Our data corroborate the role of MSTO1 as a mitochondrial fusion protein and highlight a previously unrecognized link to mtDNA regulation. As impaired mitochondrial fusion is a recognized cause of mtDNA depletion syndromes, this novel link to mtDNA depletion in patient fibroblasts suggests that MSTO1-deficiency should also be considered a mtDNA depletion syndrome. Thus, we provide mechanistic insight into the disease pathogenesis associated with MSTO1 mutations and further define the clinical spectrum and the natural history of MSTO1-related disease.

%B Acta Neuropathol %8 2019 Aug 29 %G eng %1 https://www.ncbi.nlm.nih.gov/pubmed/31463572?dopt=Abstract %R 10.1007/s00401-019-02059-z %0 Journal Article %J Am J Med Genet A %D 2019 %T Novel homozygous ENPP1 mutation causes generalized arterial calcifications of infancy, thrombocytopenia, and cardiovascular and central nervous system syndrome. %A Staretz-Chacham, Orna %A Shukrun, Rachel %A Barel, Ortal %A Pode-Shakked, Ben %A Pleniceanu, Oren %A Anikster, Yair %A Shalva, Nechama %A Ferreira, Carlos R %A Ben-Haim Kadosh, Admit %A Richardson, Justin %A Mane, Shrikant M %A Hildebrandt, Friedhelm %A Vivante, Asaf %X

Generalized arterial calcifications of infancy (GACI) is caused by mutations in ENPP1. Other ENPP1-related phenotypes include pseudoxanthoma elasticum, hypophosphatemic rickets, and Cole disease. We studied four children from two Bedouin consanguineous families who presented with severe clinical phenotype including thrombocytopenia, hypoglycemia, hepatic, and neurologic manifestations. Initial working diagnosis included congenital infection; however, patients remained without a definitive diagnosis despite extensive workup. Consequently, we investigated a potential genetic etiology. Whole exome sequencing (WES) was performed for affected children and their parents. Following the identification of a novel mutation in the ENPP1 gene, we characterized this novel multisystemic presentation and revised relevant imaging studies. Using WES, we identified a novel homozygous mutation (c.556G > C; p.Gly186Arg) in ENPP1 which affects a highly conserved protein domain (somatomedin B2). ENPP1-associated genetic diseases exhibit phenotypic heterogeneity depending on mutation type and location. Follow-up clinical characterization of these families allowed us to revise and detect new features of systemic calcifications, which established the diagnosis of GACI, expanding the phenotypic spectrum associated with ENPP1 mutations. Our findings demonstrate that this novel ENPP1 founder mutation can cause a fatal multisystemic phenotype, mimicking severe congenital infection. This also represents the first reported mutation affecting the SMB2 domain, associated with GACI.

%B Am J Med Genet A %V 179 %P 2112-2118 %8 2019 Oct %G eng %N 10 %1 https://www.ncbi.nlm.nih.gov/pubmed/31444901?dopt=Abstract %R 10.1002/ajmg.a.61334 %0 Journal Article %J Am J Hum Genet %D 2019 %T Pathogenic Abnormal Splicing Due to Intronic Deletions that Induce Biophysical Space Constraint for Spliceosome Assembly. %A Bryen, Samantha J %A Joshi, Himanshu %A Evesson, Frances J %A Girard, Cyrille %A Ghaoui, Roula %A Waddell, Leigh B %A Testa, Alison C %A Cummings, Beryl %A Arbuckle, Susan %A Graf, Nicole %A Webster, Richard %A MacArthur, Daniel G %A Laing, Nigel G %A Davis, Mark R %A Lührmann, Reinhard %A Cooper, Sandra T %X

A precise genetic diagnosis is the single most important step for families with genetic disorders to enable personalized and preventative medicine. In addition to genetic variants in coding regions (exons) that can change a protein sequence, abnormal pre-mRNA splicing can be devastating for the encoded protein, inducing a frameshift or in-frame deletion/insertion of multiple residues. Non-coding variants that disrupt splicing are extremely challenging to identify. Stemming from an initial clinical discovery in two index Australian families, we define 25 families with genetic disorders caused by a class of pathogenic non-coding splice variant due to intronic deletions. These pathogenic intronic deletions spare all consensus splice motifs, though they critically shorten the minimal distance between the 5' splice-site (5'SS) and branchpoint. The mechanistic basis for abnormal splicing is due to biophysical constraint precluding U1/U2 spliceosome assembly, which stalls in A-complexes (that bridge the 5'SS and branchpoint). Substitution of deleted nucleotides with non-specific sequences restores spliceosome assembly and normal splicing, arguing against loss of an intronic element as the primary causal basis. Incremental lengthening of 5'SS-branchpoint length in our index EMD case subject defines 45-47 nt as the critical elongation enabling (inefficient) spliceosome assembly for EMD intron 5. The 5'SS-branchpoint space constraint mechanism, not currently factored by genomic informatics pipelines, is relevant to diagnosis and precision medicine across the breadth of Mendelian disorders and cancer genomics.

%B Am J Hum Genet %V 105 %P 573-587 %8 2019 Sep 05 %G eng %N 3 %1 https://www.ncbi.nlm.nih.gov/pubmed/31447096?dopt=Abstract %R 10.1016/j.ajhg.2019.07.013 %0 Journal Article %J Genet Med %D 2019 %T The pleiotropy associated with de novo variants in CHD4, CNOT3, and SETD5 extends to moyamoya angiopathy. %A Pinard, Amélie %A Guey, Stéphanie %A Guo, Dongchuan %A Cecchi, Alana C %A Kharas, Natasha %A Wallace, Stephanie %A Regalado, Ellen S %A Hostetler, Ellen M %A Sharrief, Anjail Z %A Bergametti, Françoise %A Kossorotoff, Manoelle %A Hervé, Dominique %A Kraemer, Markus %A Bamshad, Michael J %A Nickerson, Deborah A %A Smith, Edward R %A Tournier-Lasserve, Elisabeth %A Milewicz, Dianna M %X

PURPOSE: Moyamoya angiopathy (MMA) is a cerebrovascular disease characterized by occlusion of large arteries, which leads to strokes starting in childhood. Twelve altered genes predispose to MMA but the majority of cases of European descent do not have an identified genetic trigger.

METHODS: Exome sequencing from 39 trios were analyzed.

RESULTS: We identified four de novo variants in three genes not previously associated with MMA: CHD4, CNOT3, and SETD5. Identification of additional rare variants in these genes in 158 unrelated MMA probands provided further support that rare pathogenic variants in CHD4 and CNOT3 predispose to MMA. Previous studies identified de novo variants in these genes in children with developmental disorders (DD), intellectual disability, and congenital heart disease.

CONCLUSION: These genes encode proteins involved in chromatin remodeling, and taken together with previously reported genes leading to MMA-like cerebrovascular occlusive disease (YY1AP1, SMARCAL1), implicate disrupted chromatin remodeling as a molecular pathway predisposing to early onset, large artery occlusive cerebrovascular disease. Furthermore, these data expand the spectrum of phenotypic pleiotropy due to alterations of CHD4, CNOT3, and SETD5 beyond DD to later onset disease in the cerebrovascular arteries and emphasize the need to assess clinical complications into adulthood for genes associated with DD.

%B Genet Med %8 2019 Sep 02 %G eng %1 https://www.ncbi.nlm.nih.gov/pubmed/31474762?dopt=Abstract %R 10.1038/s41436-019-0639-2 %0 Journal Article %J Am J Hum Genet %D 2019 %T Redefining the Etiologic Landscape of Cerebellar Malformations. %A Aldinger, Kimberly A %A Timms, Andrew E %A Thomson, Zachary %A Mirzaa, Ghayda M %A Bennett, James T %A Rosenberg, Alexander B %A Roco, Charles M %A Hirano, Matthew %A Abidi, Fatima %A Haldipur, Parthiv %A Cheng, Chi V %A Collins, Sarah %A Park, Kaylee %A Zeiger, Jordan %A Overmann, Lynne M %A Alkuraya, Fowzan S %A Biesecker, Leslie G %A Braddock, Stephen R %A Cathey, Sara %A Cho, Megan T %A Chung, Brian H Y %A Everman, David B %A Zarate, Yuri A %A Jones, Julie R %A Schwartz, Charles E %A Goldstein, Amy %A Hopkin, Robert J %A Krantz, Ian D %A Ladda, Roger L %A Leppig, Kathleen A %A McGillivray, Barbara C %A Sell, Susan %A Wusik, Katherine %A Gleeson, Joseph G %A Nickerson, Deborah A %A Bamshad, Michael J %A Gerrelli, Dianne %A Lisgo, Steven N %A Seelig, Georg %A Ishak, Gisele E %A Barkovich, A James %A Curry, Cynthia J %A Glass, Ian A %A Millen, Kathleen J %A Doherty, Dan %A Dobyns, William B %X

Cerebellar malformations are diverse congenital anomalies frequently associated with developmental disability. Although genetic and prenatal non-genetic causes have been described, no systematic analysis has been performed. Here, we present a large-exome sequencing study of Dandy-Walker malformation (DWM) and cerebellar hypoplasia (CBLH). We performed exome sequencing in 282 individuals from 100 families with DWM or CBLH, and we established a molecular diagnosis in 36 of 100 families, with a significantly higher yield for CBLH (51%) than for DWM (16%). The 41 variants impact 27 neurodevelopmental-disorder-associated genes, thus demonstrating that CBLH and DWM are often features of monogenic neurodevelopmental disorders. Though only seven monogenic causes (19%) were identified in more than one individual, neuroimaging review of 131 additional individuals confirmed cerebellar abnormalities in 23 of 27 genetic disorders (85%). Prenatal risk factors were frequently found among individuals without a genetic diagnosis (30 of 64 individuals [47%]). Single-cell RNA sequencing of prenatal human cerebellar tissue revealed gene enrichment in neuronal and vascular cell types; this suggests that defective vasculogenesis may disrupt cerebellar development. Further, de novo gain-of-function variants in PDGFRB, a tyrosine kinase receptor essential for vascular progenitor signaling, were associated with CBLH, and this discovery links genetic and non-genetic etiologies. Our results suggest that genetic defects impact specific cerebellar cell types and implicate abnormal vascular development as a mechanism for cerebellar malformations. We also confirmed a major contribution for non-genetic prenatal factors in individuals with cerebellar abnormalities, substantially influencing diagnostic evaluation and counseling regarding recurrence risk and prognosis.

%B Am J Hum Genet %V 105 %P 606-615 %8 2019 Sep 05 %G eng %N 3 %1 https://www.ncbi.nlm.nih.gov/pubmed/31474318?dopt=Abstract %R 10.1016/j.ajhg.2019.07.019 %0 Journal Article %J Bioinformatics %D 2019 %T svtools: population-scale analysis of structural variation. %A Larson, David E %A Abel, Haley J %A Chiang, Colby %A Badve, Abhijit %A Das, Indraniel %A Eldred, James M %A Layer, Ryan M %A Hall, Ira M %X

SUMMARY: Large-scale human genetics studies are now employing whole genome sequencing with the goal of conducting comprehensive trait mapping analyses of all forms of genome variation. However, methods for structural variation (SV) analysis have lagged far behind those for smaller scale variants, and there is an urgent need to develop more efficient tools that scale to the size of human populations. Here, we present a fast and highly scalable software toolkit (svtools) and cloud-based pipeline for assembling high quality SV maps-including deletions, duplications, mobile element insertions, inversions and other rearrangements-in many thousands of human genomes. We show that this pipeline achieves similar variant detection performance to established per-sample methods (e.g. LUMPY), while providing fast and affordable joint analysis at the scale of ≥100 000 genomes. These tools will help enable the next generation of human genetics studies.

AVAILABILITY AND IMPLEMENTATION: svtools is implemented in Python and freely available (MIT) from https://github.com/hall-lab/svtools.

SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

%B Bioinformatics %V 35 %P 4782-4787 %8 2019 Nov 01 %G eng %N 22 %1 https://www.ncbi.nlm.nih.gov/pubmed/31218349?dopt=Abstract %R 10.1093/bioinformatics/btz492 %0 Journal Article %J Nat Commun %D 2018 %T Analysis of predicted loss-of-function variants in UK Biobank identifies variants protective for disease. %A Emdin, Connor A %A Khera, Amit V %A Chaffin, Mark %A Klarin, Derek %A Natarajan, Pradeep %A Aragam, Krishna %A Haas, Mary %A Bick, Alexander %A Zekavat, Seyedeh M %A Nomura, Akihiro %A Ardissino, Diego %A Wilson, James G %A Schunkert, Heribert %A McPherson, Ruth %A Watkins, Hugh %A Elosua, Roberto %A Bown, Matthew J %A Samani, Nilesh J %A Baber, Usman %A Erdmann, Jeanette %A Gupta, Namrata %A Danesh, John %A Chasman, Daniel %A Ridker, Paul %A Denny, Joshua %A Bastarache, Lisa %A Lichtman, Judith H %A D'Onofrio, Gail %A Mattera, Jennifer %A Spertus, John A %A Sheu, Wayne H-H %A Taylor, Kent D %A Psaty, Bruce M %A Rich, Stephen S %A Post, Wendy %A Rotter, Jerome I %A Chen, Yii-Der Ida %A Krumholz, Harlan %A Saleheen, Danish %A Gabriel, Stacey %A Kathiresan, Sekar %K Databases, Genetic %K Diabetes Mellitus, Type 2 %K Disease %K Gene Frequency %K Genetic Testing %K Genetic Variation %K Humans %K Obesity %K Phenotype %K Proteins %K Respiratory Hypersensitivity %K United Kingdom %X

Less than 3% of protein-coding genetic variants are predicted to result in loss of protein function through the introduction of a stop codon, frameshift, or the disruption of an essential splice site; however, such predicted loss-of-function (pLOF) variants provide insight into effector transcript and direction of biological effect. In >400,000 UK Biobank participants, we conduct association analyses of 3759 pLOF variants with six metabolic traits, six cardiometabolic diseases, and twelve additional diseases. We identified 18 new low-frequency or rare (allele frequency < 5%) pLOF variant-phenotype associations. pLOF variants in the gene GPR151 protect against obesity and type 2 diabetes, in the gene IL33 against asthma and allergic disease, and in the gene IFIH1 against hypothyroidism. In the gene PDE3B, pLOF variants associate with elevated height, improved body fat distribution and protection from coronary artery disease. Our findings prioritize genes for which pharmacologic mimics of pLOF variants may lower risk for disease.

%B Nat Commun %V 9 %P 1613 %8 2018 04 24 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/29691411?dopt=Abstract %R 10.1038/s41467-018-03911-8 %0 Journal Article %J Stroke %D 2018 %T Cardioembolic Stroke Risk and Recovery After Anticoagulation-Related Intracerebral Hemorrhage. %A Murphy, Meredith P %A Kuramatsu, Joji B %A Leasure, Audrey %A Falcone, Guido J %A Kamel, Hooman %A Sansing, Lauren H %A Kourkoulis, Christina %A Schwab, Kristin %A Elm, Jordan J %A Gurol, M Edip %A Tran, Huy %A Greenberg, Steven M %A Viswanathan, Anand %A Anderson, Christopher D %A Schwab, Stefan %A Rosand, Jonathan %A Shi, Fu-Dong %A Kittner, Steven J %A Testai, Fernando D %A Woo, Daniel %A Langefeld, Carl D %A James, Michael L %A Koch, Sebastian %A Huttner, Hagen B %A Biffi, Alessandro %A Sheth, Kevin N %X

Background and Purpose- Whether to resume oral anticoagulation treatment after intracerebral hemorrhage (ICH) remains an unresolved question. Previous studies focused primarily on recurrent stroke after ICH. We sought to investigate the association between cardioembolic stroke risk, oral anticoagulation therapy resumption, and functional recovery among ICH survivors in the absence of recurrent stroke. Methods- We conducted a joint analysis of 3 observational studies: (1) the multicenter RETRACE study (German-Wide Multicenter Analysis of Oral Anticoagulation Associated Intracerebral Hemorrhage); (2) the Massachusetts General Hospital ICH study (n=166); and (3) the ERICH study (Ethnic/Racial Variations of Intracerebral Hemorrhage; n=131). We included 941 survivors of ICH in the setting of active oral anticoagulation therapy for prevention of cardioembolic stroke because of nonvalvular atrial fibrillation and without evidence of ischemic stroke and recurrent ICH at 1 year from the index event. We created univariable and multivariable models to explore associations between cardioembolic stroke risk (based on CHADS-VASc scores) and functional recovery after ICH, defined as achieving modified Rankin Scale score of ≤3 at 1 year for participants with modified Rankin Scale score of >3 at discharge. Results- In multivariable analyses, the CHADS-VASc score was associated with a decreased likelihood of functional recovery (odds ratio, 0.83 per 1 point increase; 95% CI, 0.79-0.86) at 1 year. Anticoagulation resumption was independently associated with a higher likelihood of recovery, regardless of CHADS-VASc score (odds ratio, 1.89; 95% CI, 1.32-2.70). We found an interaction between CHADS-VASc score and anticoagulation resumption in terms of association with increased likelihood of functional recovery (interaction P=0.011). Conclusions- Increasing cardioembolic stroke risk is associated with a decreased likelihood of functional recovery at 1 year after ICH, but this association was weaker among participants resuming oral anticoagulation therapy. These findings support, including recovery metrics, in future studies of anticoagulation resumption after ICH.

%B Stroke %V 49 %P 2652-2658 %8 2018 Nov %G eng %N 11 %1 https://www.ncbi.nlm.nih.gov/pubmed/30355194?dopt=Abstract %R 10.1161/STROKEAHA.118.021799 %0 Journal Article %J BMC Genomics %D 2018 %T Characterization and remediation of sample index swaps by non-redundant dual indexing on massively parallel sequencing platforms. %A Costello, Maura %A Fleharty, Mark %A Abreu, Justin %A Farjoun, Yossi %A Ferriera, Steven %A Holmes, Laurie %A Granger, Brian %A Green, Lisa %A Howd, Tom %A Mason, Tamara %A Vicente, Gina %A Dasilva, Michael %A Brodeur, Wendy %A DeSmet, Timothy %A Dodge, Sheila %A Lennon, Niall J %A Gabriel, Stacey %K DNA %K Gene Library %K Genome, Human %K High-Throughput Nucleotide Sequencing %K Humans %K Sequence Analysis %K Sequence Analysis, DNA %X

BACKGROUND: Here we present an in-depth characterization of the mechanism of sequencer-induced sample contamination due to the phenomenon of index swapping that impacts Illumina sequencers employing patterned flow cells with Exclusion Amplification (ExAmp) chemistry (HiSeqX, HiSeq4000, and NovaSeq). We also present a remediation method that minimizes the impact of such swaps.

RESULTS: Leveraging data collected over a two-year period, we demonstrate the widespread prevalence of index swapping in patterned flow cell data. We calculate mean swap rates across multiple sample preparation methods and sequencer models, demonstrating that different library methods can have vastly different swapping rates and that even non-ExAmp chemistry instruments display trace levels of index swapping. We provide methods for eliminating sample data cross contamination by utilizing non-redundant dual indexing for complete filtering of index swapped reads, and share the sequences for 96 non-combinatorial dual indexes we have validated across various library preparation methods and sequencer models. Finally, using computational methods we provide a greater insight into the mechanism of index swapping.

CONCLUSIONS: Index swapping in pooled libraries is a prevalent phenomenon that we observe at a rate of 0.2 to 6% in all sequencing runs on HiSeqX, HiSeq 4000/3000, and NovaSeq. Utilizing non-redundant dual indexing allows for the removal (flagging/filtering) of these swapped reads and eliminates swapping induced sample contamination, which is critical for sensitive applications such as RNA-seq, single cell, blood biopsy using circulating tumor DNA, or clinical sequencing.

%B BMC Genomics %V 19 %P 332 %8 2018 May 08 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/29739332?dopt=Abstract %R 10.1186/s12864-018-4703-0 %0 Journal Article %J Genet Med %D 2018 %T Characterizing reduced coverage regions through comparison of exome and genome sequencing data across 10 centers. %A Sanghvi, Rashesh V %A Buhay, Christian J %A Powell, Bradford C %A Tsai, Ellen A %A Dorschner, Michael O %A Hong, Celine S %A Lebo, Matthew S %A Sasson, Ariella %A Hanna, David S %A McGee, Sean %A Bowling, Kevin M %A Cooper, Gregory M %A Gray, David E %A Lonigro, Robert J %A Dunford, Andrew %A Brennan, Christine A %A Cibulskis, Carrie %A Walker, Kimberly %A Carneiro, Mauricio O %A Sailsbery, Joshua %A Hindorff, Lucia A %A Robinson, Dan R %A Santani, Avni %A Sarmady, Mahdi %A Rehm, Heidi L %A Biesecker, Leslie G %A Nickerson, Deborah A %A Hutter, Carolyn M %A Garraway, Levi %A Muzny, Donna M %A Wagle, Nikhil %K Base Sequence %K Chromosome Mapping %K Exome %K Genome, Human %K High-Throughput Nucleotide Sequencing %K Humans %K Sequence Analysis, DNA %K Software %K Whole Exome Sequencing %K Whole Genome Sequencing %X

PURPOSE: As massively parallel sequencing is increasingly being used for clinical decision making, it has become critical to understand parameters that affect sequencing quality and to establish methods for measuring and reporting clinical sequencing standards. In this report, we propose a definition for reduced coverage regions and describe a set of standards for variant calling in clinical sequencing applications.

METHODS: To enable sequencing centers to assess the regions of poor sequencing quality in their own data, we optimized and used a tool (ExCID) to identify reduced coverage loci within genes or regions of particular interest. We used this framework to examine sequencing data from 500 patients generated in 10 projects at sequencing centers in the National Human Genome Research Institute/National Cancer Institute Clinical Sequencing Exploratory Research Consortium.

RESULTS: This approach identified reduced coverage regions in clinically relevant genes, including known clinically relevant loci that were uniquely missed at individual centers, in multiple centers, and in all centers.

CONCLUSION: This report provides a process road map for clinical sequencing centers looking to perform similar analyses on their data.

%B Genet Med %V 20 %P 855-866 %8 2018 08 %G eng %N 8 %1 http://www.ncbi.nlm.nih.gov/pubmed/29144510?dopt=Abstract %R 10.1038/gim.2017.192 %0 Journal Article %J Genome Res %D 2018 %T Complex rearrangements and oncogene amplifications revealed by long-read DNA and RNA sequencing of a breast cancer cell line. %A Nattestad, Maria %A Goodwin, Sara %A Ng, Karen %A Baslan, Timour %A Sedlazeck, Fritz J %A Rescheneder, Philipp %A Garvin, Tyler %A Fang, Han %A Gurtowski, James %A Hutton, Elizabeth %A Tseng, Elizabeth %A Chin, Chen-Shan %A Beck, Timothy %A Sundaravadanam, Yogi %A Kramer, Melissa %A Antoniou, Eric %A McPherson, John D %A Hicks, James %A McCombie, W Richard %A Schatz, Michael C %K Breast Neoplasms %K Female %K Gene Amplification %K Gene Rearrangement %K Genome, Human %K Genomic Structural Variation %K High-Throughput Nucleotide Sequencing %K Humans %K MCF-7 Cells %K Oncogenes %K Receptor, ErbB-2 %K Repetitive Sequences, Nucleic Acid %K Transcriptome %X

The SK-BR-3 cell line is one of the most important models for HER2+ breast cancers, which affect one in five breast cancer patients. SK-BR-3 is known to be highly rearranged, although much of the variation is in complex and repetitive regions that may be underreported. Addressing this, we sequenced SK-BR-3 using long-read single molecule sequencing from Pacific Biosciences and develop one of the most detailed maps of structural variations (SVs) in a cancer genome available, with nearly 20,000 variants present, most of which were missed by short-read sequencing. Surrounding the important oncogene (also known as ), we discover a complex sequence of nested duplications and translocations, suggesting a punctuated progression. Full-length transcriptome sequencing further revealed several novel gene fusions within the nested genomic variants. Combining long-read genome and transcriptome sequencing enables an in-depth analysis of how SVs disrupt the genome and sheds new light on the complex mechanisms involved in cancer genome evolution.

%B Genome Res %V 28 %P 1126-1135 %8 2018 08 %G eng %N 8 %1 http://www.ncbi.nlm.nih.gov/pubmed/29954844?dopt=Abstract %R 10.1101/gr.231100.117 %0 Journal Article %J Nat Commun %D 2018 %T Functional equivalence of genome sequencing analysis pipelines enables harmonized variant calling across human genetics projects. %A Regier, Allison A %A Farjoun, Yossi %A Larson, David E %A Krasheninina, Olga %A Kang, Hyun Min %A Howrigan, Daniel P %A Chen, Bo-Juen %A Kher, Manisha %A Banks, Eric %A Ames, Darren C %A English, Adam C %A Li, Heng %A Xing, Jinchuan %A Zhang, Yeting %A Matise, Tara %A Abecasis, Goncalo R %A Salerno, Will %A Zody, Michael C %A Neale, Benjamin M %A Hall, Ira M %K Genome, Human %K Human Genetics %K Humans %K Whole Genome Sequencing %X

Hundreds of thousands of human whole genome sequencing (WGS) datasets will be generated over the next few years. These data are more valuable in aggregate: joint analysis of genomes from many sources increases sample size and statistical power. A central challenge for joint analysis is that different WGS data processing pipelines cause substantial differences in variant calling in combined datasets, necessitating computationally expensive reprocessing. This approach is no longer tenable given the scale of current studies and data volumes. Here, we define WGS data processing standards that allow different groups to produce functionally equivalent (FE) results, yet still innovate on data processing pipelines. We present initial FE pipelines developed at five genome centers and show that they yield similar variant calling results and produce significantly less variability than sequencing replicates. This work alleviates a key technical bottleneck for genome aggregation and helps lay the foundation for community-wide human genetics studies.

%B Nat Commun %V 9 %P 4038 %8 2018 10 02 %G eng %N 1 %1 http://www.ncbi.nlm.nih.gov/pubmed/30279509?dopt=Abstract %R 10.1038/s41467-018-06159-4 %0 Journal Article %J Hum Genet %D 2018 %T Genetic variants in microRNA genes and targets associated with cardiovascular disease risk factors in the African-American population. %A Li, Chang %A Grove, Megan L %A Yu, Bing %A Jones, Barbara C %A Morrison, Alanna %A Boerwinkle, Eric %A Liu, Xiaoming %K 3' Untranslated Regions %K Adult %K African Americans %K Cardiovascular Diseases %K Female %K Genetic Predisposition to Disease %K Genotyping Techniques %K Humans %K Male %K MicroRNAs %K Middle Aged %K Polymorphism, Single Nucleotide %K Risk Factors %K Whole Genome Sequencing %X

The purpose of this study is to identify microRNA (miRNA) related polymorphism, including single nucleotide variants (SNVs) in mature miRNA-encoding sequences or in miRNA-target sites, and their association with cardiovascular disease (CVD) risk factors in African-American population. To achieve our objective, we examined 1900 African-Americans from the Atherosclerosis Risk in Communities study using SNVs identified from whole-genome sequencing data. A total of 971 SNVs found in 726 different mature miRNA-encoding sequences and 16,057 SNVs found in the three prime untranslated region (3'UTR) of 3647 protein-coding genes were identified and interrogated their associations with 17 CVD risk factors. Using single-variant-based approach, we found 5 SNVs in miRNA-encoding sequences to be associated with serum Lipoprotein(a) [Lp(a)], high-density lipoprotein (HDL) or triglycerides, and 2 SNVs in miRNA-target sites to be associated with Lp(a) and HDL, all with false discovery rates of 5%. Using a gene-based approach, we identified 3 pairs of associations between gene NSD1 and platelet count, gene HSPA4L and cardiac troponin T, and gene AHSA2 and magnesium. We successfully validated the association between a variant specific to African-American population, NR_039880.1:n.18A>C, in mature hsa-miR-4727-5p encoding sequence and serum HDL level in an independent sample of 2135 African-Americans. Our study provided candidate miRNAs and their targets for further investigation of their potential contribution to ethnic disparities in CVD risk factors.

%B Hum Genet %V 137 %P 85-94 %8 2018 Jan %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/29264654?dopt=Abstract %R 10.1007/s00439-017-1858-8 %0 Journal Article %J Am J Respir Crit Care Med %D 2018 %T Whole-Genome Sequencing of Pharmacogenetic Drug Response in Racially Diverse Children with Asthma. %A Mak, Angel C Y %A White, Marquitta J %A Eckalbar, Walter L %A Szpiech, Zachary A %A Oh, Sam S %A Pino-Yanes, Maria %A Hu, Donglei %A Goddard, Pagé %A Huntsman, Scott %A Galanter, Joshua %A Wu, Ann Chen %A Himes, Blanca E %A Germer, Soren %A Vogel, Julia M %A Bunting, Karen L %A Eng, Celeste %A Salazar, Sandra %A Keys, Kevin L %A Liberto, Jennifer %A Nuckton, Thomas J %A Nguyen, Thomas A %A Torgerson, Dara G %A Kwok, Pui-Yan %A Levin, Albert M %A Celedón, Juan C %A Forno, Erick %A Hakonarson, Hakon %A Sleiman, Patrick M %A Dahlin, Amber %A Tantisira, Kelan G %A Weiss, Scott T %A Serebrisky, Denise %A Brigino-Buenaventura, Emerita %A Farber, Harold J %A Meade, Kelley %A Lenoir, Michael A %A Avila, Pedro C %A Sen, Saunak %A Thyne, Shannon M %A Rodriguez-Cintron, William %A Winkler, Cheryl A %A Moreno-Estrada, Andrés %A Sandoval, Karla %A Rodriguez-Santana, Jose R %A Kumar, Rajesh %A Williams, L Keoki %A Ahituv, Nadav %A Ziv, Elad %A Seibold, Max A %A Darnell, Robert B %A Zaitlen, Noah %A Hernandez, Ryan D %A Burchard, Esteban G %X

RATIONALE: Albuterol, a bronchodilator medication, is the first-line therapy for asthma worldwide. There are significant racial/ethnic differences in albuterol drug response.

OBJECTIVES: To identify genetic variants important for bronchodilator drug response (BDR) in racially diverse children.

METHODS: We performed the first whole-genome sequencing pharmacogenetics study from 1,441 children with asthma from the tails of the BDR distribution to identify genetic association with BDR.

MEASUREMENTS AND MAIN RESULTS: We identified population-specific and shared genetic variants associated with BDR, including genome-wide significant (P < 3.53 × 10) and suggestive (P < 7.06 × 10) loci near genes previously associated with lung capacity (DNAH5), immunity (NFKB1 and PLCB1), and β-adrenergic signaling (ADAMTS3 and COX18). Functional analyses of the BDR-associated SNP in NFKB1 revealed potential regulatory function in bronchial smooth muscle cells. The SNP is also an expression quantitative trait locus for a neighboring gene, SLC39A8. The lack of other asthma study populations with BDR and whole-genome sequencing data on minority children makes it impossible to perform replication of our rare variant associations. Minority underrepresentation also poses significant challenges to identify age-matched and population-matched cohorts of sufficient sample size for replication of our common variant findings.

CONCLUSIONS: The lack of minority data, despite a collaboration of eight universities and 13 individual laboratories, highlights the urgent need for a dedicated national effort to prioritize diversity in research. Our study expands the understanding of pharmacogenetic analyses in racially/ethnically diverse populations and advances the foundation for precision medicine in at-risk and understudied minority populations.

%B Am J Respir Crit Care Med %V 197 %P 1552-1564 %8 2018 Jun 15 %G eng %N 12 %1 http://www.ncbi.nlm.nih.gov/pubmed/29509491?dopt=Abstract %R 10.1164/rccm.201712-2529OC %0 Journal Article %J J Am Coll Cardiol %D 2017 %T ANGPTL3 Deficiency and Protection Against Coronary Artery Disease. %A Stitziel, Nathan O %A Khera, Amit V %A Wang, Xiao %A Bierhals, Andrew J %A Vourakis, A Christina %A Sperry, Alexandra E %A Natarajan, Pradeep %A Klarin, Derek %A Emdin, Connor A %A Zekavat, Seyedeh M %A Nomura, Akihiro %A Erdmann, Jeanette %A Schunkert, Heribert %A Samani, Nilesh J %A Kraus, William E %A Shah, Svati H %A Yu, Bing %A Boerwinkle, Eric %A Rader, Daniel J %A Gupta, Namrata %A Frossard, Philippe M %A Rasheed, Asif %A Danesh, John %A Lander, Eric S %A Gabriel, Stacey %A Saleheen, Danish %A Musunuru, Kiran %A Kathiresan, Sekar %K Adult %K Angiopoietin-Like Protein 3 %K Angiopoietin-like Proteins %K Angiopoietins %K Animals %K Atherosclerosis %K Case-Control Studies %K Coronary Artery Disease %K Female %K Humans %K Lipids %K Male %K Mice, Inbred C57BL %K Mice, Knockout %K Middle Aged %K Mutation, Missense %K Myocardial Infarction %K Risk Factors %X

BACKGROUND: Familial combined hypolipidemia, a Mendelian condition characterized by substantial reductions in all 3 major lipid fractions, is caused by mutations that inactivate the gene angiopoietin-like 3 (ANGPTL3). Whether ANGPTL3 deficiency reduces risk of coronary artery disease (CAD) is unknown.

OBJECTIVES: The study goal was to leverage 3 distinct lines of evidence-a family that included individuals with complete (compound heterozygote) ANGPTL3 deficiency, a population based-study of humans with partial (heterozygote) ANGPTL3 deficiency, and biomarker levels in patients with myocardial infarction (MI)-to test whether ANGPTL3 deficiency is associated with lower risk for CAD.

METHODS: We assessed coronary atherosclerotic burden in 3 individuals with complete ANGPTL3 deficiency and 3 wild-type first-degree relatives using computed tomography angiography. In the population, ANGPTL3 loss-of-function (LOF) mutations were ascertained in up to 21,980 people with CAD and 158,200 control subjects. LOF mutations were defined as nonsense, frameshift, and splice-site variants, along with missense variants resulting in <25% of wild-type ANGPTL3 activity in a mouse model. In a biomarker study, circulating ANGPTL3 concentration was measured in 1,493 people who presented with MI and 3,232 control subjects.

RESULTS: The 3 individuals with complete ANGPTL3 deficiency showed no evidence of coronary atherosclerotic plaque. ANGPTL3 gene sequencing demonstrated that approximately 1 in 309 people was a heterozygous carrier for an LOF mutation. Compared with those without mutation, heterozygous carriers of ANGPTL3 LOF mutations demonstrated a 17% reduction in circulating triglycerides and a 12% reduction in low-density lipoprotein cholesterol. Carrier status was associated with a 34% reduction in odds of CAD (odds ratio: 0.66; 95% confidence interval: 0.44 to 0.98; p = 0.04). Individuals in the lowest tertile of circulating ANGPTL3 concentrations, compared with the highest, had reduced odds of MI (adjusted odds ratio: 0.65; 95% confidence interval: 0.55 to 0.77; p < 0.001).

CONCLUSIONS: ANGPTL3 deficiency is associated with protection from CAD.

%B J Am Coll Cardiol %V 69 %P 2054-2063 %8 2017 Apr 25 %G eng %N 16 %1 https://www.ncbi.nlm.nih.gov/pubmed/28385496?dopt=Abstract %R 10.1016/j.jacc.2017.02.030 %0 Journal Article %J Neuron %D 2017 %T cTag-PAPERCLIP Reveals Alternative Polyadenylation Promotes Cell-Type Specific Protein Diversity and Shifts Araf Isoforms with Microglia Activation. %A Hwang, Hun-Way %A Saito, Yuhki %A Park, Christopher Y %A Blachère, Nathalie E %A Tajima, Yoko %A Fak, John J %A Zucker-Scharff, Ilana %A Darnell, Robert B %K Animals %K Antigens, Neoplasm %K Astrocytes %K Brain %K Cells, Cultured %K Female %K Humans %K Male %K Mice %K Microglia %K Nerve Tissue Proteins %K Neuro-Oncological Ventral Antigen %K Neurons %K Organ Specificity %K Polyadenylation %K Polypyrimidine Tract-Binding Protein %K Protein Isoforms %K Protein Serine-Threonine Kinases %K RNA-Binding Proteins %X

Alternative polyadenylation (APA) is increasingly recognized to regulate gene expression across different cell types, but obtaining APA maps from individual cell types typically requires prior purification, a stressful procedure that can itself alter cellular states. Here, we describe a new platform, cTag-PAPERCLIP, that generates APA profiles from single cell populations in intact tissues; cTag-PAPERCLIP requires no tissue dissociation and preserves transcripts in native states. Applying cTag-PAPERCLIP to profile four major cell types in the mouse brain revealed common APA preferences between excitatory and inhibitory neurons distinct from astrocytes and microglia, regulated in part by neuron-specific RNA-binding proteins NOVA2 and PTBP2. We further identified a role of APA in switching Araf protein isoforms during microglia activation, impacting production of downstream inflammatory cytokines. Our results demonstrate the broad applicability of cTag-PAPERCLIP and a previously undiscovered role of APA in contributing to protein diversity between different cell types and cellular states within the brain.

%B Neuron %V 95 %P 1334-1349.e5 %8 2017 Sep 13 %G eng %N 6 %1 https://www.ncbi.nlm.nih.gov/pubmed/28910620?dopt=Abstract %R 10.1016/j.neuron.2017.08.024 %0 Journal Article %J Nat Genet %D 2017 %T Disruption of the ATXN1-CIC complex causes a spectrum of neurobehavioral phenotypes in mice and humans. %A Lu, Hsiang-Chih %A Tan, Qiumin %A Rousseaux, Maxime W C %A Wang, Wei %A Kim, Ji-Yoen %A Richman, Ronald %A Wan, Ying-Wooi %A Yeh, Szu-Ying %A Patel, Jay M %A Liu, Xiuyun %A Lin, Tao %A Lee, Yoontae %A Fryer, John D %A Han, Jing %A Chahrour, Maria %A Finnell, Richard H %A Lei, Yunping %A Zurita-Jimenez, Maria E %A Ahimaz, Priyanka %A Anyane-Yeboa, Kwame %A Van Maldergem, Lionel %A Lehalle, Daphne %A Jean-Marcais, Nolwenn %A Mosca-Boidron, Anne-Laure %A Thevenon, Julien %A Cousin, Margot A %A Bro, Della E %A Lanpher, Brendan C %A Klee, Eric W %A Alexander, Nora %A Bainbridge, Matthew N %A Orr, Harry T %A Sillitoe, Roy V %A Ljungberg, M Cecilia %A Liu, Zhandong %A Schaaf, Christian P %A Zoghbi, Huda Y %K Animals %K Ataxin-1 %K Autism Spectrum Disorder %K Cerebellum %K Female %K Humans %K Intellectual Disability %K Interpersonal Relations %K Male %K Mice %K Nerve Tissue Proteins %K Neurodegenerative Diseases %K Nuclear Proteins %K Phenotype %K Repressor Proteins %X

Gain-of-function mutations in some genes underlie neurodegenerative conditions, whereas loss-of-function mutations in the same genes have distinct phenotypes. This appears to be the case with the protein ataxin 1 (ATXN1), which forms a transcriptional repressor complex with capicua (CIC). Gain of function of the complex leads to neurodegeneration, but ATXN1-CIC is also essential for survival. We set out to understand the functions of the ATXN1-CIC complex in the developing forebrain and found that losing this complex results in hyperactivity, impaired learning and memory, and abnormal maturation and maintenance of upper-layer cortical neurons. We also found that CIC activity in the hypothalamus and medial amygdala modulates social interactions. Informed by these neurobehavioral features in mouse mutants, we identified five individuals with de novo heterozygous truncating mutations in CIC who share similar clinical features, including intellectual disability, attention deficit/hyperactivity disorder (ADHD), and autism spectrum disorder. Our study demonstrates that loss of ATXN1-CIC complexes causes a spectrum of neurobehavioral phenotypes.

%B Nat Genet %V 49 %P 527-536 %8 2017 Apr %G eng %N 4 %1 https://www.ncbi.nlm.nih.gov/pubmed/28288114?dopt=Abstract %R 10.1038/ng.3808 %0 Journal Article %J Obesity (Silver Spring) %D 2017 %T Exome sequencing reveals novel genetic loci influencing obesity-related traits in Hispanic children. %A Sabo, Aniko %A Mishra, Pamela %A Dugan-Perez, Shannon %A Voruganti, V Saroja %A Kent, Jack W %A Kalra, Divya %A Cole, Shelley A %A Comuzzie, Anthony G %A Muzny, Donna M %A Gibbs, Richard A %A Butte, Nancy F %K Adolescent %K ATPases Associated with Diverse Cellular Activities %K Body Mass Index %K Body Weight %K Child %K Child, Preschool %K Cohort Studies %K Exome %K Genetic Loci %K Genome-Wide Association Study %K Hispanic or Latino %K Humans %K Membrane Proteins %K Pediatric Obesity %K Polymorphism, Single Nucleotide %K Risk Factors %K Sequence Analysis, DNA %K Software %K Waist Circumference %K Young Adult %X

OBJECTIVE: To perform whole exome sequencing in 928 Hispanic children and identify variants and genes associated with childhood obesity.

METHODS: Single-nucleotide variants (SNVs) were identified from Illumina whole exome sequencing data using integrated read mapping, variant calling, and an annotation pipeline (Mercury). Association analyses of 74 obesity-related traits and exonic variants were performed using SeqMeta software. Rare autosomal variants were analyzed using gene-based association analyses, and common autosomal variants were analyzed at the SNV level.

RESULTS: (1) Rare exonic variants in 10 genes and 16 common SNVs in 11 genes that were associated with obesity traits in a cohort of Hispanic children were identified, (2) novel rare variants in peroxisome biogenesis factor 1 (PEX1) associated with several obesity traits (weight, weight z score, BMI, BMI z score, waist circumference, fat mass, trunk fat mass) were discovered, and (3) previously reported SNVs associated with childhood obesity were replicated.

CONCLUSIONS: Convergence of whole exome sequencing, a family-based design, and extensive phenotyping discovered novel rare and common variants associated with childhood obesity. Linking PEX1 to obesity phenotypes poses a novel mechanism of peroxisomal biogenesis and metabolism underlying the development of childhood obesity.

%B Obesity (Silver Spring) %V 25 %P 1270-1276 %8 2017 07 %G eng %N 7 %1 https://www.ncbi.nlm.nih.gov/pubmed/28508493?dopt=Abstract %R 10.1002/oby.21869 %0 Journal Article %J Nature %D 2017 %T Genetic effects on gene expression across human tissues. %A Battle, Alexis %A Brown, Christopher D %A Engelhardt, Barbara E %A Montgomery, Stephen B %K Alleles %K Chromosomes, Human %K Disease %K Female %K Gene Expression Profiling %K Gene Expression Regulation %K Genetic Variation %K Genome, Human %K Genotype %K Humans %K Male %K Organ Specificity %K Quantitative Trait Loci %X

Characterization of the molecular function of the human genome and its variation across individuals is essential for identifying the cellular mechanisms that underlie human genetic traits and diseases. The Genotype-Tissue Expression (GTEx) project aims to characterize variation in gene expression levels across individuals and diverse tissues of the human body, many of which are not easily accessible. Here we describe genetic effects on gene expression levels across 44 human tissues. We find that local genetic variation affects gene expression levels for the majority of genes, and we further identify inter-chromosomal genetic effects for 93 genes and 112 loci. On the basis of the identified genetic effects, we characterize patterns of tissue specificity, compare local and distal effects, and evaluate the functional properties of the genetic effects. We also demonstrate that multi-tissue, multi-individual data can be used to identify genes and pathways affected by human disease-associated variation, enabling a mechanistic interpretation of gene regulation and the genetic basis of disease.

%B Nature %V 550 %P 204-213 %8 2017 10 11 %G eng %N 7675 %1 https://www.ncbi.nlm.nih.gov/pubmed/29022597?dopt=Abstract %R 10.1038/nature24277 %0 Journal Article %J Nat Commun %D 2017 %T Genetic regulatory effects modified by immune activation contribute to autoimmune disease associations. %A Kim-Hellmuth, Sarah %A Bechheim, Matthias %A Pütz, Benno %A Mohammadi, Pejman %A Nédélec, Yohann %A Giangreco, Nicholas %A Becker, Jessica %A Kaiser, Vera %A Fricker, Nadine %A Beier, Esther %A Boor, Peter %A Castel, Stephane E %A Nöthen, Markus M %A Barreiro, Luis B %A Pickrell, Joseph K %A Müller-Myhsok, Bertram %A Lappalainen, Tuuli %A Schumacher, Johannes %A Hornung, Veit %K Acetylmuramyl-Alanyl-Isoglutamine %K Adjuvants, Immunologic %K Adolescent %K Adult %K Autoimmune Diseases %K Gene Expression %K Gene Expression Profiling %K Gene Expression Regulation %K Genetic Predisposition to Disease %K Healthy Volunteers %K Humans %K Indicators and Reagents %K Lipids %K Lipopolysaccharides %K Male %K Monocytes %K Quantitative Trait Loci %K Regulatory Sequences, Nucleic Acid %K RNA, Double-Stranded %K RNA, Messenger %K Young Adult %X

The immune system plays a major role in human health and disease, and understanding genetic causes of interindividual variability of immune responses is vital. Here, we isolate monocytes from 134 genotyped individuals, stimulate these cells with three defined microbe-associated molecular patterns (LPS, MDP, and 5'-ppp-dsRNA), and profile the transcriptomes at three time points. Mapping expression quantitative trait loci (eQTL), we identify 417 response eQTLs (reQTLs) with varying effects between conditions. We characterize the dynamics of genetic regulation on early and late immune response and observe an enrichment of reQTLs in distal cis-regulatory elements. In addition, reQTLs are enriched for recent positive selection with an evolutionary trend towards enhanced immune response. Finally, we uncover reQTL effects in multiple GWAS loci and show a stronger enrichment for response than constant eQTLs in GWAS signals of several autoimmune diseases. This demonstrates the importance of infectious stimuli in modifying genetic predisposition to disease.Insight into the genetic influence on the immune response is important for the understanding of interindividual variability in human pathologies. Here, the authors generate transcriptome data from human blood monocytes stimulated with various immune stimuli and provide a time-resolved response eQTL map.

%B Nat Commun %V 8 %P 266 %8 2017 08 16 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/28814792?dopt=Abstract %R 10.1038/s41467-017-00366-1 %0 Journal Article %J Nat Genet %D 2017 %T The impact of structural variation on human gene expression. %A Chiang, Colby %A Scott, Alexandra J %A Davis, Joe R %A Tsang, Emily K %A Li, Xin %A Kim, Yungil %A Hadzic, Tarik %A Damani, Farhan N %A Ganel, Liron %A Montgomery, Stephen B %A Battle, Alexis %A Conrad, Donald F %A Hall, Ira M %K Algorithms %K Chromosome Mapping %K Gene Expression Regulation %K Genetic Variation %K Genome, Human %K Genome-Wide Association Study %K Humans %K INDEL Mutation %K Linear Models %K Polymorphism, Single Nucleotide %K Quantitative Trait Loci %K Sequence Analysis, DNA %X

Structural variants (SVs) are an important source of human genetic diversity, but their contribution to traits, disease and gene regulation remains unclear. We mapped cis expression quantitative trait loci (eQTLs) in 13 tissues via joint analysis of SVs, single-nucleotide variants (SNVs) and short insertion/deletion (indel) variants from deep whole-genome sequencing (WGS). We estimated that SVs are causal at 3.5-6.8% of eQTLs-a substantially higher fraction than prior estimates-and that expression-altering SVs have larger effect sizes than do SNVs and indels. We identified 789 putative causal SVs predicted to directly alter gene expression: most (88.3%) were noncoding variants enriched at enhancers and other regulatory elements, and 52 were linked to genome-wide association study loci. We observed a notable abundance of rare high-impact SVs associated with aberrant expression of nearby genes. These results suggest that comprehensive WGS-based SV analyses will increase the power of common- and rare-variant association studies.

%B Nat Genet %V 49 %P 692-699 %8 2017 May %G eng %N 5 %1 https://www.ncbi.nlm.nih.gov/pubmed/28369037?dopt=Abstract %R 10.1038/ng.3834 %0 Journal Article %J Circulation %D 2017 %T Polygenic Risk Score Identifies Subgroup With Higher Burden of Atherosclerosis and Greater Relative Benefit From Statin Therapy in the Primary Prevention Setting. %A Natarajan, Pradeep %A Young, Robin %A Stitziel, Nathan O %A Padmanabhan, Sandosh %A Baber, Usman %A Mehran, Roxana %A Sartori, Samantha %A Fuster, Valentin %A Reilly, Dermot F %A Butterworth, Adam %A Rader, Daniel J %A Ford, Ian %A Sattar, Naveed %A Kathiresan, Sekar %K Adolescent %K Adult %K Aged %K Aged, 80 and over %K Atherosclerosis %K Cohort Studies %K Cost of Illness %K Female %K Humans %K Hydroxymethylglutaryl-CoA Reductase Inhibitors %K Male %K Middle Aged %K Multifactorial Inheritance %K Primary Prevention %K Risk Factors %K Young Adult %X

BACKGROUND: Relative risk reduction with statin therapy has been consistent across nearly all subgroups studied to date. However, in analyses of 2 randomized controlled primary prevention trials (ASCOT [Anglo-Scandinavian Cardiac Outcomes Trial-Lipid-Lowering Arm] and JUPITER [Justification for the Use of Statins in Prevention: An Intervention Trial Evaluating Rosuvastatin]), statin therapy led to a greater relative risk reduction among a subgroup at high genetic risk. Here, we aimed to confirm this observation in a third primary prevention randomized controlled trial. In addition, we assessed whether those at high genetic risk had a greater burden of subclinical coronary atherosclerosis.

METHODS: We studied participants from a randomized controlled trial of primary prevention with statin therapy (WOSCOPS [West of Scotland Coronary Prevention Study]; n=4910) and 2 observational cohort studies (CARDIA [Coronary Artery Risk Development in Young Adults] and BioImage; n=1154 and 4392, respectively). For each participant, we calculated a polygenic risk score derived from up to 57 common DNA sequence variants previously associated with coronary heart disease. We compared the relative efficacy of statin therapy in those at high genetic risk (top quintile of polygenic risk score) versus all others (WOSCOPS), as well as the association between the polygenic risk score and coronary artery calcification (CARDIA) and carotid artery plaque burden (BioImage).

RESULTS: Among WOSCOPS trial participants at high genetic risk, statin therapy was associated with a relative risk reduction of 44% (95% confidence interval [CI], 22-60; <0.001), whereas in all others, the relative risk reduction was 24% (95% CI, 8-37; =0.004) despite similar low-density lipoprotein cholesterol lowering. In a study-level meta-analysis across the WOSCOPS, ASCOT, and JUPITER primary prevention, relative risk reduction in those at high genetic risk was 46% versus 26% in all others ( for heterogeneity=0.05). Across all 3 studies, the absolute risk reduction with statin therapy was 3.6% (95% CI, 2.0-5.1) among those in the high genetic risk group and 1.3% (95% CI, 0.6-1.9) in all others. Each 1-SD increase in the polygenic risk score was associated with 1.32-fold (95% CI, 1.04-1.68) greater likelihood of having coronary artery calcification and 9.7% higher (95% CI, 2.2-17.8) burden of carotid plaque.

CONCLUSIONS: Those at high genetic risk have a greater burden of subclinical atherosclerosis and derive greater relative and absolute benefit from statin therapy to prevent a first coronary heart disease event.

CLINICAL TRIAL REGISTRATION: URL: http://www.clinicaltrials.gov. Unique identifiers: NCT00738725 (BioImage) and NCT00005130 (CARDIA). WOSCOPS was carried out and completed before the requirement for clinical trial registration.

%B Circulation %V 135 %P 2091-2101 %8 2017 May 30 %G eng %N 22 %1 https://www.ncbi.nlm.nih.gov/pubmed/28223407?dopt=Abstract %R 10.1161/CIRCULATIONAHA.116.024436 %0 Journal Article %J Am J Hum Genet %D 2017 %T Practical Approaches for Whole-Genome Sequence Analysis of Heart- and Blood-Related Traits. %A Morrison, Alanna C %A Huang, Zhuoyi %A Yu, Bing %A Metcalf, Ginger %A Liu, Xiaoming %A Ballantyne, Christie %A Coresh, Josef %A Yu, Fuli %A Muzny, Donna %A Feofanova, Elena %A Rustagi, Navin %A Gibbs, Richard %A Boerwinkle, Eric %K Black or African American %K C-Reactive Protein %K Cholesterol, HDL %K Cholesterol, LDL %K Chromosomes, Human, Pair 9 %K Gene Frequency %K Genome, Human %K Genome-Wide Association Study %K Genomics %K Hemoglobins %K Humans %K Introns %K Leukocyte Count %K Lipoprotein(a) %K Magnesium %K Natriuretic Peptide, Brain %K Neutrophils %K Peptide Fragments %K Phosphorus %K Platelet Count %K Polymorphism, Single Nucleotide %K Quantitative Trait Loci %K Troponin T %K White People %X

Whole-genome sequencing (WGS) allows for a comprehensive view of the sequence of the human genome. We present and apply integrated methodologic steps for interrogating WGS data to characterize the genetic architecture of 10 heart- and blood-related traits in a sample of 1,860 African Americans. In order to evaluate the contribution of regulatory and non-protein coding regions of the genome, we conducted aggregate tests of rare variation across the entire genomic landscape using a sliding window, complemented by an annotation-based assessment of the genome using predefined regulatory elements and within the first intron of all genes. These tests were performed treating all variants equally as well as with individual variants weighted by a measure of predicted functional consequence. Significant findings were assessed in 1,705 individuals of European ancestry. After these steps, we identified and replicated components of the genomic landscape significantly associated with heart- and blood-related traits. For two traits, lipoprotein(a) levels and neutrophil count, aggregate tests of low-frequency and rare variation were significantly associated across multiple motifs. For a third trait, cardiac troponin T, investigation of regulatory domains identified a locus on chromosome 9. These practical approaches for WGS analysis led to the identification of informative genomic regions and also showed that defined non-coding regions, such as first introns of genes and regulatory domains, are associated with important risk factor phenotypes. This study illustrates the tractable nature of WGS data and outlines an approach for characterizing the genetic architecture of complex traits.

%B Am J Hum Genet %V 100 %P 205-215 %8 2017 Feb 02 %G eng %N 2 %1 https://www.ncbi.nlm.nih.gov/pubmed/28089252?dopt=Abstract %R 10.1016/j.ajhg.2016.12.009 %0 Journal Article %J Genome Res %D 2017 %T Quantifying the regulatory effect size of -acting genetic variation using allelic fold change. %A Mohammadi, Pejman %A Castel, Stephane E %A Brown, Andrew A %A Lappalainen, Tuuli %K Alleles %K Databases, Genetic %K Gene Expression %K Gene Expression Profiling %K Gene Regulatory Networks %K Genetic Variation %K Humans %K Models, Theoretical %K Quantitative Trait Loci %X

Mapping -acting expression quantitative trait loci (-eQTL) has become a popular approach for characterizing proximal genetic regulatory variants. In this paper, we describe and characterize log allelic fold change (aFC), the magnitude of expression change associated with a given genetic variant, as a biologically interpretable unit for quantifying the effect size of -eQTLs and a mathematically convenient approach for systematic modeling of -regulation. This measure is mathematically independent from expression level and allele frequency, additive, applicable to multiallelic variants, and generalizable to multiple independent variants. We provide efficient tools and guidelines for estimating aFC from both eQTL and allelic expression data sets and apply it to Genotype Tissue Expression (GTEx) data. We show that aFC estimates independently derived from eQTL and allelic expression data are highly consistent, and identify technical and biological correlates of eQTL effect size. We generalize aFC to analyze genes with two eQTLs in GTEx and show that in nearly all cases the two eQTLs act independently in regulating gene expression. In summary, aFC is a solid measure of -regulatory effect size that allows quantitative interpretation of cellular regulatory events from population data, and it is a valuable approach for investigating novel aspects of eQTL data sets.

%B Genome Res %V 27 %P 1872-1884 %8 2017 11 %G eng %N 11 %1 https://www.ncbi.nlm.nih.gov/pubmed/29021289?dopt=Abstract %R 10.1101/gr.216747.116 %0 Journal Article %J N Engl J Med %D 2017 %T Resolution of Disease Phenotypes Resulting from Multilocus Genomic Variation. %A Posey, Jennifer E %A Harel, Tamar %A Liu, Pengfei %A Rosenfeld, Jill A %A James, Regis A %A Coban Akdemir, Zeynep H %A Walkiewicz, Magdalena %A Bi, Weimin %A Xiao, Rui %A Ding, Yan %A Xia, Fan %A Beaudet, Arthur L %A Muzny, Donna M %A Gibbs, Richard A %A Boerwinkle, Eric %A Eng, Christine M %A Sutton, V Reid %A Shaw, Chad A %A Plon, Sharon E %A Yang, Yaping %A Lupski, James R %K Exome %K Genetic Diseases, Inborn %K Genetic Variation %K Genotyping Techniques %K High-Throughput Nucleotide Sequencing %K Humans %K Phenotype %K Retrospective Studies %K Sequence Analysis, DNA %X

BACKGROUND: Whole-exome sequencing can provide insight into the relationship between observed clinical phenotypes and underlying genotypes.

METHODS: We conducted a retrospective analysis of data from a series of 7374 consecutive unrelated patients who had been referred to a clinical diagnostic laboratory for whole-exome sequencing; our goal was to determine the frequency and clinical characteristics of patients for whom more than one molecular diagnosis was reported. The phenotypic similarity between molecularly diagnosed pairs of diseases was calculated with the use of terms from the Human Phenotype Ontology.

RESULTS: A molecular diagnosis was rendered for 2076 of 7374 patients (28.2%); among these patients, 101 (4.9%) had diagnoses that involved two or more disease loci. We also analyzed parental samples, when available, and found that de novo variants accounted for 67.8% (61 of 90) of pathogenic variants in autosomal dominant disease genes and 51.7% (15 of 29) of pathogenic variants in X-linked disease genes; both variants were de novo in 44.7% (17 of 38) of patients with two monoallelic variants. Causal copy-number variants were found in 12 patients (11.9%) with multiple diagnoses. Phenotypic similarity scores were significantly lower among patients in whom the phenotype resulted from two distinct mendelian disorders that affected different organ systems (50 patients) than among patients with disorders that had overlapping phenotypic features (30 patients) (median score, 0.21 vs. 0.36; P=1.77×10).

CONCLUSIONS: In our study, we found multiple molecular diagnoses in 4.9% of cases in which whole-exome sequencing was informative. Our results show that structured clinical ontologies can be used to determine the degree of overlap between two mendelian diseases in the same patient; the diseases can be distinct or overlapping. Distinct disease phenotypes affect different organ systems, whereas overlapping disease phenotypes are more likely to be caused by two genes encoding proteins that interact within the same pathway. (Funded by the National Institutes of Health and the Ting Tsung and Wei Fong Chao Foundation.).

%B N Engl J Med %V 376 %P 21-31 %8 2017 Jan 05 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/27959697?dopt=Abstract %R 10.1056/NEJMoa1516767 %0 Journal Article %J N Engl J Med %D 2016 %T Genetic Risk, Adherence to a Healthy Lifestyle, and Coronary Disease. %A Khera, Amit V %A Emdin, Connor A %A Drake, Isabel %A Natarajan, Pradeep %A Bick, Alexander G %A Cook, Nancy R %A Chasman, Daniel I %A Baber, Usman %A Mehran, Roxana %A Rader, Daniel J %A Fuster, Valentin %A Boerwinkle, Eric %A Melander, Olle %A Orho-Melander, Marju %A Ridker, Paul M %A Kathiresan, Sekar %K Aged %K Cohort Studies %K Coronary Disease %K Cross-Sectional Studies %K Female %K Genetic Predisposition to Disease %K Healthy Lifestyle %K Humans %K Incidence %K Male %K Middle Aged %K Multifactorial Inheritance %K Patient Compliance %K Polymorphism, Genetic %K Risk %X

BACKGROUND: Both genetic and lifestyle factors contribute to individual-level risk of coronary artery disease. The extent to which increased genetic risk can be offset by a healthy lifestyle is unknown.

METHODS: Using a polygenic score of DNA sequence polymorphisms, we quantified genetic risk for coronary artery disease in three prospective cohorts - 7814 participants in the Atherosclerosis Risk in Communities (ARIC) study, 21,222 in the Women's Genome Health Study (WGHS), and 22,389 in the Malmö Diet and Cancer Study (MDCS) - and in 4260 participants in the cross-sectional BioImage Study for whom genotype and covariate data were available. We also determined adherence to a healthy lifestyle among the participants using a scoring system consisting of four factors: no current smoking, no obesity, regular physical activity, and a healthy diet.

RESULTS: The relative risk of incident coronary events was 91% higher among participants at high genetic risk (top quintile of polygenic scores) than among those at low genetic risk (bottom quintile of polygenic scores) (hazard ratio, 1.91; 95% confidence interval [CI], 1.75 to 2.09). A favorable lifestyle (defined as at least three of the four healthy lifestyle factors) was associated with a substantially lower risk of coronary events than an unfavorable lifestyle (defined as no or only one healthy lifestyle factor), regardless of the genetic risk category. Among participants at high genetic risk, a favorable lifestyle was associated with a 46% lower relative risk of coronary events than an unfavorable lifestyle (hazard ratio, 0.54; 95% CI, 0.47 to 0.63). This finding corresponded to a reduction in the standardized 10-year incidence of coronary events from 10.7% for an unfavorable lifestyle to 5.1% for a favorable lifestyle in ARIC, from 4.6% to 2.0% in WGHS, and from 8.2% to 5.3% in MDCS. In the BioImage Study, a favorable lifestyle was associated with significantly less coronary-artery calcification within each genetic risk category.

CONCLUSIONS: Across four studies involving 55,685 participants, genetic and lifestyle factors were independently associated with susceptibility to coronary artery disease. Among participants at high genetic risk, a favorable lifestyle was associated with a nearly 50% lower relative risk of coronary artery disease than was an unfavorable lifestyle. (Funded by the National Institutes of Health and others.).

%B N Engl J Med %V 375 %P 2349-2358 %8 2016 Dec 15 %G eng %N 24 %1 https://www.ncbi.nlm.nih.gov/pubmed/27959714?dopt=Abstract %R 10.1056/NEJMoa1605086 %0 Journal Article %J Am J Hum Genet %D 2016 %T Genome Sequencing of Autism-Affected Families Reveals Disruption of Putative Noncoding Regulatory DNA. %A Turner, Tychele N %A Hormozdiari, Fereydoun %A Duyzend, Michael H %A McClymont, Sarah A %A Hook, Paul W %A Iossifov, Ivan %A Raja, Archana %A Baker, Carl %A Hoekzema, Kendra %A Stessman, Holly A %A Zody, Michael C %A Nelson, Bradley J %A Huddleston, John %A Sandstrom, Richard %A Smith, Joshua D %A Hanna, David %A Swanson, James M %A Faustman, Elaine M %A Bamshad, Michael J %A Stamatoyannopoulos, John %A Nickerson, Deborah A %A McCallion, Andrew S %A Darnell, Robert %A Eichler, Evan E %K Autistic Disorder %K DNA %K Exome %K Female %K Genome, Human %K Humans %K Male %K Pedigree %K Polymorphism, Single Nucleotide %X

We performed whole-genome sequencing (WGS) of 208 genomes from 53 families affected by simplex autism. For the majority of these families, no copy-number variant (CNV) or candidate de novo gene-disruptive single-nucleotide variant (SNV) had been detected by microarray or whole-exome sequencing (WES). We integrated multiple CNV and SNV analyses and extensive experimental validation to identify additional candidate mutations in eight families. We report that compared to control individuals, probands showed a significant (p = 0.03) enrichment of de novo and private disruptive mutations within fetal CNS DNase I hypersensitive sites (i.e., putative regulatory regions). This effect was only observed within 50 kb of genes that have been previously associated with autism risk, including genes where dosage sensitivity has already been established by recurrent disruptive de novo protein-coding mutations (ARID1B, SCN2A, NR3C2, PRKCA, and DSCAM). In addition, we provide evidence of gene-disruptive CNVs (in DISC1, WNT7A, RBFOX1, and MBD5), as well as smaller de novo CNVs and exon-specific SNVs missed by exome sequencing in neurodevelopmental genes (e.g., CANX, SAE1, and PIK3CA). Our results suggest that the detection of smaller, often multiple CNVs affecting putative regulatory elements might help explain additional risk of simplex autism.

%B Am J Hum Genet %V 98 %P 58-74 %8 2016 Jan 07 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/26749308?dopt=Abstract %R 10.1016/j.ajhg.2015.11.023