Efficient phasing and imputation of low-coverage sequencing data using large reference panels.

TitleEfficient phasing and imputation of low-coverage sequencing data using large reference panels.
Publication TypeJournal Article
Year of Publication2021
AuthorsRubinacci, S, Ribeiro, DM, Hofmeister, RJ, Delaneau, O
JournalNat Genet
Volume53
Issue1
Pagination120-126
Date Published2021 01
ISSN1546-1718
KeywordsGenome, Human, Genotype, Humans, Likelihood Functions, Polymorphism, Single Nucleotide, Reference Standards, Sequence Analysis, DNA
Abstract

Low-coverage whole-genome sequencing followed by imputation has been proposed as a cost-effective genotyping approach for disease and population genetics studies. However, its competitiveness against SNP arrays is undermined because current imputation methods are computationally expensive and unable to leverage large reference panels. Here, we describe a method, GLIMPSE, for phasing and imputation of low-coverage sequencing datasets from modern reference panels. We demonstrate its remarkable performance across different coverages and human populations. GLIMPSE achieves imputation of a genome for less than US$1 in computational cost, considerably outperforming other methods and improving imputation accuracy over the full allele frequency range. As a proof of concept, we show that 1× coverage enables effective gene expression association studies and outperforms dense SNP arrays in rare variant burden tests. Overall, this study illustrates the promising potential of low-coverage imputation and suggests a paradigm shift in the design of future genomic studies.

DOI10.1038/s41588-020-00756-0
Alternate JournalNat Genet
PubMed ID33414550
Grant ListUM1 HG008901 / HG / NHGRI NIH HHS / United States