news

First studies of human genetic variation released by the gnomAD Consortium

The Genome Aggregation Database (gnomAD) Consortium has released seven papers leveraging its database to study genetic variants and their potential for guiding discovery of safer drugs.

genetic sequences

The Genome Aggregation Database (gnomAD) Consortium has announced the release of the first seven papers based on discoveries from their database of more than 125,000 exomes and 15,000 whole genomes from populations around the world.

Since 2012 the consortium, originally the Exome Aggregation Consortium (ExAC), has expanded upon the work of the 1000 Genomes Project and other similar efforts to catalogue human genetic variation. From the initial release of whole exome data in October 2014, the database has grown to include genomes and exomes from more than 25,000 people of East and South Asian descent, nearly 18,000 of Latino descent and 12,000 of African or African-American descent, known as the gnomAD v2.1.1 dataset.

According to the consortium, more than 100 scientists and groups internationally have provided data and/or analytical effort to the consortium.

The studies cover several areas, including:

  • how the catalogue can be used to understand important and rare types of genetic variation, such as structural variants and loss-of-function (LoF) variants
  • how the catalogue can be leveraged to help clinical geneticists diagnose patients with rare genetic disease
  • and illustrate how population-scale datasets like gnomAD can help evaluate proposed drug targets.

“These studies represent the first significant wave of discovery to come out of the gnomAD Consortium,” said Daniel MacArthur, scientific lead of the gnomAD project, a senior author on six of the studies, an institute member in the Program in Medical and Population Genetics at the Broad Institute of MIT and Harvard, and director of Centre for Population Genomics at the Garvan Institute of Medical Research and Murdoch Children’s Research Institute in Australia. “The power of this database comes from its sheer size and population diversity, which we were able to reach thanks to the generosity of the investigators who contributed data to it, and of the research participants in those contributing studies.”

gnomAD and loss-of-function (LoF) variants

DNA strands in light blue

Two of the seven papers demonstrate the utility of the large genomic datasets for learning about rare or understudied types of genetic variants.

One such study, the flagship paper published in Nature, led by MacArthur and Konrad Karczewski, first author of the paper and a computational biologist at the Broad Institute and Massachusetts General Hospital’s (MGH’s) Analytic and Translational Genetics Unit, maps loss-of-function (LoF) variants.

LoFs are genetic changes that are thought to completely disrupt the function of protein-coding genes.

By comparing the number of variants in each gene across the more than 443,000 LoF variants the team identified in the gnomAD dataset, the authors were also able to classify all protein-coding genes according to how tolerant they are to disruptive mutations. The classification system pinpoints genes that are more likely to be involved in severe diseases such as intellectual disability.

“The gnomAD catalog gives us our best look so far at the spectrum of genes’ sensitivity to variation and provides a resource to support gene discovery in common and rare disease,” Karczewski explained.

Structural variants

In their paper, also published in Nature, graduate student Ryan Collins, Broad associated scientist Harrison Brand, institute member Michael Talkowski and colleagues used gnomAD to explore structural variants.

Structural variants include duplications, deletions, inversions and other changes involving larger DNA segments (generally >50-100 bases long). Their study presents gnomAD-SV, a catalogue of more than 433,000 structural variants identified within nearly 15,000 of the gnomAD genomes, that represents most of the major known classes of structural variation.

colourful genetic sequences

“Structural variants are notoriously challenging to identify within whole genome data, and have not previously been surveyed at this scale,” noted Talkowski, who is also a faculty member in the Center for Genomic Medicine at MGH. “But they alter more individual bases in the genome than any other form of variation, and are well established drivers of human evolution and disease.”

The authors were surprised to find that >25 percent of all rare LoF variants in the average individual genome are actually structural variants and that many people carry what should be deleterious or harmful structural alterations, without the expected phenotypes or clinical outcomes. They also highlighted that genes were just as sensitive to duplications as they were deletions.

“We learned a great deal by building this catalogue in gnomAD, but we’ve clearly only scratched the surface of understanding the influence of genome structure on biology and disease,” Talkowski said.

gnomAD guiding drug development

Two of the studies describe how the diverse, population-scale data could be used by researchers to pick drug targets.

One of these studies was based on musings by Broad associated scientist Eric Minikel, about whether genes with naturally-occurring predicted LoF variants could be used to assess the safety of targeting those genes with drugs. He suggested that if a gene is naturally deactivated without harmful effects, then it could possibly be safe to inhibit with a drug.

Minikel, MacArthur and golleagues leveraged the gnomAD dataset to explore this question, the results and their suggestions for how insights about LoF variants can be incorporated into the drug development process were published in Nature Medicine.

The collaborators on the study used the data on LoF variants to study the potential safety liabilities of reducing the expression of a gene called LRRK2. This gene is associated with risk of developing Parkinson’s disease and so is a desirable target for intervention strategies.

The team predicted from the data that drugs able to reduce LRRK2 protein levels or partially block the gene’s activity are unlikely to have severe side effects.

“We’ve cataloged large amounts of gene-disrupting variation in gnomAD,” MacArthur said. “And with these two studies we’ve shown how you can then leverage those variants to illuminate and assess potential drug targets.”