1000 Genomes Project team create largest catalogue of genomic differences

Posted: 2 October 2015 | Victoria White

Understanding how genomic variants contribute to disease may help clinicians develop improved diagnostics and treatments, in addition to new methods of prevention…

An international team of scientists from the 1000 Genomes Project Consortium has created the world’s largest catalogue of genomic differences among humans, providing researchers with powerful clues to help them establish why some people are susceptible to various diseases.

Understanding how genomic variants contribute to disease may help clinicians develop improved diagnostics and treatments, in addition to new methods of prevention.

In two studies, investigators examined the genomes of 2,504 people from 26 populations across Africa, East and South Asia, Europe and the Americas.

In the main study, investigators identified about 88 million sites in the human genome that vary among people, establishing a database available to researchers as a standard reference for how the genomic make-up of people varies in populations and around the world. The catalogue more than doubles the number of known variant sites in the human genome, and can now be used in a wide range of studies of human biology and medicine.

“The 1000 Genomes Project was an ambitious, historically significant effort that has produced a valuable resource about human genomic variation,” said Eric Green, M.D., Ph.D., director of the US National Human Genome Research Institute (NHGRI). “The latest data and insights add to a growing understanding of the patterns of variation in individuals’ genomes, and provide a foundation for gaining greater insights into the genomics of human disease.”

“Some 88 million sites in the genome differ among people. About one-quarter of these variants are common and occur in many or all populations, while about three-quarters occur in only 1% of people or are even more rare,” said Lisa Brooks, Ph.D., programme director in the NHGRI Genomic Variation Programme. “The 1000 Genomes Project data are a resource for any study in which scientists are looking for genomic contributions to disease, including the study of both common and rare variants.”

Scientists can use the 1000 Genomes Project data to home in on regions affecting disease

One of the more immediate uses of 1000 Genomes Project data is for genome-wide association studies (GWAS), which compare the genomes of people with and without a disease to search for regions of the genome that contain genomic variants associated with that disease. Such studies generally find several genomic regions associated with a disease and many variants in each of those regions. Scientists can now combine GWAS data with the more detailed 1000 Genomes Project data to home in on regions affecting disease more precisely. Instead of sequencing the genomes of all the people in a study, which remains expensive, researchers can use the 1000 Genomes Project data to find most of the variants in those regions that are associated with the disease.

In the second study, scientists examined differences in the structure of the genome in the 2,504 samples. They found nearly 69,000 differences, known as structural variants. The researchers created a map of eight classes of structural variants that potentially contribute to disease.

“Structural variation is responsible for a large percentage of differences in the DNA among human genomes,” said Jan Korbel, Ph.D., group leader and European Research Council Investigator in the Genome Biology Unit of the European Molecular Biology Laboratory in Heidelberg, Germany. “No study has ever looked at genomic structural variation with this kind of broad representation of populations around the world.”

Dr Korbel and colleagues discovered that structural variants were often more complicated than they originally thought. For example, the majority of inversions, which involve DNA sequences changing their orientation in the genome, frequently occur along with other structural changes.

The 1000 Genomes Project team developed new methods for large-scale DNA sequencing

To Gonçalo Abecasis, Ph.D., chair of biostatistics at the University of Michigan in Ann Arbor and co-principal investigator for the main study, the value of the 1000 Genomes Project extends far beyond the data. Advances in DNA sequencing and bioinformatics were vital to completing the project.

“We’ve learned a great deal about how to do genomics on a large scale,” said Dr. Abecasis. “Over the course of the 1000 Genomes Project, we developed new, improved methods for large-scale DNA sequencing, analysis and interpretation of genomic information, in addition to how to store this much data. We learned how to do quality genomic studies in different contexts and parts of the world.”

Related topics