Genome differences among 2,504 worldwide charted
The 1000 Genomes Project Consortium has collected information on genetic differences from the DNA maps of more than 2,500 people, representing 26 population groups from around the globe.
The result is largest catalog so far of genetic variation in the human genome. It is a reference tool for comparing the genetic make-up of individuals and population groups.
Findings from the cataloging project were reported in two papers Sept. 30 in the scientific journal Nature.
The project located about 88 million places on the human genome where variations occurred. This is double the number of known genetic differences. About 12 million of these sites contain common variants shared by many population groups. The researchers also recorded rare variants occurring in less than one percent of the people. These make up the bulk of the genetic differences.
African populations show the greatest genetic diversity. This was not unexpected, because humans originated in Africa. Later, ancestors of Asians, Europeans and Native Americans moved away and settled other parts of the world.
Most of the differences among people’s DNA codes are insignificant with respect to human health. A small fraction contributes to an individual’s susceptibility or resistance to diseases, to the effectiveness of treatment options or to pregnancy outcomes. Understanding this variation will be useful in helping to explain why only some people develop common disorders like obesity, cancer, autism, cognitive impairment or heart disease. Some variation may be benign in parents but lead to serious conditions when passed to their offspring.
The catalog was compiled from DNA sequencing of people from Africa, East and South Asia, Europe and the Americas. Their ethnic origins were in Nigeria, Colombia, Spain, Bejing, Sri Lanka and many other places and cultures. Consent for broad release of the data for research was obtained from each individual.
The DNA sequencing and analysis was conducted over seven years by an international group of more than 100 scientists from the United Kingdom, Germany, China, Canada and the United States. Several UW Medicine and UW scientists were involved. The work led to the publication last week of two papers in the scientific journal Nature.
Evan Eichler, UW professor of genome sciences, and his laboratory team were among the leaders of one of the reports.
The cataloging looked at all known types of human genetic variation, from tiny, single alterations or omissions in the DNA four-chemical code, to major structural changes in the genome. These included completely missing genes noticed in an analysis performed by UW programmer John Huddleston, the co-first author of the paper. In cases where no apparent consequences were seen, it's possible that those genes are dispensable. Many of these missing genes had similar copies elsewhere in the genome and are potentially important in immune function.
Eichler’s group concentrated on structural variation. This can result from deletions, insertions, duplications or inversions of DNA sequences. These were more complicated than previously imagined. In some cases, inverted DNA sequences were associated with duplicated segments. This finding hinted at a potential new mutational mechanism. Multiple breakpoints, possibly from separate mutational events, were also discovered. The researchers also detected clusters of repeated rearrangements.
Structural variations, the researchers explained in their paper, are implicated in numerous diseases. They may also reveal the more complex mutational and selective forces that shaped the human genome.
“Most of the analyses ranging from gene expression to disease association reveal that structural variation has far greater impact than previously appreciated,” Eichler said. “Ironically,most of the variation still remains undiscovered because of the limits of short read technology and low sequence coverage of these genomes. There is a lot more work to be done.”
As genomics technology continues to improve, the researchers anticipate that additional layers of complexity in structure variations will be detected.
“Until this happens,” said Peter Sudmant, former UW graduate student and lead author of the paper, “the set of structural variations in this project remain an important starting point for constructing and analyzing personalized genomes." It is currently the most comprehensive set available for disease and population genetics studies.
The National Human Genome Research Institute, part of the National Institutes of Health, helped fund and direct the public-private 1000 Genomes consortium. Several other institutes at the NIH, and several additional agencies and organizations, also provided funding,