Genetics Project Update: Over 1,000 Genomes and Counting
Quick Links
Your mama always told you that you were special. Now, an international consortium of researchers has clarified just how unique we all are. Each human being harbors several thousand genetic variants, hundreds of which are rare, according to a progress report from the ongoing 1000 Genomes Project in the November 1 Nature. Despite overshooting the titular quantity—it reports on 1,092 genomes from 14 nations—the sequencing consortium is still collecting and reading DNA in its effort to record 2,500 total genomes from a wider geographical area.
While genomewide association studies (GWAS) help researchers discover common genetic variants, the genome and exome sequencing used by the 1000 Genomes group can pick out uncommon ones. Based on a pilot project, the consortium’s leaders elected to perform both low-coverage whole-genome sequencing, reading each strand a handful of times, and deep sequencing, with up to 100 reads of exomes only (see ARF related news story on 1000 Genomes Project Consortium, 2010). While exomes encode proteins, scientists in another large project, the Encyclopedia of DNA Elements (ENCODE), recently reported that most of the rest of the genome has important functions, too (see ARF related news story on ENCODE Project Consortium et al., 2012).
The consortium sequenced DNA from 14 different populations, ranging from the Yoruba of Ibadan, Nigeria, to Han Chinese in Beijing, to African-Americans in the southwestern United States. The 1,092 genomes included 38 million single nucleotide polymorphisms (SNPs) and 1.4 million insertions and deletions. That accounts for most common variants, as well as 98 percent of the rare SNPs found in one in 100 people. While common variants are global, rare ones are often limited to small populations. Rare variants also tend to be deleterious to the proteins they encode, the authors reported.
“This type of massive resource helps us to improve our disease research,” said Gerard Schellenberg of the University of Pennsylvania in Philadelphia, who was not involved in the consortium. He and others have been using the 1000 Genomes data for imputation, a method of deduction geneticists use to fill in gaps in GWAS datasets. The typical gene chips used in GWAS identify a long list of single nucleotide polymorphisms, but do not cover all possible variant sites. Since nearby sequences are co-inherited in chunks, knowing some of the linked variants typically allows researchers to predict the others. Imputation accuracy, according to the Nature paper, is 90-95 percent for common variants. Researchers can use sequencing to confirm imputed variants, wrote Philippe Amouyel of the Institut Pasteur de Lille, France, in an e-mail to Alzforum (see full comment, below).
It is no surprise, Schellenberg said, that the consortium found that rare variants tend to appear only in specific populations. However, the 1000 Genome findings underscore how careful geneticists will have to be in matching control populations precisely to case populations, Schellenberg said. Scientists studying a rare variant must take care to show it is truly linked to disease, and not simply the ethnicity or background of the population they are studying, he said. Large numbers of study samples will also be required to find rare, disease-linked variants, added Amouyel, who was not part of the 1000 Genomes team.
Researchers working on Alzheimer’s and other neurodegenerative diseases are doing just that. While GWAS have uncovered common variants that confer relatively low risk for AD, some researchers believe that rare variants will be found that are of much higher risk. Even if these polymorphisms only affect a few people or a single population, they can provide valuable clues about pathological pathways, Schellenberg said. For example, a newly discovered, protective APP variant in Icelanders helped support the β amyloid hypothesis (see ARF related news story on Jonsson et al., 2012).—Amber Dance
References
News Citations
- Next-Generation Sequencing: Boldly Going Where No Geneticist...
- ENCODE Turns Human Genome From Sequence to Machine
- Protective APP Mutation Found—Supports Amyloid Hypothesis
Paper Citations
- A map of human genome variation from population-scale sequencing. Nature. 2010 Oct 28;467(7319):1061-73. PubMed.
- ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012 Sep 6;489(7414):57-74. PubMed.
- Jonsson T, Atwal JK, Steinberg S, Snaedal J, Jonsson PV, Bjornsson S, Stefansson H, Sulem P, Gudbjartsson D, Maloney J, Hoyte K, Gustafson A, Liu Y, Lu Y, Bhangale T, Graham RR, Huttenlocher J, Bjornsdottir G, Andreassen OA, Jönsson EG, Palotie A, Behrens TW, Magnusson OT, Kong A, Thorsteinsdottir U, Watts RJ, Stefansson K. A mutation in APP protects against Alzheimer's disease and age-related cognitive decline. Nature. 2012 Aug 2;488(7409):96-9. PubMed.
External Citations
Further Reading
Papers
- Chen R, Mias GI, Li-Pook-Than J, Jiang L, Lam HY, Miriami E, Karczewski KJ, Hariharan M, Dewey FE, Cheng Y, Clark MJ, Im H, Habegger L, Balasubramanian S, O'Huallachain M, Dudley JT, Hillenmeyer S, Haraksingh R, Sharon D, Euskirchen G, Lacroute P, Bettinger K, Boyle AP, Kasowski M, Grubert F, Seki S, Garcia M, Whirl-Carrillo M, Gallardo M, Blasco MA, Greenberg PL, Snyder P, Klein TE, Altman RB, Butte AJ, Ashley EA, Gerstein M, Nadeau KC, Tang H, Snyder M. Personal omics profiling reveals dynamic molecular and medical phenotypes. Cell. 2012 Mar 16;148(6):1293-307. PubMed.
- Bertram L. Alzheimer's genetics in the GWAS era: a continuing story of 'replications and refutations'. Curr Neurol Neurosci Rep. 2011 Jun;11(3):246-53. PubMed.
- Ecker JR, Bickmore WA, Barroso I, Pritchard JK, Gilad Y, Segal E. Genomics: ENCODE explained. Nature. 2012 Sep 6;489(7414):52-5. PubMed.
- Neph S, Vierstra J, Stergachis AB, Reynolds AP, Haugen E, Vernot B, Thurman RE, John S, Sandstrom R, Johnson AK, Maurano MT, Humbert R, Rynes E, Wang H, Vong S, Lee K, Bates D, Diegel M, Roach V, Dunn D, Neri J, Schafer A, Hansen RS, Kutyavin T, Giste E, Weaver M, Canfield T, Sabo P, Zhang M, Balasundaram G, Byron R, MacCoss MJ, Akey JM, Bender MA, Groudine M, Kaul R, Stamatoyannopoulos JA. An expansive human regulatory lexicon encoded in transcription factor footprints. Nature. 2012 Sep 6;489(7414):83-90. PubMed.
- Nielsen R. Genomics: In search of rare human variants. Nature. 2010 Oct 28;467(7319):1050-1. PubMed.
- Via M, Gignoux C, Burchard EG. The 1000 Genomes Project: new opportunities for research and social challenges. Genome Med. 2010;2(1):3. PubMed.
News
- Return of the Small Family Study? Whole-Genome Analysis Shows Power
- Next-Generation Sequencing: Boldly Going Where No Geneticist...
- ENCODE Turns Human Genome From Sequence to Machine
- Protective APP Mutation Found—Supports Amyloid Hypothesis
- Thousands of Whole Genomes to Be Mined for New Clues to AD
- Newly Mapped DNA Elements Help Interpret GWAS
- Genetic Testing a Foggy Crystal Ball at Best?
- 100 Centenarian Genomes in 30 Days Could Net $10 Million
- Barcelona: What Lies Beyond Genomewide Association Studies?
Primary Papers
- , Abecasis GR, Auton A, Brooks LD, DePristo MA, Durbin RM, Handsaker RE, Kang HM, Marth GT, McVean GA. An integrated map of genetic variation from 1,092 human genomes. Nature. 2012 Nov 1;491(7422):56-65. PubMed.
Annotate
To make an annotation you must Login or Register.
Comments
Institute of Neurology, UCL
Clearly, the 1000 Genomes Project and related projects are very important for those of us who are interested in the genetic determinants of all diseases. These projects give us the background information for disease association studies. One of the major goals of our lab and other similar labs is to find variants that are present in the order of 0.1-5.0 percent of the general population and substantively increase disease risk. These types of projects tell us what variability is out there in this range. Obviously, as more and more people are sequenced, the lower limit for our studies can fall below even the 0.1 percent level.
Institute Pasteur de Lille
This study is a major achievement.
In the International Genomics of Alzheimer’s Project (IGAP), which gathers the four largest GWAS consortia on AD, we have been using the 1000 Genomes information since the beginning of our collaboration to perform imputations. This has allowed us to identify around eight million SNPs in common for all the studies. Depending on the study, we were able to impute SNP with minor allele frequency of lower than 0.01 percent for half of the samples. In our present GWAS, 1000 Genomes and similar projects help us to identify specific genome areas where such rare SNPs are located. Once these are found, we have to confirm this imputed information by performing genotyping or deep sequencing. But it is clearly progress in deciphering what is called "hidden heritability."
Indeed, this 1000 Genomes map will help to more efficiently identify rare variants that may be implicated in neurodegenerative diseases. However, you will need very large population samples (as we have obtained in IGAP) to be able to detect variations in rare variants between cases and controls.
Make a Comment
To make a comment you must login or register.