An integrated encyclopedia of DNA elements in the human genome

The ENCODE Project Consortium, Data production leads (data production), Lead analysts (data analysis), Writing group, NHGRI project management (scientific management), Principal investigators (steering committee), Boise State University and University of North Carolina at Chapel Hill Proteomics groups (data production and analysis), Broad Institute Group (data production and analysis), Cold Spring Harbor University of Geneva Center for Genomic Regulation Barcelona RIKEN Sanger Institute University of Lausanne GenomeInstitute of Singapore group (data production and analysis), Data coordination center at UC Santa Cruz (production data coordination), Duke University EBI University of Texas Austin University of North Carolina-Chapel Hill group (data production and analysis), Genome Institute of Singapore group (data production and analysis), HudsonAlpha Institute Caltech UC Irvine Stanford group (data production and analysis), Lawrence Berkeley National Laboratory group (targeted experimental validation), NHGRI groups (data production and analysis), Sanger Institute Washington University Yale University Center for Genomic Regulation Barcelona UCSC MIT University of Lausanne CNIO group (data production and analysis), Stanford-Yale Harvard University of Massachusetts Medical School University of Southern California/UC Davis group (data production and analysis), University of Albany SUNY group (data production and analysis), University of Chicago Stanford group (data production and analysis), University of Heidelberg group (targeted experimental validation) & 3 others University of Massachusetts Medical School Bioinformatics group (data production and analysis), University of Washington University of Massachusetts Medical Center group (data production and analysis), Data Analysis Center (data analysis)

Research output: Contribution to journalArticle

7031 Citations (Scopus)

Abstract

The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall, the project provides new insights into the organization and regulation of our genes and genome, and is an expansive resource of functional annotations for biomedical research.

Original languageEnglish (US)
Pages (from-to)57-74
Number of pages18
JournalNature
Volume489
Issue number7414
DOIs
StatePublished - Sep 6 2012

Fingerprint

Encyclopedias
Human Genome
DNA
Histone Code
Genome
Genes
Open Reading Frames
Chromatin
Biomedical Research
Transcription Factors

ASJC Scopus subject areas

  • General

Cite this

The ENCODE Project Consortium, Data production leads (data production), Lead analysts (data analysis), Writing group, NHGRI project management (scientific management), Principal investigators (steering committee), ... Data Analysis Center (data analysis) (2012). An integrated encyclopedia of DNA elements in the human genome. Nature, 489(7414), 57-74. https://doi.org/10.1038/nature11247

An integrated encyclopedia of DNA elements in the human genome. / The ENCODE Project Consortium; Data production leads (data production); Lead analysts (data analysis); Writing group; NHGRI project management (scientific management); Principal investigators (steering committee); Boise State University and University of North Carolina at Chapel Hill Proteomics groups (data production and analysis); Broad Institute Group (data production and analysis); Cold Spring Harbor University of Geneva Center for Genomic Regulation Barcelona RIKEN Sanger Institute University of Lausanne GenomeInstitute of Singapore group (data production and analysis); Data coordination center at UC Santa Cruz (production data coordination); Duke University EBI University of Texas Austin University of North Carolina-Chapel Hill group (data production and analysis); Genome Institute of Singapore group (data production and analysis); HudsonAlpha Institute Caltech UC Irvine Stanford group (data production and analysis); Lawrence Berkeley National Laboratory group (targeted experimental validation); NHGRI groups (data production and analysis); Sanger Institute Washington University Yale University Center for Genomic Regulation Barcelona UCSC MIT University of Lausanne CNIO group (data production and analysis); Stanford-Yale Harvard University of Massachusetts Medical School University of Southern California/UC Davis group (data production and analysis); University of Albany SUNY group (data production and analysis); University of Chicago Stanford group (data production and analysis); University of Heidelberg group (targeted experimental validation); University of Massachusetts Medical School Bioinformatics group (data production and analysis); University of Washington University of Massachusetts Medical Center group (data production and analysis); Data Analysis Center (data analysis).

In: Nature, Vol. 489, No. 7414, 06.09.2012, p. 57-74.

Research output: Contribution to journalArticle

The ENCODE Project Consortium, Data production leads (data production), Lead analysts (data analysis), Writing group, NHGRI project management (scientific management), Principal investigators (steering committee), Boise State University and University of North Carolina at Chapel Hill Proteomics groups (data production and analysis), Broad Institute Group (data production and analysis), Cold Spring Harbor University of Geneva Center for Genomic Regulation Barcelona RIKEN Sanger Institute University of Lausanne GenomeInstitute of Singapore group (data production and analysis), Data coordination center at UC Santa Cruz (production data coordination), Duke University EBI University of Texas Austin University of North Carolina-Chapel Hill group (data production and analysis), Genome Institute of Singapore group (data production and analysis), HudsonAlpha Institute Caltech UC Irvine Stanford group (data production and analysis), Lawrence Berkeley National Laboratory group (targeted experimental validation), NHGRI groups (data production and analysis), Sanger Institute Washington University Yale University Center for Genomic Regulation Barcelona UCSC MIT University of Lausanne CNIO group (data production and analysis), Stanford-Yale Harvard University of Massachusetts Medical School University of Southern California/UC Davis group (data production and analysis), University of Albany SUNY group (data production and analysis), University of Chicago Stanford group (data production and analysis), University of Heidelberg group (targeted experimental validation), University of Massachusetts Medical School Bioinformatics group (data production and analysis), University of Washington University of Massachusetts Medical Center group (data production and analysis) & Data Analysis Center (data analysis) 2012, 'An integrated encyclopedia of DNA elements in the human genome', Nature, vol. 489, no. 7414, pp. 57-74. https://doi.org/10.1038/nature11247
The ENCODE Project Consortium, Data production leads (data production), Lead analysts (data analysis), Writing group, NHGRI project management (scientific management), Principal investigators (steering committee) et al. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012 Sep 6;489(7414):57-74. https://doi.org/10.1038/nature11247
The ENCODE Project Consortium ; Data production leads (data production) ; Lead analysts (data analysis) ; Writing group ; NHGRI project management (scientific management) ; Principal investigators (steering committee) ; Boise State University and University of North Carolina at Chapel Hill Proteomics groups (data production and analysis) ; Broad Institute Group (data production and analysis) ; Cold Spring Harbor University of Geneva Center for Genomic Regulation Barcelona RIKEN Sanger Institute University of Lausanne GenomeInstitute of Singapore group (data production and analysis) ; Data coordination center at UC Santa Cruz (production data coordination) ; Duke University EBI University of Texas Austin University of North Carolina-Chapel Hill group (data production and analysis) ; Genome Institute of Singapore group (data production and analysis) ; HudsonAlpha Institute Caltech UC Irvine Stanford group (data production and analysis) ; Lawrence Berkeley National Laboratory group (targeted experimental validation) ; NHGRI groups (data production and analysis) ; Sanger Institute Washington University Yale University Center for Genomic Regulation Barcelona UCSC MIT University of Lausanne CNIO group (data production and analysis) ; Stanford-Yale Harvard University of Massachusetts Medical School University of Southern California/UC Davis group (data production and analysis) ; University of Albany SUNY group (data production and analysis) ; University of Chicago Stanford group (data production and analysis) ; University of Heidelberg group (targeted experimental validation) ; University of Massachusetts Medical School Bioinformatics group (data production and analysis) ; University of Washington University of Massachusetts Medical Center group (data production and analysis) ; Data Analysis Center (data analysis). / An integrated encyclopedia of DNA elements in the human genome. In: Nature. 2012 ; Vol. 489, No. 7414. pp. 57-74.
@article{27b3beecf5b34dff9243ffe2fda69c8c,
title = "An integrated encyclopedia of DNA elements in the human genome",
abstract = "The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure and histone modification. These data enabled us to assign biochemical functions for 80{\%} of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall, the project provides new insights into the organization and regulation of our genes and genome, and is an expansive resource of functional annotations for biomedical research.",
author = "{The ENCODE Project Consortium} and {Data production leads (data production)} and {Lead analysts (data analysis)} and {Writing group} and {NHGRI project management (scientific management)} and {Principal investigators (steering committee)} and {Boise State University and University of North Carolina at Chapel Hill Proteomics groups (data production and analysis)} and {Broad Institute Group (data production and analysis)} and {Cold Spring Harbor University of Geneva Center for Genomic Regulation Barcelona RIKEN Sanger Institute University of Lausanne GenomeInstitute of Singapore group (data production and analysis)} and {Data coordination center at UC Santa Cruz (production data coordination)} and {Duke University EBI University of Texas Austin University of North Carolina-Chapel Hill group (data production and analysis)} and {Genome Institute of Singapore group (data production and analysis)} and {HudsonAlpha Institute Caltech UC Irvine Stanford group (data production and analysis)} and {Lawrence Berkeley National Laboratory group (targeted experimental validation)} and {NHGRI groups (data production and analysis)} and {Sanger Institute Washington University Yale University Center for Genomic Regulation Barcelona UCSC MIT University of Lausanne CNIO group (data production and analysis)} and {Stanford-Yale Harvard University of Massachusetts Medical School University of Southern California/UC Davis group (data production and analysis)} and {University of Albany SUNY group (data production and analysis)} and {University of Chicago Stanford group (data production and analysis)} and {University of Heidelberg group (targeted experimental validation)} and {University of Massachusetts Medical School Bioinformatics group (data production and analysis)} and {University of Washington University of Massachusetts Medical Center group (data production and analysis)} and {Data Analysis Center (data analysis)} and {Correction Dunham}, Ian and Anshul Kundaje and Aldred, {Shelley F.} and Collins, {Patrick J.} and Davis, {Carrie A.} and Francis Doyle and Epstein, {Charles B.} and Seth Frietze and Jennifer Harrow and Rajinder Kaul and Jainab Khatun and Lajoie, {Bryan R.} and Landt, {Stephen G.} and Lee, {Bum Kyu} and Florencia Pauli and Rosenbloom, {Kate R.} and Peter Sabo and Alexias Safi and Amartya Sanyal and Noam Shoresh and Simon, {Jeremy M.} and Lingyun Song and Trinklein, {Nathan D.} and Altshuler, {Robert C.} and Ewan Birney and Brown, {James B.} and Chao Cheng and Sarah Djebali and Xianjun Dong and Jason Ernst and Furey, {Terrence S.} and Mark Gerstein and Belinda Giardine and Melissa Greven and Hardison, {Ross C.} and Harris, {Robert S.} and Javier Herrero and Hoffman, {Michael M.} and Sowmya Iyer and Manolis Kellis and Pouya Kheradpour and Timo Lassmann and Qunhua Li and Xinying Lin and Marinov, {Georgi K.} and Angelika Merkel and Ali Mortazavi and Parker, {Stephen C J} and Jin, {Victor X} and Penalva, {Luiz O}",
year = "2012",
month = "9",
day = "6",
doi = "10.1038/nature11247",
language = "English (US)",
volume = "489",
pages = "57--74",
journal = "Nature",
issn = "0028-0836",
publisher = "Nature Publishing Group",
number = "7414",

}

TY - JOUR

T1 - An integrated encyclopedia of DNA elements in the human genome

AU - The ENCODE Project Consortium

AU - Data production leads (data production)

AU - Lead analysts (data analysis)

AU - Writing group

AU - NHGRI project management (scientific management)

AU - Principal investigators (steering committee)

AU - Boise State University and University of North Carolina at Chapel Hill Proteomics groups (data production and analysis)

AU - Broad Institute Group (data production and analysis)

AU - Cold Spring Harbor University of Geneva Center for Genomic Regulation Barcelona RIKEN Sanger Institute University of Lausanne GenomeInstitute of Singapore group (data production and analysis)

AU - Data coordination center at UC Santa Cruz (production data coordination)

AU - Duke University EBI University of Texas Austin University of North Carolina-Chapel Hill group (data production and analysis)

AU - Genome Institute of Singapore group (data production and analysis)

AU - HudsonAlpha Institute Caltech UC Irvine Stanford group (data production and analysis)

AU - Lawrence Berkeley National Laboratory group (targeted experimental validation)

AU - NHGRI groups (data production and analysis)

AU - Sanger Institute Washington University Yale University Center for Genomic Regulation Barcelona UCSC MIT University of Lausanne CNIO group (data production and analysis)

AU - Stanford-Yale Harvard University of Massachusetts Medical School University of Southern California/UC Davis group (data production and analysis)

AU - University of Albany SUNY group (data production and analysis)

AU - University of Chicago Stanford group (data production and analysis)

AU - University of Heidelberg group (targeted experimental validation)

AU - University of Massachusetts Medical School Bioinformatics group (data production and analysis)

AU - University of Washington University of Massachusetts Medical Center group (data production and analysis)

AU - Data Analysis Center (data analysis)

AU - Correction Dunham, Ian

AU - Kundaje, Anshul

AU - Aldred, Shelley F.

AU - Collins, Patrick J.

AU - Davis, Carrie A.

AU - Doyle, Francis

AU - Epstein, Charles B.

AU - Frietze, Seth

AU - Harrow, Jennifer

AU - Kaul, Rajinder

AU - Khatun, Jainab

AU - Lajoie, Bryan R.

AU - Landt, Stephen G.

AU - Lee, Bum Kyu

AU - Pauli, Florencia

AU - Rosenbloom, Kate R.

AU - Sabo, Peter

AU - Safi, Alexias

AU - Sanyal, Amartya

AU - Shoresh, Noam

AU - Simon, Jeremy M.

AU - Song, Lingyun

AU - Trinklein, Nathan D.

AU - Altshuler, Robert C.

AU - Birney, Ewan

AU - Brown, James B.

AU - Cheng, Chao

AU - Djebali, Sarah

AU - Dong, Xianjun

AU - Ernst, Jason

AU - Furey, Terrence S.

AU - Gerstein, Mark

AU - Giardine, Belinda

AU - Greven, Melissa

AU - Hardison, Ross C.

AU - Harris, Robert S.

AU - Herrero, Javier

AU - Hoffman, Michael M.

AU - Iyer, Sowmya

AU - Kellis, Manolis

AU - Kheradpour, Pouya

AU - Lassmann, Timo

AU - Li, Qunhua

AU - Lin, Xinying

AU - Marinov, Georgi K.

AU - Merkel, Angelika

AU - Mortazavi, Ali

AU - Parker, Stephen C J

AU - Jin, Victor X

AU - Penalva, Luiz O

PY - 2012/9/6

Y1 - 2012/9/6

N2 - The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall, the project provides new insights into the organization and regulation of our genes and genome, and is an expansive resource of functional annotations for biomedical research.

AB - The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall, the project provides new insights into the organization and regulation of our genes and genome, and is an expansive resource of functional annotations for biomedical research.

UR - http://www.scopus.com/inward/record.url?scp=84865790047&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84865790047&partnerID=8YFLogxK

U2 - 10.1038/nature11247

DO - 10.1038/nature11247

M3 - Article

VL - 489

SP - 57

EP - 74

JO - Nature

JF - Nature

SN - 0028-0836

IS - 7414

ER -