Whole-genome sequencing to understand the genetic architecture of common gene expression and biomarker phenotypes

Andrew R. Wood, Marcus A. Tuke, Mike Nalls, Dena Hernandez, J. Raphael Gibbs, Haoxiang Lin, Christopher S. Xu, Qibin Li, Juan Shen, Goo Jun, Marcio Almeida, Toshiko Tanaka, John R B Perry, Kyle Gaulton, Manny Rivas, Richard Pearson, Joanne E. Curran, Matthew P. Johnson, Harald H H Göring, Ravindranath Duggirala & 9 others John Blangero, Mark I. Mccarthy, Stefania Bandinelli, Anna Murray, Michael N. Weedon, Andrew Singleton, David Melzer, Luigi Ferrucci, Timothy M. Frayling

Research output: Contribution to journalArticle

6 Citations (Scopus)

Abstract

Initial results from sequencing studies suggest that there are relatively few low-frequency (<5%) variants associated with large effects on common phenotypes. We performed low-pass whole-genome sequencing in 680 individuals from the InCHIANTI study to test two primary hypotheses: (i) that sequencing would detect single low-frequency-large effect variants that explained similar amounts of phenotypic variance as single common variants, and (ii) that some common variant associations could be explained by low-frequency variants. We tested two sets of disease-related common phenotypes for which we had statistical power to detect large numbers of common variant-common phenotype associations-11 132 cis-gene expression traits in 450 individuals and 93 circulating biomarkers in all 680 individuals. From a total of 11 657 229 high-quality variants of which 6 129 221 and 5 528 008were common and lowfrequency (<5%), respectively, lowfrequency-large effect associations comprised 7% of detectable cis-gene expression traits [89 of 1314 cis-eQTLs at P < 1 × 10-06 (false discovery rate ~5%)] and one of eight biomarker associations at P <8× 10-10. Very few (30 of 1232; 2%) common variant associations were fully explained by lowfrequency variants. Our data show that whole-genome sequencing can identify low-frequency variants undetected by genotyping based approaches when sample sizes are sufficiently large to detect substantial numbers of common variant associations, and that common variant associations are rarely explained by single low-frequency variants of large effect.

Original languageEnglish (US)
Pages (from-to)1504-1512
Number of pages9
JournalHuman Molecular Genetics
Volume24
Issue number5
DOIs
StatePublished - Mar 1 2015
Externally publishedYes

Fingerprint

Biomarkers
Genome
Phenotype
Gene Expression
Sample Size

ASJC Scopus subject areas

  • Genetics
  • Genetics(clinical)
  • Molecular Biology

Cite this

Wood, A. R., Tuke, M. A., Nalls, M., Hernandez, D., Raphael Gibbs, J., Lin, H., ... Frayling, T. M. (2015). Whole-genome sequencing to understand the genetic architecture of common gene expression and biomarker phenotypes. Human Molecular Genetics, 24(5), 1504-1512. https://doi.org/10.1093/hmg/ddu560

Whole-genome sequencing to understand the genetic architecture of common gene expression and biomarker phenotypes. / Wood, Andrew R.; Tuke, Marcus A.; Nalls, Mike; Hernandez, Dena; Raphael Gibbs, J.; Lin, Haoxiang; Xu, Christopher S.; Li, Qibin; Shen, Juan; Jun, Goo; Almeida, Marcio; Tanaka, Toshiko; Perry, John R B; Gaulton, Kyle; Rivas, Manny; Pearson, Richard; Curran, Joanne E.; Johnson, Matthew P.; Göring, Harald H H; Duggirala, Ravindranath; Blangero, John; Mccarthy, Mark I.; Bandinelli, Stefania; Murray, Anna; Weedon, Michael N.; Singleton, Andrew; Melzer, David; Ferrucci, Luigi; Frayling, Timothy M.

In: Human Molecular Genetics, Vol. 24, No. 5, 01.03.2015, p. 1504-1512.

Research output: Contribution to journalArticle

Wood, AR, Tuke, MA, Nalls, M, Hernandez, D, Raphael Gibbs, J, Lin, H, Xu, CS, Li, Q, Shen, J, Jun, G, Almeida, M, Tanaka, T, Perry, JRB, Gaulton, K, Rivas, M, Pearson, R, Curran, JE, Johnson, MP, Göring, HHH, Duggirala, R, Blangero, J, Mccarthy, MI, Bandinelli, S, Murray, A, Weedon, MN, Singleton, A, Melzer, D, Ferrucci, L & Frayling, TM 2015, 'Whole-genome sequencing to understand the genetic architecture of common gene expression and biomarker phenotypes', Human Molecular Genetics, vol. 24, no. 5, pp. 1504-1512. https://doi.org/10.1093/hmg/ddu560
Wood, Andrew R. ; Tuke, Marcus A. ; Nalls, Mike ; Hernandez, Dena ; Raphael Gibbs, J. ; Lin, Haoxiang ; Xu, Christopher S. ; Li, Qibin ; Shen, Juan ; Jun, Goo ; Almeida, Marcio ; Tanaka, Toshiko ; Perry, John R B ; Gaulton, Kyle ; Rivas, Manny ; Pearson, Richard ; Curran, Joanne E. ; Johnson, Matthew P. ; Göring, Harald H H ; Duggirala, Ravindranath ; Blangero, John ; Mccarthy, Mark I. ; Bandinelli, Stefania ; Murray, Anna ; Weedon, Michael N. ; Singleton, Andrew ; Melzer, David ; Ferrucci, Luigi ; Frayling, Timothy M. / Whole-genome sequencing to understand the genetic architecture of common gene expression and biomarker phenotypes. In: Human Molecular Genetics. 2015 ; Vol. 24, No. 5. pp. 1504-1512.
@article{3ab0e19f8acd4c86bf7923949248a564,
title = "Whole-genome sequencing to understand the genetic architecture of common gene expression and biomarker phenotypes",
abstract = "Initial results from sequencing studies suggest that there are relatively few low-frequency (<5{\%}) variants associated with large effects on common phenotypes. We performed low-pass whole-genome sequencing in 680 individuals from the InCHIANTI study to test two primary hypotheses: (i) that sequencing would detect single low-frequency-large effect variants that explained similar amounts of phenotypic variance as single common variants, and (ii) that some common variant associations could be explained by low-frequency variants. We tested two sets of disease-related common phenotypes for which we had statistical power to detect large numbers of common variant-common phenotype associations-11 132 cis-gene expression traits in 450 individuals and 93 circulating biomarkers in all 680 individuals. From a total of 11 657 229 high-quality variants of which 6 129 221 and 5 528 008were common and lowfrequency (<5{\%}), respectively, lowfrequency-large effect associations comprised 7{\%} of detectable cis-gene expression traits [89 of 1314 cis-eQTLs at P < 1 × 10-06 (false discovery rate ~5{\%})] and one of eight biomarker associations at P <8× 10-10. Very few (30 of 1232; 2{\%}) common variant associations were fully explained by lowfrequency variants. Our data show that whole-genome sequencing can identify low-frequency variants undetected by genotyping based approaches when sample sizes are sufficiently large to detect substantial numbers of common variant associations, and that common variant associations are rarely explained by single low-frequency variants of large effect.",
author = "Wood, {Andrew R.} and Tuke, {Marcus A.} and Mike Nalls and Dena Hernandez and {Raphael Gibbs}, J. and Haoxiang Lin and Xu, {Christopher S.} and Qibin Li and Juan Shen and Goo Jun and Marcio Almeida and Toshiko Tanaka and Perry, {John R B} and Kyle Gaulton and Manny Rivas and Richard Pearson and Curran, {Joanne E.} and Johnson, {Matthew P.} and G{\"o}ring, {Harald H H} and Ravindranath Duggirala and John Blangero and Mccarthy, {Mark I.} and Stefania Bandinelli and Anna Murray and Weedon, {Michael N.} and Andrew Singleton and David Melzer and Luigi Ferrucci and Frayling, {Timothy M.}",
year = "2015",
month = "3",
day = "1",
doi = "10.1093/hmg/ddu560",
language = "English (US)",
volume = "24",
pages = "1504--1512",
journal = "Human Molecular Genetics",
issn = "0964-6906",
publisher = "Oxford University Press",
number = "5",

}

TY - JOUR

T1 - Whole-genome sequencing to understand the genetic architecture of common gene expression and biomarker phenotypes

AU - Wood, Andrew R.

AU - Tuke, Marcus A.

AU - Nalls, Mike

AU - Hernandez, Dena

AU - Raphael Gibbs, J.

AU - Lin, Haoxiang

AU - Xu, Christopher S.

AU - Li, Qibin

AU - Shen, Juan

AU - Jun, Goo

AU - Almeida, Marcio

AU - Tanaka, Toshiko

AU - Perry, John R B

AU - Gaulton, Kyle

AU - Rivas, Manny

AU - Pearson, Richard

AU - Curran, Joanne E.

AU - Johnson, Matthew P.

AU - Göring, Harald H H

AU - Duggirala, Ravindranath

AU - Blangero, John

AU - Mccarthy, Mark I.

AU - Bandinelli, Stefania

AU - Murray, Anna

AU - Weedon, Michael N.

AU - Singleton, Andrew

AU - Melzer, David

AU - Ferrucci, Luigi

AU - Frayling, Timothy M.

PY - 2015/3/1

Y1 - 2015/3/1

N2 - Initial results from sequencing studies suggest that there are relatively few low-frequency (<5%) variants associated with large effects on common phenotypes. We performed low-pass whole-genome sequencing in 680 individuals from the InCHIANTI study to test two primary hypotheses: (i) that sequencing would detect single low-frequency-large effect variants that explained similar amounts of phenotypic variance as single common variants, and (ii) that some common variant associations could be explained by low-frequency variants. We tested two sets of disease-related common phenotypes for which we had statistical power to detect large numbers of common variant-common phenotype associations-11 132 cis-gene expression traits in 450 individuals and 93 circulating biomarkers in all 680 individuals. From a total of 11 657 229 high-quality variants of which 6 129 221 and 5 528 008were common and lowfrequency (<5%), respectively, lowfrequency-large effect associations comprised 7% of detectable cis-gene expression traits [89 of 1314 cis-eQTLs at P < 1 × 10-06 (false discovery rate ~5%)] and one of eight biomarker associations at P <8× 10-10. Very few (30 of 1232; 2%) common variant associations were fully explained by lowfrequency variants. Our data show that whole-genome sequencing can identify low-frequency variants undetected by genotyping based approaches when sample sizes are sufficiently large to detect substantial numbers of common variant associations, and that common variant associations are rarely explained by single low-frequency variants of large effect.

AB - Initial results from sequencing studies suggest that there are relatively few low-frequency (<5%) variants associated with large effects on common phenotypes. We performed low-pass whole-genome sequencing in 680 individuals from the InCHIANTI study to test two primary hypotheses: (i) that sequencing would detect single low-frequency-large effect variants that explained similar amounts of phenotypic variance as single common variants, and (ii) that some common variant associations could be explained by low-frequency variants. We tested two sets of disease-related common phenotypes for which we had statistical power to detect large numbers of common variant-common phenotype associations-11 132 cis-gene expression traits in 450 individuals and 93 circulating biomarkers in all 680 individuals. From a total of 11 657 229 high-quality variants of which 6 129 221 and 5 528 008were common and lowfrequency (<5%), respectively, lowfrequency-large effect associations comprised 7% of detectable cis-gene expression traits [89 of 1314 cis-eQTLs at P < 1 × 10-06 (false discovery rate ~5%)] and one of eight biomarker associations at P <8× 10-10. Very few (30 of 1232; 2%) common variant associations were fully explained by lowfrequency variants. Our data show that whole-genome sequencing can identify low-frequency variants undetected by genotyping based approaches when sample sizes are sufficiently large to detect substantial numbers of common variant associations, and that common variant associations are rarely explained by single low-frequency variants of large effect.

UR - http://www.scopus.com/inward/record.url?scp=84924430270&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84924430270&partnerID=8YFLogxK

U2 - 10.1093/hmg/ddu560

DO - 10.1093/hmg/ddu560

M3 - Article

VL - 24

SP - 1504

EP - 1512

JO - Human Molecular Genetics

JF - Human Molecular Genetics

SN - 0964-6906

IS - 5

ER -