TY - JOUR
T1 - Prioritizing causal disease genes using unbiased genomic features
AU - Deo, Rahul C.
AU - Musso, Gabriel
AU - Tasan, Murat
AU - Tang, Paul
AU - Poon, Annie
AU - Yuan, Christiana
AU - Felix, Janine F.
AU - Vasan, Ramachandran S.
AU - Beroukhim, Rameen
AU - De Marco, Teresa
AU - Kwok, Pui Yan
AU - MacRae, Calum A.
AU - Roth, Frederick P.
N1 - Funding Information:
The work was funded by NIH awards K08 HL098361 (RCD), DP2 OD017483 (RCD), U01 HL107440-03 (RCD and FPR), and by an NIH/NHGRI Center of Excellence in Genomic Science (CEGS) grant (P50 HG004233) (FPR). FPR was also supported by NIH/NHGRI grant HG001715, an Ontario Research Fund? Research Excellence Award, the Krembil Foundation, the Avon Foundation and by the Canada Excellence Research Chairs Program.
PY - 2014
Y1 - 2014
N2 - BACKGROUND: Cardiovascular disease (CVD) is the leading cause of death in the developed world. Human genetic studies, including genome-wide sequencing and SNP-array approaches, promise to reveal disease genes and mechanisms representing new therapeutic targets. In practice, however, identification of the actual genes contributing to disease pathogenesis has lagged behind identification of associated loci, thus limiting the clinical benefits.RESULTS: To aid in localizing causal genes, we develop a machine learning approach, Objective Prioritization for Enhanced Novelty (OPEN), which quantitatively prioritizes gene-disease associations based on a diverse group of genomic features. This approach uses only unbiased predictive features and thus is not hampered by a preference towards previously well-characterized genes. We demonstrate success in identifying genetic determinants for CVD-related traits, including cholesterol levels, blood pressure, and conduction system and cardiomyopathy phenotypes. Using OPEN, we prioritize genes, including FLNC, for association with increased left ventricular diameter, which is a defining feature of a prevalent cardiovascular disorder, dilated cardiomyopathy or DCM. Using a zebrafish model, we experimentally validate FLNC and identify a novel FLNC splice-site mutation in a patient with severe DCM.CONCLUSION: Our approach stands to assist interpretation of large-scale genetic studies without compromising their fundamentally unbiased nature.
AB - BACKGROUND: Cardiovascular disease (CVD) is the leading cause of death in the developed world. Human genetic studies, including genome-wide sequencing and SNP-array approaches, promise to reveal disease genes and mechanisms representing new therapeutic targets. In practice, however, identification of the actual genes contributing to disease pathogenesis has lagged behind identification of associated loci, thus limiting the clinical benefits.RESULTS: To aid in localizing causal genes, we develop a machine learning approach, Objective Prioritization for Enhanced Novelty (OPEN), which quantitatively prioritizes gene-disease associations based on a diverse group of genomic features. This approach uses only unbiased predictive features and thus is not hampered by a preference towards previously well-characterized genes. We demonstrate success in identifying genetic determinants for CVD-related traits, including cholesterol levels, blood pressure, and conduction system and cardiomyopathy phenotypes. Using OPEN, we prioritize genes, including FLNC, for association with increased left ventricular diameter, which is a defining feature of a prevalent cardiovascular disorder, dilated cardiomyopathy or DCM. Using a zebrafish model, we experimentally validate FLNC and identify a novel FLNC splice-site mutation in a patient with severe DCM.CONCLUSION: Our approach stands to assist interpretation of large-scale genetic studies without compromising their fundamentally unbiased nature.
UR - http://www.scopus.com/inward/record.url?scp=84965189206&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84965189206&partnerID=8YFLogxK
U2 - 10.1186/s13059-014-0534-8
DO - 10.1186/s13059-014-0534-8
M3 - Article
C2 - 25633252
AN - SCOPUS:84965189206
SN - 1474-7596
VL - 15
SP - 534
JO - Genome biology
JF - Genome biology
IS - 12
M1 - 534
ER -