A variance component method for integrated pathway analysis of gene expression data

Ellen E. Quillen, John Blangero, Laura Almasy

    Research output: Contribution to journalArticle

    Abstract

    Background: The application of pathway and gene-set based analyses to high-throughput data is increasingly common and represents an effort to understand underlying biology where single-gene or single-marker analyses have failed. Many such analyses rely on the a priori identification of genes associated with the trait of interest. In contrast, this variance-component-based approach creates a similarity matrix of individuals based on the expression of genes in each pathway. Methods: We compared 16 methods of calculating similarity for positive control matrices based on probes for the genes used to model the simulated Genetic Analysis Workshop phenotypes. Results: A simple correlation matrix outperforms the other methods by identifying pathways associated with the simulated phenotypes at nearly twice the rate expected based on the associations of the component transcripts and an approximate false-positive rate of 0.05. Conclusions: This method has a number of additional advantages compared to single-transcript and pathway overrepresentation analyses, including the ability to estimate the proportion of variation explained by each pathway and the logistical advantage of only calculating the distance matrices once for each messenger RNA data set regardless of the number of phenotypes. Additionally, it offers a significant reduction in the multiple testing burden over individual consideration of each probe.

    Original languageEnglish (US)
    Article number90
    JournalBMC Proceedings
    Volume10
    DOIs
    StatePublished - 2016

    Fingerprint

    Gene expression
    Genes
    Gene Expression
    Phenotype
    Genetic Models
    Education
    Messenger RNA
    Throughput
    Testing

    ASJC Scopus subject areas

    • Medicine(all)
    • Biochemistry, Genetics and Molecular Biology(all)

    Cite this

    A variance component method for integrated pathway analysis of gene expression data. / Quillen, Ellen E.; Blangero, John; Almasy, Laura.

    In: BMC Proceedings, Vol. 10, 90, 2016.

    Research output: Contribution to journalArticle

    Quillen, Ellen E. ; Blangero, John ; Almasy, Laura. / A variance component method for integrated pathway analysis of gene expression data. In: BMC Proceedings. 2016 ; Vol. 10.
    @article{9b136159286c42959510db6ec2f6b892,
    title = "A variance component method for integrated pathway analysis of gene expression data",
    abstract = "Background: The application of pathway and gene-set based analyses to high-throughput data is increasingly common and represents an effort to understand underlying biology where single-gene or single-marker analyses have failed. Many such analyses rely on the a priori identification of genes associated with the trait of interest. In contrast, this variance-component-based approach creates a similarity matrix of individuals based on the expression of genes in each pathway. Methods: We compared 16 methods of calculating similarity for positive control matrices based on probes for the genes used to model the simulated Genetic Analysis Workshop phenotypes. Results: A simple correlation matrix outperforms the other methods by identifying pathways associated with the simulated phenotypes at nearly twice the rate expected based on the associations of the component transcripts and an approximate false-positive rate of 0.05. Conclusions: This method has a number of additional advantages compared to single-transcript and pathway overrepresentation analyses, including the ability to estimate the proportion of variation explained by each pathway and the logistical advantage of only calculating the distance matrices once for each messenger RNA data set regardless of the number of phenotypes. Additionally, it offers a significant reduction in the multiple testing burden over individual consideration of each probe.",
    author = "Quillen, {Ellen E.} and John Blangero and Laura Almasy",
    year = "2016",
    doi = "10.1186/s12919-016-0053-6",
    language = "English (US)",
    volume = "10",
    journal = "BMC Proceedings",
    issn = "1753-6561",
    publisher = "BioMed Central",

    }

    TY - JOUR

    T1 - A variance component method for integrated pathway analysis of gene expression data

    AU - Quillen, Ellen E.

    AU - Blangero, John

    AU - Almasy, Laura

    PY - 2016

    Y1 - 2016

    N2 - Background: The application of pathway and gene-set based analyses to high-throughput data is increasingly common and represents an effort to understand underlying biology where single-gene or single-marker analyses have failed. Many such analyses rely on the a priori identification of genes associated with the trait of interest. In contrast, this variance-component-based approach creates a similarity matrix of individuals based on the expression of genes in each pathway. Methods: We compared 16 methods of calculating similarity for positive control matrices based on probes for the genes used to model the simulated Genetic Analysis Workshop phenotypes. Results: A simple correlation matrix outperforms the other methods by identifying pathways associated with the simulated phenotypes at nearly twice the rate expected based on the associations of the component transcripts and an approximate false-positive rate of 0.05. Conclusions: This method has a number of additional advantages compared to single-transcript and pathway overrepresentation analyses, including the ability to estimate the proportion of variation explained by each pathway and the logistical advantage of only calculating the distance matrices once for each messenger RNA data set regardless of the number of phenotypes. Additionally, it offers a significant reduction in the multiple testing burden over individual consideration of each probe.

    AB - Background: The application of pathway and gene-set based analyses to high-throughput data is increasingly common and represents an effort to understand underlying biology where single-gene or single-marker analyses have failed. Many such analyses rely on the a priori identification of genes associated with the trait of interest. In contrast, this variance-component-based approach creates a similarity matrix of individuals based on the expression of genes in each pathway. Methods: We compared 16 methods of calculating similarity for positive control matrices based on probes for the genes used to model the simulated Genetic Analysis Workshop phenotypes. Results: A simple correlation matrix outperforms the other methods by identifying pathways associated with the simulated phenotypes at nearly twice the rate expected based on the associations of the component transcripts and an approximate false-positive rate of 0.05. Conclusions: This method has a number of additional advantages compared to single-transcript and pathway overrepresentation analyses, including the ability to estimate the proportion of variation explained by each pathway and the logistical advantage of only calculating the distance matrices once for each messenger RNA data set regardless of the number of phenotypes. Additionally, it offers a significant reduction in the multiple testing burden over individual consideration of each probe.

    UR - http://www.scopus.com/inward/record.url?scp=85016048918&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=85016048918&partnerID=8YFLogxK

    U2 - 10.1186/s12919-016-0053-6

    DO - 10.1186/s12919-016-0053-6

    M3 - Article

    VL - 10

    JO - BMC Proceedings

    JF - BMC Proceedings

    SN - 1753-6561

    M1 - 90

    ER -