TY - JOUR
T1 - A genome-wide cis-regulatory element discovery method based on promoter sequences and gene co-expression networks
AU - Gao, Zhen
AU - Zhao, Ruizhe
AU - Ruan, Jianhua
N1 - Funding Information:
The publication costs for this article were funded by NIH grant SC3GM086305. This article has been published as part of BMC Genomics Volume 14 Supplement 1, 2013: Selected articles from the Eleventh Asia Pacific Bioinformatics Conference (APBC 2013): Genomics. The full contents of the supplement are available online at http://www.biomedcentral.com/ bmcgenomics/supplements/14/S1.
Funding Information:
This work was supported by NSF grants IOS-0848135, IIS-1218201, NIH grants SC3GM086305, U54CA113001, G12MD007591 (Computational Systems Biology Core), and a UTSA Tenure-track Research Award.
Publisher Copyright:
© 2013 Gao et al.al.
PY - 2013/1/21
Y1 - 2013/1/21
N2 - Background: Deciphering cis-regulatory networks has become an attractive yet challenging task. This paper presents a simple method for cis-regulatory network discovery which aims to avoid some of the common problems of previous approaches. Results: Using promoter sequences and gene expression profiles as input, rather than clustering the genes by the expression data, our method utilizes co-expression neighborhood information for each individual gene, thereby overcoming the disadvantages of current clustering based models which may miss specific information for individual genes. In addition, rather than using a motif database as an input, it implements a simple motif count table for each enumerated k-mer for each gene promoter sequence. Thus, it can be used for species where previous knowledge of cis-regulatory motifs is unknown and has the potential to discover new transcription factor binding sites. Applications on Saccharomyces cerevisiae and Arabidopsis have shown that our method has a good prediction accuracy and outperforms a phylogenetic footprinting approach. Furthermore, the top ranked genemotif regulatory clusters are evidently functionally co-regulated, and the regulatory relationships between the motifs and the enriched biological functions can often be confirmed by literature. Conclusions: Since this method is simple and gene-specific, it can be readily utilized for insufficiently studied species or flexibly used as an additional step or data source for previous transcription regulatory networks discovery models.
AB - Background: Deciphering cis-regulatory networks has become an attractive yet challenging task. This paper presents a simple method for cis-regulatory network discovery which aims to avoid some of the common problems of previous approaches. Results: Using promoter sequences and gene expression profiles as input, rather than clustering the genes by the expression data, our method utilizes co-expression neighborhood information for each individual gene, thereby overcoming the disadvantages of current clustering based models which may miss specific information for individual genes. In addition, rather than using a motif database as an input, it implements a simple motif count table for each enumerated k-mer for each gene promoter sequence. Thus, it can be used for species where previous knowledge of cis-regulatory motifs is unknown and has the potential to discover new transcription factor binding sites. Applications on Saccharomyces cerevisiae and Arabidopsis have shown that our method has a good prediction accuracy and outperforms a phylogenetic footprinting approach. Furthermore, the top ranked genemotif regulatory clusters are evidently functionally co-regulated, and the regulatory relationships between the motifs and the enriched biological functions can often be confirmed by literature. Conclusions: Since this method is simple and gene-specific, it can be readily utilized for insufficiently studied species or flexibly used as an additional step or data source for previous transcription regulatory networks discovery models.
UR - http://www.scopus.com/inward/record.url?scp=84920615037&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84920615037&partnerID=8YFLogxK
U2 - 10.1186/1471-2164-14-S1-S4
DO - 10.1186/1471-2164-14-S1-S4
M3 - Article
C2 - 23368633
AN - SCOPUS:84920615037
VL - 14
JO - BMC Genomics
JF - BMC Genomics
SN - 1471-2164
M1 - S4
ER -