TY - JOUR
T1 - Leveraging cross-link modification events in CLIP-seq for motif discovery
AU - Bahrami-Samani, Emad
AU - Penalva, Luiz O.F.
AU - Smith, Andrew D.
AU - Uren, Philip J.
N1 - Publisher Copyright:
© 2015 The Author(s).
PY - 2015/9/1
Y1 - 2015/9/1
N2 - High-throughput protein-RNA interaction data generated by CLIP-seq has provided an unprecedented depth of access to the activities of RNA-binding proteins (RBPs), the key players in co-and post-transcriptional regulation of gene expression. Motif discovery forms part of the necessary follow-up data analysis for CLIP-seq, both to refine the exact locations of RBP binding sites, and to characterize them. The specific properties of RBP binding sites, and the CLIP-seq methods, provide additional information not usually present in the classic motif discovery problem: the binding site structure, and cross-linking induced events in reads. We show that CLIP-seq data contains clear secondary structure signals, as well as technology-and RBP-specific cross-link signals. We introduce Zagros, a motif discovery algorithm specifically designed to leverage this information and explore its impact on the quality of recovered motifs. Our results indicate that using both secondary structure and cross-link modifications can greatly improve motif discovery on CLIP-seq data. Further, the motifs we recover provide insight into the balance between sequence-and structure-specificity struck by RBP binding.
AB - High-throughput protein-RNA interaction data generated by CLIP-seq has provided an unprecedented depth of access to the activities of RNA-binding proteins (RBPs), the key players in co-and post-transcriptional regulation of gene expression. Motif discovery forms part of the necessary follow-up data analysis for CLIP-seq, both to refine the exact locations of RBP binding sites, and to characterize them. The specific properties of RBP binding sites, and the CLIP-seq methods, provide additional information not usually present in the classic motif discovery problem: the binding site structure, and cross-linking induced events in reads. We show that CLIP-seq data contains clear secondary structure signals, as well as technology-and RBP-specific cross-link signals. We introduce Zagros, a motif discovery algorithm specifically designed to leverage this information and explore its impact on the quality of recovered motifs. Our results indicate that using both secondary structure and cross-link modifications can greatly improve motif discovery on CLIP-seq data. Further, the motifs we recover provide insight into the balance between sequence-and structure-specificity struck by RBP binding.
UR - http://www.scopus.com/inward/record.url?scp=84941068123&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84941068123&partnerID=8YFLogxK
U2 - 10.1093/nar/gku1288
DO - 10.1093/nar/gku1288
M3 - Article
C2 - 25505146
AN - SCOPUS:84941068123
SN - 0305-1048
VL - 43
SP - 95
EP - 103
JO - Nucleic Acids Research
JF - Nucleic Acids Research
IS - 1
ER -