Logo2PWM: A tool to convert sequence logo to position weight matrix

Zhen Gao, Lu Liu, Jianhua Ruan

    Research output: Contribution to journalArticlepeer-review

    2 Scopus citations

    Abstract

    Background: position weight matrix (PWM) and sequence logo are the most widely used representations of transcription factor binding site (TFBS) in biological sequences. Sequence logo - a graphical representation of PWM, has been widely used in scientific publications and reports, due to its easiness of human perception, rich information, and simple format. Different from sequence logo, PWM works great as a precise and compact digitalized form, which can be easily used by a variety of motif analysis software. There are a few available tools to generate sequence logos from PWM; however, no tool does the reverse. Such tool to convert sequence logo back to PWM is needed to scan a TFBS represented in logo format in a publication where the PWM is not provided or hard to be acquired. A major difficulty in developing such tool to convert sequence logo to PWM is to deal with the diversity of sequence logo images. Results: We propose logo2PWM for reconstructing PWM from a large variety of sequence logo images. Evaluation results on over one thousand logos from three sources of different logo format show that the correlation between the reconstructed PWMs and the original PWMs are constantly high, where median correlation is greater than 0.97. Conclusion: Because of the high recognition accuracy, the easiness of usage, and, the availability of both web-based service and stand-alone application, we believe that logo2PWM can readily benefit the study of transcription by filling the gap between sequence logo and PWM.

    Original languageEnglish (US)
    Article number709
    JournalBMC genomics
    Volume18
    DOIs
    StatePublished - Oct 3 2017

    Keywords

    • Binding site
    • Convert
    • Motif finding
    • Position weight matrix
    • Sequence logo
    • Transcription

    ASJC Scopus subject areas

    • Biotechnology
    • Genetics

    Fingerprint Dive into the research topics of 'Logo2PWM: A tool to convert sequence logo to position weight matrix'. Together they form a unique fingerprint.

    Cite this