DISCOVER: a feature-based discriminative method for motif search in complex genomes
DNA binding site
Discriminative model
DOI:
10.1093/bioinformatics/btp230
Publication Date:
2009-05-28T15:48:54Z
AUTHORS (3)
ABSTRACT
Abstract Motivation: Identifying transcription factor binding sites (TFBSs) encoding complex regulatory signals in metazoan genomes remains a challenging problem computational genomics. Due to degeneracy of nucleotide content among site instances or motifs, and intricate ‘grammatical organization’ motifs within cis-regulatory modules (CRMs), extant pattern matching-based silico motif search methods often suffer from impractically high false positive rates, especially the context analyzing large genomic datasets, noisy position weight matrices which characterize sites. Here, we try address this by using framework maximally utilize information DNA region query, taking cues values various biologically meaningful genetic epigenetic factors query such as clade-specific evolutionary parameters, presence/absence nearby coding regions, etc. We present new method for TFBS prediction that utilizes both CRM architecture sequences variety features individual motifs. Our proposed approach is based on discriminative probabilistic model known conditional random fields explicitly optimizes predictive probability presence sequences, joint effect all features. Results: This overcomes weaknesses earlier less effective statistical formalisms are sensitive spurious data. evaluate our simulated CRMs real Drosophila comparison with wide spectrum existing models, outperform state art 22% F1 score. Availability Implementation: The code publicly available at http://www.sailing.cs.cmu.edu/discover.html. Contact: epxing@cs.cmu.edu Supplementary information: data Bioinformatics online.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (57)
CITATIONS (10)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....