Stability selection for regression-based models of transcription factor–DNA binding specificity
DNA binding site
Factor (programming language)
DOI:
10.1093/bioinformatics/btt221
Publication Date:
2013-06-27T05:33:26Z
AUTHORS (5)
ABSTRACT
The DNA binding specificity of a transcription factor (TF) is typically represented using position weight matrix model, which implicitly assumes that individual bases in TF site contribute independently to the affinity, an assumption does not always hold. For this reason, more complex models have been developed. However, these their own caveats: they large number parameters, makes them hard learn and interpret.We propose novel regression-based TF-DNA specificity, trained high resolution vitro data from custom protein-binding microarray (PBM) experiments. Our PBMs are specifically designed cover putative sites for TFs interest (yeast Cbf1 Tye7, human c-Myc, Max Mad2) native genomic context. These high-throughput quantitative well suited training take into account only independent contributions bases, but also di- trinucleotides at various positions within or near sites. To ensure our remain interpretable, we use feature selection identify small sequence features accurately predict specificity. further illustrate accuracy regression models, show even case paralogous with highly similar matrices, new can distinguish specificities factors. Thus, work represents important step toward better sequence-based specificity.Our code available http://genome.duke.edu/labs/gordan/ISMB2013. PBM used article Gene Expression Omnibus under accession GSE47026.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (51)
CITATIONS (45)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....