Deciphering the Preference and Predicting the Viability of Circular Permutations in Proteins

Models, Molecular 0301 basic medicine Support Vector Machine Science Q R Proteins Protein Engineering 03 medical and health sciences Medicine Computer Simulation Amino Acid Sequence Research Article
DOI: 10.1371/journal.pone.0031791 Publication Date: 2012-02-16T22:26:41Z
ABSTRACT
Circular permutation (CP) refers to situations in which the termini of a protein are relocated other positions structure. CP occurs naturally and has been artificially created study function, stability folding. Recently is increasingly applied engineer enzyme structure create bifunctional fusion proteins unachievable by tandem fusion. complicated expensive technique. An intrinsic difficulty its application lies fact that not every position amenable for creating viable permutant. To examine preferences develop viability prediction methods, we carried out comprehensive analyses sequence, structural, dynamical properties known sites using variety statistics simulation such as bootstrap aggregating, test molecular dynamics simulations. particularly favors Gly, Pro, Asp Asn. Positions preferred lie within coils, loops, turns, at residues exposed solvent, weakly hydrogen-bonded, environmentally unpacked, or flexible. Disfavored include Cys, bulky hydrophobic residues, located helices near protein's core. These results fostered development an effective site system, combined four machine learning e.g., artificial neural networks, support vector machine, random forest, hierarchical feature integration procedure developed this work. As assessed hydrofolate reductase dataset independent evaluation dataset, system achieved AUC 0.9. Large-scale predictions have performed nine thousand representative structures; several new potential applications were thus identified. Many unreported revealed study. The best method currently available. This work will facilitate research biotechnology.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (98)
CITATIONS (18)