Quantification of biases in predictions of protein stability changes upon mutations

SERVER Protein Folding Bias Protein Stability Mutation Proteins Sciences bio-médicales et agricoles PERFORMANCE Sciences de l'ingénieur 3. Good health
DOI: 10.1093/bioinformatics/bty348 Publication Date: 2018-04-25T19:59:04Z
ABSTRACT
Abstract Motivation Bioinformatics tools that predict protein stability changes upon point mutations have made a lot of progress in the last decades and have become accurate and fast enough to make computational mutagenesis experiments feasible, even on a proteome scale. Despite these achievements, they still suffer from important issues that must be solved to allow further improving their performances and utilizing them to deepen our insights into protein folding and stability mechanisms. One of these problems is their bias toward the learning datasets which, being dominated by destabilizing mutations, causes predictions to be better for destabilizing than for stabilizing mutations. Results We thoroughly analyzed the biases in the prediction of folding free energy changes upon point mutations (ΔΔG0) and proposed some unbiased solutions. We started by constructing a dataset Ssym of experimentally measured ΔΔG0s with an equal number of stabilizing and destabilizing mutations, by collecting mutations for which the structure of both the wild-type and mutant protein is available. On this balanced dataset, we assessed the performances of 15 widely used ΔΔG0 predictors. After the astonishing observation that almost all these methods are strongly biased toward destabilizing mutations, especially those that use black-box machine learning, we proposed an elegant way to solve the bias issue by imposing physical symmetries under inverse mutations on the model structure, which we implemented in PoPMuSiCsym. This new predictor constitutes an efficient trade-off between accuracy and absence of biases. Some final considerations and suggestions for further improvement of the predictors are discussed. Supplementary information Supplementary data are available at Bioinformatics online. Note The article 10.1093/bioinformatics/bty340/, published alongside this paper, also addresses the problem of biases in protein stability change predictions.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (32)
CITATIONS (136)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....