NFDI4DS | UHH-SEMS - Publication Details

Benchmarking uncertainty quantification for protein engineering

Benchmark (surveying) Uncertainty Quantification Bayesian Optimization Benchmarking

DOI: 10.1371/journal.pcbi.1012639 Publication Date: 2025-01-07T18:45:57Z

Abstract Supplemental Material References Cited by

AUTHORS (3)

Kevin P. Greenman

Ava P. Amini

Kevin K. Yang

ABSTRACT

Machine learning sequence-function models for proteins could enable significant advances in protein engineering, especially when paired with state-of-the-art methods to select new sequences property optimization and/or model improvement. Such (Bayesian and active learning) require calibrated estimations of uncertainty. While studies have benchmarked a variety deep uncertainty quantification (UQ) on standard molecular machine-learning datasets, it is not clear if these results extend datasets. In this work, we implemented panel UQ regression tasks from the Fitness Landscape Inference Proteins (FLIP) benchmark. We compared across different degrees distributional shift using metrics that assess each method’s accuracy, calibration, coverage, width, rank correlation. Additionally, one-hot encoding pretrained language representations, tested retrospective Bayesian settings. Our indicate there no single best method all splits, metrics, uncertainty-based sampling often unable outperform greedy optimization. These benchmarks us provide recommendations more effective design biological machine learning.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES (44)

CITATIONS (1)

EXTERNAL LINKS

OPENAIRE - Products CROSSREF - Publications OPENALEX - Publications

PlumX Metrics

Benchmarking uncertainty quantification for protein engineering

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....