Contrastive losses as generalized models of global epistasis
Epistasis
Benchmark (surveying)
Fitness landscape
DOI:
10.48550/arxiv.2305.03136
Publication Date:
2023-01-01
AUTHORS (3)
ABSTRACT
Fitness functions map large combinatorial spaces of biological sequences to properties interest. Inferring these multimodal from experimental data is a central task in modern protein engineering. Global epistasis models are an effective and physically-grounded class for estimating fitness observed data. These assume that sparse latent function transformed by monotonic nonlinearity emit measurable fitness. Here we demonstrate minimizing contrastive loss functions, such as the Bradley-Terry loss, simple flexible technique extracting implied global epistasis. We argue way fitness-epistasis uncertainty principle nonlinearities can produce do not admit representations, thus may be inefficient learn observations when using Mean Squared Error (MSE) (a common practice). show losses able accurately estimate ranking limited even regimes where MSE ineffective. validate practical utility this insight showing result consistently improved performance on benchmark tasks.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....