A primer on model-guided exploration of fitness landscapes for biological sequence design
Sequence (biology)
Sequence space
Fitness landscape
DOI:
10.48550/arxiv.2010.10614
Publication Date:
2020-01-01
AUTHORS (2)
ABSTRACT
Machine learning methods are increasingly employed to address challenges faced by biologists. One area that will greatly benefit from this cross-pollination is the problem of biological sequence design, which has massive potential for therapeutic applications. However, significant inefficiencies remain in communication between these fields result biologists finding progress machine inaccessible, and hinder scientists contributing impactful problems bioengineering. Sequence design can be seen as a search process on discrete, high-dimensional space, where each associated with function. This sequence-to-function map known "Fitness Landscape". Designing particular function hence matter "discovering" such (often rare) within space. Today we build predictive models good interpolation ability due impressive synthesis testing sequences large numbers, enables model training validation. it often remains challenge find useful properties like using models. In particular, primer highlight algorithms experimental what call "exploration strategies", related, yet distinct building maps. We review advances insights current literature -- no means complete treatment while highlighting desirable features optimal model-guided exploration, cover pitfalls drawn our own experience. serve starting point researchers different domains interested searching space model, but perhaps unaware approaches originate outside their field.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....