The simplicity of protein sequence-function relationships

Epistasis Sequence (biology) Genetic architecture
DOI: 10.1101/2023.09.02.556057 Publication Date: 2023-09-06T04:10:10Z
ABSTRACT
How complicated is the genetic architecture of proteins - set causal effects by which sequence determines function? High-order epistatic interactions among residues are thought to be pervasive, making a protein's function difficult predict or understand from its sequence. Most studies, however, used methods that overestimate epistasis, because they analyze relative designated reference causing measurement noise and small local idiosyncrasies propagate into pervasive high-order have not effectively accounted for global nonlinearity in sequence-function relationship. Here we present new reference-free method jointly estimates specific across entire genotype-phenotype map. This yields maximally efficient explanation more robust than existing noise, partial sampling, model misspecification. We reanalyze 20 combinatorial mutagenesis experiments diverse find additive pairwise effects, along with simple account limited dynamic range, explain median 96% total variance measured phenotypes (and >92% every case). Only tiny fraction genotypes strongly affected third- higher-order epistasis. Genetic also sparse: number terms required vast majority smaller many orders magnitude. The relationship most therefore far simpler previously thought, opening way tractable approaches characterize it.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (47)
CITATIONS (28)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....