- Monoclonal and Polyclonal Antibodies Research
- Protein Structure and Dynamics
- Immunodeficiency and Autoimmune Disorders
- RNA and protein synthesis mechanisms
- Evolution and Genetic Dynamics
- Pharmacogenetics and Drug Metabolism
- Chronic Lymphocytic Leukemia Research
- interferon and immune responses
- SARS-CoV-2 and COVID-19 Research
- CRISPR and Genetic Engineering
- Machine Learning in Bioinformatics
- vaccines and immunoinformatics approaches
- Genomics and Phylogenetic Studies
- Vitamin K Research Studies
- Genetic Associations and Epidemiology
- Protein purification and stability
- Genomics and Rare Diseases
- Receptor Mechanisms and Signaling
- Bioinformatics and Genomic Networks
- Bacterial Genetics and Biotechnology
- Glycosylation and Glycoproteins Research
- Microtubule and mitosis dynamics
- Biosimilars and Bioanalytical Methods
- Biochemical and Molecular Research
- Nanoparticle-Based Drug Delivery
Advanced Radiation Therapy (United States)
2023-2025
Center for Systems Biology
2018-2022
Harvard University
2018-2022
Predicting the effects of mutations in proteins is critical to many applications, from understanding genetic disease designing novel that can address our most pressing challenges climate, agriculture and healthcare. Despite a surge machine learning-based protein models tackle these questions, an assessment their respective benefits challenging due use distinct, often contrived, experimental datasets, variable performance across different families. Addressing requires scale. To end we...
Recent developments in protein design rely on large neural networks with up to 100s of millions parameters, yet it is unclear which residue dependencies are critical for determining function. Here, we show that amino acid preferences at individual residues-without accounting mutation interactions-explain much and sometimes virtually all the combinatorial effects across 8 datasets (R
Vitamin K epoxide reductase (VKOR) drives the vitamin cycle, activating K-dependent blood clotting factors. VKOR is also target of widely used anticoagulant drug, warfarin. Despite VKOR’s pivotal role in coagulation, its structure and active site remain poorly understood. In addition, variants can cause factor deficiency or alter warfarin response. Here, we multiplexed, sequencing-based assays to measure effects 2,695 missense on abundance 697 activity cultured human cells. The large-scale...
The Protein Data Bank in Europe - Knowledge Base (PDBe-KB, https://pdbe-kb.org) is an open collaboration between world-leading specialist data resources contributing functional and biophysical annotations derived from or relevant to the (PDB). goal of PDBe-KB place macromolecular structure their biological context by developing standardised exchange formats integrating partner into a knowledge graph that can provide valuable insights. Since we described 2019, there have been significant...
Abstract Multiplexed assays of variant effect (MAVEs) are a critical tool for researchers and clinicians to understand genetic variants. Here we describe the 2024 update MaveDB ( https://www.mavedb.org/ ) with four key improvements MAVE community’s database record: more available data including over 7 million measurements, an improved model supporting such as saturation genome editing, new built-in exploration visualization tools, powerful APIs federation streamlined submission access....
Abstract Encapsulins are recently discovered protein compartments able to specifically encapsulate cargo proteins in vivo. Encapsulation is dependent on C-terminal targeting peptides (TPs). Here, we characterize and engineer TP-shell interactions the Thermotoga maritima Myxococcus xanthus encapsulin systems. Using force-field modeling particle fluorescence measurements show that TPs vary native specificity binding strength, determined by hydrophobic ionic as well TP flexibility. We design a...
Abstract A central problem in genomics is understanding the effect of individual DNA variants. Multiplexed Assays Variant Effect (MAVEs) can help address this challenge by measuring all possible single nucleotide variant effects a gene or regulatory sequence simultaneously. Here we describe MaveDB v2, which has become database record for MAVEs. now contains large fraction published studies, comprising over two hundred datasets and three million measurements. We created tools APIs to...
ABSTRACT Unsupervised sequence models for protein fitness have emerged as powerful tools design in order to engineer therapeutics and industrial enzymes, yet they are strongly biased towards potential designs that close their training data. This hinders ability generate functional sequences far away from natural sequences, is often desired new functions. To address this problem, we introduce a de-biasing approach enables the comparison of across mutational depths overcome extant similarity...
Abstract The grand challenge of protein engineering is the development computational models to characterize and generate sequences for arbitrary functions. Progress limited by lack 1) benchmarking opportunities, 2) large function datasets, 3) access experimental characterization. We introduce Protein Engineering Tournament—a fully-remote competition designed foster evaluation approaches in engineering. tournament consists an silico round, predicting biophysical properties from sequences,...
Many organisms can survive extreme conditions and successfully recover to normal life. This extremotolerant behavior has been attributed in part repetitive, amphipathic, intrinsically disordered proteins that are upregulated the protected state. Here, we assemble a library of approximately 300 naturally occurring designed extremotolerance-associated assess their ability protect human cells from chemically induced apoptosis. We show several tardigrades, nematodes, Chinese giant salamander...
Summary Effective pandemic preparedness relies on anticipating viral mutations that are able to evade host immune responses in order facilitate vaccine and therapeutic design. However, current strategies for evolution prediction not available early a – experimental approaches require polyclonal antibodies test against existing computational methods draw heavily from strain prevalence make reliable predictions of variants concern. To address this, we developed EVEscape, generalizable, modular...
ABSTRACT Medium-chain fatty acids are commodity chemicals. Increasing and modifying the activity of thioesterases (TEs) on medium-chain acyl-acyl carrier protein (acyl-ACP) esters may enable a high-yield microbial production these molecules. The plant Cuphea palustris harbors two distinct TEs: C. FatB1 ( Cp FatB1) (C 8 specificity, lower activity) FatB2 14 higher with 78% sequence identity. We combined structural features from enzymes to create several chimeric TEs, some which showed...
Abstract Recent developments in protein design have adapted large neural networks with up to 100s of millions parameters learn complex sequence-function mappings. However, it is unclear which dependencies between residues are critical for determining function, and a better empirical understanding could enable high quality models that also more data- resource-efficient. Here, we observe the per residue amino acid preferences - without considering interactions mutations sufficient explain...
Infectious diseases caused by viral pathogens exacerbate health care and economic burdens. Numerous biomolecules suppress the human innate immune system, enabling viruses to evade an response from host.
Summary Suppression of the host intracellular innate immune system is an essential aspect viral replication. Here, we developed a suite medium-throughput high-content cell-based assays to reveal effect individual coronavirus proteins on antiviral pathways. Using these assays, screened 196 protein products seven coronaviruses (SARS-CoV-2, SARS-CoV-1, 229E, NL63, OC43, HKU1 and MERS). This includes previously unidentified gene in SARS-CoV-2 encoded within Spike gene. We observe...
To assess the pharmacokinetics (PK), pharmacodynamics (PD) and preclinical efficacy of an engineered pan-IgG protease. Pathogenic autoantibodies are key effectors inflammation, promoting complement activation immune cell responses that cause tissue damage in autoantibody-mediated diseases such as myasthenia gravis chronic inflammatory demyelinating polyneuropathy. IgG-proteases represent a new therapeutic opportunity. S-1117, novel Fc-fused protease, was using proprietary machine learning...
Abstract Pathogenic autoantibodies are key effectors of inflammation, promoting complement activation and immune cell responses that cause tissue damage in autoantibody-mediated diseases such as myasthenia gravis. IgG-proteases represent a new therapeutic opportunity. Here we present S-1117, novel Fc-fused pan-IgG protease, engineered using proprietary machine learning enabled platform to reduce immunogenicity augment manufacturability stability while maintaining enzyme activity. S-1117...
Steroids can be difficult to modify through traditional organic synthesis methods, but many enzymes regio- and stereoselectively process a wide variety of steroid substrates. We tested whether steroid-modifying could make novel steroids from non-native Numerous genes encoding enzymes, including some bacterial were expressed in mammalian cells by transient transfection found active. made three unusual stable expression, HEK293 cells, the 7α-hydroxylase CYP7B1, which was selected because its...
ABSTRACT Vitamin K epoxide reductase (VKOR) drives the vitamin cycle, activating K-dependent blood clotting factors. VKOR is also target of widely used anticoagulant drug, warfarin Despite VKOR’s pivotal role in coagulation, its structure and active site remain poorly understood. In addition, variants can cause factor deficiency 2 or alter response. Here, we multiplexed, sequencing-based assays to measure effects 2,695 missense on abundance 697 activity cultured human cells. The large-scale...
Abstract Proteases derived from human pathogens can specifically cleave IgG into F(ab’)2 and Fc fragments. IdeS, an cleaving enzyme Streptococcus pyogenes, has shown clinical proof of concept, is approved for use before kidney transplantation. Due to the immunogenic nature these proteases, dosing limited by high prevalence pre-existing antibodies induction anti-drug after dosing. Therefore, mitigate impact immune system on our enzyme, we identify remove putative T B cell epitopes using...
Abstract Proteases derived from human pathogens can specifically cleave IgG into F(ab′)2 and Fc fragments. This unique trait suggests a novel opportunity to use these molecules treat auto antibody mediated disease. IdeS, an cleaving enzyme Streptococcus pyogenes has shown clinical proof of concept is approved for before kidney transplant. Due the immunogenic nature proteases, dosing regimen impacted by pre-existing antibodies induction anti-drug after dosing. To mitigate impact immune system...