NFDI4DS | UHH-SEMS - Publication Details

Samuel Sledzieski

ORCID: 0000-0002-0170-3029

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5020346838

Research Areas

Machine Learning in Bioinformatics
Protein Structure and Dynamics
RNA and protein synthesis mechanisms
Genomics and Phylogenetic Studies
Bioinformatics and Genomic Networks
Evolution and Genetic Dynamics
Computational Drug Discovery Methods
Microbial Metabolic Engineering and Bioproduction
Coral and Marine Ecosystems Studies
vaccines and immunoinformatics approaches
Gene Regulatory Network Analysis
Aquaculture disease management and microbiota
Gut microbiota and health
Monoclonal and Polyclonal Antibodies Research
Marine Sponges and Natural Products
Mosquito-borne diseases and control
Cancer Genomics and Diagnostics
Aquaculture Nutrition and Growth
Biomedical Text Mining and Ontologies
Neurobiology and Insect Physiology Research
Lipid Membrane Structure and Behavior
CRISPR and Genetic Engineering
Single-cell and spatial transcriptomics
Animal Virus Infections Studies
Bat Biology and Ecology Studies

Massachusetts Institute of Technology
2021-2025

Microsoft (United States)
2023-2025

Flatiron Health (United States)
2024-2025

Flatiron Institute
2024

Moscow Institute of Thermal Technology
2024

Tufts University
2022

Broad Institute
2022

University of Connecticut
2019-2020

D-SCRIPT translates genome to phenome with sequence-based, structure-aware, genome-scale predictions of protein-protein interactions

OPENALEX - Publications

Samuel Sledzieski Rohit Singh Lenore Cowen Bonnie Berger

We combine advances in neural language modeling and structurally motivated design to develop D-SCRIPT, an interpretable generalizable deep-learning model, which predicts interaction between two proteins using only their sequence maintains high accuracy with limited training data across species. show that a D-SCRIPT model trained on 38,345 human PPIs enables significantly improved functional characterization of fly compared the state-of-the-art approach. Evaluating same protein complexes...

10.1016/j.cels.2021.08.010 article EN cc-by Cell Systems 2021-10-01

Contrastive learning in protein language space predicts interactions between drugs and protein targets

OPENALEX - Publications

Rohit Singh Samuel Sledzieski Bryan D. Bryson Lenore Cowen Bonnie Berger

Sequence-based prediction of drug-target interactions has the potential to accelerate drug discovery by complementing experimental screens. Such computational needs be generalizable and scalable while remaining sensitive subtle variations in inputs. However, current techniques fail simultaneously meet these goals, often sacrificing performance one achieve others. We develop a deep learning model, ConPLex, successfully leveraging advances pretrained protein language models ("PLex") employing...

10.1073/pnas.2220778120 article EN cc-by-nc-nd Proceedings of the National Academy of Sciences 2023-06-08

Democratizing protein language models with parameter-efficient fine-tuning

OPENALEX - Publications

Samuel Sledzieski Meghana Kshirsagar Minkyung Baek Rahul Dodhia Juan Lavista Ferres and 1 more

Proteomics has been revolutionized by large protein language models (PLMs), which learn unsupervised representations from corpora of sequences. These are typically fine-tuned in a supervised setting to adapt the model specific downstream tasks. However, computational and memory footprint fine-tuning (FT) PLMs presents barrier for many research groups with limited resources. Natural processing seen similar explosion size models, where these challenges have addressed methods...

10.1073/pnas.2405840121 article EN cc-by Proceedings of the National Academy of Sciences 2024-06-20

Topsy-Turvy: integrating a global view into sequence-based PPI prediction

OPENALEX - Publications

Rohit Singh Kapil Devkota Samuel Sledzieski Bonnie Berger Lenore Cowen

Abstract Summary Computational methods to predict protein–protein interaction (PPI) typically segregate into sequence-based ‘bottom-up’ that infer properties from the characteristics of individual protein sequences, or global ‘top-down’ pattern already known PPIs in species interest. However, a way incorporate top-down insights bottom-up PPI prediction has been elusive. We thus introduce Topsy-Turvy, method newly synthesizes both views sequence-based, multi-scale, deep-learning model for...

10.1093/bioinformatics/btac258 article EN cc-by Bioinformatics 2022-04-14

Rapid and accurate prediction of protein homo-oligomer symmetry using Seq2Symm

OPENALEX - Publications

Meghana Kshirsagar Artur Meller Ian R. Humphreys Samuel Sledzieski Yixi Xu and 7 more

Abstract The majority of proteins must form higher-order assemblies to perform their biological functions, yet few machine learning models can accurately and rapidly predict the symmetry involving multiple copies same protein chain. Here, we address this gap by finetuning several classes foundation models, homo-oligomer symmetry. Our best model named Seq2Symm, which utilizes ESM2, outperforms existing template-based deep methods achieving an average AUC-PR 0.47, 0.44 0.49 across symmetries...

10.1038/s41467-025-57148-3 article EN cc-by Nature Communications 2025-02-27

TT3D: Leveraging precomputed protein 3D sequence models to predict protein–protein interactions

OPENALEX - Publications

Samuel Sledzieski Kapil Devkota Rohit Singh Lenore Cowen Bonnie Berger

High-quality computational structural models are now precomputed and available for nearly every protein in UniProt. However, the best way to leverage these predict which pairs of proteins interact a high-throughput manner is not immediately clear. The recent Foldseek method van Kempen et al. encodes information distances angles along backbone into linear string same length as string, using tokens from 21-letter discretized alphabet (3Di).

10.1093/bioinformatics/btad663 article EN cc-by Bioinformatics 2023-10-27

Learning the Language of Antibody Hypervariability

OPENALEX - Publications

Rohit Singh Chiho Im Yu Qiu Brian C. Mackness Abhinav Gupta and 7 more

Protein language models (PLMs) based on machine learning have demon-strated impressive success in predicting protein structure and function. However, general-purpose (“foundational”) PLMs limited performance antibodies due to the latter’s hypervariable regions, which do not conform evolutionary conservation principles that such rely on. In this study, we propose a new transfer framework called AbMAP, fine-tunes foundational for antibody-sequence inputs by supervising antibody binding...

10.1101/2023.04.26.538476 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2023-04-28

Sequence-based prediction of protein-protein interactions: a structure-aware interpretable deep learning model

OPENALEX - Publications

Samuel Sledzieski Rohit Singh Lenore Cowen Bonnie Berger

Abstract Protein-protein interaction (PPI) networks have proven to be a valuable tool in systems biology facilitate the discovery and understanding of protein function. Unfortunately, experimental PPI data remains sparse most model organisms even more so other species. Existing methods for computational prediction PPIs seek address this limitation, while they perform well when sufficient within-species training is available, generalize poorly new species or often require specific types sizes...

10.1101/2021.01.22.427866 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2021-01-25

Rapid and accurate prediction of protein homo-oligomer symmetry with Seq2Symm

OPENALEX - Publications

Meghana Kshirsagar Artur Meller Ian R. Humphreys Samuel Sledzieski Yixi Xu and 7 more

<title>Abstract</title> The majority of proteins must form higher-order assemblies to perform their biological functions. Despite the importance protein quaternary structure, there are few machine learning models that can accurately and rapidly predict symmetry involving multiple copies same chain. Here, we address this gap by training several classes foundation models, including ESM-MSA, ESM2, RoseTTAFold2, homo-oligomer symmetry. Our best model named Seq2Symm, which utilizes outperforms...

10.21203/rs.3.rs-4215086/v1 preprint EN Research Square (Research Square) 2024-04-26

Transfer of knowledge from model organisms to evolutionarily distant non-model organisms: The coral Pocillopora damicornis membrane signaling receptome

OPENALEX - Publications

Lokender Kumar Nathanael Brenner Samuel Sledzieski Monsurat Olaosebikan Liza M. Roger and 10 more

With the ease of gene sequencing and technology available to study manipulate non-model organisms, extension methodological toolbox required translate our understanding model organisms has become an urgent problem. For example, mining large coral their symbiont sequence data is a challenge, but also provides opportunity for functionality evolution these other organisms. Much more information than any eukaryotic species humans, especially related signal transduction diseases. However,...

10.1371/journal.pone.0270965 article EN cc-by PLoS ONE 2023-02-03

Democratizing Protein Language Models with Parameter-Efficient Fine-Tuning

OPENALEX - Publications

Samuel Sledzieski Meghana Kshirsagar Minkyung Baek Bonnie Berger Rahul Dodhia and 1 more

Proteomics has been revolutionized by large pre-trained protein language models, which learn unsupervised representations from corpora of sequences. The parameters these models are then fine-tuned in a supervised setting to tailor the model specific downstream task. However, as size increases, computational and memory footprint fine-tuning becomes barrier for many research groups. In field natural processing, seen similar explosion challenges have addressed methods parameter-efficient...

10.1101/2023.11.09.566187 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2023-11-10

BPS2025 - Structure quantization enables ultra-fast prediction of protein conformational flexibility

OPENALEX - Publications

Samuel Sledzieski Pilar Cossio Olga G. Troyanskaya Sonya M. Hanson

10.1016/j.bpj.2024.11.1124 article EN Biophysical Journal 2025-02-01

BPS2025 - Structure quantization enables ultra-fast prediction of protein conformational flexibility

OPENALEX - Publications

Samuel Sledzieski Pilar Cossio Olga G. Troyanskaya Sonya M. Hanson

10.1016/j.bpj.2024.11.2085 article EN Biophysical Journal 2025-02-01

Learning the language of protein–protein interactions

OPENALEX - Publications

Varun Ullanat Bowen Jing Samuel Sledzieski Bonnie Berger

Protein Language Models (PLMs) trained on large databases of protein sequences have proven effective in modeling biology across a wide range applications. However, while PLMs excel at capturing individual properties, they face challenges natively representing protein–protein interactions (PPIs), which are crucial to understanding cellular processes and disease mechanisms. Here, we introduce MINT, PLM specifically designed model sets interacting proteins contextual scalable manner. Using...

10.1101/2025.03.09.642188 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2025-03-10

Learning the language of antibody hypervariability

OPENALEX - Publications

Rohit Singh Chiho Im Yu Qiu Brian C. Mackness Abhinav Gupta and 7 more

Protein language models (PLMs) have demonstrated impressive success in modeling proteins. However, general-purpose “foundational” PLMs limited performance antibodies due to the latter’s hypervariable regions, which do not conform evolutionary conservation principles that such rely on. In this study, we propose a transfer learning framework called Antibody Mutagenesis-Augmented Processing (AbMAP), fine-tunes foundational for antibody-sequence inputs by supervising on antibody structure and...

10.1073/pnas.2418918121 article EN cc-by-nc-nd Proceedings of the National Academy of Sciences 2024-12-30

Adapting protein language models for rapid DTI prediction

OPENALEX - Publications

Samuel Sledzieski Rohit Singh Lenore Cowen Bonnie Berger

Abstract We consider the problem of sequence-based drug-target interaction (DTI) prediction, showing that a straightforward deep learning architecture leverages pre-trained protein language models (PLMs) for embedding outperforms state art approaches, achieving higher accuracy, expanded generalizability, and an order magnitude faster training. PLM embeddings are found to contain general information is especially useful in few-shot (small training data set) zero-shot instances (unseen...

10.1101/2022.11.03.515084 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2022-11-04

TreeFix-TP: Phylogenetic Error-Correction for Infectious Disease Transmission Network Inference

OPENALEX - Publications

Samuel Sledzieski Chengchen Zhang Ion Măndoiu Mukul S. Bansal

10.1142/9789811232701_0012 article EN Biocomputing 2020-11-01

Single-cell mosaicism analysis reveals cell-type-specific somatic mutational burden in Alzheimer’s Dementia

OPENALEX - Publications

Maria Kousi Carles A. Boix Yongjin Park Hansruedi Mathys Samuel Sledzieski and 4 more

Abstract Despite significant advances in identifying genetic drivers of neurodegenerative disorders, the majority affected individuals lack molecular diagnosis, with somatic mutations proposed as one potential contributor to increased risk. Here, we report first cell-type-specific map mosaicism Alzheimer’s Dementia (AlzD), using 4,014 cells from prefrontal cortex samples 19 AlzD and 17 non-AlzD individuals. We integrate full-transcript single-nucleus RNA-seq (SMART-Seq) matched...

10.1101/2022.04.21.489103 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2022-04-22

TreeFix-TP: Phylogenetic error-correction for infectious disease transmission network inference

OPENALEX - Publications

Samuel Sledzieski Chengchen Zhang Ion Măndoiu Mukul S. Bansal

Many existing methods for estimation of infectious disease transmission networks use a phylogeny the infecting strains as basis network inference, and accurate inference relies on accuracy this underlying evolutionary history. However, phylogenetic reconstruction can be highly error prone more sophisticated fail to scale larger outbreaks, negatively impacting downstream inference.We introduce new method, TreeFix-TP, scalable phylogenies based an error-correction framework. Our method uses...

10.7490/f1000research.1118422.1 article EN 2020-12-08

virDTL: Viral Recombination Analysis Through Phylogenetic Reconciliation and Its Application to Sarbecoviruses and SARS-CoV-2

OPENALEX - Publications

Sumaira Zaman Samuel Sledzieski Bonnie Berger Yi-Chieh Wu Mukul S. Bansal

An accurate understanding of the evolutionary history rapidly-evolving viruses like SARS-CoV-2, responsible for COVID-19 pandemic, is crucial to tracking and preventing spread emerging pathogens. However, undergo frequent recombination, which makes it difficult trace their using traditional phylogenetic methods. In this study, we present a workflow, virDTL, analyzing viral evolution in presence recombination. Our approach leverages reconciliation methods developed inferring horizontal gene...

10.1089/cmb.2021.0507 article EN cc-by-nc Journal of Computational Biology 2022-09-20

TreeFix-TP: Phylogenetic Error-Correction for Infectious Disease Transmission Network Inference

OPENALEX - Publications

Samuel Sledzieski Chengchen Zhang Ion Măndoiu Mukul S. Bansal

Abstract Background Many existing methods for estimation of infectious disease transmission networks use a phylogeny the infecting strains as basis network inference, and accurate inference relies on accuracy this underlying evolutionary history. However, phylogenetic reconstruction can be highly error prone more sophisticated fail to scale larger outbreaks, negatively impacting downstream inference. Additionally, there are no currently available which able within-host diversity improve...

10.1101/813931 preprint EN cc-by-nc bioRxiv (Cold Spring Harbor Laboratory) 2019-10-22

Phylogenetic reconciliation reveals extensive ancestral recombination in Sarbecoviruses and the SARS-CoV-2 lineage

OPENALEX - Publications

Sumaira Zaman Samuel Sledzieski Bonnie Berger Yi-Chieh Wu Mukul S. Bansal

Abstract An accurate understanding of the evolutionary history rapidly-evolving viruses like SARS-CoV-2, responsible for COVID-19 pandemic, is crucial to tracking and preventing spread emerging pathogens. However, undergo frequent recombination, which makes it difficult trace their using traditional phylogenetic methods. Here, we present a workflow, virDTL, analyzing viral evolution in presence recombination. Our approach leverages reconciliation methods developed inferring horizontal gene...

10.1101/2021.08.12.456131 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2021-08-13

Insulin Signaling and Pharmacology in Corals

OPENALEX - Publications

Whitney Vizgaudis Lokender Kumar Monsurat Olaosebikan Liza M. Roger Nathanael Brenner and 7 more

Once thought to be a unique capability of the Langerhans Islands in pancreas mammals, insulin production is now recognized as an evolutionarily ancient function going back prokaryotes, ubiquitously present unicellular eukaryotes, fungi, worm, Drosophila and course human. While functionality signaling pathway has been experimentally demonstrated some these organisms, it not yet exploited for pharmacological applications. To enable such applications, we need understand extent which structure...

10.22541/au.170666200.07483513/v1 preprint EN Authorea (Authorea) 2024-01-31

Decoding the Functional Interactome of Non-Model Organisms with PHILHARMONIC

OPENALEX - Publications

Samuel Sledzieski C Versavel Rohit Singh Faith Ocitti Kapil Devkota and 9 more

Protein-protein interaction (PPI) networks are a fundamental resource for modeling cellular and molecular function, large sophisticated toolbox has been developed to leverage their structure topological organization predict the functional roles of under-studied genes, proteins, pathways. However, overwhelming majority experimentally-determined interactions from which such constructed come small number well-studied model organisms. Indeed, most species lack even single in these databases,...

10.1101/2024.10.25.620267 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2024-10-29

Coming Soon ...