NFDI4DS | UHH-SEMS - Publication Details

Bozitao Zhong

ORCID: 0000-0001-9363-6099

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5090728932

Research Areas

Protein Structure and Dynamics
Machine Learning in Bioinformatics
RNA and protein synthesis mechanisms
Enzyme Structure and Function
Genomics and Phylogenetic Studies
Advanced Proteomics Techniques and Applications
Microbial Metabolic Engineering and Bioproduction
Monoclonal and Polyclonal Antibodies Research
Machine Learning in Materials Science
vaccines and immunoinformatics approaches
Ubiquitin and proteasome pathways
Genetics, Bioinformatics, and Biomedical Research
Protein purification and stability
Viral Infectious Diseases and Gene Expression in Insects
Transgenic Plants and Applications
Bacteriophages and microbial interactions
Glycosylation and Glycoproteins Research
Parallel Computing and Optimization Techniques
Computational Drug Discovery Methods
Bacterial Genetics and Biotechnology
Peptidase Inhibition and Analysis
Microfluidic and Capillary Electrophoresis Applications
Cell Image Analysis Techniques
Enzyme Production and Characterization
HIV/AIDS drug development and treatment

Shanghai Jiao Tong University
2019-2025

Center for Life Sciences
2021-2025

Mila - Quebec Artificial Intelligence Institute
2023-2024

Université de Montréal
2023-2024

Precise Generation of Conformational Ensembles for Intrinsically Disordered Proteins via Fine-tuned Diffusion Models

OPENALEX - Publications

Junjie Zhu Zhengxin Li Bo Zhang Zhuoqi Zheng Bozitao Zhong and 6 more

Intrinsically disordered proteins (IDPs) play pivotal roles in various biological functions and are closely linked to many human diseases including cancer, diabetes Alzheimer disease. Structural investigations of IDPs typically involve a combination molecular dynamics (MD) simulations experimental data correct for intrinsic biases simulation methods. However, these hindered by their high computational cost scarcity data, severely limiting applicability. Despite the recent advancements...

10.1101/2024.05.05.592611 preprint EN bioRxiv (Cold Spring Harbor Laboratory) 2024-05-07

Pretrainable geometric graph neural network for antibody affinity maturation

OPENALEX - Publications

Huiyu Cai Zuobai Zhang Mingkai Wang Bozitao Zhong Quanxiao Li and 4 more

10.1038/s41467-024-51563-8 article EN cc-by-nc-nd Nature Communications 2024-09-06

ProSST: Protein Language Modeling with Quantized Structure and Disentangled Attention

OPENALEX - Publications

Mingchen Li Pan Tan Xinzhu Ma Bozitao Zhong Huiqun Yu and 5 more

Abstract Protein language models (PLMs) have shown remarkable capabilities in various protein function prediction tasks. However, while is intricately tied to structure, most existing PLMs do not incorporate structure information. To address this issue, we introduce ProSST, a Transformer-based model that seamlessly integrates both sequences and structures. ProSST incorporates quantization module Transformer architecture with disentangled attention. The translates 3D into sequence of discrete...

10.1101/2024.04.15.589672 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2024-04-17

A conditional protein diffusion model generates artificial programmable endonuclease sequences with enhanced activity

OPENALEX - Publications

Bingxin Zhou Lirong Zheng Banghao Wu Kai Yi Bozitao Zhong and 4 more

10.1038/s41421-024-00728-2 article EN cc-by Cell Discovery 2024-09-10

Simple, Efficient, and Scalable Structure-Aware Adapter Boosts Protein Language Models

OPENALEX - Publications

Yang Tan Mingchen Li Bingxin Zhou Bozitao Zhong Lirong Zheng and 5 more

Fine-tuning pretrained protein language models (PLMs) has emerged as a prominent strategy for enhancing downstream prediction tasks, often outperforming traditional supervised learning approaches. As widely applied powerful technique in natural processing, employing parameter-efficient fine-tuning techniques could potentially enhance the performance of PLMs. However, direct transfer to life science tasks is nontrivial due different training strategies and data forms. To address this gap, we...

10.1021/acs.jcim.4c00689 article EN Journal of Chemical Information and Modeling 2024-08-07

Mechanism of zinc ejection by disulfiram in nonstructural protein 5A

OPENALEX - Publications

Ashfaq Ur Rehman Guodong Zhen Bozitao Zhong Duan Ni Jiayi Li and 7 more

Hepatitis C virus (HCV) is a notorious member of the Flaviviridae family enveloped, positive-strand RNA viruses. Non-structural protein 5A (NS5A) plays key role in HCV replication and assembly. NS5A multi-domain which includes an N-terminal amphipathic membrane anchoring alpha helix, highly structured domain-1, two intrinsically disordered domains 2-3. The domain-1 contains zinc finger (Zf)-site, binding stabilizes overall structure, while ejection this from Zf-site destabilizes structure....

10.1039/d0cp06360f article EN Physical Chemistry Chemical Physics 2021-01-01

ParaFold: Paralleling AlphaFold for Large-Scale Predictions

OPENALEX - Publications

Bozitao Zhong Xiaoming Su Minhua Wen Si-Cheng Zuo Liang Hong and 1 more

AlphaFold developed by DeepMind predicts protein structures from the amino acid sequence at or near experimental resolution, solving 50-year-old folding challenge, leading to progress transforming large-scale genomics data into structures. will also greatly change scientific research model low-throughput high-throughput manner. The overall prediction process consists of two stages: 1) MSA construction based on CPUs and 2) inferences GPUs. In first stage, uses only, taking up hours for a...

10.1145/3503470.3503471 preprint EN 2022-01-11

Proteome-wide 3D structure prediction provides insights into the ancestral metabolism of ancient archaea and bacteria

OPENALEX - Publications

Weishu Zhao Bozitao Zhong Lirong Zheng Pan Tan Yinzhao Wang and 5 more

Ancestral metabolism has remained controversial due to a lack of evidence beyond sequence-based reconstructions. Although prebiotic chemists have provided hints that might originate from non-enzymatic protometabolic pathways, gaps between ancestral reconstruction and processes mean there is much still unknown. Here, we apply proteome-wide 3D structure predictions comparisons investigate ancestorial ancient bacteria archaea, provide information sequence as bridge the processes. We compare...

10.1038/s41467-022-35523-8 article EN cc-by Nature Communications 2022-12-21

Phosphorylation Modification Force Field FB18CMAP Improving Conformation Sampling of Phosphoproteins

OPENALEX - Publications

Ge Song Bozitao Zhong Bo Zhang Ashfaq Ur Rehman Haifeng Chen

Phosphorylation of proteins plays an important regulatory role at almost all levels cellular organization. Molecular dynamics (MD) simulation is a promising tool to reveal the mechanism how phosphorylation regulates many key biological processes atomistic level. MD accuracy depends on force field precision, while current fields for phospho-amino acids have resulted in notable inconsistency with experimental data. Here, new parameter (named FB18CMAP) generated by fitting against quantum...

10.1021/acs.jcim.3c00112 article EN Journal of Chemical Information and Modeling 2023-02-17

DiffPack: A Torsional Diffusion Model for Autoregressive Protein Side-Chain Packing

OPENALEX - Publications

Yangtian Zhan Zuobai Zhang Bozitao Zhong Sanchit Misra J. Tang

Proteins play a critical role in carrying out biological functions, and their 3D structures are essential determining functions. Accurately predicting the conformation of protein side-chains given backbones is important for applications structure prediction, design protein-protein interactions. Traditional methods computationally intensive have limited accuracy, while existing machine learning treat problem as regression task overlook restrictions imposed by constant covalent bond lengths...

10.48550/arxiv.2306.01794 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Mn2+-induced structural flexibility enhances the entire catalytic cycle and the cleavage of mismatches in prokaryotic argonaute proteins

OPENALEX - Publications

Lirong Zheng Bingxin Zhou Yu Yang Bing Zan Bozitao Zhong and 4 more

Prokaryotic Argonaute (pAgo) proteins, a class of DNA/RNA-guided programmable endonucleases, have been extensively utilized in nucleic acid-based biosensors.

10.1039/d3sc06221j article EN cc-by-nc Chemical Science 2024-01-01

Discovery of Expression-Governing Residues in Proteins

OPENALEX - Publications

Fan Jiang Mingchen Li Banghao Wu Liang Zhang Bozitao Zhong and 2 more

Understanding how amino acids influence protein expression is crucial for advancements in biotechnology and synthetic biology. In this study, we introduce Venus-TIGER, a deep learning model designed to accurately identify critical expression. By constructing two-dimensional matrix that links representations experimental fitness, Venus-TIGER achieves improved predictive accuracy enhanced extrapolation capability. We validated our approach on both public mutational scanning datasets...

10.1101/2025.01.06.631498 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2025-01-07

Entropy-driven zero-shot deep learning model selection for viral proteins

OPENALEX - Publications

Yuanxi Yu Fan Jiang Bozitao Zhong Liang Hong Mingchen Li

Predicting the fitness of viral proteins is fundamental to understanding evolution and developing antiviral strategies. This study introduces Venus-EEM, an entropy-driven ensemble model, aimed at improving performance zero-shot predictions for protein across diverse datasets. We demonstrate that entropy serves as effective criterion selecting optimal models, enabling adaptive model selection different prediction tasks. By incorporating entropy-weighted learning from multiple language...

10.1103/physrevresearch.7.013229 article EN cc-by Physical Review Research 2025-02-28

Precise Generation of Conformational Ensembles for Intrinsically Disordered Proteins with IDPFold

OPENALEX - Publications

Jun‐Jie Zhu Zhengxin Li Zhuoqi Zheng Bo Zhang Bozitao Zhong and 6 more

10.2139/ssrn.5178914 preprint EN 2025-01-01

Personalized Energy Adaptation through Reweighting Learning (PEARL) Force Field for Intrinsically Disordered Proteins

OPENALEX - Publications

Xiaoyue Ji Jun‐Jie Zhu Bozitao Zhong Zhengxin Li Taeyoung Choi and 3 more

Intrinsically disordered proteins (IDPs) have garnered significant attention due to their critical roles in complex human diseases. Molecular dynamics (MD) simulations emerged as a valuable approach for studying IDPs, whose accuracy heavily depends on the of force fields. Despite this, high conformational flexibility IDPs presents limitations current fields precisely capturing features. Here, we developed tool generating field parameters, consisting two main components: construction and...

10.1021/acs.jcim.5c00140 article EN Journal of Chemical Information and Modeling 2025-04-02

Quantum‐Enhanced Computing for the Antiferromagnetic J1−J2$J_1-J_2$ Heisenberg Model

OPENALEX - Publications

Yuheng Guo Fangmin Guo Bozitao Zhong Xingyu Chen Xiao-Zhong Yuan and 2 more

Abstract The variational quantum eigensolver (VQE) has recently been demonstrated for solving the challenging Heisenberg Antiferromagnet (HAFM) models. Apart from ground state energy, many important issues such as excited states and general frustration HAFM are worth investigating, which have only partially solved by classical methods rarely approaches. Here, VQE is applied to GPU simulator calculate of a ‐ model on both square kagome lattices. invariant subspace property analyzed during...

10.1002/qute.202300240 article EN Advanced Quantum Technologies 2025-05-14

Venus-MAXWELL: Efficient Learning of Protein-Mutation Stability Landscapes using Protein Language Models

OPENALEX - Publications

Yuanxi Yu Fan Jiang Xinzhu Ma Liang Zhang Bozitao Zhong and 5 more

In-silico prediction of protein mutant stability, measured by the difference in Gibbs free energy change (ΔΔG), is fundamental for engineering. Current sequence-to-label methods typically employ two-stage pipeline: (i) encoding sequences using neural networks (e.g., transformers), followed (ii) ΔΔG regression from latent representations. Although these have demonstrated promising performance, their dependence on specialized network encoders significantly increases complexity. Additionally,...

10.1101/2025.05.30.656964 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2025-06-02

Autoregressive Enzyme Function Prediction with Multi-scale Multi-modality Fusion

OPENALEX - Publications

Dingyi Rong Wenzhuo Zheng Bozitao Zhong Zhouhan Lin Liang Hong and 1 more

Accurate prediction of enzyme function is crucial for elucidating biological mechanisms and driving innovation across various sectors. Existing deep learning methods tend to rely solely on either sequence data or structural predict the EC number as a whole, neglecting intrinsic hierarchical structure numbers. To address these limitations, we introduce MAPred, novel multi-modality multi-scale model designed autoregressively proteins. MAPred integrates both primary amino acid 3D tokens...

10.48550/arxiv.2408.06391 preprint EN arXiv (Cornell University) 2024-08-11

E3Bind: An End-to-End Equivariant Network for Protein-Ligand Docking

OPENALEX - Publications

Yangtian Zhang Huiyu Cai Chence Shi Bozitao Zhong Jian Tang

In silico prediction of the ligand binding pose to a given protein target is crucial but challenging task in drug discovery. This work focuses on blind flexible selfdocking, where we aim predict positions, orientations and conformations docked molecules. Traditional physics-based methods usually suffer from inaccurate scoring functions high inference cost. Recently, data-driven based deep learning techniques are attracting growing interest thanks their efficiency during promising...

10.48550/arxiv.2210.06069 preprint EN cc-by arXiv (Cornell University) 2022-01-01

A conditional protein diffusion model generates artificial programmable endonuclease sequences with enhanced activity

OPENALEX - Publications

Bingxin Zhou Lirong Zheng Banghao Wu Kai Yi Bozitao Zhong and 4 more

Abstract Deep learning-based methods for generating functional proteins address the growing need novel biocatalysts, allowing precise tailoring of functionalities to meet specific requirements. This emergence leads creation highly efficient and specialized with wide-ranging applications in scientific, technological, biomedical domains. study establishes a pipeline protein sequence generation conditional diffusion model, namely CPDiffusion, deliver diverse sequences enhanced functions....

10.1101/2023.08.10.552783 preprint EN bioRxiv (Cold Spring Harbor Laboratory) 2023-08-14

Score-based Enhanced Sampling for Protein Molecular Dynamics

OPENALEX - Publications

Jiarui Lu Bozitao Zhong J. Tang

The dynamic nature of proteins is crucial for determining their biological functions and properties, which Monte Carlo (MC) molecular dynamics (MD) simulations stand as predominant tools to study such phenomena. By utilizing empirically derived force fields, MC or MD explore the conformational space through numerically evolving system via Markov chain Newtonian mechanics. However, high-energy barrier fields can hamper exploration both methods by rare event, resulting in inadequately sampled...

10.48550/arxiv.2306.03117 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Protein Representation Learning with Sequence Information Embedding: Does it Always Lead to a Better Performance?

OPENALEX - Publications

Yang Tan Lirong Zheng Bozitao Zhong Liang Hong Bingxin Zhou

Deep learning has become a crucial tool in studying proteins. While the significance of modeling protein structure been discussed extensively literature, amino acid types are typically included input as default operation for many inference tasks. This study demonstrates with alignment task that embedding some cases may not help deep model learn better representation. To this end, we propose ProtLOCA, local geometry method based solely on The effectiveness ProtLOCA is examined by global...

10.48550/arxiv.2406.19755 preprint EN arXiv (Cornell University) 2024-06-28

Harnessing Protein Language Model for Structure-Based Discovery of Highly Efficient and Robust PET Hydrolases

OPENALEX - Publications

Banghao Wu Bozitao Zhong Lirong Zheng Runye Huang Shudong Jiang and 3 more

Abstract Plastic waste, particularly polyethylene terephthalate (PET), presents significant environmental challenges, prompting extensive research into enzymatic biodegradation. Existing PET hydrolases are limited to a narrow sequence space and demonstrate insufficient performance for This study introduces novel discovery pipeline that combines protein language models (PLMs) with structural representation tree identify enzymes based on similarity. Using the crystal structure of Is PETase as...

10.1101/2024.11.13.623508 preprint EN bioRxiv (Cold Spring Harbor Laboratory) 2024-11-15

Harnessing Protein Language Model for Structure-Based Discovery of Highly Efficient and Robust PET Hydrolases

OPENALEX - Publications

Lirong Zheng Banghao Wu Bozitao Zhong Runye Huang Shudong Jiang and 3 more

<title>Abstract</title> Plastic waste, particularly polyethylene terephthalate (PET), poses significant environmental challenges, prompting extensive research into enzymatic biodegradation. However, existing PET hydrolases (PETases) are constrained to a narrow sequence space and exhibited limited performance for effective This study introduces protein discovery pipeline, ProMine, which integrates language models (PLMs) with representation tree identify PETase based on structural similarity...

10.21203/rs.3.rs-5492523/v1 preprint EN cc-by Research Square (Research Square) 2024-12-16

Coming Soon ...