- Computational Drug Discovery Methods
- Machine Learning in Materials Science
- Protein Structure and Dynamics
- Bioinformatics and Genomic Networks
- Chemical Synthesis and Analysis
- Advanced Graph Neural Networks
- Microbial Natural Products and Biosynthesis
- RNA and protein synthesis mechanisms
- Genetics, Bioinformatics, and Biomedical Research
- Biomedical Text Mining and Ontologies
- Topic Modeling
- Asymmetric Hydrogenation and Catalysis
- Chemistry and Chemical Engineering
- Machine Learning in Bioinformatics
- Microbial Metabolic Engineering and Bioproduction
- Artificial Intelligence in Healthcare
- Monoclonal and Polyclonal Antibodies Research
- Advanced Fluorescence Microscopy Techniques
- Enzyme Catalysis and Immobilization
- SARS-CoV-2 and COVID-19 Research
- interferon and immune responses
- Analytical Chemistry and Sensors
- vaccines and immunoinformatics approaches
- Click Chemistry and Applications
- Cell Image Analysis Techniques
Shanghai Jiao Tong University
2021-2025
Sun Yat-sen University
2018-2023
National Supercomputing Center in Shenzhen
2019-2021
Guangzhou Regenerative Medicine and Health Guangdong Laboratory
2021
Tsinghua University
2020
Tencent (China)
2020
The University of Texas at Arlington
2020
Green Circle
2020
A novel coronavirus (COVID-19) recently emerged as an acute respiratory syndrome, and has caused a pneumonia outbreak world-widely. As the COVID-19 continues to spread rapidly across world, computed tomography (CT) become essentially important for fast diagnoses. Thus, it is urgent develop accurate computer-aided method assist clinicians identify COVID-19-infected patients by CT images. Here, we have collected chest scans of 88 diagnosed with from hospitals two provinces in China, 100...
Background A novel coronavirus (COVID-19) has emerged recently as an acute respiratory syndrome. The outbreak was originally reported in Wuhan, China, but subsequently been spread world-widely. As the COVID-19 continues to rapidly across world, computed tomography (CT) become essentially important for fast diagnoses. Thus, it is urgent develop accurate computer-aided method assist clinicians identify COVID-19-infected patients by CT images. Materials and Methods We collected chest scans of...
Synthesis planning is the process of recursively decomposing target molecules into available precursors. Computer-aided retrosynthesis can potentially assist chemists in designing synthetic routes; however, at present, it cumbersome and cannot provide satisfactory results. In this study, we have developed a template-free self-corrected predictor (SCROP) to predict using transformer neural networks. method, was converted machine translation problem from products molecular linear notations...
Abstract A reduced removal of dysfunctional mitochondria is common to aging and age-related neurodegenerative pathologies such as Alzheimer’s disease (AD). Strategies for treating impaired mitophagy would benefit from the identification modulators. Here we report combined use unsupervised machine learning (involving vector representations molecular structures, pharmacophore fingerprinting conformer fingerprinting) a cross-species approach screening experimental validation new...
The wide application of smart devices enables the availability multimodal data, which can be utilized in many tasks. In field sentiment analysis, most previous works focus on exploring intra- and inter-modal interactions. However, training a network with cross-modal information (language, audio visual) is still challenging due to modality gap. Besides, while learning dynamics within each sample draws great attention, inter-sample inter-class relationships neglected. Moreover, size datasets...
Abstract Illuminating interactions between proteins and small drug molecules is a longstanding challenge in the field of discovery. Despite importance understanding these interactions, most previous works are limited by hand-designed scoring functions insufficient conformation sampling. The recently-proposed graph neural network-based methods provides alternatives to predict protein-ligand complex one-shot manner. However, neglect geometric constraints structure weaken role local functional...
Abstract The complete biosynthetic pathways are unknown for most natural products (NPs), it is thus valuable to make computer-aided bio-retrosynthesis predictions. Here, a navigable and user-friendly toolkit, BioNavi-NP, developed predict the both NPs NP-like compounds. First, single-step prediction model trained using general organic reactions through end-to-end transformer neural networks. Based on this model, plausible can be efficiently sampled an AND-OR tree-based planning algorithm...
While significant advances have been made in predicting static protein structures, the inherent dynamics of proteins, modulated by ligands, are crucial for understanding function and facilitating drug discovery. Traditional docking methods, frequently used studying protein-ligand interactions, typically treat proteins as rigid. molecular simulations can propose appropriate conformations, they're computationally demanding due to rare transitions between biologically relevant equilibrium...
Constructing proper representations of molecules lies at the core numerous tasks such as molecular property prediction and drug design. Graph neural networks, especially message passing network (MPNN) its variants, have recently made remarkable achievements in graph modeling. Albeit powerful, one-sided focuses on atom (node) or bond (edge) information existing MPNN methods lead to insufficient attributed graphs. Herein, we propose a Communicative Message Passing Neural Network (CMPNN)...
Biomedical knowledge graphs (KGs), which can help with the understanding of complex biological systems and pathologies, have begun to play a critical role in medical practice research. However, challenges remain their embedding use due nature specific demands construction. Existing studies often suffer from problems such as sparse noisy datasets, insufficient modeling methods non-uniform evaluation metrics. In this work, we established comprehensive KG system for biomedical field an attempt...
Recognizing substructures and their relations embedded in a molecular structure representation is key process for structure–activity or structure–property relationship (SAR/SPR) studies. A can be explicitly represented as either connection table (CT) linear notation, such SMILES, which language describing the connectivity of atoms structure. Conventional SAR/SPR approaches rely on partitioning CT into set predefined structural descriptors. In this work, we propose new method to identifying...
Linking fragments to generate a focused compound library for specific drug target is one of the challenges in fragment-based design (FBDD).
Protein-DNA interactions play crucial roles in the biological systems, and identifying protein-DNA binding sites is first step for mechanistic understanding of various activities (such as transcription repair) designing novel drugs. How to accurately identify DNA-binding residues from only protein sequence remains a challenging task. Currently, most existing sequence-based methods consider contextual features sequential neighbors, which are limited capture spatial information. Based on...
Abstract Protein solubility is significant in producing new soluble proteins that can reduce the cost of biocatalysts or therapeutic agents. Therefore, a computational model highly desired to accurately predict protein from amino acid sequence. Many methods have been developed, but they are mostly based on one-dimensional embedding acids limited catch spatially structural information. In this study, we developed structure-aware method GraphSol by attentive graph convolutional network (GCN),...
Relation prediction for knowledge graphs aims at predicting missing relationships between entities. Despite the importance of inductive relation prediction, most previous works are limited to a transductive setting and cannot process previously unseen The recent proposed subgraph-based reasoning models provided alternatives predict links from subgraph structure surrounding candidate triplet inductively. However, we observe that these methods often neglect directed nature extracted weaken...
Identifying drug–protein interactions (DPIs) is crucial in drug discovery, and a number of machine learning methods have been developed to predict DPIs. Existing usually use unrealistic data sets with hidden bias, which will limit the accuracy virtual screening methods. Meanwhile, most DPI prediction pay more attention molecular representation but lack effective research on protein high-level associations between different instances. To this end, we present novel structure-aware multimodal...
Graph neural networks (GNNs) have received increasing attention because of their expressive power on topological data, but they are still criticized for lack interpretability. To interpret GNN models, explainable artificial intelligence (XAI) methods been developed. However, these limited to qualitative analyses without quantitative assessments from the real-world datasets due a ground truths. In this study, we established five XAI-specific molecular property benchmarks, including two...
The CACHE challenges are a series of prospective benchmarking exercises to evaluate progress in the field computational hit-finding. Here we report results inaugural challenge which 23 teams each selected up 100 commercially available compounds that they predicted would bind WDR domain Parkinson's disease target LRRK2, with no known ligand and only an apo structure PDB. lack binding data presumably low druggability is hit finding methods. Of 1955 molecules by participants Round 1 challenge,...
Biogenic compounds are important materials for drug discovery and chemical biology. In this work, we report a quasi-biogenic molecule generator (QBMG) to compose virtual compound libraries by means of gated recurrent unit neural networks. The library includes stereo-chemical properties, which crucial features natural products. QMBG can reproduce the property distribution underlying training set, while being able generate realistic, novel molecules outside set. Furthermore, these associated...
Scaffold hopping is a central task of modern medicinal chemistry for rational drug design, which aims to design molecules novel scaffolds sharing similar target biological activities toward known hit molecules. Traditionally, scaffolding depends on searching databases available compounds that can't exploit vast chemical space. In this study, we have re-formulated as supervised molecule-to-molecule translation generate hopped in 2D structure but 3D structure, inspired by the fact candidate...
Fragment-based drug discovery is a widely used strategy for design in both academic and pharmaceutical industries. Although fragments can be linked to generate candidate compounds by the latest deep generative models, generating linkers with specified attributes remains underdeveloped. In this study, we presented novel framework, DRlinker, control fragment linking toward given through reinforcement learning. The method has been shown effective many tasks from controlling linker length log P,...
Illuminating synthetic pathways is essential for producing valuable chemicals, such as bioactive molecules. Chemical and biological syntheses are crucial, their integration often leads to more efficient sustainable pathways. Despite the rapid development of retrosynthesis models, few them consider both chemical syntheses, hindering pathway design high-value chemicals. Here, we propose BioNavi by innovating multitask learning reaction templates into deep learning-driven model hybrid synthesis...
Abstract AlphaFold3 has set the new state-of-the-art in predicting protein-protein complex structures. However, complete picture of biomolecular interactions cannot be fully captured by static structures alone. In field protein engineering and antibody discovery, connection from structure to function is often mediated binding energy. This work benchmarks against SKEMPI, a commonly used energy dataset. We demonstrate that learns unique information synergizes with force field, profile-based,...