- Protein Structure and Dynamics
- Graph Theory and Algorithms
- Advanced Graph Neural Networks
- Machine Learning in Materials Science
- Computational Drug Discovery Methods
- Usability and User Interface Design
- Service-Oriented Architecture and Web Services
- Data Management and Algorithms
- Distributed systems and fault tolerance
- Advanced Database Systems and Queries
- Enzyme Structure and Function
- Caching and Content Delivery
- Peer-to-Peer Network Technologies
- Machine Learning in Bioinformatics
- Genomics and Phylogenetic Studies
- Botanical Research and Chemistry
- SARS-CoV-2 and COVID-19 Research
- Topic Modeling
- Web Data Mining and Analysis
- Cloud Computing and Resource Management
- Mass Spectrometry Techniques and Applications
- Semantic Web and Ontologies
- RNA Research and Splicing
- Interconnection Networks and Systems
- Distributed and Parallel Computing Systems
Microsoft Research Asia (China)
2014-2025
West China Hospital of Sichuan University
2023
Kunming University of Science and Technology
2023
Sichuan University
2023
Peking University
2023
Peking University Cancer Hospital
2023
Microsoft (United States)
2014-2015
Fudan University
2007-2011
Concordia University
2005
Computations performed by graph algorithms are data driven, and require a high degree of random access. Despite the great progresses made in disk technology, it still cannot provide level efficient access required computation. On other hand, memory-based approaches usually do not scale due to capacity limit single machines. In this paper, we introduce Trinity, general purpose engine over distributed memory cloud. Through optimized management network communication, Trinity supports fast...
The ability to handle large scale graph data is crucial an increasing number of applications. Much work has been dedicated supporting basic operations such as subgraph matching, reachability, regular expression etc. In many cases, indices are employed speed up query processing. Typically, most require either super-linear indexing time or space. Unfortunately, for very graphs, approaches almost always infeasible. this paper, we study the problem matching on billion-node graphs. We present a...
Much work has been devoted to supporting RDF data. But state-of-the-art systems and methods still cannot handle web scale data effectively. Furthermore, many useful general purpose graph-based operations (e.g., random walk, reachability, community discovery) on are not supported, as most existing store index in particular ways relational tables or a bitmap matrix) maximize one operation data: SPARQL query processing. In this paper, we introduce Trinity. RDF, distributed, memory-based graph...
Abstract Geometric deep learning has been revolutionizing the molecular modeling field. Despite state-of-the-art neural network models are approaching ab initio accuracy for property prediction, their applications, such as drug discovery and dynamics (MD) simulation, have hindered by insufficient utilization of geometric information high computational costs. Here we propose an equivariant geometry-enhanced graph called ViSNet, which elegantly extracts features efficiently structures with low...
Billion-node graphs pose significant challenges at all levels from storage infrastructures to programming models. It is critical develop a general purpose platform for graph processing. A distributed memory system considered feasible supporting online query processing as well offline analytics. In this paper, we study the problem of partitioning billion-node on such platform, an important consideration because it has direct impact load balancing and communication overhead. challenging not...
Drug-drug interaction (DDI) prediction identifies interactions of drug combinations in which the adverse side effects caused by physicochemical incompatibility have attracted much attention. Previous studies usually model information from single or dual views whole molecules but ignore detailed among atoms, leads to incomplete and noisy limits accuracy DDI prediction. In this work, we propose a novel dual-view representation learning network for ('DSN-DDI'), employs local global modules...
Abstract Residue co-evolution has become the primary principle for estimating inter-residue distances of a protein, which are crucially important predicting protein structure. Most existing approaches adopt an indirect strategy, i.e., inferring residue based on some hand-crafted features, say, covariance matrix, calculated from multiple sequence alignment (MSA) target protein. This however, cannot fully exploit information carried by MSA. Here, we report end-to-end deep neural network,...
Biomolecular dynamics simulation is a fundamental technology for life sciences research, and its usefulness depends on accuracy efficiency
Machine learning force fields (MLFFs) have gained popularity in recent years as they provide a cost-effective alternative to ab initio molecular dynamics (MD) simulations. Despite small error on the test set, MLFFs inherently suffer from generalization and robustness issues during MD To alleviate these issues, we propose global metrics fine-grained element conformation aspects systematically measure for every atom of molecules. We selected three state-of-the-art (ET, NequIP, ViSNet)...
We are facing challenges at all levels ranging from infrastructures to programming models for managing and mining large graphs. A lot of algorithms on graphs ad-hoc in the sense that each them assumes underlying graph data can be organized a certain way maximizes performance algorithm. In other words, there is no standard systems based which developed optimized. response this situation, have been proposed recently. tutorial, we discuss several representative systems. Still, focus providing...
Hamiltonian matrix prediction is pivotal in computational chemistry, serving as the foundation for determining a wide range of molecular properties. While SE(3) equivariant graph neural networks have achieved remarkable success this domain, their substantial cost-driven by high-order tensor product (TP) operations-restricts scalability to large systems with extensive basis sets. To address challenge, we introduce SPHNet, an efficient and scalable network that incorporates adaptive sparsity...
Abstract In this paper we investigate the effect of exceptional points (EPs) on violation Leggett–Garg inequality (LGI) and no-signaling-in-time (NSIT) conditions, compare different effects between Hamiltonian EP (HEP) Liouvillian (LEP) those violations. We consider an open system consisting two coupled qubits each qubit is contacted with a thermal bath at temperature. case omitting quantum jumps, find that exhibits second order HEP, which separates parameter space into overdamped regime...
ABSTRACT Since SARS-CoV-2 Omicron variant (B.1.1.529) was reported in November 2021, it has quickly spread to many countries and outcompeted the globally dominant Delta several countries. The contains largest number of mutations date, with 32 located at spike (S) glycoprotein, which raised great concern for its enhanced viral fitness immune escape [1–4] . In this study, we crystal structure receptor binding domain (RBD) S glycoprotein bound human ACE2 a resolution 2.6 Å. Structural...
Abstract Molecular dynamics (MD) simulations have revolutionized the modeling of biomolecular conformations and provided unprecedented insight into molecular interactions. Due to prohibitive computational overheads ab initio simulation for large biomolecules, dynamic proteins is generally constrained on force field with mechanics, which suffers from low accuracy as well ignores electronic effects. Here, we report AIMD-Chig, an MD dataset including 2 million 166-atom protein Chignolin sampled...
The emergence of real life graphs with billions nodes poses significant challenges for managing and querying these graphs. One the fundamental queries submitted to is shortest distance query. Online BFS (breadth-first search) offline pre-computing pairwise distances are prohibitive in time or space complexity billion-node In this paper, we study feasibility building oracles A oracle provides approximate answers by using a pre-computed data structure graph. Sketch-based good candidates...
SARS-CoV-2 is what has caused the COVID-19 pandemic. Early viral infection mediated by homo-trimeric Spike (S) protein with its receptor binding domains (RBDs) in receptor-accessible state. Molecular dynamics simulation on S a focus function of N-terminal (NTDs) performed. The study reveals that NTD acts as "wedge" and plays crucial regulatory role conformational changes protein. complete RBD structural transition allowed only when neighboring typically prohibits RBD's movements wedge...
The identification of active binding drugs for target proteins (referred to as drug-target interaction prediction) is the key challenge in virtual screening, which plays an essential role drug discovery. Although recent deep learning-based approaches achieve better performance than molecular docking, existing models often neglect topological or spatial intermolecular information, hindering prediction performance. We recognize this problem and propose a novel approach called Intermolecular...
In recent years, knowledge graph embedding becomes a pretty hot research topic of artificial intelligence and plays increasingly vital roles in various downstream applications, such as recommendation question answering. However, existing methods for can not make proper trade-off between the model complexity expressiveness, which makes them still far from satisfactory. To mitigate this problem, we propose lightweight modeling framework that achieve highly competitive relational expressiveness...
Current Web 2.0 services are making mass collaboration a reality. Using browser, people can participate in cooperative work anytime, anywhere from any computing device as long there is an Internet connection. Lying the heart of some well-known optimistic consistency control technique called operational transformation (OT). This paper proposes TIPS, novel sync protocol that adapts OT for applications. Based on recent theoretical framework ABT, it ensures not only convergence but also right...