Bin Shao

ORCID: 0000-0002-9790-5687
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Protein Structure and Dynamics
  • Graph Theory and Algorithms
  • Advanced Graph Neural Networks
  • Machine Learning in Materials Science
  • Computational Drug Discovery Methods
  • Usability and User Interface Design
  • Service-Oriented Architecture and Web Services
  • Data Management and Algorithms
  • Distributed systems and fault tolerance
  • Advanced Database Systems and Queries
  • Enzyme Structure and Function
  • Caching and Content Delivery
  • Peer-to-Peer Network Technologies
  • Machine Learning in Bioinformatics
  • Genomics and Phylogenetic Studies
  • Botanical Research and Chemistry
  • SARS-CoV-2 and COVID-19 Research
  • Topic Modeling
  • Web Data Mining and Analysis
  • Cloud Computing and Resource Management
  • Mass Spectrometry Techniques and Applications
  • Semantic Web and Ontologies
  • RNA Research and Splicing
  • Interconnection Networks and Systems
  • Distributed and Parallel Computing Systems

Microsoft Research Asia (China)
2014-2025

West China Hospital of Sichuan University
2023

Kunming University of Science and Technology
2023

Sichuan University
2023

Peking University
2023

Peking University Cancer Hospital
2023

Microsoft (United States)
2014-2015

Fudan University
2007-2011

Concordia University
2005

Computations performed by graph algorithms are data driven, and require a high degree of random access. Despite the great progresses made in disk technology, it still cannot provide level efficient access required computation. On other hand, memory-based approaches usually do not scale due to capacity limit single machines. In this paper, we introduce Trinity, general purpose engine over distributed memory cloud. Through optimized management network communication, Trinity supports fast...

10.1145/2463676.2467799 article EN 2013-06-22

The ability to handle large scale graph data is crucial an increasing number of applications. Much work has been dedicated supporting basic operations such as subgraph matching, reachability, regular expression etc. In many cases, indices are employed speed up query processing. Typically, most require either super-linear indexing time or space. Unfortunately, for very graphs, approaches almost always infeasible. this paper, we study the problem matching on billion-node graphs. We present a...

10.14778/2311906.2311907 article EN Proceedings of the VLDB Endowment 2012-05-01

Much work has been devoted to supporting RDF data. But state-of-the-art systems and methods still cannot handle web scale data effectively. Furthermore, many useful general purpose graph-based operations (e.g., random walk, reachability, community discovery) on are not supported, as most existing store index in particular ways relational tables or a bitmap matrix) maximize one operation data: SPARQL query processing. In this paper, we introduce Trinity. RDF, distributed, memory-based graph...

10.14778/2535570.2488333 article EN Proceedings of the VLDB Endowment 2013-02-01

Abstract Geometric deep learning has been revolutionizing the molecular modeling field. Despite state-of-the-art neural network models are approaching ab initio accuracy for property prediction, their applications, such as drug discovery and dynamics (MD) simulation, have hindered by insufficient utilization of geometric information high computational costs. Here we propose an equivariant geometry-enhanced graph called ViSNet, which elegantly extracts features efficiently structures with low...

10.1038/s41467-023-43720-2 article EN cc-by Nature Communications 2024-01-05

Billion-node graphs pose significant challenges at all levels from storage infrastructures to programming models. It is critical develop a general purpose platform for graph processing. A distributed memory system considered feasible supporting online query processing as well offline analytics. In this paper, we study the problem of partitioning billion-node on such platform, an important consideration because it has direct impact load balancing and communication overhead. challenging not...

10.1109/icde.2014.6816682 article EN 2014-03-01

Drug-drug interaction (DDI) prediction identifies interactions of drug combinations in which the adverse side effects caused by physicochemical incompatibility have attracted much attention. Previous studies usually model information from single or dual views whole molecules but ignore detailed among atoms, leads to incomplete and noisy limits accuracy DDI prediction. In this work, we propose a novel dual-view representation learning network for ('DSN-DDI'), employs local global modules...

10.1093/bib/bbac597 article EN Briefings in Bioinformatics 2022-12-05

Abstract Residue co-evolution has become the primary principle for estimating inter-residue distances of a protein, which are crucially important predicting protein structure. Most existing approaches adopt an indirect strategy, i.e., inferring residue based on some hand-crafted features, say, covariance matrix, calculated from multiple sequence alignment (MSA) target protein. This however, cannot fully exploit information carried by MSA. Here, we report end-to-end deep neural network,...

10.1038/s41467-021-22869-8 article EN cc-by Nature Communications 2021-05-05

Biomolecular dynamics simulation is a fundamental technology for life sciences research, and its usefulness depends on accuracy efficiency

10.1038/s41586-024-08127-z article EN cc-by-nc-nd Nature 2024-11-06

Machine learning force fields (MLFFs) have gained popularity in recent years as they provide a cost-effective alternative to ab initio molecular dynamics (MD) simulations. Despite small error on the test set, MLFFs inherently suffer from generalization and robustness issues during MD To alleviate these issues, we propose global metrics fine-grained element conformation aspects systematically measure for every atom of molecules. We selected three state-of-the-art (ET, NequIP, ViSNet)...

10.1063/5.0147023 article EN The Journal of Chemical Physics 2023-07-17

We are facing challenges at all levels ranging from infrastructures to programming models for managing and mining large graphs. A lot of algorithms on graphs ad-hoc in the sense that each them assumes underlying graph data can be organized a certain way maximizes performance algorithm. In other words, there is no standard systems based which developed optimized. response this situation, have been proposed recently. tutorial, we discuss several representative systems. Still, focus providing...

10.1145/2213836.2213907 article EN 2012-05-20

Hamiltonian matrix prediction is pivotal in computational chemistry, serving as the foundation for determining a wide range of molecular properties. While SE(3) equivariant graph neural networks have achieved remarkable success this domain, their substantial cost-driven by high-order tensor product (TP) operations-restricts scalability to large systems with extensive basis sets. To address challenge, we introduce SPHNet, an efficient and scalable network that incorporates adaptive sparsity...

10.48550/arxiv.2502.01171 preprint EN arXiv (Cornell University) 2025-02-03

Abstract In this paper we investigate the effect of exceptional points (EPs) on violation Leggett–Garg inequality (LGI) and no-signaling-in-time (NSIT) conditions, compare different effects between Hamiltonian EP (HEP) Liouvillian (LEP) those violations. We consider an open system consisting two coupled qubits each qubit is contacted with a thermal bath at temperature. case omitting quantum jumps, find that exhibits second order HEP, which separates parameter space into overdamped regime...

10.1088/1572-9494/adb562 article EN Communications in Theoretical Physics 2025-02-13

ABSTRACT Since SARS-CoV-2 Omicron variant (B.1.1.529) was reported in November 2021, it has quickly spread to many countries and outcompeted the globally dominant Delta several countries. The contains largest number of mutations date, with 32 located at spike (S) glycoprotein, which raised great concern for its enhanced viral fitness immune escape [1–4] . In this study, we crystal structure receptor binding domain (RBD) S glycoprotein bound human ACE2 a resolution 2.6 Å. Structural...

10.1101/2022.01.03.474855 preprint EN bioRxiv (Cold Spring Harbor Laboratory) 2022-01-04

Abstract Molecular dynamics (MD) simulations have revolutionized the modeling of biomolecular conformations and provided unprecedented insight into molecular interactions. Due to prohibitive computational overheads ab initio simulation for large biomolecules, dynamic proteins is generally constrained on force field with mechanics, which suffers from low accuracy as well ignores electronic effects. Here, we report AIMD-Chig, an MD dataset including 2 million 166-atom protein Chignolin sampled...

10.1038/s41597-023-02465-9 article EN cc-by Scientific Data 2023-08-22

The emergence of real life graphs with billions nodes poses significant challenges for managing and querying these graphs. One the fundamental queries submitted to is shortest distance query. Online BFS (breadth-first search) offline pre-computing pairwise distances are prohibitive in time or space complexity billion-node In this paper, we study feasibility building oracles A oracle provides approximate answers by using a pre-computed data structure graph. Sketch-based good candidates...

10.14778/2732219.2732225 article EN Proceedings of the VLDB Endowment 2013-09-01

SARS-CoV-2 is what has caused the COVID-19 pandemic. Early viral infection mediated by homo-trimeric Spike (S) protein with its receptor binding domains (RBDs) in receptor-accessible state. Molecular dynamics simulation on S a focus function of N-terminal (NTDs) performed. The study reveals that NTD acts as "wedge" and plays crucial regulatory role conformational changes protein. complete RBD structural transition allowed only when neighboring typically prohibits RBD's movements wedge...

10.1002/adts.202100152 article EN Advanced Theory and Simulations 2021-09-02

The identification of active binding drugs for target proteins (referred to as drug-target interaction prediction) is the key challenge in virtual screening, which plays an essential role drug discovery. Although recent deep learning-based approaches achieve better performance than molecular docking, existing models often neglect topological or spatial intermolecular information, hindering prediction performance. We recognize this problem and propose a novel approach called Intermolecular...

10.1093/bib/bbac162 article EN Briefings in Bioinformatics 2022-04-14

In recent years, knowledge graph embedding becomes a pretty hot research topic of artificial intelligence and plays increasingly vital roles in various downstream applications, such as recommendation question answering. However, existing methods for can not make proper trade-off between the model complexity expressiveness, which makes them still far from satisfactory. To mitigate this problem, we propose lightweight modeling framework that achieve highly competitive relational expressiveness...

10.18653/v1/2020.acl-main.358 preprint EN cc-by 2020-01-01

Current Web 2.0 services are making mass collaboration a reality. Using browser, people can participate in cooperative work anytime, anywhere from any computing device as long there is an Internet connection. Lying the heart of some well-known optimistic consistency control technique called operational transformation (OT). This paper proposes TIPS, novel sync protocol that adapts OT for applications. Based on recent theoretical framework ABT, it ensures not only convergence but also right...

10.1145/1958824.1958910 article EN 2011-03-19

10.1016/j.jpdc.2005.06.010 article EN Journal of Parallel and Distributed Computing 2005-08-04

10.1016/j.ins.2015.04.016 article EN Information Sciences 2015-04-18
Coming Soon ...