- Advanced Graph Neural Networks
- Data Management and Algorithms
- Graph Theory and Algorithms
- Complex Network Analysis Techniques
- Algorithms and Data Compression
- Computational Drug Discovery Methods
- Advanced Database Systems and Queries
- Machine Learning in Materials Science
- Complexity and Algorithms in Graphs
- Topic Modeling
- Caching and Content Delivery
- Data Stream Mining Techniques
- Cancer-related molecular mechanisms research
- Time Series Analysis and Forecasting
- Gastric Cancer Management and Outcomes
- Protein Structure and Dynamics
- RNA modifications and cancer
- Computational Geometry and Mesh Generation
- Metastasis and carcinoma case studies
- Human Mobility and Location-Based Analysis
- Immune cells in cancer
- Chemokine receptors and signaling
- Neural Networks and Applications
- Helicobacter pylori-related gastroenterology studies
- Recommender Systems and Techniques
Renmin University of China
2016-2025
The First Affiliated Hospital, Sun Yat-sen University
2018-2024
Sun Yat-sen University
2015-2024
Beijing Institute of Big Data Research
2023
Beihang University
2018
Sun Yat-sen University Cancer Center
2018
Aarhus University
2013-2014
Danish National Research Foundation
2013
Center for Massive Data Algorithmics
2013
Hong Kong University of Science and Technology
2008-2012
Abstract Autonomous agents have long been a research focus in academic and industry communities. Previous often focuses on training with limited knowledge within isolated environments, which diverges significantly from human learning processes, makes the hard to achieve human-like decisions. Recently, through acquisition of vast amounts Web knowledge, large language models (LLMs) shown potential human-level intelligence, leading surge LLM-based autonomous agents. In this paper, we present...
Molecular representation learning (MRL) has gained tremendous attention due to its critical role in from limited supervised data for applications like drug design. In most MRL methods, molecules are treated as 1D sequential tokens or 2D topology graphs, limiting their ability incorporate 3D information downstream tasks and, particular, making it almost impossible geometry prediction/generation. this paper, we propose a universal framework, called Uni-Mol, that significantly enlarges the and...
Autonomous agents have long been a prominent research focus in both academic and industry communities. Previous this field often focuses on training with limited knowledge within isolated environments, which diverges significantly from human learning processes, thus makes the hard to achieve human-like decisions. Recently, through acquisition of vast amounts web knowledge, large language models (LLMs) demonstrated remarkable potential achieving human-level intelligence. This has sparked an...
The matching of similar pairs objects, called similarity join, is fundamental functionality in data management. We consider the case trajectory join (TS-Join), where objects are trajectories vehicles moving road networks. Thus, given two sets and a threshold θ , TS-Join returns all from with above . This targets applications such as near-duplicate detection, cleaning, ridesharing recommendation, traffic congestion prediction. With these mind, we provide purposeful definition similarity. To...
We study the mergeability of data summaries. Informally speaking, requires that, given two summaries on datasets, there is a way to merge into single summary datasets combined together, while preserving error and size guarantees. This property means that can be merged in akin other algebraic operators such as sum max, which especially useful for computing massive distributed data. Several are trivially mergeable by construction, most notably all sketches linear functions datasets. But some...
Abstract Background Long non-coding RNA H19 was demonstrated to be significantly correlated with tumor metastasis. However, the specific functions of in colorectal cancer (CRC) metastasis and underlying mechanism are still largely unclear. Methods Use public database screen potential lncRNA crucial for cancer. The expression clinical CRC specimens detected by qRT-PCR. effect on cells investigated transwell, wound healing assays, CCK-8 assays animal studies. proteins binding were identified...
Travel planning and recommendation are important aspects of transportation. We propose investigate a novel Collective Planning (CTP) query that finds the lowest-cost route connecting multiple sources destination, via at most <inline-formula><tex-math notation="LaTeX">$k$</tex-math></inline-formula> meeting points. When travelers target same destination (e.g., stadium or theater), they may want to assemble points then go together by public transport reduce their global travel cost energy,...
Abstract Purpose: C-X-C chemokine receptor type 2 (CXCR2) is a key regulator that drives immune suppression and inflammation in tumor microenvironment. CXCR2-targeted therapy has shown promising results several solid tumors. However, the underlying mechanism of CXCR2-mediated cross-talk between gastric cancer cells macrophages still remains unclear. Experimental Design: The expression CXCR2 its ligands 155 human tissues was analyzed via immunohistochemistry, correlations with clinical...
Given a graph G, source node s and target t, the personalized PageRank (PPR) of t with respect to is probability that random walk starting from terminates at t. A single-source PPR (SSPPR) query enumerates all nodes in returns top-k highest values given s. SSPPR has important applications web search social networks, e.g., Twitter's Who-To-Follow recommendation service. However, computation immensely expensive, same time resistant indexing materialization. So far, existing solutions either...
Many representative graph neural networks, e.g., GPR-GNN and ChebNet, approximate convolutions with spectral filters. However, existing work either applies predefined filter weights or learns them without necessary constraints, which may lead to oversimplified ill-posed To overcome these issues, we propose BernNet, a novel network theoretical support that provides simple but effective scheme for designing learning arbitrary In particular, any over the normalized Laplacian spectrum of graph,...
Molecular representation learning (MRL) has gained tremendous attention due to its critical role in from limited supervised data for applications like drug design. In most MRL methods, molecules are treated as 1D sequential tokens or 2D topology graphs, limiting their ability incorporate 3D information downstream tasks and, particular, making it almost impossible geometry prediction generation. Herein, we propose Uni-Mol, a universal framework that significantly enlarges the and application...
In long-term time series forecasting, most Transformer-based methods adopt the standard point-wise attention mechanism, which not only has high complexity but also cannot explicitly capture predictive dependencies from contexts since corresponding key and value are transformed same point. This paper proposes a model called Preformer. Preformer introduces novel efficient Multi-Scale Segment-Correlation mechanism that divides into segments utilizes segment-wise correlation-based to replace...
We study the mergeability of data summaries. Informally speaking, requires that, given two summaries on sets, there is a way to merge into single summary union while preserving error and size guarantees. This property means that can be merged in like other algebraic operators such as sum max, which especially useful for computing massive distributed data. Several are trivially mergeable by construction, most notably all sketches linear functions sets. But some fundamental ones those heavy...
The matching between trajectories and locations, called Trajectory-to-Location join (TL-Join), is fundamental functionality in spatiotemporal data management. Given a set of trajectories, threshold 8, the TL-Join finds all (trajectory, location) pairs from two sets with correlation above 8. This targets diverse applications, including location recommendation, event tracking, trajectory activity analyses. We address three challenges relation to TL-Join: how define prune search space...
Long noncoding RNAs (lncRNAs) are implicated in various cancers, including colon cancer. Liver metastasis is the main cause of cancer-related death. However, roles lncRNAs cancer liver still largely unclear. In this study, we identified a novel lncRNA B3GALT5-AS1, which reduced tissues and further tissues. Reduced expression B3GALT5-AS1 associated with poor outcome patients. Gain-of-function loss-of-function assays revealed that inhibited proliferation but promoted migration invasion cells....
Personalized PageRank (PPR) is a classic metric that measures the relevance of graph nodes with respect to source node. Given G, node s, and parameter k, top-k PPR query returns set k highest values s. This type queries serves as an important building block for numerous applications in web search social networks, such Twitter's Who-To-Follow recommendation service. Existing techniques PPR, however, suffer from two major deficiencies. First, they either incur prohibitive space time overheads...
Given a social network G with n nodes and m edges, positive integer k, cascade model C, the influence maximization (IM) problem asks for k in such that expected number of influenced by under C is maximized. The state-of-the-art approximate solutions run O(k(n+m)log(n)/ε2) time while returning (1-1/e -ε) solution at least 1-1/n probability. A key phase these IM algorithms random reverse reachable (RR) set generation, this significantly affects efficiency scalability algorithms. In paper, we...
Graph convolutional networks (GCNs) are a powerful deep learning approach for graph-structured data. Recently, GCNs and subsequent variants have shown superior performance in various application areas on real-world datasets. Despite their success, most of the current GCN models shallow, due to {\em over-smoothing} problem. In this paper, we study problem designing analyzing graph networks. We propose GCNII, an extension vanilla model with two simple yet effective techniques: Initial...
Molecular representation learning (MRL) has gained tremendous attention due to its critical role in from limited supervised data for applications like drug design. In most MRL methods, molecules are treated as 1D sequential tokens or 2D topology graphs, limiting their ability incorporate 3D information downstream tasks and, particular, making it almost impossible geometry prediction generation. Herein, we propose Uni-Mol, a universal framework that significantly enlarges the and application...
Designing spectral convolutional networks is a challenging problem in graph learning. ChebNet, one of the early attempts, approximates convolutions using Chebyshev polynomials. GCN simplifies ChebNet by utilizing only first two polynomials while still outperforming it on real-world datasets. GPR-GNN and BernNet demonstrate that Monomial Bernstein bases also outperform basis terms learning convolutions. Such conclusions are counter-intuitive field approximation theory, where established...
Integrating scientific principles into machine learning models to enhance their predictive performance and generalizability is a central challenge in the development of AI for Science. Herein, we introduce Uni-pKa, novel framework that successfully incorporates thermodynamic modeling, achieving high-precision predictions acid dissociation constants (pKa), crucial task rational design drugs catalysts, as well modeling computational physical chemistry small organic molecules. Uni-pKa utilizes...
The extracellular matrix (ECM) has been demonstrated to be dysregulated and crucial for malignant progression in gastric cancer (GC), but the mechanism is not well understood. Here, that discoidin domain receptor 1 (DDR1), a principal ECM receptor, recognized as key driver of GC reported. Mechanistically, DDR1 directly interacts with PAS hypoxia-inducible factor-1α (HIF-1α), suppresses its ubiquitination subsequently strengthens transcriptional regulation angiogenesis. Additionally,...