- Complex Network Analysis Techniques
- Advanced Graph Neural Networks
- Data Management and Algorithms
- Caching and Content Delivery
- Graph Theory and Algorithms
- Mobile Crowdsensing and Crowdsourcing
- Advanced Image and Video Retrieval Techniques
- Bayesian Modeling and Causal Inference
- Multi-Criteria Decision Making
- Recommender Systems and Techniques
- Expert finding and Q&A systems
- Algorithms and Data Compression
- Opinion Dynamics and Social Influence
- Topic Modeling
- Privacy-Preserving Technologies in Data
- Explainable Artificial Intelligence (XAI)
- Multimodal Machine Learning Applications
- Peer-to-Peer Network Technologies
- Network Packet Processing and Optimization
- Data-Driven Disease Surveillance
- Data Quality and Management
- Metabolomics and Mass Spectrometry Studies
- Semantic Web and Ontologies
- Complexity and Algorithms in Graphs
- Social Media and Politics
Zhejiang University
2023-2025
Ningbo University
2024
Hangzhou Dianzi University
2023
Nanyang Technological University
2017-2022
Generating explanations for graph neural networks (GNNs) has been studied to understand their behavior in analytical tasks such as classification. Existing approaches aim the overall results of GNNs rather than providing specific class labels interest, and may return explanation structures that are hard access, nor directly queryable.We propose GVEX, a novel paradigm generates Graph Views EXplanation. (1) We design two-tier structure called views. An view consists set patterns induced...
The ubiquity of machine learning, particularly deep applied to graphs is evident in applications ranging from cheminformatics (drug discovery) and bioinformatics (protein interaction prediction) knowledge graph-based query answering, fraud detection, social network analysis. Concurrently, graph data management deals with the research development effective, efficient, scalable, robust, user-friendly systems algorithms for storing, processing, analyzing vast quantities heterogeneous complex...
Community search on attributed graphs (CSAG) is a fundamental topic in graph data mining. Given an G and query node q , CSAG seeks structural- attribute-cohesive subgraph from that contains . Exact methods based traversal are time-consuming, especially for large graphs. Approximate improve efficiency by pruning the space with heuristics but still take hundreds of milliseconds to tens seconds respond, hindering their use time-sensitive applications. Moreover, strategies typically tailored...
High-dimensional vector similarity search (HVSS) is gaining prominence as a powerful tool for various data science and AI applications. As scales up, in-memory indexes pose significant challenge due to the substantial increase in main memory requirements. A potential solution involves leveraging disk-based implementation, which stores searches on high-performance devices like NVMe SSDs. However, implementing HVSS segments proves be intricate databases where single machine comprises multiple...
-1mmWe study the novel problem of jointly finding top- k seed nodes and r relevant tags for targeted influence maximization in a social network. The bulk research on assumes that diffusion probabilities across edges are fixed, users identified to maximize cascade entire graph. However, real-world applications, edge typically depend information being cascaded, e.g., networks, probability tweet some user will be re-tweeted by her followers depends whether contains specific hashtags. In...
Graph summarization is beneficial in a wide range of applications, such as visualization, interactive and exploratory analysis, approximate query processing, reducing the on-disk storage footprint, graph processing modern hardware. However, bulk literature on surprisingly overlooks possibility having edges different types. In this article, we study novel problem producing summaries multi-relation networks, i.e., graphs where multiple types may exist between any pair nodes. Multi-relation are...
Bipartite graphs characterize relationships between two different sets of entities, like actor-movie, user-item, and author-paper. The butterfly, a 4-vertices 4-edges (2,2)-biclique, is the simplest cohesive motif in bipartite graph fundamental component higher-order substructures. Counting enumerating butterflies offer significant benefits across various applications, including fraud detection, embedding, community search. While corresponding motif, triangle, unipartite has been widely...
Expert finding is crucial for a wealth of applications in both academia and industry. Given user query trove academic papers, expert aims at retrieving the most relevant experts query, from papers. Existing studies focus on embedding-based solutions that consider papers’ textual semantic similarities to via document representation extract top- n similar Beyond implicit semantics, however, explicit relationships (e.g., co-authorship) heterogeneous graph DBLP) are critical finding, because...
Computing the densest subgraph is a primitive graph operation with critical applications in detecting communities, events, and anomalies biological, social, Web, financial networks. In this paper, we study novel problem of Most Probable Densest Subgraph (MPDS) discovery uncertain graphs: Find node set that most likely to induce an graph. We further extend our by considering various notions density, e.g., clique pattern densities, studying top-k MPDSs, finding largest containment probability...
Uncertain, or probabilistic, graphs have been increasingly used to represent noisy linked data in many emerging applications, and recently attracted the attention of database research community. A fundamental problem on uncertain is s-t reliability, which measures probability that a target node t reachable from source s probabilistic (or uncertain) graph, i.e., graph where every edge assigned existence. Due inherent complexity reliability estimation (#P-hard), various sampling indexing based...
Similarity search, the task of identifying objects most similar to a given query object under specific metric, has gathered significant attention due its practical applications. However, absence coordinate information accelerate similarity search and high computational cost measuring hinder efficiency existing CPU-based methods. Additionally, these methods struggle meet demand for throughput data management. To address challenges, we propose GTS, GPU-based tree index designed parallel...
Generating explanations for graph neural networks (GNNs) is a crucial aspect to understand their decision-making processes, especially complex analytical tasks such as classification [1]–[3]. Existing approaches [4]–[13] in this field are limited providing individual instances or specific class labels. The main focus of these methods on defining input features, often the shape numerical encoding [14]. These fall short targeted and configurable multiple labels interest. Additionally, existing...
<title>Abstract</title> Given a graph $G$ and query node $q$, community search (CS) aims to find cohesive subgraph from that contains $q$ as the desired of $q$. CS is fundamental problem in data analytics has gained much research interest. Recently, new thought using deep learning model support emerged. Supervised models Graph Neural Networks are presented (i.e., neural search). However, lack explicit consideration for features results suboptimal embeddings online inference, which adversely...
Retrieval-augmented Large Language Models (LLMs) have reshaped traditional query-answering systems, offering unparalleled user experiences. However, existing retrieval techniques often struggle to handle multi-modal query contexts. In this paper, we present an interactive M ulti-modal Q uery A nswering (MQA) system, empowered by our newly developed framework and navigation graph index, integrated with cutting-edge LLMs. It comprises five core components: Data Preprocessing, Vector...
We investigate the novel problem of voting-based opinion maximization in a social network: Find given number seed nodes for target campaigner, presence other competing campaigns, so as to maximize score campaigner at time horizon.The bulk influence literature assumes that network users can switch between only two discrete states, inactive and active, choice is frozen upon one-time activation. In reality, even when having preferred opinion, user may not completely despise opinions, preference...
Explaining the behavior of graph neural networks (GNNs) has become critical due to their "black-box'' nature, especially in context analytical tasks such as classification. Current approaches are limited providing explanations for individual instances or specific class labels and may return large explanation structures that hard access, nor directly queryable. In this paper, we present GVEX [1] (<u>G</u>raph <u>V</u>iews GNN <u>EX</u>planation) -- our system developed offer user-friendly,...
Crowdsourcing is becoming increasingly important in entity resolution tasks due to their inherent complexity such as clustering of images and natural language processing. Humans can provide more insightful information for these difficult problems compared machine-based automatic techniques. Nevertheless, human workers make mistakes lack domain expertise or seriousness, ambiguity, even malicious intents. The bulk literature usually deals with errors via majority voting by assigning a...
The social network host has knowledge of the structure and user characteristics can earn a profit by providing merchants with viral marketing campaigns. We investigate problem maximization leveraging performance incentives flexibility. To incentivize host's performance, we propose setting desired influence threshold that would allow to receive full payment, possibility small bonus for exceeding threshold. Unlike existing works assume user's choice is frozen once they are activated, introduce...
Finding relevant experts in specified areas is often crucial for a wide range of applications both academia and industry. Given user input query large amount academic knowledge (e.g., papers), expert finding aims to find rank the who are most given query, from knowledge. Existing studies mainly focus on embedding-based solutions that (1) consider papers' textual semantic similarities through document representation models (2) extract <tex xmlns:mml="http://www.w3.org/1998/Math/MathML"...