Xiangyu Ke

ORCID: 0000-0001-8082-7398
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Complex Network Analysis Techniques
  • Advanced Graph Neural Networks
  • Data Management and Algorithms
  • Caching and Content Delivery
  • Graph Theory and Algorithms
  • Mobile Crowdsensing and Crowdsourcing
  • Advanced Image and Video Retrieval Techniques
  • Bayesian Modeling and Causal Inference
  • Multi-Criteria Decision Making
  • Recommender Systems and Techniques
  • Expert finding and Q&A systems
  • Algorithms and Data Compression
  • Opinion Dynamics and Social Influence
  • Topic Modeling
  • Privacy-Preserving Technologies in Data
  • Explainable Artificial Intelligence (XAI)
  • Multimodal Machine Learning Applications
  • Peer-to-Peer Network Technologies
  • Network Packet Processing and Optimization
  • Data-Driven Disease Surveillance
  • Data Quality and Management
  • Metabolomics and Mass Spectrometry Studies
  • Semantic Web and Ontologies
  • Complexity and Algorithms in Graphs
  • Social Media and Politics

Zhejiang University
2023-2025

Ningbo University
2024

Hangzhou Dianzi University
2023

Nanyang Technological University
2017-2022

Generating explanations for graph neural networks (GNNs) has been studied to understand their behavior in analytical tasks such as classification. Existing approaches aim the overall results of GNNs rather than providing specific class labels interest, and may return explanation structures that are hard access, nor directly queryable.We propose GVEX, a novel paradigm generates Graph Views EXplanation. (1) We design two-tier structure called views. An view consists set patterns induced...

10.1145/3639295 article EN Proceedings of the ACM on Management of Data 2024-03-12

The ubiquity of machine learning, particularly deep applied to graphs is evident in applications ranging from cheminformatics (drug discovery) and bioinformatics (protein interaction prediction) knowledge graph-based query answering, fraud detection, social network analysis. Concurrently, graph data management deals with the research development effective, efficient, scalable, robust, user-friendly systems algorithms for storing, processing, analyzing vast quantities heterogeneous complex...

10.48550/arxiv.2502.00529 preprint EN arXiv (Cornell University) 2025-02-01

Community search on attributed graphs (CSAG) is a fundamental topic in graph data mining. Given an G and query node q , CSAG seeks structural- attribute-cohesive subgraph from that contains . Exact methods based traversal are time-consuming, especially for large graphs. Approximate improve efficiency by pruning the space with heuristics but still take hundreds of milliseconds to tens seconds respond, hindering their use time-sensitive applications. Moreover, strategies typically tailored...

10.1145/3709672 article EN Proceedings of the ACM on Management of Data 2025-02-10

High-dimensional vector similarity search (HVSS) is gaining prominence as a powerful tool for various data science and AI applications. As scales up, in-memory indexes pose significant challenge due to the substantial increase in main memory requirements. A potential solution involves leveraging disk-based implementation, which stores searches on high-performance devices like NVMe SSDs. However, implementing HVSS segments proves be intricate databases where single machine comprises multiple...

10.1145/3639269 article EN Proceedings of the ACM on Management of Data 2024-03-12

10.1145/3626772.3657771 article EN Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2024-07-10

10.1109/icde60146.2024.00214 article EN 2022 IEEE 38th International Conference on Data Engineering (ICDE) 2024-05-13

-1mmWe study the novel problem of jointly finding top- k seed nodes and r relevant tags for targeted influence maximization in a social network. The bulk research on assumes that diffusion probabilities across edges are fixed, users identified to maximize cascade entire graph. However, real-world applications, edge typically depend information being cascaded, e.g., networks, probability tweet some user will be re-tweeted by her followers depends whether contains specific hashtags. In...

10.1145/3183713.3199670 article EN Proceedings of the 2022 International Conference on Management of Data 2018-05-25

Graph summarization is beneficial in a wide range of applications, such as visualization, interactive and exploratory analysis, approximate query processing, reducing the on-disk storage footprint, graph processing modern hardware. However, bulk literature on surprisingly overlooks possibility having edges different types. In this article, we study novel problem producing summaries multi-relation networks, i.e., graphs where multiple types may exist between any pair nodes. Multi-relation are...

10.1145/3494561 article EN ACM Transactions on Knowledge Discovery from Data 2022-03-09

Bipartite graphs characterize relationships between two different sets of entities, like actor-movie, user-item, and author-paper. The butterfly, a 4-vertices 4-edges (2,2)-biclique, is the simplest cohesive motif in bipartite graph fundamental component higher-order substructures. Counting enumerating butterflies offer significant benefits across various applications, including fraud detection, embedding, community search. While corresponding motif, triangle, unipartite has been widely...

10.14778/3636218.3636223 article EN Proceedings of the VLDB Endowment 2023-12-01

Expert finding is crucial for a wealth of applications in both academia and industry. Given user query trove academic papers, expert aims at retrieving the most relevant experts query, from papers. Existing studies focus on embedding-based solutions that consider papers’ textual semantic similarities to via document representation extract top- n similar Beyond implicit semantics, however, explicit relationships (e.g., co-authorship) heterogeneous graph DBLP) are critical finding, because...

10.1145/3578365 article EN ACM Transactions on Knowledge Discovery from Data 2023-02-09

Computing the densest subgraph is a primitive graph operation with critical applications in detecting communities, events, and anomalies biological, social, Web, financial networks. In this paper, we study novel problem of Most Probable Densest Subgraph (MPDS) discovery uncertain graphs: Find node set that most likely to induce an graph. We further extend our by considering various notions density, e.g., clique pattern densities, studying top-k MPDSs, finding largest containment probability...

10.1109/icde55515.2023.00115 article EN 2022 IEEE 38th International Conference on Data Engineering (ICDE) 2023-04-01

Uncertain, or probabilistic, graphs have been increasingly used to represent noisy linked data in many emerging applications, and recently attracted the attention of database research community. A fundamental problem on uncertain is s-t reliability, which measures probability that a target node t reachable from source s probabilistic (or uncertain) graph, i.e., graph where every edge assigned existence. Due inherent complexity reliability estimation (#P-hard), various sampling indexing based...

10.14778/3324301.3324304 article EN Proceedings of the VLDB Endowment 2019-04-01

Similarity search, the task of identifying objects most similar to a given query object under specific metric, has gathered significant attention due its practical applications. However, absence coordinate information accelerate similarity search and high computational cost measuring hinder efficiency existing CPU-based methods. Additionally, these methods struggle meet demand for throughput data management. To address challenges, we propose GTS, GPU-based tree index designed parallel...

10.1145/3654945 preprint EN arXiv (Cornell University) 2024-04-01

Generating explanations for graph neural networks (GNNs) is a crucial aspect to understand their decision-making processes, especially complex analytical tasks such as classification [1]–[3]. Existing approaches [4]–[13] in this field are limited providing individual instances or specific class labels. The main focus of these methods on defining input features, often the shape numerical encoding [14]. These fall short targeted and configurable multiple labels interest. Additionally, existing...

10.1109/icdew61823.2024.00058 article EN 2024-05-13

<title>Abstract</title> Given a graph $G$ and query node $q$, community search (CS) aims to find cohesive subgraph from that contains $q$ as the desired of $q$. CS is fundamental problem in data analytics has gained much research interest. Recently, new thought using deep learning model support emerged. Supervised models Graph Neural Networks are presented (i.e., neural search). However, lack explicit consideration for features results suboptimal embeddings online inference, which adversely...

10.21203/rs.3.rs-4640804/v1 preprint EN Research Square (Research Square) 2024-07-18

10.1109/icde60146.2024.00247 article EN 2022 IEEE 38th International Conference on Data Engineering (ICDE) 2024-05-13

Retrieval-augmented Large Language Models (LLMs) have reshaped traditional query-answering systems, offering unparalleled user experiences. However, existing retrieval techniques often struggle to handle multi-modal query contexts. In this paper, we present an interactive M ulti-modal Q uery A nswering (MQA) system, empowered by our newly developed framework and navigation graph index, integrated with cutting-edge LLMs. It comprises five core components: Data Preprocessing, Vector...

10.14778/3685800.3685868 article EN Proceedings of the VLDB Endowment 2024-08-01

We investigate the novel problem of voting-based opinion maximization in a social network: Find given number seed nodes for target campaigner, presence other competing campaigns, so as to maximize score campaigner at time horizon.The bulk influence literature assumes that network users can switch between only two discrete states, inactive and active, choice is frozen upon one-time activation. In reality, even when having preferred opinion, user may not completely despise opinions, preference...

10.1109/icde55515.2023.00048 article EN 2022 IEEE 38th International Conference on Data Engineering (ICDE) 2023-04-01

Explaining the behavior of graph neural networks (GNNs) has become critical due to their "black-box'' nature, especially in context analytical tasks such as classification. Current approaches are limited providing explanations for individual instances or specific class labels and may return large explanation structures that hard access, nor directly queryable. In this paper, we present GVEX [1] (<u>G</u>raph <u>V</u>iews GNN <u>EX</u>planation) -- our system developed offer user-friendly,...

10.1145/3626246.3654735 article EN 2024-05-23

Crowdsourcing is becoming increasingly important in entity resolution tasks due to their inherent complexity such as clustering of images and natural language processing. Humans can provide more insightful information for these difficult problems compared machine-based automatic techniques. Nevertheless, human workers make mistakes lack domain expertise or seriousness, ambiguity, even malicious intents. The bulk literature usually deals with errors via majority voting by assigning a...

10.1145/3132847.3132876 article EN 2017-11-06

The social network host has knowledge of the structure and user characteristics can earn a profit by providing merchants with viral marketing campaigns. We investigate problem maximization leveraging performance incentives flexibility. To incentivize host's performance, we propose setting desired influence threshold that would allow to receive full payment, possibility small bonus for exceeding threshold. Unlike existing works assume user's choice is frozen once they are activated, introduce...

10.14778/3617838.3617843 article EN Proceedings of the VLDB Endowment 2023-09-01

Finding relevant experts in specified areas is often crucial for a wide range of applications both academia and industry. Given user input query large amount academic knowledge (e.g., papers), expert finding aims to find rank the who are most given query, from knowledge. Existing studies mainly focus on embedding-based solutions that (1) consider papers' textual semantic similarities through document representation models (2) extract <tex xmlns:mml="http://www.w3.org/1998/Math/MathML"...

10.1109/icde53745.2022.00030 article EN 2022 IEEE 38th International Conference on Data Engineering (ICDE) 2022-05-01
Coming Soon ...