NFDI4DS | UHH-SEMS - Publication Details

Jeffrey Xu Yu

ORCID: 0000-0002-9738-827X

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5075642293

Research Areas

Data Management and Algorithms
Advanced Database Systems and Queries
Graph Theory and Algorithms
Complex Network Analysis Techniques
Advanced Graph Neural Networks
Data Mining Algorithms and Applications
Semantic Web and Ontologies
Caching and Content Delivery
Algorithms and Data Compression
Web Data Mining and Analysis
Peer-to-Peer Network Technologies
Complexity and Algorithms in Graphs
Cloud Computing and Resource Management
Advanced Data Storage Technologies
Advanced Graph Theory Research
Rough Sets and Fuzzy Logic
Data Stream Mining Techniques
Time Series Analysis and Forecasting
Advanced Image and Video Retrieval Techniques
Opinion Dynamics and Social Influence
Distributed systems and fault tolerance
Data Quality and Management
Optimization and Search Problems
Geographic Information Systems Studies
Human Mobility and Location-Based Analysis

Chinese University of Hong Kong
2016-2025

University of Hong Kong
2003-2024

Nanjing Normal University
2024

Kaiser Permanente
2023

Guangzhou University
2023

University of Technology Sydney
2023

University of California, San Diego
2020-2022

Baycrest Hospital
2022

Health Sciences Centre
2022

The University of Texas at Arlington
2021

Graph clustering based on structural/attribute similarities

OPENALEX - Publications

Yang Zhou Hong Cheng Jeffrey Xu Yu

The goal of graph clustering is to partition vertices in a large into different clusters based on various criteria such as vertex connectivity or neighborhood similarity. Graph techniques are very useful for detecting densely connected groups graph. Many existing methods mainly focus the topological structure clustering, but largely ignore properties which often heterogenous. In this paper, we propose novel algorithm, SA-Cluster , both structural and attribute similarities through unified...

10.14778/1687627.1687709 article EN Proceedings of the VLDB Endowment 2009-08-01

Efficient similarity joins for near duplicate detection

OPENALEX - Publications

Chuan Xiao Wei Wang Xuemin Lin Jeffrey Xu Yu

With the increasing amount of data and need to integrate from multiple sources, a challenging issue is find near duplicate records efficiently. In this paper, we focus on efficient algorithms pairs such that their similarities are above given threshold. Several existing rely prefix filtering principle avoid computing similarity values for all possible records. We propose new techniques by exploiting ordering information; they integrated into methods drastically reduce candidate sizes hence...

10.1145/1367497.1367516 article EN 2008-04-21

Querying k-truss community in large and dynamic graphs

OPENALEX - Publications

Xin Huang Hong Cheng Lu Qin Wentao Tian Jeffrey Xu Yu

Community detection which discovers densely connected structures in a network has been studied lot. In this paper, we study online community search is practically useful but less the literature. Given query vertex graph, problem to find meaningful communities that belongs an manner. We propose novel model based on k-truss concept, brings nice structural and computational properties. design compact elegant index structure supports efficient of with linear cost respect size. addition,...

10.1145/2588555.2610495 article EN 2014-06-18

Taming verification hardness

OPENALEX - Publications

Haichuan Shang Ying Zhang Xuemin Lin Jeffrey Xu Yu

Graphs are widely used to model complicated data semantics in many applications. In this paper, we aim develop efficient techniques retrieve graphs, containing a given query graph, from large set of graphs. Considering the problem testing subgraph isomorphism is generally NP-hard, most existing based on framework filtering -and- verification reduce precise computation costs; consequently various novel feature-based indexes have been developed. While work well for small phase becomes...

10.14778/1453856.1453899 article EN Proceedings of the VLDB Endowment 2008-08-01

Finding Top-k Min-Cost Connected Trees in Databases

OPENALEX - Publications

Bolin Ding Jeffrey Xu Yu Shan Wang Lu Qin Xiao Zhang and 1 more

It is widely realized that the integration of database and information retrieval techniques will provide users with a wide range high quality services. In this paper, we study processing an l-keyword query, p1, p2, ···, pl, against relational which can be modeled as weighted graph, G(V, E). Here V set nodes (tuples) E edges representing foreign key references between tuples. Let Vi contain keyword pi. We finding top-k minimum cost connected trees at least one node in every subset Vi, denote...

10.1109/icde.2007.367929 article EN 2007-04-01

Efficient similarity joins for near-duplicate detection

OPENALEX - Publications

Chuan Xiao Wei Wang Xuemin Lin Jeffrey Xu Yu Guoren Wang

With the increasing amount of data and need to integrate from multiple sources, one challenging issues is identify near-duplicate records efficiently. In this article, we focus on efficient algorithms find a pair such that their similarities are no less than given threshold. Several existing rely prefix filtering principle avoid computing similarity values for all possible pairs records. We propose new techniques by exploiting token ordering information; they integrated into methods...

10.1145/2000824.2000825 article EN ACM Transactions on Database Systems 2011-08-01

Answering Natural Language Questions by Subgraph Matching over Knowledge Graphs

OPENALEX - Publications

Sen Hu Lei Zou Jeffrey Xu Yu Haixun Wang Dongyan Zhao

RDF question/answering (Q/A) allows users to ask questions in natural languages over a knowledge base represented by RDF. To answer language question, the existing work takes two-stage approach: question understanding and query evaluation. Their focus is on deal with disambiguation of phrases. The most common technique joint disambiguation, which has exponential search space. In this paper, we propose systematic framework repository (RDF Q/A) from graph data-driven perspective. We semantic...

10.1109/tkde.2017.2766634 article EN IEEE Transactions on Knowledge and Data Engineering 2017-10-26

Natural language question answering over RDF

OPENALEX - Publications

Lei Zou Ruizhe Huang Haixun Wang Jeffrey Xu Yu Wenqiang He and 1 more

RDF question/answering (Q/A) allows users to ask questions in natural languages over a knowledge base represented by RDF. To answer national language question, the existing work takes two-stage approach: question understanding and query evaluation. Their focus is on deal with disambiguation of phrases. The most common technique joint disambiguation, which has exponential search space. In this paper, we propose systematic framework repository (RDF Q/A) from graph data-driven perspective. We...

10.1145/2588555.2610525 article EN 2014-06-18

Influential community search in large networks

OPENALEX - Publications

Rong-Hua Li Lu Qin Jeffrey Xu Yu Rui Mao

Community search is a problem of finding densely connected subgraphs that satisfy the query conditions in network, which has attracted much attention recent years. However, all previous studies on community do not consider influence community. In this paper, we introduce novel model called k -influential based concept -core, can capture Based new model, propose linear-time online algorithm to find top- r communities network. To further speed up influential algorithm, devise linear-space...

10.14778/2735479.2735484 article EN Proceedings of the VLDB Endowment 2015-01-01

Approximate closest community search in networks

OPENALEX - Publications

Xin Huang Laks V. S. Lakshmanan Jeffrey Xu Yu Hong Cheng

Recently, there has been significant interest in the study of community search problem social and information networks: given one or more query nodes, find densely connected communities containing nodes. However, most existing studies do not address "free rider" issue, that is, nodes far away from irrelevant to them are included detected community. Some state-of-the-art models have attempted this but only their formulated problems NP-hard, they admit any approximations without restrictive...

10.14778/2856318.2856323 article EN Proceedings of the VLDB Endowment 2015-12-01

Target-aware Holistic Influence Maximization in Spatial Social Networks

OPENALEX - Publications

Taotao Cai Jianxin Li Ajmal Mian Ronghua li Timos Sellis and 1 more

Influence maximization has recently received significant attention for scheduling online campaigns or advertisements on social network platforms. However, most studies only focus user influence via cyber interactions while ignoring their physical which are also essential to gauge propagation. Additionally, targeted have not sufficient attention. To address these issues, we first devise a novel holistic diffusion model that takes into account both and in an effective practical way. Based the...

10.1109/tkde.2020.3003047 article EN IEEE Transactions on Knowledge and Data Engineering 2020-01-01

Dual Labeling: Answering Graph Reachability Queries in Constant Time

OPENALEX - Publications

Haixun Wang Hao He Jun Yang Philip S. Yu Jeffrey Xu Yu

Graph reachability is fundamental to a wide range of applications, including XML indexing, geographic navigation, Internet routing, ontology queries based on RDF/OWL, etc. Many applications involve huge graphs and require fast answering queries. Several labeling methods have been proposed for this purpose. They assign labels the vertices, such that between any two vertices may be decided using their only. For sparse graphs, 2-hop schemes answer efficiently relatively small label space....

10.1109/icde.2006.53 article EN 2006-01-01

Finding time-dependent shortest paths over large graphs

OPENALEX - Publications

Bolin Ding Jeffrey Xu Yu Lu Qin

The spatial and temporal databases have been studied widely intensively over years. In this paper, we study how to answer queries of finding the best departure time that minimizes total travel from a place another, road network, where traffic conditions dynamically change time. We generalized form problem, called time-dependent shortest-path problem. A graph GT is has an edge-delay function, wi, j(t), associated with each edge (vi, vj), be stored in database. function j(t) specifies much it...

10.1145/1353343.1353371 article EN 2008-03-25

Text classification without negative examples revisit

OPENALEX - Publications

Gabriel Pui Cheong Fung Jeffrey Xu Yu Hongjun Lü Philip S. Yu

Traditionally, building a classifier requires two sets of examples: positive examples and negative examples. This paper studies the problem text using (P) unlabeled (U). The are mixed with both Since no example is given explicitly, task reliable becomes far more challenging. Simply treating all as thereafter undoubtedly poor approach to tackling this problem. Generally speaking, most solved by two-step heuristic: first, extract (N) from U. Second, build based on P N. Surprisingly, did not...

10.1109/tkde.2006.16 article EN IEEE Transactions on Knowledge and Data Engineering 2006-01-01

Clustering Large Attributed Graphs: An Efficient Incremental Approach

OPENALEX - Publications

Yang Zhou Hong Cheng Jeffrey Xu Yu

In recent years, many networks have become available for analysis, including social networks, sensor biological etc. Graph clustering has shown its effectiveness in analyzing and visualizing large networks. The goal of graph is to partition vertices a into clusters based on various criteria such as vertex connectivity or neighborhood similarity. Many existing methods mainly focus the topological structures, but largely ignore properties which are often heterogeneous. Recently, new algorithm,...

10.1109/icdm.2010.41 article EN 2010-12-01

Fast Graph Pattern Matching

OPENALEX - Publications

Jiefeng Cheng Jeffrey Xu Yu Bolin Ding Philip S. Yu Haixun Wang

Due to rapid growth of the Internet technology and new scientific/technological advances, number applications that model data as graphs increases, because have high expressive power complicated structures. The dominance in real-world asks for graph management so users can access effectively efficiently. In this paper, we study a pattern matching problem over large graph. is find all patterns match user-given pattern. We propose two-step R-join (reachability join) algorithm with filter step...

10.1109/icde.2008.4497500 article EN 2008-04-01

Efficient Core Maintenance in Large Dynamic Graphs

OPENALEX - Publications

Rong-Hua Li Jeffrey Xu Yu Rui Mao

The k-core decomposition in a graph is fundamental problem for social network analysis. of to calculate the core number every node graph. Previous studies mainly focus on static There exists linear time algorithm However, many real-world applications such as online networks and Internet, typically evolves overtime. In applications, key issue maintain numbers nodes when changes A simple implementation perform recompute after updated. Such expensive very large. this paper, we propose new...

10.1109/tkde.2013.158 article EN IEEE Transactions on Knowledge and Data Engineering 2013-09-27

Community detection in social networks

OPENALEX - Publications

Meng Wang Chaokun Wang Jeffrey Xu Yu Jun Zhang

Revealing the latent community structure, which is crucial to understanding features of networks, an important problem in network and graph analysis. During last decade, many approaches have been proposed solve this challenging diverse ways, i.e. different measures or data structures. Unfortunately, experimental reports on existing techniques fell short validity integrity since comparisons were not based a unified code base merely discussed theory. We engage in-depth benchmarking study...

10.14778/2794367.2794370 article EN Proceedings of the VLDB Endowment 2015-06-01

Clustering Large Attributed Graphs

OPENALEX - Publications

Hong Cheng Yang Zhou Jeffrey Xu Yu

Social networks, sensor biological and many other information networks can be modeled as a large graph. Graph vertices represent entities, graph edges their relationships or interactions. In graphs, there is usually one more attributes associated with every vertex to describe its properties. application domains, clustering techniques are very useful for detecting densely connected groups in well understanding visualizing The goal of partition into different clusters based on various criteria...

10.1145/1921632.1921638 article EN ACM Transactions on Knowledge Discovery from Data 2011-02-01

Sliding-window top-k queries on uncertain streams

OPENALEX - Publications

Cheqing Jin Ke Yi Lei Chen Jeffrey Xu Yu Xuemin Lin

Query processing on uncertain data streams has attracted a lot of attentions lately, due to the imprecise nature in generated from variety streaming applications, such as readings sensor network. However, all existing works study unbounded streams. This paper takes first step towards important and challenging problem answering sliding-window queries streams, with focus arguably one most types queries---top- k queries. The challenge top- stems strict space time requirements both arriving...

10.14778/1453856.1453892 article EN Proceedings of the VLDB Endowment 2008-08-01

Finding maximal cliques in massive networks

OPENALEX - Publications

James Cheng Yiping Ke Ada Wai-Chee Fu Jeffrey Xu Yu Linhong Zhu

Maximal clique enumeration is a fundamental problem in graph theory and has important applications many areas such as social network analysis bioinformatics. The extensively studied; however, the best existing algorithms require memory space linear size of input graph. This become serious concern view massive volume today's fast-growing networks. We propose general framework for designing external-memory maximal large graphs. enables to be processed recursively small subgraphs graph, thus...

10.1145/2043652.2043654 article EN ACM Transactions on Database Systems 2011-12-01

Coming Soon ...