NFDI4DS | UHH-SEMS - Publication Details

Zhifeng Bao

ORCID: 0000-0003-2477-381X

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5080660416

Research Areas

Data Management and Algorithms
Advanced Database Systems and Queries
Human Mobility and Location-Based Analysis
Geographic Information Systems Studies
Semantic Web and Ontologies
Web Data Mining and Analysis
Traffic Prediction and Management Techniques
Data Quality and Management
Transportation Planning and Optimization
Advanced Graph Neural Networks
Time Series Analysis and Forecasting
Recommender Systems and Techniques
Complex Network Analysis Techniques
Algorithms and Data Compression
Topic Modeling
Data Mining Algorithms and Applications
Caching and Content Delivery
Anomaly Detection Techniques and Applications
Graph Theory and Algorithms
Transportation and Mobility Innovations
Privacy-Preserving Technologies in Data
Data Visualization and Analytics
Data Stream Mining Techniques
Peer-to-Peer Network Technologies
Advanced Image and Video Retrieval Techniques

MIT University
2015-2025

RMIT University
2016-2025

The Royal Melbourne Hospital
2017-2025

Zhejiang University
2019

Nanjing University of Aeronautics and Astronautics
2019

ResearchWorks (United States)
2019

University of Tasmania
2014-2015

National University of Singapore
2007-2014

Institute for Infocomm Research
2013

Yanshan University
2012

A Survey on Modern Deep Neural Network for Traffic Prediction: Trends, Methods and Challenges

OPENALEX - Publications

David Alexander Tedjopurnomo Zhifeng Bao Baihua Zheng Farhana M. Choudhury A. K. Qin

In this modern era, traffic congestion has become a major source of severe negative economic and environmental impact for urban areas worldwide. One the most efficient ways to mitigate is through future prediction. The research field prediction evolved greatly ever since its inception in late 70s. Earlier studies mainly use classical statistical models such as ARIMA variants. Recently, researchers have started focus on machine learning because their power flexibility. As theoretical...

10.1109/tkde.2020.3001195 article EN IEEE Transactions on Knowledge and Data Engineering 2020-01-01

Effective XML Keyword Search with Relevance Oriented Ranking

OPENALEX - Publications

Zhifeng Bao Tok Wang Ling Bo Chen Jiaheng Lu

Inspired by the great success of information retrieval (IR) style keyword search on Web, XML has emerged recently. The difference between text database and results in three new challenges: (1) Identify user intention, i.e. identify node types that wants to for via. (2) Resolve ambiguity problems: a can appear as both tag name value some node; values different carry meanings. (3) As are sub-trees document, scoring function is needed estimate its relevance given query. However, existing...

10.1109/icde.2009.16 article EN Proceedings - International Conference on Data Engineering 2009-03-01

DITA

OPENALEX - Publications

Zeyuan Shang Guoliang Li Zhifeng Bao

Trajectory analytics can benefit many real-world applications, e.g., frequent trajectory based navigation systems, road planning, car pooling, and transportation optimizations. Existing algorithms focus on optimizing this problem in a single machine. However, the amount of trajectories exceeds storage processing capability machine, it calls for large-scale distributed environments. The faces challenges data locality aware partitioning, load balance, easy-to-use interface, versatility to...

10.1145/3183713.3183743 article EN Proceedings of the 2022 International Conference on Management of Data 2018-05-25

Crowdsourced POI labelling: Location-aware result inference and Task Assignment

OPENALEX - Publications

Huiqi Hu Yudian Zheng Zhifeng Bao Guoliang Li Jianhua Feng and 1 more

Identifying the labels of points interest (POIs), aka POI labelling, provides significant benefits in location-based services. However, quality raw manually added by users or generated artificial algorithms cannot be guaranteed. Such low-quality decrease usability and result bad user experiences. In this paper, observing that crowdsourcing is a best-fit for computer-hard tasks, we leverage to improve labelling. To our best knowledge, first work on crowdsourced labelling tasks. particular,...

10.1109/icde.2016.7498229 article EN 2016-05-01

Online Anomalous Trajectory Detection with Deep Generative Sequence Modeling

OPENALEX - Publications

Yiding Liu Kaiqi Zhao Gao Cong Zhifeng Bao

Detecting anomalous trajectory has become an important and fundamental concern in many real-world applications. However, most of the existing studies 1) cannot handle complexity variety data 2) do not support efficient anomaly detection online manner. To this end, we propose a novel model, namely Gaussian Mixture Variational Sequence AutoEncoder (GM-VSAE), to tackle these challenges. Our GM-VSAE model is able (1) capture complex sequential information enclosed trajectories, (2) discover...

10.1109/icde48307.2020.00087 article EN 2022 IEEE 38th International Conference on Data Engineering (ICDE) 2020-04-01

Effective Travel Time Estimation: When Historical Trajectories over Road Networks Matter

OPENALEX - Publications

Haitao Yuan Guoliang Li Zhifeng Bao Ling Feng

In this paper, we study the problem of origin-destination (OD) travel time estimation where OD input consists an pair and a departure time. We propose novel neural network based prediction model that fully exploits important fact neglected by literature -- for past trip its is usually affiliated with trajectory it travels along, whereas does not exist during prediction. At training phase, our goal to design representations trajectory, such they are close each other in latent space. First,...

10.1145/3318464.3389771 article EN 2020-05-29

Fast large-scale trajectory clustering

OPENALEX - Publications

Sheng Wang Zhifeng Bao J. Shane Culpepper Timos Sellis Xiaolin Qin

In this paper, we study the problem of large-scale trajectory data clustering, k -paths, which aims to efficiently identify "representative" paths in a road network. Unlike traditional clustering approaches that require multiple data-dependent hyperparameters, -paths can be used for visual exploration applications such as traffic monitoring, public transit planning, and site selection. By combining map matching with an efficient intermediate representation trajectories novel edge-based...

10.14778/3357377.3357380 article EN Proceedings of the VLDB Endowment 2019-09-01

A Framework of Joint Graph Embedding and Sparse Regression for Dimensionality Reduction

OPENALEX - Publications

Xiaoshuang Shi Zhenhua Guo Zhihui Lai Yujiu Yang Zhifeng Bao and 1 more

Over the past few decades, a large number of algorithms have been developed for dimensionality reduction. Despite different motivations these algorithms, they can be interpreted by common framework known as graph embedding. In order to explore significant features data, some sparse regression proposed based on However, problem is that include two separate steps: (1) embedding learning and (2) regression. Thus their performance largely determined effectiveness constructed graph. this paper,...

10.1109/tip.2015.2405474 article EN IEEE Transactions on Image Processing 2015-02-19

Torch

OPENALEX - Publications

Sheng Wang Zhifeng Bao J. Shane Culpepper Zizhe Xie Qizhi Liu and 1 more

This paper presents a new trajectory search engine called Torch for querying road network data. is able to efficiently process two types of typical queries (similarity and Boolean search), support wide variety similarity functions. Additionally, we propose function LORS in measure the more effective efficient manner. Indexing works as follows. First, each raw vehicle transformed set segments (edges) crossings (vertices) on network. Then lightweight edge vertex index LEVI built. Given query,...

10.1145/3209978.3209989 article EN 2018-06-27

An Index Advisor Using Deep Reinforcement Learning

OPENALEX - Publications

Hai Lan Zhifeng Bao Yuwei Peng

We study the problem of index selection to maximize workload performance, which is critical database systems. In contrast existing methods, we seamlessly integrate recommendation rules and deep reinforcement learning, such that can recommend single-attribute multi-attribute indexes together for complex queries meanwhile support multiple-index access a table. Specifically, first propose five heuristic generate candidates. Then, formulate as learning task employ Deep Q Network (DQN) on it....

10.1145/3340531.3412106 article EN 2020-10-19

Location-Centered House Price Prediction: A Multi-Task Learning Approach

OPENALEX - Publications

Guangliang Gao Zhifeng Bao Jie Cao A. K. Qin Timos Sellis

Accurate house prediction is of great significance to various real estate stakeholders such as owners, buyers, and investors. We propose a location-centered framework that differs from existing work in terms data profiling model. Regarding profiling, we make an important observation follows – besides the in-house features floor area, location plays critical role price prediction. Unfortunately, either overlooked it or had coarse grained measurement locations. Thereby, define capture...

10.1145/3501806 article EN ACM Transactions on Intelligent Systems and Technology 2022-01-05

Semantic-Enhanced Representation Learning for Road Networks with Temporal Dynamics

OPENALEX - Publications

Yile Chen Xiucheng Li Gao Cong Zhifeng Bao Cheng Long

10.1109/tmc.2025.3562656 article EN IEEE Transactions on Mobile Computing 2025-01-01

DDE

OPENALEX - Publications

Liang Xu Tok Wang Ling Huayu Wu Zhifeng Bao

Labeling schemes lie at the core of query processing for many XML database management systems. Designing labeling dynamic documents is an important problem that has received a lot research attention. Existing schemes, however, often sacrifice performance and introduce additional cost to facilitate arbitrary updates even when actually seldom get updated. Since line between static blurred in practice, we believe it design scheme compact efficient regardless whether are frequently updated or...

10.1145/1559845.1559921 article EN 2009-06-29

Extended XML Tree Pattern Matching: Theories and Algorithms

OPENALEX - Publications

Jiaheng Lu Tok Wang Ling Zhifeng Bao Chen Wang

As business and enterprises generate exchange XML data more often, there is an increasing need for efficient processing of queries on data. Searching the occurrences a tree pattern query in database core operation processing. Prior works demonstrate that holistic twig matching algorithm technique to answer with parent-child (P-C) ancestor-descendant (A-D) relationships, as it can effectively control size intermediate results during However, languages (e.g., XPath XQuery) define axes...

10.1109/tkde.2010.126 article EN IEEE Transactions on Knowledge and Data Engineering 2010-08-24

Location-Aware Pub/Sub System

OPENALEX - Publications

Long Guo Dongxiang Zhang Guoliang Li Kian‐Lee Tan Zhifeng Bao

In this paper, we propose a new location-aware pub/sub system, Elaps, that continuously monitors moving users subscribing to dynamic event streams from social media and E-commerce applications. Users are notified instantly when there is matching nearby. To the best of our knowledge, Elaps first take into account continuous queries against streams. Like existing works on query processing,Elaps employs concept safe region reduce communication overhead. However, unlike which assume data...

10.1145/2723372.2746481 article EN 2015-05-27

Crowdsourcing-based real-time urban traffic speed estimation: From trends to speeds

OPENALEX - Publications

Huiqi Hu Guoliang Li Zhifeng Bao Yan Cui Jianhua Feng

Real-time urban traffic speed estimation provides significant benefits in many real-world applications. However, existing information acquisition systems only obtain coarse-grained on a small number of roads but cannot acquire fine-grained every road. To address this problem, paper we study the which, given budget K, identifies K (called seeds) where real speeds these seeds can be obtained using crowdsourcing, and infers other non-seed roads) based seeds. This problem includes two...

10.1109/icde.2016.7498298 article EN 2016-05-01

Adaptive task scheduling strategy in cloud: when energy consumption meets performance guarantee

OPENALEX - Publications

Yao Shen Zhifeng Bao Xiaolin Qin Jian Shen

10.1007/s11280-016-0382-4 article EN World Wide Web 2016-02-18

Efficient Selection of Geospatial Data on Maps for Interactive and Visualized Exploration

OPENALEX - Publications

Tao Guo Kaiyu Feng Gao Cong Zhifeng Bao

With the proliferation of mobile devices, large collections geospatial data are becoming available, such as geo-tagged photos. Map rendering systems play an important role in presenting datasets to end users. We propose that should support following desirable features: representativeness, visibility constraint, zooming consistency, and panning consistency. The first two constraints fundamental challenges a map exploration system, which aims efficiently select small set representative objects...

10.1145/3183713.3183738 article EN Proceedings of the 2022 International Conference on Management of Data 2018-05-25

Robust Road Network Representation Learning

OPENALEX - Publications

Yile Chen Xiucheng Li Gao Cong Zhifeng Bao Cheng Long and 3 more

In this work, we propose a robust road network representation learning framework called Toast, which comes to be cornerstone boost the performance of numerous demanding transport planning tasks. Specifically, first traffic context aware skip-gram module incorporate auxiliary tasks predicting target segment. Furthermore, trajectory-enhanced Transformer that utilizes trajectory data extract traveling semantics on networks. Apart from obtaining effective segment representations, also enables us...

10.1145/3459637.3482293 article EN 2021-10-26

An Effective Joint Prediction Model for Travel Demands and Traffic Flows

OPENALEX - Publications

Haitao Yuan Guoliang Li Zhifeng Bao Ling Feng

In this paper, we study how to jointly predict travel demands and traffic flows for all regions of a city at future time interval. From an empirical analysis data, outline three desired properties, namely region-level correlations, temporal periodicity inter-traffic correlations. Then, propose comprehensive neural network based prediction model, where various effective embeddings or encodings are designed capture the aforementioned properties. First, design region two forms correlations:...

10.1109/icde51399.2021.00037 article EN 2022 IEEE 38th International Conference on Data Engineering (ICDE) 2021-04-01

Towards an Effective XML Keyword Search

OPENALEX - Publications

Zhifeng Bao Jiaheng Lu Tok Wang Ling Bo Chen

Inspired by the great success of information retrieval (IR) style keyword search on web, XML has emerged recently. The difference between text database and results in three new challenges: 1) Identify user intention, i.e., identify node types that wants to for via. 2) Resolve ambiguity problems: a can appear as both tag name value some node; values different carry meanings; with meanings. 3) As are subtrees document, scoring function is needed estimate its relevance given query. However,...

10.1109/tkde.2010.63 article EN IEEE Transactions on Knowledge and Data Engineering 2010-04-28

Reverse $k$ Nearest Neighbor Search over Trajectories

OPENALEX - Publications

Sheng Wang Zhifeng Bao J. Shane Culpepper Timos Sellis Gao Cong

GPS enables mobile devices to continuously provide new opportunities improve our daily lives. For example, the data collected in applications created by Uber or Public Transport Authorities can be used plan transportation routes, estimate capacities, and proactively identify low coverage areas. In this paper, we study a kind of query-Reverse k Nearest Neighbor Search over Trajectories (RkNNT), which for route planning capacity estimation. Given set existing routes D <sub...

10.1109/tkde.2017.2776268 article EN IEEE Transactions on Knowledge and Data Engineering 2017-11-22

Differentially Private Triangle Counting in Large Graphs

OPENALEX - Publications

Xiaofeng Ding Shujun Sheng Huajian Zhou Xiaodong Zhang Zhifeng Bao and 2 more

Triangle count is a critical parameter in mining relationships among people social networks. However, directly publishing the findings obtained from triangle counts may bring potential privacy concern, which raises great challenges and opportunities for privacy-preserving counting. In this paper, we choose to use differential protect counting large scale graphs. To reduce sensitivity caused graphs, propose novel graph projection method that can be used obtain an upper bound different...

10.1109/tkde.2021.3052827 article EN IEEE Transactions on Knowledge and Data Engineering 2022-10-06

Updatable Learned Indexes Meet Disk-Resident DBMS - From Evaluations to Design Choices

OPENALEX - Publications

Hai Lan Zhifeng Bao J. Shane Culpepper Renata Borovica‐Gajic

Although many updatable learned indexes have been proposed in recent years, whether they can outperform traditional approaches on disk remains unknown. In this study, we revisit and implement four state-of-the-art disk, compare them against the B+-tree under a wide range of settings. Through our evaluation, make some key observations: 1) Overall, performs well across workload types datasets. 2) A index could or other for specific workload. For example, PGM achieves best performance...

10.1145/3589284 article EN Proceedings of the ACM on Management of Data 2023-06-13

Coming Soon ...