- Data Management and Algorithms
- Advanced Database Systems and Queries
- Human Mobility and Location-Based Analysis
- Geographic Information Systems Studies
- Semantic Web and Ontologies
- Web Data Mining and Analysis
- Traffic Prediction and Management Techniques
- Data Quality and Management
- Transportation Planning and Optimization
- Advanced Graph Neural Networks
- Time Series Analysis and Forecasting
- Recommender Systems and Techniques
- Complex Network Analysis Techniques
- Algorithms and Data Compression
- Topic Modeling
- Data Mining Algorithms and Applications
- Caching and Content Delivery
- Anomaly Detection Techniques and Applications
- Graph Theory and Algorithms
- Transportation and Mobility Innovations
- Privacy-Preserving Technologies in Data
- Data Visualization and Analytics
- Data Stream Mining Techniques
- Peer-to-Peer Network Technologies
- Advanced Image and Video Retrieval Techniques
MIT University
2015-2025
RMIT University
2016-2025
The Royal Melbourne Hospital
2017-2025
Zhejiang University
2019
Nanjing University of Aeronautics and Astronautics
2019
ResearchWorks (United States)
2019
University of Tasmania
2014-2015
National University of Singapore
2007-2014
Institute for Infocomm Research
2013
Yanshan University
2012
In this modern era, traffic congestion has become a major source of severe negative economic and environmental impact for urban areas worldwide. One the most efficient ways to mitigate is through future prediction. The research field prediction evolved greatly ever since its inception in late 70s. Earlier studies mainly use classical statistical models such as ARIMA variants. Recently, researchers have started focus on machine learning because their power flexibility. As theoretical...
Inspired by the great success of information retrieval (IR) style keyword search on Web, XML has emerged recently. The difference between text database and results in three new challenges: (1) Identify user intention, i.e. identify node types that wants to for via. (2) Resolve ambiguity problems: a can appear as both tag name value some node; values different carry meanings. (3) As are sub-trees document, scoring function is needed estimate its relevance given query. However, existing...
Trajectory analytics can benefit many real-world applications, e.g., frequent trajectory based navigation systems, road planning, car pooling, and transportation optimizations. Existing algorithms focus on optimizing this problem in a single machine. However, the amount of trajectories exceeds storage processing capability machine, it calls for large-scale distributed environments. The faces challenges data locality aware partitioning, load balance, easy-to-use interface, versatility to...
Identifying the labels of points interest (POIs), aka POI labelling, provides significant benefits in location-based services. However, quality raw manually added by users or generated artificial algorithms cannot be guaranteed. Such low-quality decrease usability and result bad user experiences. In this paper, observing that crowdsourcing is a best-fit for computer-hard tasks, we leverage to improve labelling. To our best knowledge, first work on crowdsourced labelling tasks. particular,...
Detecting anomalous trajectory has become an important and fundamental concern in many real-world applications. However, most of the existing studies 1) cannot handle complexity variety data 2) do not support efficient anomaly detection online manner. To this end, we propose a novel model, namely Gaussian Mixture Variational Sequence AutoEncoder (GM-VSAE), to tackle these challenges. Our GM-VSAE model is able (1) capture complex sequential information enclosed trajectories, (2) discover...
In this paper, we study the problem of origin-destination (OD) travel time estimation where OD input consists an pair and a departure time. We propose novel neural network based prediction model that fully exploits important fact neglected by literature -- for past trip its is usually affiliated with trajectory it travels along, whereas does not exist during prediction. At training phase, our goal to design representations trajectory, such they are close each other in latent space. First,...
In this paper, we study the problem of large-scale trajectory data clustering, k -paths, which aims to efficiently identify "representative" paths in a road network. Unlike traditional clustering approaches that require multiple data-dependent hyperparameters, -paths can be used for visual exploration applications such as traffic monitoring, public transit planning, and site selection. By combining map matching with an efficient intermediate representation trajectories novel edge-based...
Over the past few decades, a large number of algorithms have been developed for dimensionality reduction. Despite different motivations these algorithms, they can be interpreted by common framework known as graph embedding. In order to explore significant features data, some sparse regression proposed based on However, problem is that include two separate steps: (1) embedding learning and (2) regression. Thus their performance largely determined effectiveness constructed graph. this paper,...
This paper presents a new trajectory search engine called Torch for querying road network data. is able to efficiently process two types of typical queries (similarity and Boolean search), support wide variety similarity functions. Additionally, we propose function LORS in measure the more effective efficient manner. Indexing works as follows. First, each raw vehicle transformed set segments (edges) crossings (vertices) on network. Then lightweight edge vertex index LEVI built. Given query,...
We study the problem of index selection to maximize workload performance, which is critical database systems. In contrast existing methods, we seamlessly integrate recommendation rules and deep reinforcement learning, such that can recommend single-attribute multi-attribute indexes together for complex queries meanwhile support multiple-index access a table. Specifically, first propose five heuristic generate candidates. Then, formulate as learning task employ Deep Q Network (DQN) on it....
Accurate house prediction is of great significance to various real estate stakeholders such as owners, buyers, and investors. We propose a location-centered framework that differs from existing work in terms data profiling model. Regarding profiling, we make an important observation follows – besides the in-house features floor area, location plays critical role price prediction. Unfortunately, either overlooked it or had coarse grained measurement locations. Thereby, define capture...
Labeling schemes lie at the core of query processing for many XML database management systems. Designing labeling dynamic documents is an important problem that has received a lot research attention. Existing schemes, however, often sacrifice performance and introduce additional cost to facilitate arbitrary updates even when actually seldom get updated. Since line between static blurred in practice, we believe it design scheme compact efficient regardless whether are frequently updated or...
As business and enterprises generate exchange XML data more often, there is an increasing need for efficient processing of queries on data. Searching the occurrences a tree pattern query in database core operation processing. Prior works demonstrate that holistic twig matching algorithm technique to answer with parent-child (P-C) ancestor-descendant (A-D) relationships, as it can effectively control size intermediate results during However, languages (e.g., XPath XQuery) define axes...
In this paper, we propose a new location-aware pub/sub system, Elaps, that continuously monitors moving users subscribing to dynamic event streams from social media and E-commerce applications. Users are notified instantly when there is matching nearby. To the best of our knowledge, Elaps first take into account continuous queries against streams. Like existing works on query processing,Elaps employs concept safe region reduce communication overhead. However, unlike which assume data...
Real-time urban traffic speed estimation provides significant benefits in many real-world applications. However, existing information acquisition systems only obtain coarse-grained on a small number of roads but cannot acquire fine-grained every road. To address this problem, paper we study the which, given budget K, identifies K (called seeds) where real speeds these seeds can be obtained using crowdsourcing, and infers other non-seed roads) based seeds. This problem includes two...
With the proliferation of mobile devices, large collections geospatial data are becoming available, such as geo-tagged photos. Map rendering systems play an important role in presenting datasets to end users. We propose that should support following desirable features: representativeness, visibility constraint, zooming consistency, and panning consistency. The first two constraints fundamental challenges a map exploration system, which aims efficiently select small set representative objects...
In this work, we propose a robust road network representation learning framework called Toast, which comes to be cornerstone boost the performance of numerous demanding transport planning tasks. Specifically, first traffic context aware skip-gram module incorporate auxiliary tasks predicting target segment. Furthermore, trajectory-enhanced Transformer that utilizes trajectory data extract traveling semantics on networks. Apart from obtaining effective segment representations, also enables us...
In this paper, we study how to jointly predict travel demands and traffic flows for all regions of a city at future time interval. From an empirical analysis data, outline three desired properties, namely region-level correlations, temporal periodicity inter-traffic correlations. Then, propose comprehensive neural network based prediction model, where various effective embeddings or encodings are designed capture the aforementioned properties. First, design region two forms correlations:...
Inspired by the great success of information retrieval (IR) style keyword search on web, XML has emerged recently. The difference between text database and results in three new challenges: 1) Identify user intention, i.e., identify node types that wants to for via. 2) Resolve ambiguity problems: a can appear as both tag name value some node; values different carry meanings; with meanings. 3) As are subtrees document, scoring function is needed estimate its relevance given query. However,...
GPS enables mobile devices to continuously provide new opportunities improve our daily lives. For example, the data collected in applications created by Uber or Public Transport Authorities can be used plan transportation routes, estimate capacities, and proactively identify low coverage areas. In this paper, we study a kind of query-Reverse k Nearest Neighbor Search over Trajectories (RkNNT), which for route planning capacity estimation. Given set existing routes D <sub...
Triangle count is a critical parameter in mining relationships among people social networks. However, directly publishing the findings obtained from triangle counts may bring potential privacy concern, which raises great challenges and opportunities for privacy-preserving counting. In this paper, we choose to use differential protect counting large scale graphs. To reduce sensitivity caused graphs, propose novel graph projection method that can be used obtain an upper bound different...
Although many updatable learned indexes have been proposed in recent years, whether they can outperform traditional approaches on disk remains unknown. In this study, we revisit and implement four state-of-the-art disk, compare them against the B+-tree under a wide range of settings. Through our evaluation, make some key observations: 1) Overall, performs well across workload types datasets. 2) A index could or other for specific workload. For example, PGM achieves best performance...