- Data Management and Algorithms
- Advanced Database Systems and Queries
- Parallel Computing and Optimization Techniques
- Advanced Data Storage Technologies
- Algorithms and Data Compression
- Cloud Computing and Resource Management
- Caching and Content Delivery
- Graph Theory and Algorithms
- Distributed and Parallel Computing Systems
- Data Stream Mining Techniques
- Computer Graphics and Visualization Techniques
- Complexity and Algorithms in Graphs
- Geographic Information Systems Studies
- Advanced Graph Neural Networks
- Human Mobility and Location-Based Analysis
- Time Series Analysis and Forecasting
Nanyang Technological University
2016-2021
National University of Singapore
2019-2021
Graphics Processing Units (GPUs) have evolved as a powerful query co-processor for main memory On-Line Analytical (OLAP) databases. However, existing GPU-based processors adopt kernel-based execution approach which optimizes individual kernels resource utilization and executes the GPU involved in plan one by one. Such cannot utilize all resources efficiently due to underutilization of ping-pong across kernel executions. In this paper, we propose GPL, novel pipelined engine improve...
The release of OpenCL support for FPGAs represents a significant improvement in extending database applications to the reconfigurable domain. Taking advantage programmability offered by HLS tool, an can be easily ported and re-designed FPGAs. A single SQL query these systems usually consists multiple operators, each one operators turn kernels. Due specific properties FPGAs, kernel have different FPGA-specific optimization combinations, terms CU (compute unit) SIMD (kernel vectorization),...
Data stream processing systems (DSPSs) enable users to express and run applications continuously process data streams. To achieve realtime analytics, recent researches keep focusing on optimizing the system latency throughput. Witnessing great achievements in computer architecture community, researchers practitioners have investigated potential of adoption hardware-conscious by better utilizing modern hardware capacity DSPSs. In this paper, we conduct a systematic survey work field,...
The recent scale-up of GPU hardware through the integration multiple GPUs into a single machine and introduction higher bandwidth interconnects like NVLink 2.0 has enabled new opportunities relational query processing on GPUs. However, due to unique characteristics interconnects, existing hash join implementations spend up 66% their execution time moving data between achieve lower than 50% utilization newer high interconnects. This leads extremely poor scalablity performance GPUs, which can...
In recent years, we have witnessed significant efforts to improve the performance of Online Analytical Processing (OLAP) on graphics processing units (GPUs). Most existing studies focused improving memory efficiency since stalls can play an essential role in query GPUs. Motivated by rise just-in-time (JIT) compilation processing, investigate whether and how further GPU. Specifically, study execution state-of-the-art JIT compile-based systems. We find that thanks advanced techniques such as...
Recently, field-programmable gate array (FPGA) vendors (such as Altera) have started to address the programmability issues of FPGAs via OpenCL SDKs. In this paper, we analyze performance relational database applications on using OpenCL. particular, study how improve data partitioning, which is a very important building block in database. Since partitioning causes random memory accesses, it time-consuming, and then, has been major bottleneck for operators, such partitioned hash join. import...
As large graph processing emerges, we observe a costly fork-processing pattern (FPP) that is common in many algorithms. The unique feature of the FPP it launches independent queries from different source vertices on same graph. For example, an algorithm analyzing network community profile can execute Personalized PageRanks start tens thousands at time. We study efficiency handling FPPs state-of-the-art systems multi-core architectures. find those suffer severe cache miss penalty because...
The large number of computational cores and the high memory bandwidth provided by modern graphics processors (GPUs) make them an ideal hardware accelerator for in-memory hash joins. Over last decade, significant research effort has been put into improving performance join operation on GPUs. Looking back at literature, we find that fundamentals GPU remained unchanged. In-spite this, implementations have managed to achieve over 5.3x end-to-end improvement original implementation taking...
The release of OpenCL support for FPGAs represents a significant improvement in extending database applications to the reconfigurable domain. Taking advantage programmability offered by HLS tool, an can be easily ported and re-designed FPGAs. A single SQL query these systems usually consists multiple operators, each one operators turn kernels. Due specific properties FPGAs, kernel have different optimization combinations (in terms CU SIMD) which is critical overall performance processing. In...
The proliferation of modern GPS-enabled devices like smartphones have led to significant research interest in large-scale trajectory exploration, which aims identify all nearby trajectories a given input trajectory. Trajectory exploration is useful many scenarios, for example, identifying incorrect road network information or assisting users when traveling unfamiliar geographical regions as it can reveal the popularity certain routes/trajectories. In this study, we develop an interactive...
Interaction-based systems have been widely used in many enterprises like Grab to enable quick and easy analysis of large-scale spatial data. Unlike traditional instruction-based query processing systems, modern interaction-based allow users issue complex queries through simple interactions with a Graphical User Interface (GUI). While such significantly transformed the process processing, they still rely on process-after-query approach for executing queries. Even though user is continuously...
Traditionally, FPGAs were programmed using low-level Hardware Description Languages (HDLs) like Verilog or VHDL, which made it extremely difficult to design, build and maintain systems for FPGAs. However, the recent release of OpenCL SDKs by FPGA vendors Xilinx Altera have significantly improved programmability brought new research opportunities query processing on It remains an open question whether how we can optimize based database engines There is a gap optimizations tuning between FPGA,...
Data stream processing systems (DSPSs) enable users to express and run applications continuously process data streams. To achieve real-time analytics, recent researches keep focusing on optimizing the system latency throughput. Witnessing great achievements in computer architecture community, researchers practitioners have investigated potential of adoption hardware-conscious by better utilizing modern hardware capacity DSPSs. In this paper, we conduct a systematic survey work field,...