- Infrared Target Detection Methodologies
- Advanced Image and Video Retrieval Techniques
- Multimodal Machine Learning Applications
- Advanced Neural Network Applications
- Computer Graphics and Visualization Techniques
- Advanced Vision and Imaging
- 3D Shape Modeling and Analysis
- Topic Modeling
- Human Pose and Action Recognition
- Underwater Vehicles and Communication Systems
- Maritime Navigation and Safety
- 3D Surveying and Cultural Heritage
- Anomaly Detection Techniques and Applications
- Video Surveillance and Tracking Methods
- Image Retrieval and Classification Techniques
- Image Enhancement Techniques
- Image Processing and 3D Reconstruction
- Domain Adaptation and Few-Shot Learning
- Natural Language Processing Techniques
- Seismic Imaging and Inversion Techniques
- Fire Detection and Safety Systems
- Visual Attention and Saliency Detection
- Drilling and Well Engineering
- Expert finding and Q&A systems
- Geophysical Methods and Applications
Hong Kong University of Science and Technology
2020-2025
University of Hong Kong
2020-2025
Vietnam National University Ho Chi Minh City
2017-2019
International University
2018
The 1 <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">st</sup> Workshop on Maritime Computer Vision (MaCVi) 2023 focused maritime computer vision for Unmanned Aerial Vehicles (UAV) and Surface Vehicle (USV), organized several subchallenges in this domain: (i) UAV-based Object Detection, (ii) Mar-itime Tracking, (iii) USV-based Obstacle Segmentation (iv) Detection. were based the SeaDronesSee MODS benchmarks. This report summarizes main findings...
The 2 <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">nd</sup> Workshop on Maritime Computer Vision (MaCVi) 2024 addresses maritime computer vision for Unmanned Aerial Vehicles (UAV) and Surface (USV). Three challenges categories are considered: (i) UAV-based Object Tracking with Re-ideruification, (ii) USV-based Obstacle Segmentation Detection, (iii) Boat Tracking. Detection features three sub-challenges, including a new embedded challenge...
Data augmentation is a powerful technique to enhance the performance of deep learning task but has received less attention in 3D learning. It well known that when shapes are sparsely represented with low point density, downstream tasks drops significantly. This work explores test-time (TTA) for clouds. We inspired by recent revolution implicit representation and cloud upsampling, which can produce high-quality surface reconstruction proximity-to-surface, respectively. Our idea leverage field...
Architectural photography is a genre of that focuses on capturing building or structure in the foreground with dramatic lighting background. Inspired by recent successes image-to-image translation methods, we aim to perform style transfer for architectural photographs. However, special composition poses great challenges this type Existing neural methods treat images as single entity, which would generate mismatched chrominance and destroy geometric features original architecture, yielding...
Large language models (LLMs) have demonstrated a powerful ability to answer various queries as general-purpose assistant. The continuous multi-modal large (MLLM) empower LLMs with the perceive visual signals. launch of GPT-4 (Generative Pre-trained Transformers) has generated significant interest in research communities. GPT-4V(ison) power both academia and industry fields, focal point new artificial intelligence generation. Though success was achieved by GPT-4V, exploring MLLMs...
Vehicle detection and classification is an essential application in traffic surveillance system (TSS). However, recognizing moving vehicle at nighttime more challenging because of either poorly (lack street lights) or brightly illuminations chaos motorbikes. Adding to this various type vehicles travels on the same road which falsifies pairing results. So, research proposes algorithm for scenes consists headlight segmentation, detection, tracking (two-wheeled four-wheeled vehicles). First,...
Large language models (LLMs), such as ChatGPT/GPT-4, have proven to be powerful tools in promoting the user experience an AI assistant. The continuous works are proposing multi-modal large (MLLM), empowering LLMs with ability sense multiple modality inputs through constructing a joint semantic space (e.g. visual-text space). Though significant success was achieved and MLLMs, exploring MLLMs domain-specific applications that required knowledge expertise has been less conducted, especially for...
The 1$^{\text{st}}$ Workshop on Maritime Computer Vision (MaCVi) 2023 focused maritime computer vision for Unmanned Aerial Vehicles (UAV) and Surface Vehicle (USV), organized several subchallenges in this domain: (i) UAV-based Object Detection, (ii) Tracking, (iii) USV-based Obstacle Segmentation (iv) Detection. were based the SeaDronesSee MODS benchmarks. This report summarizes main findings of individual introduces a new benchmark, called Detection v2, which extends previous benchmark by...
The main goal of traffic surveillance systems (TSSs) is to extract useful information by analyzing signals from cameras. This paper presents a system for vehicle detection and classification static pole-mounted roadside cameras on busy streets in the presence different kinds vehicles. There has been considerable research accommodate this subject since 90s; but most studies have only carried out developed countries where infrastructures are built around automobiles, whereas developing...
Text-to-image diffusion techniques have shown exceptional capability of producing high-quality images from text descriptions. This indicates that there exists a strong correlation between the visual and textual domains. In addition, text-image discriminative models such as CLIP excel in image labelling prompts, thanks to rich diverse information available open concepts. this paper, we leverage these technical advances solve challenging problem computer vision: camouflaged instance...
Referring image segmentation is a challenging task that involves generating pixel-wise masks based on natural language descriptions. Existing methods have relied mostly visual features to generate the while treating text as supporting components. This over-reliance can lead suboptimal results, especially in complex scenarios where prompts are ambiguous or context-dependent. To overcome these challenges, we present novel framework VATEX improve referring by enhancing object and context...
<title>Abstract</title> Object reconstruction from 3D point clouds has been a long-standing research topic in computer vision and graphics, achieved impressive progress. However, time-varying (a.k.a. 4D clouds) remains overlooked. In this paper, we propose new network architecture, namely RFNet-4D++, that jointly reconstructs objects their motion flows clouds. The key insight is simultaneously performing both the tasks can leverage individual ones, leading to improved overall performance. To...
The 2nd Workshop on Maritime Computer Vision (MaCVi) 2024 addresses maritime computer vision for Unmanned Aerial Vehicles (UAV) and Surface (USV). Three challenges categories are considered: (i) UAV-based Object Tracking with Re-identification, (ii) USV-based Obstacle Segmentation Detection, (iii) Boat Tracking. Detection features three sub-challenges, including a new embedded challenge addressing efficicent inference real-world devices. This report offers comprehensive overview of the...
Creating large-scale virtual urban scenes with variant styles is inherently challenging. To facilitate prototypes of production and bypass the need for complex materials lighting setups, we introduce first vision-and-text-driven texture stylization system scenes, StyleCity. Taking an image text as references, StyleCity stylizes a 3D textured mesh scene in semantics-aware fashion generates harmonic omnidirectional sky background. achieve that, propose to stylize neural field by transferring...
Building a video retrieval system that is robust and reliable, especially for the marine environment, challenging task due to several factors such as dealing with massive amounts of dense repetitive data, occlusion, blurriness, low lighting conditions, abstract queries. To address these challenges, we present MarineVRS, novel flexible designed explicitly domain. MarineVRS integrates state-of-the-art methods visual linguistic object representation enable efficient accurate search analysis...
Building a video retrieval system that is robust and reliable, especially for the marine environment, challenging task due to several factors such as dealing with massive amounts of dense repetitive data, occlusion, blurriness, low lighting conditions, abstract queries. To address these challenges, we present MarineVRS, novel flexible designed explicitly domain. MarineVRS integrates state-of-the-art methods visual linguistic object representation enable efficient accurate search analysis...