- Video Surveillance and Tracking Methods
- Multimodal Machine Learning Applications
- Autonomous Vehicle Technology and Safety
- Human Pose and Action Recognition
- Advanced Image and Video Retrieval Techniques
- Fire Detection and Safety Systems
- Natural Language Processing Techniques
- Topic Modeling
- Advanced Vision and Imaging
- Anomaly Detection Techniques and Applications
- Advanced Neural Network Applications
- Calibration and Measurement Techniques
- Visual Attention and Saliency Detection
- Industrial Vision Systems and Defect Detection
- Traffic Prediction and Management Techniques
- Domain Adaptation and Few-Shot Learning
- Currency Recognition and Detection
- Infrared Target Detection Methodologies
- CCD and CMOS Imaging Sensors
- Vehicle License Plate Recognition
- Image Enhancement Techniques
Ho Chi Minh City International University
2020-2024
Vietnam National University Ho Chi Minh City
2019-2024
MARK Resources (United States)
1992
Due to the rapid growth in number of vehicles over last decade, there has been a dramatic increase demand for highway capacity analysis. Vehicle counting, particular, become key element vision-based intelligent traffic systems deployed across metropolitan areas. Most methods solved vehicle counting problem under assumption state-of-the-art computing systems. However, large-scale deployment such multi-camera processing is very inefficient. With recent advancement cost-efficient...
This paper presents a solution for Track 1 of the AI City Challenge 2023, which involves Multi-Camera People Tracking in indoor scenarios. The proposed framework comprises four modules: Vehicle detection, ReID feature extraction, single-camera multi-target tracking (SCMT), matching, and multi-camera matching. A significant contribution our approach is introduction ID switch detection splitting using Gaussian mixture model, efficiently addresses problem tracklets with switches. Furthermore,...
This paper introduces our solution for Track 2 in AI City Challenge 2022. The task is Tracked-Vehicle Retrieval by Natural Language Descriptions with a real-world dataset of various scenarios and cameras. We mainly focus on developing robust natural language-based vehicle retrieval system to address the domain bias problem due unseen multi-view multi-camera tracks. Specifically, we apply CLIP [16] effectively extract both visual textual representations contrastive representation learning....
Background subtraction (BgS) is a problem for handling pixel-level identification of changing or moving entities in the field view static camera system. Recent works have discovered superior generalization to unseen realistic scenarios by an approach called deep BgS, which employs neural networks (DNNs) on concatenations image inputs and their backgrounds. However, due lack large-scale foreground-background datasets, challenges manually creating background masks existing Internet-scale...
Decades of ongoing research have shown that background modelling is a very powerful technique, which used in intelligent surveillance systems, order to extract features interest, known as foregrounds. In work with the dynamic nature different scenes, many techniques adopted unsupervised approach Gaussian Mixture Model an iterative paradigm. Although technique has had much success, problem occurs cases sudden scene changes high variation (e.g., illumination changes, camera jittering) model...
This paper introduces our solution for Track 2 in AI City Challenge 2023. The task is tracked-vehicle retrieval by natural language descriptions with a real-world dataset of various scenarios and cameras. Our mainly focuses on four points: (1) To address the linguistic ambiguity query, we leverage proposed standardized version text domain-adaptive training post-processing stage. (2) baseline vehicle model utilizes CLIP to extract robust visual textual feature representations learn unified...
In this paper, we propose a system for Multi-Camera Multi-Target (MCMT) Vehicle Tracking in Track 1 of AI City Challenge 2022. There are many technical difficulties to the MCMT problem such as common lack labeled data real scenarios, distortion vehicle detailed appearances recording, and ambiguity between highly similar vehicles. Taking those into account, develop 3-component that exploits behavior, leverages synthetic multiple augmentation techniques, enforces contextual constraints....
The main goal of traffic surveillance systems (TSSs) is to extract useful information by analyzing signals from cameras. This paper presents a system for vehicle detection and classification static pole-mounted roadside cameras on busy streets in the presence different kinds vehicles. There has been considerable research accommodate this subject since 90s; but most studies have only carried out developed countries where infrastructures are built around automobiles, whereas developing...
Shadows are among the most critical problems for traffic surveillance systems (TSSs). In a TSS, shadow regions significantly affect extraction of vehicles' attributes vehicle detection, classification and tracking. Although many methods have been proposed to address this key problem, dilemma accurate removal with boundaries recovery real-time processing still poses as great challenge. paper, we propose new method that utilizes edge features eliminate shadows, refine images regardless changes...
Text-to-image diffusion techniques have shown exceptional capability of producing high-quality images from text descriptions. This indicates that there exists a strong correlation between the visual and textual domains. In addition, text-image discriminative models such as CLIP excel in image labelling prompts, thanks to rich diverse information available open concepts. this paper, we leverage these technical advances solve challenging problem computer vision: camouflaged instance...
Traffic surveillance system (TSS) is an essential tool to extract necessary information (count, type, speed, etc.) from cameras for traffic monitoring in many metro cities. In TSS, vehicle detection plays a pivotal role as it vital process further analysis such classification and tracking. So far there has been considerable amount of research proposed with single-pipeline Convolution Neural Networks (CNN) accommodate this subject. Although these studies achieved results high accuracy, they...
In high-resolution imaging, weak target pixel amplifiers may not be detected in the presence of clutter containing strong nonhomogeneities, when conventional approaches are used. The authors describe a constant false alarm rate (CFAR) approach that avoids elimination these significant returns. nonhomogeneous as well components with this approach. targets could then discriminated from homogeneities by discrimination techniques. It is shown how lower amplitude background noise and homogeneous...
Given Natural Language (NL) text descriptions, NL-based vehicle retrieval aims to extract target vehicles from a multi-view multi-camera traffic video pool. Due inherent distinctions between textual and visual data, this is challenging multi-modal task that requires robust feature extractors (e.g. neural network) well-align the abstract representations of texts images in same domain. However, solutions problem have been challenged by high data complexities not only multi-view, attributes...
Background modeling and subtraction is a promising research area with variety of applications for video surveillance. Recent years have witnessed proliferation effective learning-based deep neural networks in this area. However, the techniques only provided limited descriptions scenes' properties while requiring heavy computations, as their single-valued mapping functions are learned to approximate temporal conditional averages observed target backgrounds foregrounds. On other hand,...