- Human Pose and Action Recognition
- Advanced Image and Video Retrieval Techniques
- Multimodal Machine Learning Applications
- Domain Adaptation and Few-Shot Learning
- Advanced Neural Network Applications
- Visual Attention and Saliency Detection
- Hand Gesture Recognition Systems
- Generative Adversarial Networks and Image Synthesis
- Fish Ecology and Management Studies
- Image Retrieval and Classification Techniques
- Video Surveillance and Tracking Methods
- Anomaly Detection Techniques and Applications
- Colorectal Cancer Screening and Detection
- Video Analysis and Summarization
- Image Enhancement Techniques
- Traffic Prediction and Management Techniques
- Fish Biology and Ecology Studies
- Radiomics and Machine Learning in Medical Imaging
- Identification and Quantification in Food
- Water Quality Monitoring Technologies
- AI in cancer detection
- 3D Surveying and Cultural Heritage
- Hearing Impairment and Communication
- Vehicular Ad Hoc Networks (VANETs)
- Antenna Design and Analysis
Vietnam National University Ho Chi Minh City
2021-2024
Stony Brook University
2024
Ho Chi Minh City University of Science
2020-2023
Hanoi University of Science and Technology
2021-2022
To synthesize high-fidelity samples, diffusion models typically require auxiliary data to guide the generation process. However, it is impractical procure painstaking patch-level annotation effort required in specialized domains like histopathology and satellite imagery; often performed by domain experts involves hundreds of millions patches. Modern-day self-supervised learning (SSL) representations encode rich semantic visual information. In this paper, we posit that such are expressive...
This paper pushes the envelope on decomposing camouflaged regions in an image into meaningful components, namely, instances. To promote new task of instance segmentation in-the-wild images, we introduce a dataset, dubbed CAMO++, that extends our preliminary CAMO dataset (camouflaged object segmentation) terms quantity and diversity. The substantially increases number images with hierarchical pixel-wise ground truths. We also provide benchmark suite for segmentation. In particular, present...
While diffusion models are powerful in generating high-quality, diverse synthetic data for object-centric tasks, existing methods struggle with scene-aware tasks such as Visual Question Answering (VQA) and Human-Object Interaction (HOI) Reasoning, where it is critical to preserve scene attributes generated images consistent a multimodal context, i.e. reference image accompanying text guidance query. To address this, we introduce Hummingbird, the first diffusion-based generator which, given...
Few-shot instance segmentation extends the few-shot learning paradigm to task, which tries segment objects from a query image with few annotated examples of novel categories. Conventional approaches have attempted address task via prototype learning, known as point estimation. However, this mechanism depends on prototypes (e.g. mean K-shot) for prediction, leading performance instability. To overcome disadvantage estimation mechanism, we propose approach, dubbed MaskDiff, models underlying...
Traffic flow analysis is essential for intelligent transportation systems. In this paper, we introduce our Intelligent Analysis Software Kit (iTASK) to tackle three challenging problems: vehicle counting, re-identification, and abnormal event detection. For the first problem, propose real-time track vehicles moving along desired direction in corresponding motion-of-interests (MOIs). second consider each as a document with multiple semantic words (i.e., attributes) transform given problem...
Camouflaged object detection (COD) and camouflaged instance segmentation (CIS) aim to recognize segment objects that are blended into their surroundings, respectively. While several deep neural network models have been proposed tackle those tasks, augmentation methods for COD CIS not thoroughly explored. Augmentation strategies can help improve the performance of by increasing size diversity training data exposing model a wider range variations in data. Besides, we automatically learn...
Synthesizing high-resolution images from intricate, domain-specific information remains a significant challenge in generative modeling, particularly for applications large-image domains such as digital histopathology and remote sensing. Existing methods face critical limitations: conditional diffusion models pixel or latent space cannot exceed the resolution on which they were trained without losing fidelity, computational demands increase significantly larger image sizes. Patch-based offer...
Person search by natural language description is a challenging task as it has to model and learn visual-text semantic embedding. While several works have been dedicated person English descriptions, few attempts made for other languages. As result, lacks of available resource in these Inspired transfer learning idea image classification, this paper, we propose method Vietnamese using whose weights are trained on large scale dataset English. To end, first, the published network architecture...
Artificial Intelligence (AI) has played an increasingly crucial part in our daily lives recent years. Convolutional neural network (CNN) medical image processing lately received a lot of interest. With the introduction modern endoscopic technologies, doctor could be able to diagnose patient more accurately. Consequently, it becomes and advantageous use computer-aided support during procedures. This paper proposes framework for automatic classification Upper Gastrointestinal tract diseases...
In this paper, we introduce a practical system for interactive video object mask annotation, which can support multiple back-end methods. To demonstrate the generalization of our system, novel approach annotation. Our proposed takes scribbles at chosen key-frame from end-users via user-friendly interface and produces masks corresponding objects Control-Point-based Scribbles-to-Mask (CPSM) module. The are then propagated to other frames refined through Multi-Referenced Guided Segmentation...
Gesture recognition is a fundamental tool to enable novel interaction paradigms in variety of application scenarios like Mixed Reality environments, touchless public kiosks, entertainment systems, and more. Recognition hand gestures can be nowadays performed directly from the stream skeletons estimated by software provided low-cost trackers (Ultraleap) MR headsets (Hololens, Oculus Quest) or video processing modules (e.g. Google Mediapipe). Despite recent advancements gesture action...
Big cities are well-known for their traffic congestion and high density of vehicles such as cars, buses, trucks, even a swarm motorbikes that overwhelm city streets. Large-scale development projects have exacerbated urban conditions, making more severe. In this paper, we proposed data-driven planning simulator. particular, make use the camera system analysis. It seeks to recognize flows, with reduced intervention from monitoring staff. Then, develop simulator upon analyzed data. The is used...
Few-shot instance segmentation extends the few-shot learning paradigm to task, which tries segment objects from a query image with few annotated examples of novel categories. Conventional approaches have attempted address task via prototype learning, known as point estimation. However, this mechanism depends on prototypes (\eg mean $K-$shot) for prediction, leading performance instability. To overcome disadvantage estimation mechanism, we propose approach, dubbed MaskDiff, models underlying...
The retrieval of 3D objects has gained significant importance in recent years due to its broad range applications computer vision, graphics, virtual reality, and augmented reality. However, the presents challenges intricate nature models, which can vary shape, size, texture, have numerous polygons vertices. To this end, we introduce a novel SHREC challenge track that focuses on retrieving relevant animal models from dataset using sketch queries expedites accessing through available sketches....
3D object retrieval is an important yet challenging task that has drawn more and attention in recent years. While existing approaches have made strides addressing this issue, they are often limited to restricted settings such as image sketch queries, which unfriendly interactions for common users. In order overcome these limitations, paper presents a novel SHREC challenge track focusing on text-based fine-grained of animal models. Unlike previous tracks, the proposed considerably...
Recent blockchain-based systems for managing credentials show advantages over paper-based procedures. However, issuing with blockchain could conflict current management rules and policies. One of the possible conflicts is auditability. Most focus on security, efficiency privacy while ignoring auditability system. In this paper, we propose a new system IU-TransCert issuing, verifying, auditing academic credentials. The uses data structure named Auditable Merkle Tree that enables credential...
To synthesize high-fidelity samples, diffusion models typically require auxiliary data to guide the generation process. However, it is impractical procure painstaking patch-level annotation effort required in specialized domains like histopathology and satellite imagery; often performed by domain experts involves hundreds of millions patches. Modern-day self-supervised learning (SSL) representations encode rich semantic visual information. In this paper, we posit that such are expressive...