Minh-Quan Le

ORCID: 0000-0003-0023-6235
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Human Pose and Action Recognition
  • Advanced Image and Video Retrieval Techniques
  • Multimodal Machine Learning Applications
  • Domain Adaptation and Few-Shot Learning
  • Advanced Neural Network Applications
  • Visual Attention and Saliency Detection
  • Hand Gesture Recognition Systems
  • Generative Adversarial Networks and Image Synthesis
  • Fish Ecology and Management Studies
  • Image Retrieval and Classification Techniques
  • Video Surveillance and Tracking Methods
  • Anomaly Detection Techniques and Applications
  • Colorectal Cancer Screening and Detection
  • Video Analysis and Summarization
  • Image Enhancement Techniques
  • Traffic Prediction and Management Techniques
  • Fish Biology and Ecology Studies
  • Radiomics and Machine Learning in Medical Imaging
  • Identification and Quantification in Food
  • Water Quality Monitoring Technologies
  • AI in cancer detection
  • 3D Surveying and Cultural Heritage
  • Hearing Impairment and Communication
  • Vehicular Ad Hoc Networks (VANETs)
  • Antenna Design and Analysis

Vietnam National University Ho Chi Minh City
2021-2024

Stony Brook University
2024

Ho Chi Minh City University of Science
2020-2023

Hanoi University of Science and Technology
2021-2022

To synthesize high-fidelity samples, diffusion models typically require auxiliary data to guide the generation process. However, it is impractical procure painstaking patch-level annotation effort required in specialized domains like histopathology and satellite imagery; often performed by domain experts involves hundreds of millions patches. Modern-day self-supervised learning (SSL) representations encode rich semantic visual information. In this paper, we posit that such are expressive...

10.1109/cvpr52733.2024.00815 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024-06-16

This paper pushes the envelope on decomposing camouflaged regions in an image into meaningful components, namely, instances. To promote new task of instance segmentation in-the-wild images, we introduce a dataset, dubbed CAMO++, that extends our preliminary CAMO dataset (camouflaged object segmentation) terms quantity and diversity. The substantially increases number images with hierarchical pixel-wise ground truths. We also provide benchmark suite for segmentation. In particular, present...

10.1109/tip.2021.3130490 article EN publisher-specific-oa IEEE Transactions on Image Processing 2021-12-02

While diffusion models are powerful in generating high-quality, diverse synthetic data for object-centric tasks, existing methods struggle with scene-aware tasks such as Visual Question Answering (VQA) and Human-Object Interaction (HOI) Reasoning, where it is critical to preserve scene attributes generated images consistent a multimodal context, i.e. reference image accompanying text guidance query. To address this, we introduce Hummingbird, the first diffusion-based generator which, given...

10.48550/arxiv.2502.05153 preprint EN arXiv (Cornell University) 2025-02-07

10.1109/wacv61041.2025.00338 article EN 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2025-02-26

Few-shot instance segmentation extends the few-shot learning paradigm to task, which tries segment objects from a query image with few annotated examples of novel categories. Conventional approaches have attempted address task via prototype learning, known as point estimation. However, this mechanism depends on prototypes (e.g. mean K-shot) for prediction, leading performance instability. To overcome disadvantage estimation mechanism, we propose approach, dubbed MaskDiff, models underlying...

10.1609/aaai.v38i3.28068 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2024-03-24

Traffic flow analysis is essential for intelligent transportation systems. In this paper, we introduce our Intelligent Analysis Software Kit (iTASK) to tackle three challenging problems: vehicle counting, re-identification, and abnormal event detection. For the first problem, propose real-time track vehicles moving along desired direction in corresponding motion-of-interests (MOIs). second consider each as a document with multiple semantic words (i.e., attributes) transform given problem...

10.1109/cvprw50498.2020.00314 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2020-06-01

Camouflaged object detection (COD) and camouflaged instance segmentation (CIS) aim to recognize segment objects that are blended into their surroundings, respectively. While several deep neural network models have been proposed tackle those tasks, augmentation methods for COD CIS not thoroughly explored. Augmentation strategies can help improve the performance of by increasing size diversity training data exposing model a wider range variations in data. Besides, we automatically learn...

10.48550/arxiv.2308.15660 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Synthesizing high-resolution images from intricate, domain-specific information remains a significant challenge in generative modeling, particularly for applications large-image domains such as digital histopathology and remote sensing. Existing methods face critical limitations: conditional diffusion models pixel or latent space cannot exceed the resolution on which they were trained without losing fidelity, computational demands increase significantly larger image sizes. Patch-based offer...

10.48550/arxiv.2407.14709 preprint EN arXiv (Cornell University) 2024-07-19

Person search by natural language description is a challenging task as it has to model and learn visual-text semantic embedding. While several works have been dedicated person English descriptions, few attempts made for other languages. As result, lacks of available resource in these Inspired transfer learning idea image classification, this paper, we propose method Vietnamese using whose weights are trained on large scale dataset English. To end, first, the published network architecture...

10.1109/kse53942.2021.9648695 article EN 2021-11-10

Artificial Intelligence (AI) has played an increasingly crucial part in our daily lives recent years. Convolutional neural network (CNN) medical image processing lately received a lot of interest. With the introduction modern endoscopic technologies, doctor could be able to diagnose patient more accurately. Consequently, it becomes and advantageous use computer-aided support during procedures. This paper proposes framework for automatic classification Upper Gastrointestinal tract diseases...

10.1109/iccais56082.2022.9990445 article EN 2022 11th International Conference on Control, Automation and Information Sciences (ICCAIS) 2022-11-21

In this paper, we introduce a practical system for interactive video object mask annotation, which can support multiple back-end methods. To demonstrate the generalization of our system, novel approach annotation. Our proposed takes scribbles at chosen key-frame from end-users via user-friendly interface and produces masks corresponding objects Control-Point-based Scribbles-to-Mask (CPSM) module. The are then propagated to other frames refined through Multi-Referenced Guided Segmentation...

10.1609/aaai.v35i18.18014 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2021-05-18

Gesture recognition is a fundamental tool to enable novel interaction paradigms in variety of application scenarios like Mixed Reality environments, touchless public kiosks, entertainment systems, and more. Recognition hand gestures can be nowadays performed directly from the stream skeletons estimated by software provided low-cost trackers (Ultraleap) MR headsets (Hololens, Oculus Quest) or video processing modules (e.g. Google Mediapipe). Despite recent advancements gesture action...

10.48550/arxiv.2106.10980 preprint EN cc-by-nc-nd arXiv (Cornell University) 2021-01-01

Big cities are well-known for their traffic congestion and high density of vehicles such as cars, buses, trucks, even a swarm motorbikes that overwhelm city streets. Large-scale development projects have exacerbated urban conditions, making more severe. In this paper, we proposed data-driven planning simulator. particular, make use the camera system analysis. It seeks to recognize flows, with reduced intervention from monitoring staff. Then, develop simulator upon analyzed data. The is used...

10.1109/ismar-adjunct57072.2022.00185 article EN 2022 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct) 2022-10-01

Few-shot instance segmentation extends the few-shot learning paradigm to task, which tries segment objects from a query image with few annotated examples of novel categories. Conventional approaches have attempted address task via prototype learning, known as point estimation. However, this mechanism depends on prototypes (\eg mean $K-$shot) for prediction, leading performance instability. To overcome disadvantage estimation mechanism, we propose approach, dubbed MaskDiff, models underlying...

10.48550/arxiv.2303.05105 preprint EN cc-by arXiv (Cornell University) 2023-01-01

The retrieval of 3D objects has gained significant importance in recent years due to its broad range applications computer vision, graphics, virtual reality, and augmented reality. However, the presents challenges intricate nature models, which can vary shape, size, texture, have numerous polygons vertices. To this end, we introduce a novel SHREC challenge track that focuses on retrieving relevant animal models from dataset using sketch queries expedites accessing through available sketches....

10.48550/arxiv.2304.05731 preprint EN cc-by arXiv (Cornell University) 2023-01-01

3D object retrieval is an important yet challenging task that has drawn more and attention in recent years. While existing approaches have made strides addressing this issue, they are often limited to restricted settings such as image sketch queries, which unfriendly interactions for common users. In order overcome these limitations, paper presents a novel SHREC challenge track focusing on text-based fine-grained of animal models. Unlike previous tracks, the proposed considerably...

10.48550/arxiv.2304.06053 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Recent blockchain-based systems for managing credentials show advantages over paper-based procedures. However, issuing with blockchain could conflict current management rules and policies. One of the possible conflicts is auditability. Most focus on security, efficiency privacy while ignoring auditability system. In this paper, we propose a new system IU-TransCert issuing, verifying, auditing academic credentials. The uses data structure named Auditable Merkle Tree that enables credential...

10.1145/3628797.3628822 article EN 2023-12-06

To synthesize high-fidelity samples, diffusion models typically require auxiliary data to guide the generation process. However, it is impractical procure painstaking patch-level annotation effort required in specialized domains like histopathology and satellite imagery; often performed by domain experts involves hundreds of millions patches. Modern-day self-supervised learning (SSL) representations encode rich semantic visual information. In this paper, we posit that such are expressive...

10.48550/arxiv.2312.07330 preprint EN cc-by arXiv (Cornell University) 2023-01-01
Coming Soon ...