Thang Vu

ORCID: 0000-0003-0486-6349
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Domain Adaptation and Few-Shot Learning
  • Advanced Image Processing Techniques
  • Advanced Neural Network Applications
  • Image Enhancement Techniques
  • 3D Shape Modeling and Analysis
  • Image and Signal Denoising Methods
  • Advanced Image and Video Retrieval Techniques
  • Reinforcement Learning in Robotics
  • Advanced Vision and Imaging
  • Robotics and Sensor-Based Localization
  • Adversarial Robustness in Machine Learning
  • Vehicle License Plate Recognition
  • Generative Adversarial Networks and Image Synthesis
  • Adaptive Dynamic Programming Control
  • Image Processing and 3D Reconstruction
  • Digital Media Forensic Detection
  • Neural dynamics and brain function
  • Anomaly Detection Techniques and Applications
  • Image and Video Quality Assessment
  • Robot Manipulation and Learning
  • COVID-19 diagnosis using AI
  • Computer Graphics and Visualization Techniques
  • Imbalanced Data Classification Techniques
  • Cancer-related molecular mechanisms research
  • 3D Surveying and Cultural Heritage

Korea Advanced Institute of Science and Technology
2018-2023

Existing state-of-the-art 3D instance segmentation methods perform semantic followed by grouping. The hard predictions are made when performing such that each point is associated with a single class. However, the errors stemming from decision propagate into grouping results in (1) low overlaps between predicted ground truth and (2) substantial false positives. To address aforementioned problems, this paper proposes method referred to as SoftGroup bottom-up soft top-down refinement. allows be...

10.1109/cvpr52688.2022.00273 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

Cascaded architectures have brought significant performance improvement in object detection and instance segmentation. However, there are lingering issues regarding the disparity Intersection-over-Union (IoU) distribution of samples between training inference. This can potentially exacerbate accuracy. paper proposes an architecture referred to as Sample Consistency Network (SCNet) ensure that IoU at time is close inference time. Furthermore, SCNet incorporates feature relay utilizes global...

10.1609/aaai.v35i3.16374 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2021-05-18

This paper considers an architecture referred to as Cascade Region Proposal Network (Cascade RPN) for improving the region-proposal quality and detection performance by \textit{systematically} addressing limitation of conventional RPN that \textit{heuristically defines} anchors \textit{aligns} features anchors. First, instead using multiple with predefined scales aspect ratios, relies on a \textit{single anchor} per location performs multi-stage refinement. Each stage is progressively more...

10.48550/arxiv.1909.06720 preprint EN other-oa arXiv (Cornell University) 2019-01-01

A learning algorithm referred to as Maximum Margin (MM) is proposed for considering the class-imbalance data issue: trained model tends predict majority of classes rather than minority ones. That is, underfitting seems be one challenges generalization. For a good generalization classes, we design new loss function, motivated by minimizing margin-based bound through shifting decision bound. The theoretically-principled label-distribution-aware margin (LDAM) was successfully applied with prior...

10.1109/icip42928.2021.9506389 article EN 2022 IEEE International Conference on Image Processing (ICIP) 2021-08-23

Self-supervised learning (SSL) has gained remarkable success, for which contrastive (CL) plays a key role. However, the recent development of new non-CL frameworks achieved comparable or better performance with high improvement potential, prompting researchers to enhance these further. Assimilating CL into been thought be beneficial, but empirical evidence indicates no visible improvements. In view that, this paper proposes strategy performing along dimensional direction instead batch as...

10.1109/access.2023.3236087 article EN cc-by IEEE Access 2023-01-01

Developing an agent in reinforcement learning (RL) that is capable of performing complex control tasks directly from high-dimensional observation such as raw pixels a challenge efforts still need to be made towards improving sample efficiency and generalization RL algorithm. This paper considers framework for Curiosity Contrastive Forward Dynamics Model (CCFDM) achieve more sample-efficient based on pixels. CCFDM incorporates forward dynamics model (FDM) performs contrastive train its deep...

10.1109/iros51168.2021.9636536 article EN 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2021-09-27

This paper considers a network referred to as SoftGroup for accurate and scalable 3D instance segmentation. Existing state-of-the-art methods produce hard semantic predictions followed by grouping segmentation results. Unfortunately, errors stemming from decisions propagate into the grouping, resulting in poor overlap between predicted instances ground truth substantial false positives. To address abovementioned problems, allows each point be associated with multiple classes mitigate...

10.1109/tpami.2023.3326189 article EN cc-by-nc-nd IEEE Transactions on Pattern Analysis and Machine Intelligence 2023-10-20

In an attempt to overcome the limitations of reward-driven representation learning in vision-based reinforcement (RL), unsupervised framework referred as visual pretraining via contrastive predictive model (VPCPM) is proposed learn representations detached from policy learning. Our method enables convolutional encoder perceive underlying dynamics through a pair forward and inverse models under supervision loss, thus resulting better representations. experiments with diverse set vision...

10.3390/s22176504 article EN cc-by Sensors 2022-08-29

Action repeat has become the de-facto mechanism in deep reinforcement learning (RL) for stabilizing training and enhancing exploration. Here, action is taken at action-decision point executed repeatedly a designated number of times until next decision point. Although showing several advantages, this mechanism, intermediate states which stem from repeated actions are discarded agents, causing sample inefficiency. To utilize as data nontrivial action, causes transition between these states,...

10.1109/access.2022.3182107 article EN cc-by IEEE Access 2022-01-01

Identification of DeepFake video content is a challenging scientific problem that addresses growing societal concern. We investigate the relationship between detection by humans and automatic methods based on state-of-the-art deep learning algorithms. The main novelty our work consideration videos are transmitted through noisy channels arrive with distortions. This reflects many practical environments, including surveillance cameras connected via wireless links videoconferencing in driving...

10.1109/icme52920.2022.9859954 article EN 2022 IEEE International Conference on Multimedia and Expo (ICME) 2022-07-18

A bounding box commonly serves as the proxy for 2D object detection. However, extending this practice to 3D detection raises sensitivity localization error. This problem is acute on flat objects since small error may lead low overlaps between prediction and ground truth. To address problem, paper proposes Sphere Region Proposal Network (SphereRPN) which detects by learning spheres opposed boxes. We demonstrate that spherical proposals are more robust compared The proposed SphereRPN not only...

10.1109/icip42928.2021.9506249 article EN 2022 IEEE International Conference on Image Processing (ICIP) 2021-08-23

This paper reviews the first challenge on efficient perceptual image enhancement with focus deploying deep learning models smartphones. The consisted of two tracks. In one, participants were solving classical super-resolution problem a bicubic downscaling factor 4. second track was aimed at real-world photo enhancement, and goal to map low-quality photos from iPhone 3GS device same captured DSLR camera. target metric used in this combined runtime, PSNR scores solutions' results measured user...

10.48550/arxiv.1810.01641 preprint EN other-oa arXiv (Cornell University) 2018-01-01
Coming Soon ...