- Domain Adaptation and Few-Shot Learning
- Advanced Image Processing Techniques
- Advanced Neural Network Applications
- Image Enhancement Techniques
- 3D Shape Modeling and Analysis
- Image and Signal Denoising Methods
- Advanced Image and Video Retrieval Techniques
- Reinforcement Learning in Robotics
- Advanced Vision and Imaging
- Robotics and Sensor-Based Localization
- Adversarial Robustness in Machine Learning
- Vehicle License Plate Recognition
- Generative Adversarial Networks and Image Synthesis
- Adaptive Dynamic Programming Control
- Image Processing and 3D Reconstruction
- Digital Media Forensic Detection
- Neural dynamics and brain function
- Anomaly Detection Techniques and Applications
- Image and Video Quality Assessment
- Robot Manipulation and Learning
- COVID-19 diagnosis using AI
- Computer Graphics and Visualization Techniques
- Imbalanced Data Classification Techniques
- Cancer-related molecular mechanisms research
- 3D Surveying and Cultural Heritage
Korea Advanced Institute of Science and Technology
2018-2023
Existing state-of-the-art 3D instance segmentation methods perform semantic followed by grouping. The hard predictions are made when performing such that each point is associated with a single class. However, the errors stemming from decision propagate into grouping results in (1) low overlaps between predicted ground truth and (2) substantial false positives. To address aforementioned problems, this paper proposes method referred to as SoftGroup bottom-up soft top-down refinement. allows be...
Cascaded architectures have brought significant performance improvement in object detection and instance segmentation. However, there are lingering issues regarding the disparity Intersection-over-Union (IoU) distribution of samples between training inference. This can potentially exacerbate accuracy. paper proposes an architecture referred to as Sample Consistency Network (SCNet) ensure that IoU at time is close inference time. Furthermore, SCNet incorporates feature relay utilizes global...
This paper considers an architecture referred to as Cascade Region Proposal Network (Cascade RPN) for improving the region-proposal quality and detection performance by \textit{systematically} addressing limitation of conventional RPN that \textit{heuristically defines} anchors \textit{aligns} features anchors. First, instead using multiple with predefined scales aspect ratios, relies on a \textit{single anchor} per location performs multi-stage refinement. Each stage is progressively more...
A learning algorithm referred to as Maximum Margin (MM) is proposed for considering the class-imbalance data issue: trained model tends predict majority of classes rather than minority ones. That is, underfitting seems be one challenges generalization. For a good generalization classes, we design new loss function, motivated by minimizing margin-based bound through shifting decision bound. The theoretically-principled label-distribution-aware margin (LDAM) was successfully applied with prior...
Self-supervised learning (SSL) has gained remarkable success, for which contrastive (CL) plays a key role. However, the recent development of new non-CL frameworks achieved comparable or better performance with high improvement potential, prompting researchers to enhance these further. Assimilating CL into been thought be beneficial, but empirical evidence indicates no visible improvements. In view that, this paper proposes strategy performing along dimensional direction instead batch as...
Developing an agent in reinforcement learning (RL) that is capable of performing complex control tasks directly from high-dimensional observation such as raw pixels a challenge efforts still need to be made towards improving sample efficiency and generalization RL algorithm. This paper considers framework for Curiosity Contrastive Forward Dynamics Model (CCFDM) achieve more sample-efficient based on pixels. CCFDM incorporates forward dynamics model (FDM) performs contrastive train its deep...
This paper considers a network referred to as SoftGroup for accurate and scalable 3D instance segmentation. Existing state-of-the-art methods produce hard semantic predictions followed by grouping segmentation results. Unfortunately, errors stemming from decisions propagate into the grouping, resulting in poor overlap between predicted instances ground truth substantial false positives. To address abovementioned problems, allows each point be associated with multiple classes mitigate...
In an attempt to overcome the limitations of reward-driven representation learning in vision-based reinforcement (RL), unsupervised framework referred as visual pretraining via contrastive predictive model (VPCPM) is proposed learn representations detached from policy learning. Our method enables convolutional encoder perceive underlying dynamics through a pair forward and inverse models under supervision loss, thus resulting better representations. experiments with diverse set vision...
Action repeat has become the de-facto mechanism in deep reinforcement learning (RL) for stabilizing training and enhancing exploration. Here, action is taken at action-decision point executed repeatedly a designated number of times until next decision point. Although showing several advantages, this mechanism, intermediate states which stem from repeated actions are discarded agents, causing sample inefficiency. To utilize as data nontrivial action, causes transition between these states,...
Identification of DeepFake video content is a challenging scientific problem that addresses growing societal concern. We investigate the relationship between detection by humans and automatic methods based on state-of-the-art deep learning algorithms. The main novelty our work consideration videos are transmitted through noisy channels arrive with distortions. This reflects many practical environments, including surveillance cameras connected via wireless links videoconferencing in driving...
A bounding box commonly serves as the proxy for 2D object detection. However, extending this practice to 3D detection raises sensitivity localization error. This problem is acute on flat objects since small error may lead low overlaps between prediction and ground truth. To address problem, paper proposes Sphere Region Proposal Network (SphereRPN) which detects by learning spheres opposed boxes. We demonstrate that spherical proposals are more robust compared The proposed SphereRPN not only...
This paper reviews the first challenge on efficient perceptual image enhancement with focus deploying deep learning models smartphones. The consisted of two tracks. In one, participants were solving classical super-resolution problem a bicubic downscaling factor 4. second track was aimed at real-world photo enhancement, and goal to map low-quality photos from iPhone 3GS device same captured DSLR camera. target metric used in this combined runtime, PSNR scores solutions' results measured user...