Junsuk Choe

ORCID: 0000-0003-4726-4436
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Neural Network Applications
  • Domain Adaptation and Few-Shot Learning
  • Advanced Image and Video Retrieval Techniques
  • Multimodal Machine Learning Applications
  • Machine Learning and Data Classification
  • Adversarial Robustness in Machine Learning
  • Handwritten Text Recognition Techniques
  • Anomaly Detection Techniques and Applications
  • Face recognition and analysis
  • Generative Adversarial Networks and Image Synthesis
  • Advanced Vision and Imaging
  • Text and Document Classification Technologies
  • Neural Networks and Applications
  • Visual Attention and Saliency Detection
  • Image Enhancement Techniques
  • Optical measurement and interference techniques
  • Natural Language Processing Techniques
  • Human Pose and Action Recognition
  • Image Retrieval and Classification Techniques
  • Image Processing and 3D Reconstruction
  • COVID-19 diagnosis using AI
  • Robotics and Sensor-Based Localization
  • CCD and CMOS Imaging Sensors
  • Advancements in Photolithography Techniques
  • Educational Technology and Assessment

Sogang University
2021-2024

California State University, Fresno
2024

University of California, San Francisco
2024

Istituto Tecnico Industriale Alessandro Volta
2021

Weatherford College
2021

Naver (South Korea)
2019-2021

Yonsei University
2004-2020

Pohang University of Science and Technology
1998

Regional dropout strategies have been proposed to enhance performance of convolutional neural network classifiers. They proved be effective for guiding the model attend on less discriminative parts objects (e.g. leg as opposed head a person), thereby letting generalize better and object localization capabilities. On other hand, current methods regional removes informative pixels training images by overlaying patch either black or random noise. Such removal is not desirable because it suffers...

10.1109/iccv.2019.00612 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

Vision Transformer (ViT) extends the application range of transformers from language processing to computer vision tasks as being an alternative architecture against existing convolutional neural networks (CNN). Since transformer-based has been innovative for modeling, design convention towards effective less studied yet. From successful principles CNN, we investigate role spatial dimension conversion and its effectiveness on architecture. We particularly attend reduction principle CNNs;...

10.1109/iccv48922.2021.01172 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

Weakly Supervised Object Localization (WSOL) techniques learn the object location only using image-level labels, without annotations. A common limitation for these is that they cover most discriminative part of object, not entire object. To address this problem, we propose an Attention-based Dropout Layer (ADL), which utilizes self-attention mechanism to process feature maps model. The proposed method composed two key components: 1) hiding from model capturing integral extent and 2)...

10.1109/cvpr.2019.00232 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

Regional dropout strategies have been proposed to enhance the performance of convolutional neural network classifiers. They proved be effective for guiding model attend on less discriminative parts objects (e.g. leg as opposed head a person), thereby letting generalize better and object localization capabilities. On other hand, current methods regional remove informative pixels training images by overlaying patch either black or random noise. Such removal is not desirable because it leads...

10.48550/arxiv.1905.04899 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Weakly-supervised object localization (WSOL) has gained popularity over the last years for its promise to train models with only image-level labels. Since seminal WSOL work of class activation mapping (CAM), field focused on how expand attention regions cover objects more broadly and localize them better. However, these strategies rely full supervision validate hyperparameters model selection, which is in principle prohibited under setup. In this paper, we argue that task ill-posed labels,...

10.1109/cvpr42600.2020.00320 preprint EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

Weakly supervised semantic segmentation (WSSS) methods are often built on pixel-level localization maps obtained from a classifier. However, training class labels only, classifiers suffer the spurious correlation between fore-ground and background cues (e.g. train rail), fundamentally bounding performance of WSSS. There have been previous endeavors to address this issue with additional supervision. We propose novel source information distinguish foreground background: Out-of-Distribution...

10.1109/cvpr52688.2022.01639 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

Both weakly supervised single object localization and semantic segmentation techniques learn an object's location using only image-level labels. However, these are limited to cover the most discriminative part of not entire object. To address this problem, we propose attention-based dropout layer, which utilizes attention mechanism locate efficiently. achieve this, devise two key components, 1) hiding from model capture object, 2) highlighting informative region improve classification power...

10.1109/tpami.2020.2999099 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2020-06-01

ImageNet has been the most popular image classification benchmark, but it is also one with a significant level of label noise. Recent studies have shown that many samples contain multiple classes, despite being assumed to be single-label benchmark. They thus proposed turn evaluation into multi-label task, exhaustive annotations per image. However, they not fixed training set, presumably because formidable annotation cost. We argue mismatch between and effectively images equally, if more,...

10.1109/cvpr46437.2021.00237 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

Recently, low-shot learning has been proposed for handling the lack of training data in machine learning. Despite importance this issue, relatively less efforts have made to study problem. In paper, we aim increase size dataset various ways improve accuracy and robustness face recognition. detail, adapt a generator from Generative Adversarial Network (GAN) dataset, which includes base set, widely available novel given limited while adopting transfer as backend. Based on extensive...

10.1109/iccvw.2017.229 article EN 2017-10-01

Vision Transformer (ViT) extends the application range of transformers from language processing to computer vision tasks as being an alternative architecture against existing convolutional neural networks (CNN). Since transformer-based has been innovative for modeling, design convention towards effective less studied yet. From successful principles CNN, we investigate role spatial dimension conversion and its effectiveness on architecture. We particularly attend reduction principle CNNs;...

10.48550/arxiv.2103.16302 preprint EN cc-by-sa arXiv (Cornell University) 2021-01-01

Despite apparent human-level performances of deep neural networks (DNN), they behave fundamentally differently from humans. They easily change predictions when small corruptions such as blur and noise are applied on the input (lack robustness), often produce confident out-of-distribution samples (improper uncertainty measure). While a number researches have aimed to address those issues, proposed solutions typically expensive complicated (e.g. Bayesian inference adversarial training)....

10.48550/arxiv.2003.03879 preprint EN other-oa arXiv (Cornell University) 2020-01-01

10.1016/j.patrec.2025.01.012 article DA Pattern Recognition Letters 2025-01-01

Weakly-supervised object localization (WSOL) enables finding an using a dataset without any information. By simply training classification model only image-level annotations, the feature map of can be utilized as score for localization. In spite many WSOL methods proposing novel strategies, there has not been de facto standard about how to normalize class activation (CAM). Consequently, have failed fully exploit their own capacity because misuse normalization method. this paper, we review...

10.1109/iccv48922.2021.00341 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

State-of-the-art techniques in weakly-supervised semantic segmentation (WSSS) using image-level labels exhibit severe performance degradation on driving scene datasets such as Cityscapes. To address this challenge, we develop a new WSSS framework tailored to datasets. Based extensive analysis of dataset characteristics, employ Contrastive Language-Image Pre-training (CLIP) our baseline obtain pseudo-masks. However, CLIP introduces two key challenges: (1) pseudo-masks from lack representing...

10.1609/aaai.v38i3.28053 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2024-03-24

Weakly-supervised object localization (WSOL) has gained popularity over the last years for its promise to train models with only image-level labels. Since seminal WSOL work of class activation mapping (CAM), field focused on how expand attention regions cover objects more broadly and localize them better. However, these strategies rely full supervision validating hyperparameters model selection, which is in principle prohibited under setup. In this paper, we argue that task ill-posed labels,...

10.1109/tpami.2022.3169881 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2022-04-25

Weakly Supervised Object Localization (WSOL) techniques learn the object location only using image-level labels, without annotations. A common limitation for these is that they cover most discriminative part of object, not entire object. To address this problem, we propose an Attention-based Dropout Layer (ADL), which utilizes self-attention mechanism to process feature maps model. The proposed method composed two key components: 1) hiding from model capturing integral extent and 2)...

10.48550/arxiv.1908.10028 preprint EN other-oa arXiv (Cornell University) 2019-01-01

The class activation mapping, or CAM, has been the cornerstone of feature attribution methods for multiple vision tasks. Its simplicity and effectiveness have led to wide applications in explanation visual predictions weakly-supervised localization However, CAM its own shortcomings. computation maps relies on ad-hoc calibration steps that are not part training computational graph, making it difficult us understand real meaning values. In this paper, we improve by explicitly incorporating a...

10.1109/iccv48922.2021.00824 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

The goal of unsupervised co-localization is to locate the object in a scene under assumptions that 1) dataset consists only one superclass, e.g., birds, and 2) there are no human-annotated labels dataset. most recent method achieves impressive performance by employing self-supervised representation learning approaches such as predicting rotation. In this paper, we introduce new contrastive objective directly on attention maps enhance performance. Our loss function exploits rich information...

10.1109/iccv48922.2021.00280 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01
Coming Soon ...