Bohyung Han

ORCID: 0000-0003-3099-3616
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Domain Adaptation and Few-Shot Learning
  • Multimodal Machine Learning Applications
  • Video Surveillance and Tracking Methods
  • Advanced Image and Video Retrieval Techniques
  • Human Pose and Action Recognition
  • Advanced Neural Network Applications
  • Advanced Vision and Imaging
  • Image Enhancement Techniques
  • Anomaly Detection Techniques and Applications
  • Video Analysis and Summarization
  • Advanced Image Processing Techniques
  • Machine Learning and Data Classification
  • Visual Attention and Saliency Detection
  • Generative Adversarial Networks and Image Synthesis
  • Adversarial Robustness in Machine Learning
  • Machine Learning and ELM
  • Image Processing Techniques and Applications
  • Gaussian Processes and Bayesian Inference
  • Image Retrieval and Classification Techniques
  • Target Tracking and Data Fusion in Sensor Networks
  • Face recognition and analysis
  • Infrared Target Detection Methodologies
  • Robotics and Sensor-Based Localization
  • Face and Expression Recognition
  • COVID-19 diagnosis using AI

State Grid Shanxi Electric Power Company (China)
2025

Seoul National University
2003-2024

Pohang University of Science and Technology
2011-2020

Google (United States)
2017-2018

Korea Post
2011-2017

Ulsan National Institute of Science and Technology
2010

Princeton University
2008-2009

University of Maryland, College Park
2003-2007

University of California, Irvine
2007

Samsung (United States)
2006

We propose a novel semantic segmentation algorithm by learning deep deconvolution network. learn the network on top of convolutional layers adopted from VGG 16-layer net. The is composed and unpooling layers, which identify pixelwise class labels predict masks. apply trained to each proposal in an input image, construct final map combining results all proposals simple manner. proposed mitigates limitations existing methods based fully networks integrating proposal-wise prediction, our method...

10.1109/iccv.2015.178 preprint EN 2015-12-01

We propose a novel visual tracking algorithm based on the representations from discriminatively trained Convolutional Neural Network (CNN). Our pretrains CNN using large set of videos with ground-truths to obtain generic target representation. network is composed shared layers and multiple branches domain-specific layers, where domains correspond individual training sequences each branch responsible for binary classification identify in domain. train domain iteratively layers. When new...

10.1109/cvpr.2016.465 preprint EN 2016-06-01

We propose an attentive local feature descriptor suitable for large-scale image retrieval, referred to as DELE (DEep Local Feature). The new is based on convolutional neural networks, which are trained only with image-level annotations a landmark dataset. To identify semantically useful features we also attention mechanism key point selection, shares most network layers the descriptor. This frame-work can be used retrieval drop-in replacement other keypoint detectors and descriptors,...

10.1109/iccv.2017.374 preprint EN 2017-10-01

We propose a novel semantic segmentation algorithm by learning deconvolution network. learn the network on top of convolutional layers adopted from VGG 16-layer net. The is composed and unpooling layers, which identify pixel-wise class labels predict masks. apply trained to each proposal in an input image, construct final map combining results all proposals simple manner. proposed mitigates limitations existing methods based fully networks integrating deep proposal-wise prediction; our...

10.48550/arxiv.1505.04366 preprint EN other-oa arXiv (Cornell University) 2015-01-01

The Visual Object Tracking challenge 2015, VOT2015, aims at comparing short-term single-object visual trackers that do not apply pre-learned models of object appearance. Results 62 are presented. number tested makes VOT 2015 the largest benchmark on tracking to date. For each participating tracker, a short description is provided in appendix. Features VOT2015 go beyond its VOT2014 predecessor are: (i) new dataset twice as large with full annotation targets by rotated bounding boxes and...

10.1109/iccvw.2015.79 preprint EN 2015-12-01

We propose a weakly supervised temporal action localization algorithm on untrimmed videos using convolutional neural networks. Our learns from video-level class labels and predicts intervals of human actions with no requirement annotations. design our network to identify sparse subset key segments associated target in video an attention module fuse the through adaptive pooling. loss function is comprised two terms that minimize classification error enforce sparsity segment selection. At...

10.1109/cvpr.2018.00706 article EN 2018-06-01

We propose a novel unsupervised domain adaptation framework based on domain-specific batch normalization in deep neural networks. aim to adapt both domains by specializing layers convolutional networks while allowing them share all other model parameters, which is realized two-stage algorithm. In the first stage, we estimate pseudo-labels for examples target using an external algorithm-for example, MSTN or CPUA-integrating proposed normalization. The second stage learns final models...

10.1109/cvpr.2019.00753 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

We tackle image question answering (ImageQA) problem by learning a convolutional neural network (CNN) with dynamic parameter layer whose weights are determined adaptively based on questions. For the adaptive prediction, we employ separate prediction network, which consists of gated recurrent unit (GRU) taking as its input and fully-connected generating set candidate output. However, it is challenging to construct for large number parameters in CNN. reduce complexity this incorporating...

10.1109/cvpr.2016.11 preprint EN 2016-06-01

We propose an online visual tracking algorithm by learning discriminative saliency map using Convolutional Neural Network (CNN). Given a CNN pre-trained on large-scale image repository in offline, our takes outputs from hidden layers of the network as feature descriptors since they show excellent representation performance various general recognition problems. The features are used to learn target appearance models Support Vector Machine (SVM). In addition, we construct target-specific...

10.48550/arxiv.1502.06796 preprint EN other-oa arXiv (Cornell University) 2015-01-01

We present an online visual tracking algorithm by managing multiple target appearance models in a tree structure. The proposed employs Convolutional Neural Networks (CNNs) to represent appearances, where CNNs collaborate estimate states and determine the desirable paths for model updates tree. By maintaining diverse branches of structure, it is convenient deal with multi-modality appearances preserve reliability through smooth along paths. Since share all parameters convolutional layers,...

10.48550/arxiv.1608.07242 preprint EN other-oa arXiv (Cornell University) 2016-01-01

Prevailing video frame interpolation techniques rely heavily on optical flow estimation and require additional model complexity computational cost; it is also susceptible to error propagation in challenging scenarios with large motion heavy occlusion. To alleviate the limitation, we propose a simple but effective deep neural network for interpolation, which end-to-end trainable free from component. Our algorithm employs special feature reshaping operation, referred as PixelShuffle, channel...

10.1609/aaai.v34i07.6693 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2020-04-03

We propose Quadruplet Convolutional Neural Networks (Quad-CNN) for multi-object tracking, which learn to associate object detections across frames using quadruplet losses. The proposed networks consider target appearances together with their temporal adjacencies data association. Unlike conventional ranking losses, the loss enforces an additional constraint that makes temporally adjacent more closely located than ones large gaps. also employ a multi-task jointly association and bounding box...

10.1109/cvpr.2017.403 article EN 2017-07-01

This paper addresses the problem of text-to-video temporal grounding, which aims to identify time interval in a video semantically relevant text query. We tackle this using novel regression-based model that learns extract collection mid-level features for semantic phrases query, corresponds important entities described query (e.g., actors, objects, and actions), reflect bi-modal interactions between linguistic visual multiple levels. The proposed method effectively predicts target by...

10.1109/cvpr42600.2020.01082 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

We propose a novel weakly-supervised semantic segmentation algorithm based on Deep Convolutional Neural Network (DCNN). Contrary to existing approaches, our exploits auxiliary annotations available for different categories guide segmentations images with only image-level class labels. To make knowledge transferrable across categories, we design decoupled encoder-decoder architecture attention model. In this architecture, the model generates spatial highlights of each category presented in...

10.1109/cvpr.2016.349 preprint EN 2016-06-01

We propose an extremely simple but effective regularization technique of convolutional neural networks (CNNs), referred to as BranchOut, for online ensemble tracking. Our algorithm employs a CNN target representation, which has common layers multiple branches fully connected layers. For better regularization, subset in the are selected randomly learning whenever appearance models need be updated. Each branch may have different number maintain variable abstraction levels appearances....

10.1109/cvpr.2017.63 article EN 2017-07-01

We present a novel class incremental learning approach based on deep neural networks, which continually learns new tasks with limited memory for storing examples in the previous tasks. Our algorithm is knowledge distillation and provides principled way to maintain representations of old models while adjusting effectively. The proposed method estimates relationship between representation changes resulting loss increases incurred by model updates. It minimizes upper bound using...

10.1109/cvpr52688.2022.01560 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

We present an information-theoretic regularization technique for few-shot novel view synthesis based on neural im-plicit representation. The proposed approach minimizes potential reconstruction inconsistency that happens due to in-sufficient viewpoints by imposing the entropy constraint of density in each ray. In addition, alleviate poten-tial degenerate issue when all training images are acquired from almost redundant viewpoints, we further incorporate spatial smoothness into estimated...

10.1109/cvpr52688.2022.01257 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

Visual features are commonly modeled with probability density functions in computer vision problems, but current methods such as a mixture of Gaussians and kernel estimation suffer from either the lack flexibility, by fixing or limiting number Gaussian components mixture, large memory requirement, maintaining non-parametric representation density. These problems aggravated real-time applications since required to be updated new data becomes available. We present novel approximation technique...

10.1109/tpami.2007.70771 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2008-06-10

Background modeling and subtraction is a natural technique for object detection in videos captured by static camera, also critical preprocessing step various high-level computer vision applications. However, there have not been many studies concerning useful features binary segmentation algorithms this problem. We propose pixelwise background using multiple features, where generative discriminative techniques are combined classification. In our algorithm, color, gradient, Haar-like...

10.1109/tpami.2011.243 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2011-12-13
Coming Soon ...