NFDI4DS | UHH-SEMS - Publication Details

Kihyuk Sohn

ORCID: 0000-0003-4303-8319

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5072573916

Research Areas

Domain Adaptation and Few-Shot Learning
Multimodal Machine Learning Applications
Generative Adversarial Networks and Image Synthesis
Face recognition and analysis
Advanced Neural Network Applications
Advanced Image and Video Retrieval Techniques
Face and Expression Recognition
Machine Learning and Data Classification
Biometric Identification and Security
Adversarial Robustness in Machine Learning
COVID-19 diagnosis using AI
Anomaly Detection Techniques and Applications
Advanced Vision and Imaging
Image Retrieval and Classification Techniques
Computer Graphics and Visualization Techniques
Digital Media Forensic Detection
Video Surveillance and Tracking Methods
Topic Modeling
Speech Recognition and Synthesis
Data-Driven Disease Surveillance
Human Pose and Action Recognition
Imbalanced Data Classification Techniques
Privacy-Preserving Technologies in Data
Human Motion and Animation
Video Analysis and Summarization

Google (United States)
2019-2024

Korea Advanced Institute of Science and Technology
2021

NEC (United States)
2017-2020

NEC (Japan)
2015-2020

University of Michigan
2011-2015

FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence

OPENALEX - Publications

Kihyuk Sohn David Berthelot Chun‐Liang Li Zizhao Zhang Nicholas Carlini and 4 more

Semi-supervised learning (SSL) provides an effective means of leveraging unlabeled data to improve a model's performance. In this paper, we demonstrate the power simple combination two common SSL methods: consistency regularization and pseudo-labeling. Our algorithm, FixMatch, first generates pseudo-labels using predictions on weakly-augmented images. For given image, pseudo-label is only retained if model produces high-confidence prediction. The then trained predict when fed...

10.48550/arxiv.2001.07685 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Learning to Adapt Structured Output Space for Semantic Segmentation

OPENALEX - Publications

Yi–Hsuan Tsai Wei-Chih Hung Samuel Schulter Kihyuk Sohn Ming–Hsuan Yang and 1 more

Convolutional neural network-based approaches for semantic segmentation rely on supervision with pixel-level ground truth, but may not generalize well to unseen image domains. As the labeling process is tedious and labor intensive, developing algorithms that can adapt source truth labels target domain of great interest. In this paper, we propose an adversarial learning method adaptation in context segmentation. Considering segmentations as structured outputs contain spatial similarities...

10.1109/cvpr.2018.00780 article EN 2018-06-01

CutPaste: Self-Supervised Learning for Anomaly Detection and Localization

OPENALEX - Publications

Chun‐Liang Li Kihyuk Sohn Jinsung Yoon Tomas Pfister

We aim at constructing a high performance model for defect detection that detects unknown anomalous patterns of an image without data. To this end, we propose two-stage framework building anomaly detectors using normal training data only. first learn self-supervised deep representations and then build generative one-class classifier on learned representations. by classifying from the CutPaste, simple augmentation strategy cuts patch pastes random location large image. Our empirical study...

10.1109/cvpr46437.2021.00954 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

Towards Large-Pose Face Frontalization in the Wild

OPENALEX - Publications

Xi Yin Yu Xiang Kihyuk Sohn Xiaoming Liu Manmohan Chandraker

Despite recent advances in face recognition using deep learning, severe accuracy drops are observed for large pose variations unconstrained environments. Learning pose-invariant features is one solution, but needs expensively labeled large-scale data and carefully designed feature learning algorithms. In this work, we focus on frontalizing faces the wild under various head poses, including extreme profile view's. We propose a novel 3D Morphable Model (3DMM) conditioned Face Frontalization...

10.1109/iccv.2017.430 article EN 2017-10-01

Domain Adaptation for Structured Output via Discriminative Patch Representations

OPENALEX - Publications

Yi–Hsuan Tsai Kihyuk Sohn Samuel Schulter Manmohan Chandraker

Predicting structured outputs such as semantic segmentation relies on expensive per-pixel annotations to learn supervised models like convolutional neural networks. However, trained one data domain may not generalize well other domains without for model finetuning. To avoid the labor-intensive process of annotation, we develop a adaptation method adapt source unlabeled target domain. We propose discriminative feature representations patches in by discovering multiple modes patch-wise output...

10.1109/iccv.2019.00154 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

Feature Transfer Learning for Face Recognition With Under-Represented Data

OPENALEX - Publications

Xi Yin Yu Xiang Kihyuk Sohn Xiaoming Liu Manmohan Chandraker

Despite the large volume of face recognition datasets, there is a significant portion subjects, which samples are insufficient and thus under-represented. Ignoring such results in training data. Training with under-represented data leads to biased classifiers conventionally-trained deep networks. In this paper, we propose center-based feature transfer framework augment space subjects from regular that have sufficiently diverse samples. A Gaussian prior variance assumed across all ones...

10.1109/cvpr.2019.00585 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

ReMixMatch: Semi-Supervised Learning with Distribution Alignment and Augmentation Anchoring

OPENALEX - Publications

David Berthelot Nicholas Carlini Ekin D. Cubuk А.В. Куракин Kihyuk Sohn and 2 more

We improve the recently-proposed "MixMatch" semi-supervised learning algorithm by introducing two new techniques: distribution alignment and augmentation anchoring. Distribution encourages marginal of predictions on unlabeled data to be close ground-truth labels. Augmentation anchoring feeds multiple strongly augmented versions an input into model each output prediction for a weakly-augmented version same input. To produce strong augmentations, we propose variant AutoAugment which learns...

10.48550/arxiv.1911.09785 preprint EN other-oa arXiv (Cornell University) 2019-01-01

A Simple Semi-Supervised Learning Framework for Object Detection

OPENALEX - Publications

Kihyuk Sohn Zizhao Zhang Chun‐Liang Li Han Zhang Chen‐Yu Lee and 1 more

Semi-supervised learning (SSL) has a potential to improve the predictive performance of machine models using unlabeled data. Although there been remarkable recent progress, scope demonstration in SSL mainly on image classification tasks. In this paper, we propose STAC, simple yet effective framework for visual object detection along with data augmentation strategy. STAC deploys highly confident pseudo labels localized objects from an and updates model by enforcing consistency via strong...

10.48550/arxiv.2005.04757 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Understanding and Improving Convolutional Neural Networks via Concatenated Rectified Linear Units

OPENALEX - Publications

Wenling Shang Kihyuk Sohn Diogo Almeida Honglak Lee

Recently, convolutional neural networks (CNNs) have been used as a powerful tool to solve many problems of machine learning and computer vision. In this paper, we aim provide insight on the property networks, well generic method improve performance CNN architectures. Specifically, first examine existing models observe an intriguing that filters in lower layers form pairs (i.e., with opposite phase). Inspired by our observation, propose novel, simple yet effective activation scheme called...

10.48550/arxiv.1603.05201 preprint EN other-oa arXiv (Cornell University) 2016-01-01

Improving object detection with deep convolutional networks via Bayesian optimization and structured prediction

OPENALEX - Publications

Yuting Zhang Kihyuk Sohn Ruben Villegas Gang Pan Honglak Lee

Object detection systems based on the deep convolutional neural network (CNN) have recently made ground- breaking advances several object benchmarks. While features learned by these high-capacity networks are discriminative for categorization, inaccurate localization is still a major source of error detection. Building upon CNN architectures, we address problem 1) using search algorithm Bayesian optimization that sequentially proposes candidate regions an bounding box, and 2) training with...

10.1109/cvpr.2015.7298621 preprint EN 2015-06-01

CReST: A Class-Rebalancing Self-Training Framework for Imbalanced Semi-Supervised Learning

OPENALEX - Publications

Chen Wei Kihyuk Sohn Clayton Mellina Alan Yuille Fan Yang

Semi-supervised learning on class-imbalanced data, although a realistic problem, has been under studied. While existing semi-supervised (SSL) methods are known to perform poorly minority classes, we find that they still generate high precision pseudo-labels classes. By exploiting this property, in work, propose Class-Rebalancing Self-Training (CReST), simple yet effective framework improve SSL data. CReST iteratively retrains baseline model with labeled set expanded by adding pseudolabeled...

10.1109/cvpr46437.2021.01071 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

Video Probabilistic Diffusion Models in Projected Latent Space

OPENALEX - Publications

Sihyun Yu Kihyuk Sohn Subin Kim Jinwoo Shin

Despite the remarkable progress in deep generative models, synthesizing high-resolution and temporally coherent videos still remains a challenge due to their high-dimensionality complex temporal dynamics along with large spatial variations. Recent works on diffusion models have shown potential solve this challenge, yet they suffer from severe computation memory-inefficiency that limit scalability. To handle issue, we propose novel model for videos, coined projected latent video (PVDM),...

10.1109/cvpr52729.2023.01770 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023-06-01

MAGVIT: Masked Generative Video Transformer

OPENALEX - Publications

Lijun Yu Yong Cheng Kihyuk Sohn José Lezama Han Zhang and 6 more

We introduce the MAsked Generative VIdeo Transformer, MAGVIT, to tackle various video synthesis tasks with a single model. 3D tokenizer quantize into spatial-temporal visual tokens and propose an embedding method for masked token modeling facilitate multi-task learning. conduct extensive experiments demonstrate quality, efficiency, flexibility of MAGVIT. Our show that (i) MAGVIT performs favorably against state-of-the-art approaches establishes best-published FVD on three generation...

10.1109/cvpr52729.2023.01008 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023-06-01

Augmenting CRFs with Boltzmann Machine Shape Priors for Image Labeling

OPENALEX - Publications

Andrew Kae Kihyuk Sohn Honglak Lee Erik Learned-Miller

Conditional random fields (CRFs) provide powerful tools for building models to label image segments. They are particularly well-suited modeling local interactions among adjacent regions (e.g., super pixels). However, CRFs limited in dealing with complex, global (long-range) between regions. Complementary this, restricted Boltzmann machines (RBMs) can be used model shapes produced by segmentation models. In this work, we present a new that uses the combined power of these two network types...

10.1109/cvpr.2013.263 article EN 2009 IEEE Conference on Computer Vision and Pattern Recognition 2013-06-01

Reconstruction-Based Disentanglement for Pose-Invariant Face Recognition

OPENALEX - Publications

Xi Peng Yu Xiang Kihyuk Sohn Dimitris Metaxas Manmohan Chandraker

Deep neural networks (DNNs) trained on large-scale datasets have recently achieved impressive improvements in face recognition. But a persistent challenge remains to develop methods capable of handling large pose variations that are relatively under-represented training data. This paper presents method for learning feature representation is invariant pose, without requiring extensive coverage We first propose generate non-frontal views from single frontal face, order increase the diversity...

10.1109/iccv.2017.180 article EN 2017-10-01

Towards Universal Representation Learning for Deep Face Recognition

OPENALEX - Publications

Yichun Shi Yu Xiang Kihyuk Sohn Manmohan Chandraker Anil K. Jain

Recognizing wild faces is extremely hard as they appear with all kinds of variations. Traditional methods either train specifically annotated variation data from target domains, or by introducing unlabeled to adapt the training data. Instead, we propose a universal representation learning framework that can deal larger unseen in given without leveraging domain knowledge. We firstly synthesize alongside some semantically meaningful variations, such low resolution, occlusion and head pose....

10.1109/cvpr42600.2020.00685 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

Efficient learning of sparse, distributed, convolutional feature representations for object recognition

OPENALEX - Publications

Kihyuk Sohn Dae Yon Jung Honglak Lee Alfred O. Hero

Informative image representations are important in achieving state-of-the-art performance object recognition tasks. Among feature learning algorithms that used to develop representations, restricted Boltzmann machines (RBMs) have good expressive power and build effective representations. However, the difficulty of training RBMs has been a barrier their wide use. To address this difficulty, we show connections between mixture models present an efficient method for utilize these connections....

10.1109/iccv.2011.6126554 article EN International Conference on Computer Vision 2011-11-01

Unsupervised Domain Adaptation for Face Recognition in Unlabeled Videos

OPENALEX - Publications

Kihyuk Sohn Sifei Liu Guangyu Zhong Yu Xiang Ming–Hsuan Yang and 1 more

Despite rapid advances in face recognition, there remains a clear gap between the performance of still image-based recognition and video-based due to vast difference visual quality domains difficulty curating diverse large-scale video datasets. This paper addresses both those challenges, through an image feature-level domain adaptation approach, learn discriminative frame representations. The framework utilizes unlabeled data reduce different while transferring knowledge from labeled images....

10.1109/iccv.2017.630 article EN 2017-10-01

Coming Soon ...