Jun-Yan Zhu

ORCID: 0000-0001-8504-3410
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Generative Adversarial Networks and Image Synthesis
  • Advanced Vision and Imaging
  • Computer Graphics and Visualization Techniques
  • Advanced Image and Video Retrieval Techniques
  • Advanced Image Processing Techniques
  • Image Retrieval and Classification Techniques
  • 3D Shape Modeling and Analysis
  • Digital Media Forensic Detection
  • Video Analysis and Summarization
  • Domain Adaptation and Few-Shot Learning
  • Advanced Neural Network Applications
  • Cell Image Analysis Techniques
  • Anomaly Detection Techniques and Applications
  • Image Processing and 3D Reconstruction
  • Multimodal Machine Learning Applications
  • Tactile and Sensory Interactions
  • Image Enhancement Techniques
  • Topic Modeling
  • Visual Attention and Saliency Detection
  • Medical Image Segmentation Techniques
  • Machine Learning and Data Classification
  • Model Reduction and Neural Networks
  • Interactive and Immersive Displays
  • Artificial Intelligence in Games
  • AI in cancer detection

Carnegie Mellon University
2021-2025

Baylor University
2025

Central China Normal University
2024

Institute of Microelectronics
2020-2024

University of Chinese Academy of Sciences
2020-2024

Chinese Academy of Sciences
2020-2024

Queen Mary University of London
2023

Wuhan Institute of Technology
2022

Adobe Systems (United States)
2018-2021

Massachusetts Institute of Technology
2018-2020

We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These not only learn the mapping from input image output image, but also loss function train this mapping. This makes it possible apply same generic approach problems that traditionally would require very different formulations. demonstrate is effective at synthesizing photos label maps, reconstructing objects edge and colorizing images, among other tasks. Moreover, since...

10.1109/cvpr.2017.632 article EN 2017-07-01

Image-to-image translation is a class of vision and graphics problems where the goal to learn mapping between an input image output using training set aligned pairs. However, for many tasks, paired data will not be available. We present approach learning translate from source domain X target Y in absence examples. Our G : → such that distribution images G(X) indistinguishable adversarial loss. Because this highly under-constrained, we couple it with inverse F introduce cycle consistency loss...

10.1109/iccv.2017.244 article EN 2017-10-01

We present a new method for synthesizing high-resolution photo-realistic images from semantic label maps using conditional generative adversarial networks (conditional GANs). Conditional GANs have enabled variety of applications, but the results are often limited to low-resolution and still far realistic. In this work, we generate 2048 × 1024 visually appealing with novel loss, as well multi-scale generator discriminator architectures. Furthermore, extend our framework interactive visual...

10.1109/cvpr.2018.00917 article EN 2018-06-01

We propose spatially-adaptive normalization, a simple but effective layer for synthesizing photorealistic images given an input semantic layout. Previous methods directly feed the layout as to network, forcing network memorize information throughout all layers. Instead, we using modulating activations in normalization layers through spatially-adaptive, learned affine transformation. Experiments on several challenging datasets demonstrate superiority of our method compared existing...

10.1109/cvpr.2019.00244 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

Domain adaptation is critical for success in new, unseen environments. Adversarial models applied feature spaces discover domain invariant representations, but are difficult to visualize and sometimes fail capture pixel-level low-level shifts. Recent work has shown that generative adversarial networks combined with cycle-consistency constraints surprisingly effective at mapping images between domains, even without the use of aligned image pairs. We propose a novel discriminatively-trained...

10.48550/arxiv.1711.03213 preprint EN other-oa arXiv (Cornell University) 2017-01-01

Image-to-image translation is a class of vision and graphics problems where the goal to learn mapping between an input image output using training set aligned pairs. However, for many tasks, paired data will not be available. We present approach learning translate from source domain $X$ target $Y$ in absence examples. Our $G: X \rightarrow Y$ such that distribution images $G(X)$ indistinguishable adversarial loss. Because this highly under-constrained, we couple it with inverse $F: Y X$...

10.48550/arxiv.1703.10593 preprint EN other-oa arXiv (Cornell University) 2017-01-01

Many image-to-image translation problems are ambiguous, as a single input image may correspond to multiple possible outputs. In this work, we aim model \emph{distribution} of outputs in conditional generative modeling setting. The ambiguity the mapping is distilled low-dimensional latent vector, which can be randomly sampled at test time. A generator learns map given input, combined with code, output. We explicitly encourage connection between output and code invertible. This helps prevent...

10.48550/arxiv.1711.11586 preprint EN other-oa arXiv (Cornell University) 2017-01-01

Deep neural networks (DNNs) have been found to be vulnerable adversarial examples resulting from adding small-magnitude perturbations inputs. Such can mislead DNNs produce adversary-selected results. Different attack strategies proposed generate examples, but how them with high perceptual quality and more efficiently requires research efforts. In this paper, we propose AdvGAN exam- ples generative (GANs), which learn approximate the distribution of original instances. For AdvGAN, once...

10.24963/ijcai.2018/543 article EN 2018-07-01

A commonly observed failure mode of Neural Radiance Field (NeRF) is fitting incorrect geometries when given an insufficient number input views. One potential reason that standard volumetric rendering does not enforce the constraint most a scene's geometry consist empty space and opaque surfaces. We formalize above assumption through DS-NeRF (Depth-supervised Fields), loss for learning radiance fields takes advantage readily-available depth supervision. leverage fact current NeRF pipelines...

10.1109/cvpr52688.2022.01254 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

We propose a deep learning approach for user-guided image colorization. The system directly maps grayscale image, along with sparse, local user "hints" to an output colorization Convolutional Neural Network (CNN). Rather than using hand-defined rules, the network propagates edits by fusing low-level cues high-level semantic information, learned from large-scale data. train on million images, simulated inputs. To guide towards efficient input selection, recommends likely colors based and...

10.1145/3072959.3073703 article EN ACM Transactions on Graphics 2017-07-20

Deep neural networks excel at finding hierarchical representations that solve complex tasks over large data sets. How can we humans understand these learned representations? In this work, present network dissection, an analytic framework to systematically identify the semantics of individual hidden units within image classification and generation networks. First, analyze a convolutional (CNN) trained on scene discover match diverse set object concepts. We find evidence has many classes play...

10.1073/pnas.1907375117 article EN Proceedings of the National Academy of Sciences 2020-09-01

Despite the recent success of GANs in synthesizing images conditioned on inputs such as a user sketch, text, or semantic labels, manipulating high-level attributes an existing natural photograph with is challenging for two reasons. First, it hard to precisely reproduce input image. Second, after manipulation, newly synthesized pixels often do not fit original In this paper, we address these issues by adapting image prior learned statistics individual Our method can accurately reconstruct and...

10.1145/3306346.3323023 article EN ACM Transactions on Graphics 2019-07-12

Despite the success of Generative Adversarial Networks (GANs), mode collapse remains a serious issue during GAN training. To date, little work has focused on understanding and quantifying which modes have been dropped by model. In this work, we visualize at both distribution level instance level. First, deploy semantic segmentation network to compare segmented objects in generated images with target training set. Differences statistics reveal object classes that are omitted GAN. Second,...

10.1109/iccv.2019.00460 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

The performance of generative adversarial networks (GANs) heavily deteriorates given a limited amount training data. This is mainly because the discriminator memorizing exact set. To combat it, we propose Differentiable Augmentation (DiffAugment), simple method that improves data efficiency GANs by imposing various types differentiable augmentations on both real and fake samples. Previous attempts to directly augment manipulate distribution images, yielding little benefit; DiffAugment...

10.48550/arxiv.2006.10738 preprint EN other-oa arXiv (Cornell University) 2020-01-01

While generative models produce high-quality images of concepts learned from a large-scale database, user often wishes to synthesize instantiations their own (for example, family, pets, or items). Can we teach model quickly acquire new concept, given few examples? Furthermore, can compose multiple together? We propose Custom Diffusion, an efficient method for augmenting existing text-to-image models. find that only optimizing parameters in the conditioning mechanism is sufficiently powerful...

10.1109/cvpr52729.2023.00192 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023-06-01

Recent studies show that widely used deep neural networks (DNNs) are vulnerable to carefully crafted adversarial examples. Many advanced algorithms have been proposed generate examples by leveraging the $\mathcal{L}_p$ distance for penalizing perturbations. Researchers explored different defense methods defend against such attacks. While effectiveness of as a metric perceptual quality remains an active research area, in this paper we will instead focus on type perturbation, namely spatial...

10.48550/arxiv.1801.02612 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Generative Adversarial Networks (GANs) have recently achieved impressive results for many real-world applications, and GAN variants emerged with improvements in sample quality training stability. However, they not been well visualized or understood. How does a represent our visual world internally? What causes the artifacts results? do architectural choices affect learning? Answering such questions could enable us to develop new insights better models. In this work, we present an analytic...

10.48550/arxiv.1811.10597 preprint EN other-oa arXiv (Cornell University) 2018-01-01

The recent success of text-to-image synthesis has taken the world by storm and captured general public's imagination. From a technical standpoint, it also marked drastic change in favored architecture to design generative image models. GANs used be de facto choice, with techniques like StyleGAN. With DALL.E 2, autoregressive diffusion models became new standard for large-scale overnight. This rapid shift raises fundamental question: can we scale up benefit from large datasets LAION? We find...

10.1109/cvpr52729.2023.00976 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023-06-01

Deep neural networks (DNNs) have been found to be vulnerable adversarial examples resulting from adding small-magnitude perturbations inputs. Such can mislead DNNs produce adversary-selected results. Different attack strategies proposed generate examples, but how them with high perceptual quality and more efficiently requires research efforts. In this paper, we propose AdvGAN generative (GANs), which learn approximate the distribution of original instances. For AdvGAN, once generator is...

10.48550/arxiv.1801.02610 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Large-scale text-to-image generative models have shown their remarkable ability to synthesize diverse, high-quality images. However, directly applying these for real image editing remains challenging two reasons. First, it is hard users craft a perfect text prompt depicting every visual detail in the input image. Second, while existing can introduce desirable changes certain regions, they often dramatically alter content and unexpected unwanted regions. In this work, we pix2pix-zero, an...

10.1145/3588432.3591513 article EN cc-by 2023-07-19
Coming Soon ...