NFDI4DS | UHH-SEMS - Publication Details

Zero-Shot Image Restoration Using Denoising Diffusion Null-Space Model

OPENALEX - Publications

Yinhuai Wang Jiwen Yu Jian Zhang

Most existing Image Restoration (IR) models are task-specific, which can not be generalized to different degradation operators. In this work, we propose the Denoising Diffusion Null-Space Model (DDNM), a novel zero-shot framework for arbitrary linear IR problems, including but limited image super-resolution, colorization, inpainting, compressed sensing, and deblurring. DDNM only needs pre-trained off-the-shelf diffusion model as generative prior, without any extra training or network...

10.48550/arxiv.2212.00490 preprint EN other-oa arXiv (Cornell University) 2022-01-01

FreeDoM: Training-Free Energy-Guided Conditional Diffusion Model

OPENALEX - Publications

Jiwen Yu Yinhuai Wang Chen Zhao Bernard Ghanem Jian Zhang

Recently, conditional diffusion models have gained popularity in numerous applications due to their exceptional generation ability. However, many existing methods are training-required. They need train a time-dependent classifier or condition-dependent score estimator, which increases the cost of constructing and is inconvenient transfer across different conditions. Some current works aim overcome this limitation by proposing training-free solutions, but most can only be applied specific...

10.1109/iccv51070.2023.02118 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

GAN Prior Based Null-Space Learning for Consistent Super-resolution

OPENALEX - Publications

Yinhuai Wang Yujie Hu Jiwen Yu Jian Zhang

Consistency and realness have always been the two critical issues of image super-resolution. While has dramatically improved with use GAN prior, state-of-the-art methods still suffer inconsistencies in local structures colors (e.g., tooth eyes). In this paper, we show that these can be analytically eliminated by learning only null-space component while fixing range-space part. Further, design a pooling-based decomposition (PD), universal range-null space for super-resolution tasks, which is...

10.1609/aaai.v37i3.25372 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2023-06-26

DEAR-GAN: Degradation-Aware Face Restoration With GAN Prior

OPENALEX - Publications

Yujie Hu Yinhuai Wang Jian Zhang

With the development of generative adversarial networks (GANs), recent face restoration (FR) methods often utilize pre-trained GAN models (i.e.,, StyleGAN2) as prior to generate rich details. However, these usually struggle balance realness and fidelity when facing various degradation levels. In this paper, we propose a novel DEgradation-Aware Restoration network with prior, dubbed DEAR-GAN, for FR tasks by explicitly learning representations (DR) adapt degradation. Specifically, an...

10.1109/tcsvt.2023.3244786 article EN IEEE Transactions on Circuits and Systems for Video Technology 2023-02-16

Unlimited-Size Diffusion Restoration

OPENALEX - Publications

Yinhuai Wang Jiwen Yu Runyi Yu Jian Zhang

Recently, using diffusion models for zero-shot image restoration (IR) has become a new hot paradigm. This type of method only needs to use the pre-trained off-the-shelf models, without any finetuning, and can directly handle various IR tasks. The upper limit performance depends on which are in rapid evolution. However, current methods discuss how deal with fixed-size images, but dealing images arbitrary sizes is very important practical applications. paper focuses those diffusion-based size...

10.1109/cvprw59228.2023.00123 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2023-06-01

DiffLLE: Diffusion-based Domain Calibration for Weak Supervised Low-light Image Enhancement

OPENALEX - Publications

Shuzhou Yang Xuanyu Zhang Yinhuai Wang Jiwen Yu Yuhan Wang and 1 more

10.1007/s11263-024-02292-4 article EN International Journal of Computer Vision 2024-11-27

Panini-Net: GAN Prior Based Degradation-Aware Feature Interpolation for Face Restoration

OPENALEX - Publications

Yinhuai Wang Yujie Hu Jian Zhang

Emerging high-quality face restoration (FR) methods often utilize pre-trained GAN models (i.e., StyleGAN2) as Prior. However, these usually struggle to balance realness and fidelity when facing various degradation levels. Besides, there is still a noticeable visual quality gap compared with models. In this paper, we propose novel Prior based degradation-aware feature interpolation network, dubbed Panini-Net, for FR tasks by explicitly learning the abstract representations distinguish...

10.1609/aaai.v36i3.20159 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2022-06-28

Null-Space Diffusion Sampling for Zero-Shot Point Cloud Completion

OPENALEX - Publications

Xinhua Cheng Nan Zhang Jiwen Yu Yinhuai Wang Ge Li and 1 more

Point cloud completion aims at estimating the complete data of objects from degraded observations. Despite existing methods achieving impressive performances, they rely heavily on degraded-complete pairs for supervision. In this work, we propose a novel framework named Null-Space Diffusion Sampling (NSDS) to solve point task in zero-shot manner. By leveraging pre-trained diffusion model as off-the-shelf generator, our sampling approach can generate desired outputs with guidance observed...

10.24963/ijcai.2023/69 article EN 2023-08-01

LaPE: Layer-adaptive Position Embedding for Vision Transformers with Independent Layer Normalization

OPENALEX - Publications

Runyi Yu Zhennan Wang Yinhuai Wang Kehan Li Chang Liu and 3 more

Position information is critical for Vision Transformers (VTs) due to the permutation-invariance of self-attention operations. A typical way introduce position adding absolute Embedding (PE) patch embedding before entering VTs. However, this approach operates same Layer Normalization (LN) token and PE, delivers PE each layer. This results in restricted monotonic across layers, as shared LN affine parameters are not dedicated cannot be adjusted on a per-layer basis. To overcome these...

10.1109/iccv51070.2023.00541 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

NeRFocus: Neural Radiance Field for 3D Synthetic Defocus

OPENALEX - Publications

Yinhuai Wang Shuzhou Yang Yujie Hu Jian Zhang

Neural radiance fields (NeRF) bring a new wave for 3D interactive experiences. However, as an important part of the immersive experiences, defocus effects have not been fully explored within NeRF. Some recent NeRF-based methods generate in post-process fashion by utilizing multiplane technology. Still, they are either time-consuming or memory-consuming. This paper proposes novel thin-lens-imaging-based NeRF framework that can directly render various effects, dubbed NeRFocus. Unlike pinhole,...

10.48550/arxiv.2203.05189 preprint EN other-oa arXiv (Cornell University) 2022-01-01

FreeDoM: Training-Free Energy-Guided Conditional Diffusion Model

OPENALEX - Publications

Jiwen Yu Yinhuai Wang Zhao Chen Bernard Ghanem Jian Zhang

Recently, conditional diffusion models have gained popularity in numerous applications due to their exceptional generation ability. However, many existing methods are training-required. They need train a time-dependent classifier or condition-dependent score estimator, which increases the cost of constructing and is inconvenient transfer across different conditions. Some current works aim overcome this limitation by proposing training-free solutions, but most can only be applied specific...

10.48550/arxiv.2303.09833 preprint EN other-oa arXiv (Cornell University) 2023-01-01

DiffLLE: Diffusion-guided Domain Calibration for Unsupervised Low-light Image Enhancement

OPENALEX - Publications

Shuzhou Yang Xuanyu Zhang Yinhuai Wang Jiwen Yu Yuhan Wang and 1 more

Existing unsupervised low-light image enhancement methods lack enough effectiveness and generalization in practical applications. We suppose this is because of the absence explicit supervision inherent gap between real-world scenarios training data domain. In paper, we develop Diffusion-based domain calibration to realize more robust effective Low-Light Enhancement, called DiffLLE. Since diffusion model performs impressive denoising capability has been trained on massive clean images, adopt...

10.48550/arxiv.2308.09279 preprint EN other-oa arXiv (Cornell University) 2023-01-01

PhysHOI: Physics-Based Imitation of Dynamic Human-Object Interaction

OPENALEX - Publications

Yinhuai Wang Jing Lin Ailing Zeng Zhengyi Luo Jian Zhang and 1 more

Humans interact with objects all the time. Enabling a humanoid to learn human-object interaction (HOI) is key step for future smart animation and intelligent robotics systems. However, recent progress in physics-based HOI requires carefully designed task-specific rewards, making system unscalable labor-intensive. This work focuses on dynamic imitation: teaching skills through imitating kinematic demonstrations. It quite challenging because of complexity between body parts lack data. To...

10.48550/arxiv.2312.04393 preprint EN other-oa arXiv (Cornell University) 2023-01-01

SkillMimic: Learning Reusable Basketball Skills from Demonstrations

OPENALEX - Publications

Yinhuai Wang Qihan Zhao Runyi Yu Ailing Zeng Jing Lin and 8 more

Mastering basketball skills such as diverse layups and dribbling involves complex interactions with the ball requires real-time adjustments. Traditional reinforcement learning methods for interaction rely on labor-intensive, manually designed rewards that do not generalize well across different skills. Inspired by how humans learn from demonstrations, we propose SkillMimic, a data-driven approach mimics both human motions to wide variety of SkillMimic employs unified configuration human-ball...

10.48550/arxiv.2408.15270 preprint EN arXiv (Cornell University) 2024-08-12

Position Embedding Needs an Independent Layer Normalization

OPENALEX - Publications

Runyi Yu Zhennan Wang Yinhuai Wang Kehan Li Y. Zhao and 3 more

The Position Embedding (PE) is critical for Vision Transformers (VTs) due to the permutation-invariance of self-attention operation. By analyzing input and output each encoder layer in VTs using reparameterization visualization, we find that default PE joining method (simply adding patch embedding together) operates same affine transformation token PE, which limits expressiveness hence constrains performance VTs. To overcome this limitation, propose a simple, effective, robust method....

10.48550/arxiv.2212.05262 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Unlimited-Size Diffusion Restoration

OPENALEX - Publications

Yinhuai Wang Jiwen Yu Runyi Yu Jian Zhang

Recently, using diffusion models for zero-shot image restoration (IR) has become a new hot paradigm. This type of method only needs to use the pre-trained off-the-shelf models, without any finetuning, and can directly handle various IR tasks. The upper limit performance depends on which are in rapid evolution. However, current methods discuss how deal with fixed-size images, but dealing images arbitrary sizes is very important practical applications. paper focuses those diffusion-based size...

10.48550/arxiv.2303.00354 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Panini-Net: GAN Prior Based Degradation-Aware Feature Interpolation for Face Restoration

OPENALEX - Publications

Yinhuai Wang Yujie Hu Jian Zhang

Emerging high-quality face restoration (FR) methods often utilize pre-trained GAN models (\textit{i.e.}, StyleGAN2) as Prior. However, these usually struggle to balance realness and fidelity when facing various degradation levels. Besides, there is still a noticeable visual quality gap compared with models. In this paper, we propose novel Prior based degradation-aware feature interpolation network, dubbed Panini-Net, for FR tasks by explicitly learning the abstract representations...

10.48550/arxiv.2203.08444 preprint EN other-oa arXiv (Cornell University) 2022-01-01

GAN Prior based Null-Space Learning for Consistent Super-Resolution

OPENALEX - Publications

Yinhuai Wang Yujie Hu Jiwen Yu Jian Zhang

Consistency and realness have always been the two critical issues of image super-resolution. While has dramatically improved with use GAN prior, state-of-the-art methods still suffer inconsistencies in local structures colors (e.g., tooth eyes). In this paper, we show that these can be analytically eliminated by learning only null-space component while fixing range-space part. Further, design a pooling-based decomposition (PD), universal range-null space for super-resolution tasks, which is...

10.48550/arxiv.2211.13524 preprint EN other-oa arXiv (Cornell University) 2022-01-01