NFDI4DS | UHH-SEMS - Publication Details

Richard Zhang

ORCID: 0000-0003-2507-4674

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5079334071

Research Areas

Computer Graphics and Visualization Techniques
Image Retrieval and Classification Techniques
Advanced Vision and Imaging
Generative Adversarial Networks and Image Synthesis
Topological and Geometric Data Analysis
Medical Image Segmentation Techniques
Philosophy, Science, and History
Neural dynamics and brain function
Cinema and Media Studies
Advanced Image and Video Retrieval Techniques
Advanced Image Fusion Techniques
Multidisciplinary Warburg-centric Studies

Adobe Systems (United States)
2023-2024

Zero-shot Image-to-Image Translation

OPENALEX - Publications

Gaurav Parmar Krishna Kumar Singh Richard Zhang Yijun Li Jingwan Lu and 1 more

Large-scale text-to-image generative models have shown their remarkable ability to synthesize diverse, high-quality images. However, directly applying these for real image editing remains challenging two reasons. First, it is hard users craft a perfect text prompt depicting every visual detail in the input image. Second, while existing can introduce desirable changes certain regions, they often dramatically alter content and unexpected unwanted regions. In this work, we pix2pix-zero, an...

10.1145/3588432.3591513 article EN cc-by 2023-07-19

Customizing Text-to-Image Diffusion with Object Viewpoint Control

OPENALEX - Publications

Nupur Kumari Grace Su Richard Zhang Taesung Park Eli Shechtman and 1 more

Model customization introduces new concepts to existing text-to-image models, enabling the generation of these concepts/objects in novel contexts. However, such methods lack accurate camera view control with respect object, and users must resort prompt engineering (e.g., adding "top-view") achieve coarse control. In this work, we introduce a task – explicit object viewpoint diffusion models. This allows us modify custom object's properties generate it various background scenes via text...

10.1145/3680528.3687564 article EN cc-by 2024-12-03

From Slow Bidirectional to Fast Causal Video Generators

OPENALEX - Publications

Tianwei Yin Qiang Zhang Richard Zhang William T. Freeman Frédo Durand and 2 more

Current video diffusion models achieve impressive generation quality but struggle in interactive applications due to bidirectional attention dependencies. The of a single frame requires the model process entire sequence, including future. We address this limitation by adapting pretrained transformer causal that generates frames on-the-fly. To further reduce latency, we extend distribution matching distillation (DMD) videos, distilling 50-step into 4-step generator. enable stable and...

10.48550/arxiv.2412.07772 preprint EN arXiv (Cornell University) 2024-12-10

What Makes for a Good Stereoscopic Image?

OPENALEX - Publications

Netanel Y. Tamir Shir Amir Ranel Itzhaky Noam Atia Shobhita Sundaram and 6 more

With rapid advancements in virtual reality (VR) headsets, effectively measuring stereoscopic quality of experience (SQoE) has become essential for delivering immersive and comfortable 3D experiences. However, most existing stereo metrics focus on isolated aspects the viewing such as visual discomfort or image quality, have traditionally faced data limitations. To address these gaps, we present SCOPE (Stereoscopic COntent Preference Evaluation), a new dataset comprised real synthetic images...

10.48550/arxiv.2412.21127 preprint EN arXiv (Cornell University) 2024-12-30

Coming Soon ...