- Advanced Vision and Imaging
- Generative Adversarial Networks and Image Synthesis
- Face recognition and analysis
- Computer Graphics and Visualization Techniques
- Image Enhancement Techniques
- 3D Shape Modeling and Analysis
- Image Processing and 3D Reconstruction
- Industrial Vision Systems and Defect Detection
- Advanced Image Processing Techniques
- Advanced Optical Sensing Technologies
- Color Science and Applications
- Image and Signal Denoising Methods
- Gaze Tracking and Assistive Technology
- Advanced Image and Video Retrieval Techniques
- Advanced Image Fusion Techniques
- Human Pose and Action Recognition
- Optical Systems and Laser Technology
- Architecture and Computational Design
- Human Motion and Animation
- Remote-Sensing Image Classification
- Robotics and Sensor-Based Localization
- Virtual Reality Applications and Impacts
- Sparse and Compressive Sensing Techniques
Google (United States)
2019-2024
Max Planck Institute for Informatics
2016-2021
Max Planck Society
2018
Saarland University
2018
Indian Institute of Technology Bombay
2014
Synthesizing visual content that meets users' needs often requires flexible and precise controllability of the pose, shape, expression, layout generated objects. Existing approaches gain generative adversarial networks (GANs) via manually annotated training data or a prior 3D model, which lack flexibility, precision, generality. In this work, we study powerful yet much less explored way controlling GANs, is, to "drag" any points image precisely reach target in user-interactive manner, as...
We present the first end-to-end approach for real-time material estimation general object shapes with uniform that only requires a single color image as input. In addition to Lambertian surface properties, our fully automatically computes specular albedo, shininess, and foreground segmentation. tackle this challenging ill-posed inverse rendering problem using recent advances in image-to-image translation techniques based on deep convolutional encoder-decoder architectures. The underlying...
We present a novel technique to relight images of human faces by learning model facial reflectance from database 4D field data several subjects in variety expressions and viewpoints. Using our learned model, face can be relit arbitrary illumination environments using only two original recorded under spherical color gradient illumination. The output deep network indicates that the contain information needed estimate full field, including specular reflections high frequency details. While...
Intrinsic video decomposition refers to the fundamentally ambiguous task of separating a stream into its constituent layers, in particular reflectance and shading layers. Such is basis for variety manipulation applications, such as realistic recoloring or retexturing objects. We present novel variational approach tackle this underconstrained inverse problem at real-time frame rates, which enables on-line processing live footage. The finding intrinsic formulated mixed ℓ 2 - p -optimization...
The increasing demand for 3D content in augmented and virtual reality has motivated the development of volumetric performance capture systemsnsuch as Light Stage. Recent advances are pushing free viewpoint relightable videos dynamic human performances closer to photorealistic quality. However, despite significant efforts, these sophisticated systems limited by reconstruction rendering algorithms which do not fully model complex structures higher order light transport effects such global...
We propose a method to learn high-quality implicit 3D head avatar from monocular RGB video captured in the wild. The learnt is driven by parametric face model achieve user-controlled facial expressions and poses. Our hybrid pipeline combines geometry prior dynamic tracking of 3DMM with neural radiance field fine-grained control photorealism. To reduce over-smoothing improve out-of-model synthesis, we predict local features anchored on geometry. These are deformation interpolated space yield...
We propose VoLux-GAN, a generative framework to synthesize 3D-aware faces with convincing relighting. Our main contribution is volumetric HDRI relighting method that can efficiently accumulate albedo, diffuse and specular lighting contributions along each 3D ray for any desired HDR environmental map. Additionally, we show the importance of supervising image decomposition process using multiple discriminators. In particular, data augmentation technique leverages recent advances in single...
Deep generative models can synthesize photorealistic images of human faces with novel identities. However, a key challenge to the wide applicability such techniques is provide independent control over semantically meaningful parameters: appearance, head pose, face shape, and facial expressions. In this paper, we propose VariTex - best our knowledge first method that learns variational latent feature space neural textures, which allows sampling We combine model parametric gain explicit pose...
A unique challenge in creating high-quality animatable and relightable 3D avatars of people is modeling human eyes. The synthesizing eyes multifold as it requires 1) appropriate representations for the various components eye periocular region coherent viewpoint synthesis, capable representing diffuse, refractive highly reflective surfaces, 2) disentangling skin appearance from environmental illumination such that may be rendered under novel lighting conditions, 3) capturing eyeball motion...
We present a novel real-time approach for user-guided intrinsic decomposition of static scenes captured by an RGB-D sensor. In the first step, we acquire three-dimensional representation scene using dense volumetric reconstruction framework. The obtained serves as proxy to densely fuse reflectance estimates and store user-provided constraints in space. User constraints, form constant shading strokes, can be placed directly on real-world geometry intuitive touch-based interaction metaphor, or...
We propose the first approach for decomposition of a monocular color video into direct and indirect illumination components in real time. retrieve, separate layers, contribution made to scene appearance by reflectance, light sources, reflections from various coherent regions one another. Existing techniques that invert global transport require image capture under multiplexed controlled lighting or only enable single at slow off-line frame rates. In contrast, our works regular videos produces...
NeRFs have enabled highly realistic synthesis of human faces including complex appearance and reflectance effects hair skin. These methods typically require a large number multi-view input images, making the process hardware intensive cumbersome, limiting applicability to unconstrained settings. We propose novel volumetric face prior that enables ultra high-resolution views subjects are not part prior's training distribution. This model consists an identity-conditioned NeRF, trained on...
Abstract Eye gaze and expressions are crucial non‐verbal signals in face‐to‐face communication. Visual effects telepresence demand significant improvements personalized tracking, animation, synthesis of the eye region to achieve true immersion. Morphable face models, combination with coordinate‐based neural volumetric representations, show promise solving difficult problem reconstructing intricate geometry (eyelashes) synthesizing photorealistic appearance variations (wrinkles specularities)...
High-fidelity, photorealistic 3D capture of a human face is long-standing problem in computer graphics – the complex material skin, intricate geometry hair, and fine scale textural details make it challenging. Traditional techniques rely on very large expensive rigs to reconstruct explicit mesh appearance maps, are limited by accuracy hand-crafted reflectance models. More recent volumetric methods (e.g., NeRFs) have enabled view-synthesis sometimes relighting learning an implicit...
In this paper, we propose an optimization-based method for simultaneous fusion and unsupervised segmentation of hyperspectral remote sensing images by exploiting redundancy in the data. The data set is visualized as a single image obtained weighted addition all spectral points at each pixel location set. weights are optimized to improve those statistical characteristics fused image, which invoke enhanced response from human observer. A piecewise-constant smoothness constraint imposed on...
Synthesizing visual content that meets users' needs often requires flexible and precise controllability of the pose, shape, expression, layout generated objects. Existing approaches gain generative adversarial networks (GANs) via manually annotated training data or a prior 3D model, which lack flexibility, precision, generality. In this work, we study powerful yet much less explored way controlling GANs, is, to "drag" any points image precisely reach target in user-interactive manner, as...
Traditional methods for constructing high-quality, personalized head avatars from monocular videos demand extensive face captures and training time, posing a significant challenge scalability. This paper introduces novel approach to create high quality avatar utilizing only single or few images per user. We learn generative model 3D animatable photo-realistic multi-view dataset of expressions 2407 subjects, leverage it as prior creating few-shot images. Different previous 3D-aware models,...
3D rendering of dynamic face captures is a challenging problem, and it demands improvements on several fronts---photorealism, efficiency, compatibility, configurability. We present novel representation that enables high-quality volumetric an actor's facial performances with minimal compute memory footprint. It runs natively commodity graphics soft- hardware, allows for graceful trade-off between quality efficiency. Our method utilizes recent advances in neural rendering, particularly...
Abstract High‐resolution texture maps are essential to render photoreal digital humans for visual effects or generate data machine learning. The acquisition of high resolution assets at scale is cumbersome, it involves enrolling a large number human subjects, using expensive multi‐view camera setups, and significant manual artistic effort align the textures. To alleviate these problems, we introduce GANtlitz (A play on german noun Antlitz, meaning face), generative model that can synthesize...
Figure 1: We present Lite2Relight, a method that can relight monocular portrait images given HDRI environment maps.Our demonstrates strong generalization to in-the-wild images, maintains 3D consistent pose synthesis of the subjects and performs physically accurate relighting.Moreover, courtesy our lightweight architecture, Lite2Relight captured by live webcam at interactive rates.Image credits Flickr.
Volumetric modeling and neural radiance field representations have revolutionized 3D face capture photorealistic novel view synthesis. However, these methods often require hundreds of multi-view input images are thus inapplicable to cases with less than a handful inputs. We present volumetric prior on human faces that allows for high-fidelity expressive from as few three views captured in the wild. Our key insight is an implicit trained synthetic data alone can generalize extremely...