- Face recognition and analysis
- Advanced Vision and Imaging
- 3D Shape Modeling and Analysis
- Human Motion and Animation
- Computer Graphics and Visualization Techniques
- Generative Adversarial Networks and Image Synthesis
- Human Pose and Action Recognition
- Advanced Image Processing Techniques
- Facial Rejuvenation and Surgery Techniques
- Hand Gesture Recognition Systems
- Gaze Tracking and Assistive Technology
- Advanced Image and Video Retrieval Techniques
- Optical measurement and interference techniques
- Speech and Audio Processing
- Retinal Imaging and Analysis
- Image Processing Techniques and Applications
- Digital Imaging in Medicine
- Image Enhancement Techniques
- Multimodal Machine Learning Applications
- Medical Imaging and Analysis
- Artificial Intelligence in Games
- Computational and Text Analysis Methods
- Biometric Identification and Security
- Glaucoma and retinal disorders
- Virtual Reality Applications and Impacts
Google (Switzerland)
2020-2024
Google (United States)
2020-2024
Weatherford College
2021
Walt Disney (United States)
2012-2020
Walt Disney (Switzerland)
2010-2020
Faculty of 1000 (United States)
2018
ETH Zurich
2010-2014
In this paper, we provide a detailed survey of 3D Morphable Face Models over the 20 years since they were first proposed. The challenges in building and applying these models, namely capture, modeling, image formation, analysis, are still active research topics, review state-of-the-art each areas. We also look ahead, identifying unsolved challenges, proposing directions for future highlighting broad range current applications.
This paper describes a passive stereo system for capturing the 3D geometry of face in single-shot under standard light sources. The is low-cost and easy to deploy. Results are submillimeter accurate commensurate with those from state-of-the-art systems based on active lighting, models meet quality requirements demanding domain like movie industry. Recovered shown captures both high-end cameras studio setting consumer binocular-stereo camera, demonstrating scalability across spectrum camera...
Abstract The computer graphics and vision communities have dedicated long standing efforts in building computerized tools for reconstructing, tracking, analyzing human faces based on visual input. Over the past years rapid progress has been made, which led to novel powerful algorithms that obtain impressive results even very challenging case of reconstruction from a single RGB or RGB‐D camera. range applications is vast steadily growing as these technologies are further improving speed,...
We present a new technique for passive and markerless facial performance capture based on anchor frames. Our method starts with high resolution per-frame geometry acquisition using state-of-the-art stereo reconstruction, proceeds to establish single triangle mesh that is propagated through the entire performance. Leveraging fact performances often contain repetitive subsequences, we identify frames as those which similar expressions manually chosen reference expression. Anchor are...
We present the first real-time high-fidelity facial capture method. The core idea is to enhance a global face tracker, which provides low-resolution mesh, with local regressors that add in medium-scale details, such as expression wrinkles. Our main observation although wrinkles appear different scales and at locations on face, they are locally very self-similar their visual appearance direct consequence of shape. therefore train from high-resolution data order predict geometry runtime....
We present a new technique for passive and markerless facial performance capture based on anchor frames . Our method starts with high resolution per-frame geometry acquisition using state-of-the-art stereo reconstruction, proceeds to establish single triangle mesh that is propagated through the entire performance. Leveraging fact performances often contain repetitive subsequences, we identify as those which similar expressions manually chosen reference expression. Anchor are automatically...
We present a novel method for populating 3D indoor scenes with virtual humans that can navigate in the environment and interact objects realistic manner. Existing approaches rely on high-quality training sequences contain captured human motions they with. However, such interaction data are costly, difficult to capture, hardly cover full range of plausible human-scene interactions complex environments. To address these challenges, we propose reinforcement learning-based approach enables...
We present a new anatomically-constrained local face model and fitting approach for tracking 3D faces from 2D motion data in very high quality. In contrast to traditional global models, often built large set of blendshapes, we propose deformation composed many small subspaces spatially distributed over the face. Our offers far more flexibility expressiveness than blendshape even with much smaller size. This would typically come at cost reduced robustness, particular during under-constrained...
Even though the human eye is one of central features individual appearance, its shape has so far been mostly approximated in our community with gross simplifications. In this paper we demonstrate that there a lot individuality to every eye, fact common practices for 3D generation do not consider. To faithfully reproduce all intricacies propose novel capture system capable accurately reconstructing visible parts eye: white sclera , transparent cornea and non-rigidly deforming colored iris ....
Although facial hair plays an important role in individual expression, facial-hair reconstruction is not addressed by current face-capture systems. Our research addresses this limitation with algorithm that treats and skin surface capture together a coupled fashion so high-quality representation of fibers as well the underlying can be reconstructed. We propose passive, camera-based system robust against arbitrary motion since all data acquired within time period single exposure. detects...
In recent years, sophisticated image-based reconstruction methods for the human face have been developed. These capture highly detailed static and dynamic geometry of whole face, or specific models regions, such as hair, eyes eye lids. Unfortunately, to mouth cavity in general, teeth particular, received very little attention. The accurate rendering teeth, however, is crucial realistic display facial expressions, currently high quality animations resort tooth row created by tedious manual...
We present a method to acquire dynamic properties of facial skin appearance, including diffuse albedo encoding blood flow, specular intensity, and per-frame high resolution normal maps for performance sequence. The reconstructs these from purely passive multi-camera setup, without the need polarization or requiring temporally multiplexed illumination. Hence, it is very well suited integration with existing systems capture. To solve this seemingly underconstrained problem, we demonstrate that...
Facial landmark detection is a fundamental task for many consumer and high-end applications almost entirely solved by machine learning methods today. Existing datasets used to train such algorithms are primarily made up of only low resolution images, current limited inputs comparable quality as the training dataset. On other hand, high imagery becoming increasingly more common cameras improve in every year. Therefore, there need that can leverage rich information available imagery. Naively...
We propose a new light-weight face capture system capable of reconstructing both high-quality geometry and detailed appearance maps from single exposure. Unlike currently employed acquisition systems, the proposed technology does not require active illumination hence can readily be integrated with passive photogrammetry solutions. These solutions are in widespread use for 3D scanning humans as they assembled off-the-shelf hardware components, but lack capability estimating appearance. This...
Dubbing is a technique for translating video content from one language to another. However, state-of-the-art visual dubbing techniques directly copy facial expressions source target actors without considering identity-specific idiosyncrasies such as unique type of smile. We present style-preserving approach single inputs, which maintains the signature style when modifying expressions, including mouth motions, match foreign languages. At heart our concept motion style, in particular i.e.,...
The increasing demand for 3D content in augmented and virtual reality has motivated the development of volumetric performance capture systemsnsuch as Light Stage. Recent advances are pushing free viewpoint relightable videos dynamic human performances closer to photorealistic quality. However, despite significant efforts, these sophisticated systems limited by reconstruction rendering algorithms which do not fully model complex structures higher order light transport effects such global...
Text-to-image generative models often reflect the biases of training data, leading to unequal representations underrepresented groups. This study investigates inclusive text-to-image that generate images based on human-written prompts and ensure resulting are uniformly distributed across attributes interest. Unfortunately, directly expressing desired in prompt leads sub-optimal results due linguistic ambiguity or model misrepresentation. Hence, this paper proposes a drastically different...
This paper describes a passive stereo system for capturing the 3D geometry of face in single-shot under standard light sources. The is low-cost and easy to deploy. Results are submillimeter accurate commensurate with those from state-of-the-art systems based on active lighting, models meet quality requirements demanding domain like movie industry. Recovered shown captures both high-end cameras studio setting consumer binocular-stereo camera, demonstrating scalability across spectrum camera...
Facial scanning has become ubiquitous in digital media, but so far most efforts have focused on reconstructing the skin. Eye reconstruction, other hand, received only little attention, and current state-of-the-art method is cumbersome for actor, time-consuming, requires carefully setup calibrated hardware. These constraints currently make eye capture impractical general use. We present first approach high-quality lightweight capture, which leverages a database of pre-captured eyes to guide...
Abstract Facial appearance capture is now firmly established within academic research and used extensively across various application domains, perhaps most prominently in the entertainment industry through design of virtual characters video games films. While significant progress has occurred over last two decades, no single survey currently exists that discusses similarities, differences, practical considerations available techniques as applied to human faces. A central difficulty facial...
We propose a method to learn high-quality implicit 3D head avatar from monocular RGB video captured in the wild. The learnt is driven by parametric face model achieve user-controlled facial expressions and poses. Our hybrid pipeline combines geometry prior dynamic tracking of 3DMM with neural radiance field fine-grained control photorealism. To reduce over-smoothing improve out-of-model synthesis, we predict local features anchored on geometry. These are deformation interpolated space yield...
The physical properties of an object, such as mass, significantly affect how we manipulate it with our hands. Surprisingly, this aspect has so far been neglected in prior work on 3D motion synthesis. To improve the naturalness synthesized hand-object motions, proposes MACS–the first MAss Conditioned hand and object Synthesis approach. Our approach is based cascaded diffusion models generates interactions that plausibly adjust object's mass interaction type. MACS also accepts a manually drawn...