- 3D Shape Modeling and Analysis
- Advanced Vision and Imaging
- Human Pose and Action Recognition
- Human Motion and Animation
- Generative Adversarial Networks and Image Synthesis
- Computer Graphics and Visualization Techniques
- Video Surveillance and Tracking Methods
- Face recognition and analysis
- Advanced Image Fusion Techniques
- Anomaly Detection Techniques and Applications
- Medical Image Segmentation Techniques
- Image and Signal Denoising Methods
- Infrared Thermography in Medicine
- Cell Image Analysis Techniques
- Sports Performance and Training
- Anatomy and Medical Technology
- Color Science and Applications
- Advanced Numerical Analysis Techniques
- Image Enhancement Techniques
- Social Robot Interaction and HRI
- Remote Sensing and LiDAR Applications
- Building Energy and Comfort Optimization
- Thermoregulation and physiological responses
Google (United States)
2021-2024
Weatherford College
2021
Max Planck Society
2018-2020
Technische Universität Braunschweig
2017-2020
Max Planck Institute for Informatics
2018-2020
Media Design School
2017
Aalborg University
2016
While many works focus on 3D reconstruction from images, in this paper, we shape and completion a variety of inputs, which are deficient some respect: low high resolution voxels, sparse dense point clouds, complete or incomplete. Processing such inputs is an increasingly important problem as they the output scanners, becoming more accessible, intermediate computer vision algorithms. Recently, learned implicit functions have shown great promise produce continuous reconstructions. However,...
This paper describes a method to obtain accurate 3D body models and texture of arbitrary people from single, monocular video in which person is moving. Based on parametric model, we present robust processing pipeline infer model shapes including clothed with 4.5mm reconstruction accuracy. At the core our approach transformation dynamic pose into canonical frame reference. Our main contribution transform silhouette cones corresponding human silhouettes visual hull common reference frame....
We present Octopus, a learning-based model to infer the personalized 3D shape of people from few frames (1-8) monocular video in which person is moving with reconstruction accuracy 4 5mm, while being orders magnitude faster than previous methods. From semantic segmentation images, our Octopus reconstructs shape, including parameters SMPL plus clothing and hair 10 seconds or less. The achieves fast accurate predictions based on two key design choices. First, by predicting canonical T-pose...
We present a simple yet effective method to infer detailed full human body shape from only single photograph. Our model can full-body including face, hair, and clothing wrinkles at interactive frame-rates. Results feature details even on parts that are occluded in the input image. main idea is turn regression into an aligned image-to-image translation problem. The our partial texture map of visible region obtained off-the-shelf methods. From texture, we estimate normal vector displacement...
We present a novel method for high detail-preserving human avatar creation from monocular video. A parameterized body model is refined and optimized to maximally resemble subjects video showing them all sides. Our avatars feature natural face, hairstyle, clothes with garment wrinkles, high-resolution texture. paper contributes facial landmark shading-based shape refinement, semantic texture prior, stitching strategy, resulting in the most sophisticated-looking obtained single date. Numerous...
We present PHORHUM, a novel, end-to-end trainable, deep neural network methodology for photorealistic 3D human reconstruction given just monocular RGB image. Our pixel-aligned method estimates detailed geometry and, the first time, unshaded surface color together with scene illumination. Observing that supervision alone is not sufficient high fidelity reconstruction, we introduce patch-based rendering losses enable reliable on visible parts of human, and plausible estimation non-visible...
In this paper, we present a simple yet effective method to automatically transfer textures of clothing images (front and back) 3D garments worn on top SMPL, in real time. We first compute training pairs with aligned using custom non-rigid 2D registration method, which is accurate but slow. Using these pairs, learn mapping from pixels the garment surface. Our idea dense correspondences image silhouettes 2D-UV map surface shape information alone, completely ignoring texture, allows us...
We present imGHUM, the first holistic generative model of 3D human shape and articulated pose, represented as a signed distance function. In contrast to prior work, we full body implicitly function zero-level-set without use an explicit template mesh. propose novel network architecture learning paradigm, which make it possible learn detailed implicit shape, semantics, on par with state-of-the-art mesh-based models. Our features desired detail for models, such pose including hand motion...
We present neural radiance fields for rendering and temporal (4D) reconstruction of humans in motion (H-NeRF), as captured by a sparse set cameras or even from monocular video. Our approach combines ideas scene representation, novel-view synthesis, implicit statistical geometric human representations, coupled using novel loss functions. Instead learning field with uniform occupancy prior, we constrain it structured body model, represented signed distance This allows us to robustly fuse...
We present a new solution to egocentric 3D body pose estimation from monocular images captured downward looking fish-eye camera installed on the rim of head mounted virtual reality device. This unusual viewpoint leads with unique visual appearance, characterized by severe self-occlusions and strong perspective distortions that result in drastic difference resolution between lower upper body. propose encoder-decoder architecture novel multi-branch decoder designed specifically account for...
We present DreamHuman, a method to generate realistic animatable 3D human avatar models solely from textual descriptions. Recent text-to-3D methods have made considerable strides in generation, but are still lacking important aspects. Control and often spatial resolution remain limited, existing produce fixed rather than animated models, anthropometric consistency for complex structures like people remains challenge. DreamHuman connects large text-to-image synthesis neural radiance fields,...
We introduce Structured 3D Features, a model based on novel implicit representation that pools pixel-aligned image features onto dense points sampled from parametric, statistical human mesh surface. The have associated semantics and can move freely in space. This allows for optimal coverage of the person interest, beyond just body shape, which turn, additionally helps modeling accessories, hair, loose clothing. Owing to this, we present complete transformer-based attention framework which,...
In order to enable a robust 24-h monitoring of traffic under changing environmental conditions, it is beneficial observe the scene using several sensors, preferably from different modalities. To fully benefit multi-modal sensor output, however, one must fuse data. This paper introduces new approach for fusing color RGB and thermal video streams by not only information videos themselves, but also available contextual scene. The used judge quality particular modality guides fusion two parallel...
Noninvasive imaging of oxygen uptake may provide a useful tool for the quantification energy expenditure during human locomotion. A novel thermal method (optical flow) was validated against indirect calorimetry estimation walking and running.Fourteen endurance-trained subjects completed discontinuous incremental exercise test on treadmill. Subjects performed 4-min intervals at 3, 5, 7 km·h (walking) 8, 10, 12, 14, 16, 18 (running) with 30 s rest between intervals. Heart rate, gas exchange,...
We present PhoMoH ['fo℧.mo℧'], a neural network methodology to construct generative models of photorealistic 3D geometry and appearance human heads including hair, beards, an oral cavity, clothing. In contrast prior work, the head using fields, thus supporting complex topology. Instead learning model from scratch, we propose augment existing expressive with new features. Concretely, learn highly detailed layered on top mid-resolution together detailed, local geometry-aware, disentangled...
The demand for automatically gathered data is a societal trend quickly extending to all aspects of human life. Knowledge on the utilization public facilities interest optimising use and cutting expenses owners. Manual observations are both cumbersome expensive, they have risk incorrect results due subjective opinions or lack in given task. In this paper we present main 5-year long research project revolving around real-world application automatic analysis activities sports arenas. Three...
We present PHORHUM, a novel, end-to-end trainable, deep neural network methodology for photorealistic 3D human reconstruction given just monocular RGB image. Our pixel-aligned method estimates detailed geometry and, the first time, unshaded surface color together with scene illumination. Observing that supervision alone is not sufficient high fidelity reconstruction, we introduce patch-based rendering losses enable reliable on visible parts of human, and plausible estimation non-visible...
Score Distillation Sampling (SDS) is a recent but already widely popular method that relies on an image diffusion model to control optimization problems using text prompts. In this paper, we conduct in-depth analysis of the SDS loss function, identify inherent problem with its formulation, and propose surprisingly easy effective fix. Specifically, decompose into different factors isolate component responsible for noisy gradients. original high guidance used account noise, leading unwanted...