NFDI4DS | UHH-SEMS - Publication Details

Monocular Total Capture: Posing Face, Body, and Hands in the Wild

OPENALEX - Publications

Donglai Xiang Hanbyul Joo Yaser Sheikh

We present the first method to capture 3D total motion of a target person from monocular view input. Given an image or video, our reconstructs body, face, and fingers represented by deformable mesh model. use efficient representation called Part Orientation Fields (POFs), encode orientations all body parts in common 2D space. POFs are predicted Fully Convolutional Network, along with joint confidence maps. To train network, we collect new human dataset capturing diverse 40 subjects multiview...

10.1109/cvpr.2019.01122 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

Semantic Object Parsing with Local-Global Long Short-Term Memory

OPENALEX - Publications

Xiaodan Liang Xiaohui Shen Donglai Xiang Jiashi Feng Liang Lin and 1 more

Semantic object parsing is a fundamental task for understanding objects in detail computer vision community, where incorporating multi-level contextual information critical achieving such fine-grained pixel-level recognition. Prior methods often leverage the through post-processing predicted confidence maps. In this work, we propose novel deep Local-Global Long Short-Term Memory (LG-LSTM) architecture to seamlessly incorporate short-distance and long-distance spatial dependencies into...

10.1109/cvpr.2016.347 article EN 2016-06-01

Single-Network Whole-Body Pose Estimation

OPENALEX - Publications

Gines Hidalgo Martinez Yaadhav Raaj Haroon Idrees Donglai Xiang Hanbyul Joo and 2 more

We present the first single-network approach for 2D~whole-body pose estimation, which entails simultaneous localization of body, face, hands, and feet keypoints. Due to bottom-up formulation, our method maintains constant real-time performance regardless number people in image. The network is trained a single stage using multi-task learning, through an improved architecture can handle scale differences between body/foot face/hand Our considerably improves upon...

10.1109/iccv.2019.00708 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

You2Me: Inferring Body Pose in Egocentric Video via First and Second Person Interactions

OPENALEX - Publications

Evonne Ng Donglai Xiang Hanbyul Joo Kristen Grauman

The body pose of a person wearing camera is great interest for applications in augmented reality, healthcare, and robotics, yet much the person's out view typical wearable camera. We propose learning-based approach to estimate wearer's 3D from egocentric video sequences. Our key insight leverage interactions with another person---whose we can directly observe---as signal inherently linked first-person subject. show that since between individuals often induce well-ordered series...

10.1109/cvpr42600.2020.00991 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

MonoClothCap: Towards Temporally Coherent Clothing Capture from Monocular RGB Video

OPENALEX - Publications

Donglai Xiang Fabián Prada Chenglei Wu Jessica K. Hodgins

We present a method to capture temporally coherent dynamic clothing deformation from monocular RGB video input. In contrast the existing literature, our does not require pre-scanned personalized mesh template, and thus can be applied in-the-wild videos. To constrain output valid space, we build statistical models for three types of clothing: T- shirt, short pants long pants. A differentiable renderer is utilized align captured shapes input frames by minimizing difference in both silhouette,...

10.1109/3dv50981.2020.00042 article EN 2021 International Conference on 3D Vision (3DV) 2020-11-01

Dressing Avatars

OPENALEX - Publications

Donglai Xiang Timur Bagautdinov Tuur Stuyck Fabián Prada Javier M. Romero and 8 more

Despite recent progress in developing animatable full-body avatars, realistic modeling of clothing - one the core aspects human self-expression remains an open challenge. State-of-the-art physical simulation methods can generate realistically behaving geometry at interactive rates. Modeling photorealistic appearance, however, usually requires physically-based rendering which is too expensive for applications. On other hand, data-driven deep appearance models are capable efficiently producing...

10.1145/3550454.3555456 article EN ACM Transactions on Graphics 2022-11-30

Modeling clothing as a separate layer for an animatable human avatar

OPENALEX - Publications

Donglai Xiang Fabián Prada Timur Bagautdinov Weipeng Xu Yuan Dong and 3 more

We have recently seen great progress in building photorealistic animatable full-body codec avatars, but generating high-fidelity animation of clothing is still difficult. To address these difficulties, we propose a method to build an clothed body avatar with explicit representation the on upper from multi-view captured videos. use two-layer mesh register each 3D scan separately and templates. In order improve photometric correspondence across different frames, texture alignment then...

10.1145/3478513.3480545 article EN ACM Transactions on Graphics 2021-12-01

Surface Normals in the Wild

OPENALEX - Publications

Weifeng Chen Donglai Xiang Jia Deng

We study the problem of single-image depth estimation for images in wild. collect human annotated surface normals and use them to help train a neural network that directly predicts pixel-wise depth. propose two novel loss functions training with normal annotations. Experiments on NYU Depth, KITTI, our own dataset demonstrate approach can significantly improve quality

10.1109/iccv.2017.173 article EN 2017-10-01

American society of biomechanics early career achievement award 2020: Toward portable and modular biomechanics labs: How video and IMU fusion will change gait analysis

OPENALEX - Publications

Eni Halilaj Soyong Shin Eric Rapp Donglai Xiang

10.1016/j.jbiomech.2021.110650 article EN Journal of Biomechanics 2021-07-28

Revitalizing Optimization for 3D Human Pose and Shape Estimation: A Sparse Constrained Formulation

OPENALEX - Publications

Taosha Fan Kalyan Vasudev Alwala Donglai Xiang Weipeng Xu Todd Murphey and 1 more

We propose a novel sparse constrained formulation and from it derive real-time optimization method for 3D human pose shape estimation. Our method, SCOPE (Sparse Constrained Optimization Pose shapE estimation), is orders of magnitude faster (avg. 4ms convergence) than existing methods, while being mathematically equivalent to their dense unconstrained under mild assumptions. achieve this by exploiting the underlying sparsity constraints our efficiently compute Gauss-Newton direction. show...

10.1109/iccv48922.2021.01126 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

Semantic Object Parsing with Local-Global Long Short-Term Memory

OPENALEX - Publications

Xiaodan Liang Xiaohui Shen Donglai Xiang Jiashi Feng Liang Lin and 1 more

Semantic object parsing is a fundamental task for understanding objects in detail computer vision community, where incorporating multi-level contextual information critical achieving such fine-grained pixel-level recognition. Prior methods often leverage the through post-processing predicted confidence maps. In this work, we propose novel deep Local-Global Long Short-Term Memory (LG-LSTM) architecture to seamlessly incorporate short-distance and long-distance spatial dependencies into...

10.48550/arxiv.1511.04510 preprint EN other-oa arXiv (Cornell University) 2015-01-01

Pattern-Based Cloth Registration and Sparse-View Animation

OPENALEX - Publications

Oshri Halimi Tuur Stuyck Donglai Xiang Timur Bagautdinov He Wen and 5 more

We propose a novel multi-view camera pipeline for the reconstruction and registration of dynamic clothing. Our proposed method relies on specifically designed pattern that allows precise video tracking in each view. triangulate tracked points register cloth surface fine-grained geometric resolution low localization error. Compared to state-of-the-art methods, our exhibits stable correspondence, same deforming along temporal sequence. As an application, we demonstrate how use greatly improves...

10.1145/3550454.3555448 article EN ACM Transactions on Graphics 2022-11-30

Drivable Avatar Clothing: Faithful Full-Body Telepresence with Dynamic Clothing Driven by Sparse RGB-D Input

OPENALEX - Publications

Donglai Xiang Fabián Prada Zhe Cao Kaiwen Guo Chenglei Wu and 2 more

Clothing is an important part of human appearance but challenging to model in photorealistic avatars. In this work we present avatars with dynamically moving loose clothing that can be faithfully driven by sparse RGB-D inputs as well body and face motion. We propose a Neural Iterative Closest Point (N-ICP) algorithm efficiently track the coarse garment shape given depth input. Given tracking results, input images are then remapped texel-aligned features, which fed into drivable avatar models...

10.1145/3610548.3618136 preprint EN cc-by 2023-12-10

Monocular Total Capture: Posing Face, Body, and Hands in the Wild

OPENALEX - Publications

Donglai Xiang Hanbyul Joo Yaser Sheikh

We present the first method to capture 3D total motion of a target person from monocular view input. Given an image or video, our reconstructs body, face, and fingers represented by deformable mesh model. use efficient representation called Part Orientation Fields (POFs), encode orientations all body parts in common 2D space. POFs are predicted Fully Convolutional Network (FCN), along with joint confidence maps. To train network, we collect new human dataset capturing diverse 40 subjects...

10.48550/arxiv.1812.01598 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Diffusion Shape Prior for Wrinkle-Accurate Cloth Registration

OPENALEX - Publications

Jingfan Guo Fabián Prada Donglai Xiang Javier M. Romero Chenglei Wu and 3 more

Registering clothes from 4D scans with vertex-accurate correspondence is challenging, yet important for dynamic appearance modeling and physics parameter estimation real-world data. However, previous methods either rely on texture information, which not always reliable, or achieve only coarse-level alignment. In this work, we present a novel approach to enabling accurate surface registration of texture-less large deformation. Our key idea effectively leverage shape prior learned pre-captured...

10.1109/3dv62453.2024.00042 article EN 2021 International Conference on 3D Vision (3DV) 2024-03-18

Garment Avatars: Realistic Cloth Driving using Pattern Registration

OPENALEX - Publications

Oshri Halimi Fabián Prada Tuur Stuyck Donglai Xiang Timur Bagautdinov and 5 more

Virtual telepresence is the future of online communication. Clothing an essential part a person's identity and self-expression. Yet, ground truth data registered clothes currently unavailable in required resolution accuracy for training models realistic cloth animation. Here, we propose end-to-end pipeline building drivable representations clothing. The core our approach multi-view patterned tracking algorithm capable capturing deformations with high accuracy. We further rely on high-quality...

10.48550/arxiv.2206.03373 preprint EN cc-by arXiv (Cornell University) 2022-01-01

Local Discriminant Training and Global Optimization for Convolutional Neural Network Based Handwritten Chinese Character Recognition

OPENALEX - Publications

Xiangsheng Zeng Donglai Xiang Liangrui Peng Changsong Liu Xiaoqing Ding

This paper investigates local discriminant training and global optimization methods for Convolutional Neural Network (CNN) to improve its ability recognition accuracy. For training, we propose combine triplet loss softmax with cross-entropy as the function. The is incorporated into an additional fully-connected layer before final of a CNN model. optimization, use Conditional Random Field (CRF) further utilize pairwise distance feature vectors trained loss. Experiments different models on...

10.1109/icdar.2017.70 article EN 2017-11-01

Performance-aware task scheduling for energy harvesting nonvolatile processors considering power switching overhead

OPENALEX - Publications

Hehe Li Yongpan Liu Chenchen Fu Chun Jason Xue Donglai Xiang and 5 more

Nonvolatile processors have manifested strong vitality in battery-less energy harvesting sensor nodes due to their characteristics of zero standby power, resilience power failures and fast read/write operations. However, I/O sensing operations cannot store system states after off, hence they are sensitive high switching overhead is induced during oscillation, which significantly degrades the performance. In this paper, we propose a novel performance-aware task scheduling technique...

10.1145/2897937.2898059 article EN 2016-05-25

PhysAvatar: Learning the Physics of Dressed 3D Avatars from Visual Observations

OPENALEX - Publications

Yang Zheng Qingqing Zhao Guandao Yang Yifan Wang Donglai Xiang and 6 more

Modeling and rendering photorealistic avatars is of crucial importance in many applications. Existing methods that build a 3D avatar from visual observations, however, struggle to reconstruct clothed humans. We introduce PhysAvatar, novel framework combines inverse with physics automatically estimate the shape appearance human multi-view video data along physical parameters fabric their clothes. For this purpose, we adopt mesh-aligned 4D Gaussian technique for spatio-temporal mesh tracking...

10.48550/arxiv.2404.04421 preprint EN arXiv (Cornell University) 2024-04-05

GenUSD: 3D scene generation made easy

OPENALEX - Publications

Tsung-Yi Lin Chen-Hsuan Lin Yin Cui Yunhao Ge Seungjun Nah and 14 more

We introduce GenUSD, an end-to-end text-to-scene generation framework that transforms natural language queries into realistic 3D scenes, including objects and layouts. The process involves two main steps: 1) A Large Language Model (LLM) generates a scene layout hierarchically. It first proposes high-level plan to decompose the multiple functionally spatially distinct subscenes. Then, for each subscene, LLM with detailed positions, poses, sizes, descriptions. To manage complex object...

10.1145/3641520.3665306 article EN 2024-07-25

DressRecon: Freeform 4D Human Reconstruction from Monocular Video

OPENALEX - Publications

Jeff Tan Donglai Xiang Shubham Tulsiani Deva Ramanan Gengshan Yang

We present a method to reconstruct time-consistent human body models from monocular videos, focusing on extremely loose clothing or handheld object interactions. Prior work in reconstruction is either limited tight with no interactions, requires calibrated multi-view captures personalized template scans which are costly collect at scale. Our key insight for high-quality yet flexible the careful combination of generic priors about articulated shape (learned large-scale training data)...

10.48550/arxiv.2409.20563 preprint EN arXiv (Cornell University) 2024-09-30