- Human Pose and Action Recognition
- Human Motion and Animation
- Advanced Vision and Imaging
- 3D Shape Modeling and Analysis
- Video Analysis and Summarization
- Hand Gesture Recognition Systems
- Computer Graphics and Visualization Techniques
- Robotics and Sensor-Based Localization
- Generative Adversarial Networks and Image Synthesis
- Video Surveillance and Tracking Methods
- Face recognition and analysis
- Robot Manipulation and Learning
- Music and Audio Processing
- Interactive and Immersive Displays
- Digital Holography and Microscopy
- Virtual Reality Applications and Impacts
- Augmented Reality Applications
- Music Technology and Sound Studies
- Advanced Optical Imaging Technologies
- Optical measurement and interference techniques
- Stroke Rehabilitation and Recovery
- Advanced Neural Network Applications
- Image Processing Techniques and Applications
- Advanced Image and Video Retrieval Techniques
- Artificial Intelligence in Games
Seattle University
2024
META Health
2022-2023
Meta (United States)
2016-2022
Meta (Israel)
2018-2021
Seoul National University
2020
Microsoft Research (United Kingdom)
2015
Microsoft (United States)
2013-2015
Microsoft Research Asia (China)
2013-2015
Waseda University
2015
Saitama University
2014
3D human pose estimation can facilitate various applications, such as assistive technologies and AR/VR. Most existing monocular approaches only focus on a single body part, neglecting the fact that essential nuance of motion is conveyed through concert subtle movements face, hands, body. In this paper, we present FrankMocap, fast accurate whole-body system produce simultaneously from in-the-wild images. The idea FrankMocap its modular design: We first run regression methods for...
Motion capture technology generally requires that recordings be performed in a laboratory or closed stage setting with controlled lighting. This restriction precludes the of motions require an outdoor traversal large areas. In this paper, we present theory and practice using body-mounted cameras to reconstruct motion subject. Outward-looking are attached limbs subject, joint angles root pose estimated through non-linear optimization. The optimization objective function incorporates terms for...
We present a learning-based method for building driving-signal aware full-body avatars. Our model is conditional variational autoencoder that can be animated with incomplete driving signals, such as human pose and facial keypoints, produces high-quality representation of geometry view-dependent appearance. The core intuition behind our better drivability generalization achieved by disentangling the signals remaining generative factors, which are not available during animation. To this end,...
Photorealistic avatars of human faces have come a long way in recent years, yet research along this area is limited by lack publicly available, high-quality datasets covering both, dense multi-view camera captures, and rich facial expressions the captured subjects. In work, we present Multiface, new multi-view, high-resolution face dataset collected from 13 identities at Reality Labs Research for neural rendering. We introduce Mugsy, large scale multi-camera apparatus to capture synchronized...
Abstract In computer graphics, considerable research has been conducted on realistic human motion synthesis. However, most does not consider emotional aspects, which often strongly affect motion. This paper presents a new approach for synthesizing dance performance matched to input music, based the aspects of performance. Our method consists analysis, music and synthesis extracted features. analysis steps, feature vectors are acquired. Motion derived from rhythm intensity, while musical...
In late 2006, Nintendo released a new game controller, the Wiimote, which included three-axis accelerometer. Since then, large variety of novel applications for these controllers have been developed by both independent and commercial developers. We add to this growing library with three performance interfaces that allow user control motion dynamically simulated, animated character through his or her arms, wrists, legs. For comparison, we also implement traditional joystick/button interface....
Hand-drawn animation is a major art form and communication medium, but can be challenging to produce. We present system help people create frame-by-frame animations through manual sketches. design our interface minimalistic: it contains only canvas few controls. When users draw on the canvas, silently analyzes all past sketches predicts what might drawn in future across spatial locations temporal frames. The also offers suggestions beautify existing drawings. Our reduce workload improve...
Many of the actions that we take with our hands involve self-contact and occlusion: shaking hands, making a fist, or interlacing fingers while thinking. This use illustrates importance tracking through occlusion for many applications in computer vision graphics, but existing methods faces are not designed to treat extreme amounts self-occlusion exhibited by common hand gestures. By extending recent advances vision-based physically based animation, present first algorithm capable...
Although the essential nuance of human motion is often conveyed as a combination body movements and hand gestures, existing monocular capture approaches mostly focus on either only ignoring parts or without considering motion. In this paper, we present FrankMocap, system that can estimate both 3D from in-the-wild inputs with faster speed (9.5 fps) better accuracy than previous work. Our method works in near real-time produces outputs unified parametric model structure. aims to simultaneously...
We present a 16.2-million frame (50-hour) multimodal dataset of two-person face-to-face spontaneous conversations. Our features synchronized body and finger motion as well audio data. To the best our knowledge, it represents largest capture natural conversations to date. The statistical analysis verifies strong intraperson interperson covariance arm, hand, speech features, potentially enabling new directions on data-driven social behavior analysis, prediction, synthesis. As an illustration,...
Natural hand manipulations exhibit complex finger maneuvers adaptive to object shapes and the tasks at hand. Learning dexterous manipulation from data in a brute force way would require prohibitive amount of examples effectively cover combinatorial space 3D activities. In this paper, we propose hand-object spatial representation that can achieve generalization limited data. Our combines global shape as voxel occupancies with local geometric details samples closest distances. This is used by...
Despite recent progress in developing animatable full-body avatars, realistic modeling of clothing - one the core aspects human self-expression remains an open challenge. State-of-the-art physical simulation methods can generate realistically behaving geometry at interactive rates. Modeling photorealistic appearance, however, usually requires physically-based rendering which is too expensive for applications. On other hand, data-driven deep appearance models are capable efficiently producing...
HideOut is a mobile projector-based system that enables new applications and interaction techniques with tangible objects surfaces. uses device mounted camera to detect hidden markers applied infrared-absorbing ink. The obtrusive appearance of fiducial avoided the marker surface doubles as functional projection surface. We present example demonstrate wide range scenarios, including media navigation tools, interactive storytelling applications, games. explore design space enabled by describe...
We present the MotionBeam metaphor for character interaction with handheld projectors. Our work draws from tradition of pre-cinema projectors that use direct physical manipulation to control projected imagery. With our prototype system, users interact and characters by moving gesturing projector itself. This creates a unified style where input output are tied together within single device. introduce set principles applications provide clear examples in use. Finally we describe observations...
Motion capture technology generally requires that recordings be performed in a laboratory or closed stage setting with controlled lighting. This restriction precludes the of motions require an outdoor traversal large areas. In this paper, we present theory and practice using body-mounted cameras to reconstruct motion subject. Outward-looking are attached limbs subject, joint angles root pose estimated through non-linear optimization. The optimization objective function incorporates terms for...
Existing methods for video completion typically rely on periodic color transitions, layer extraction, or temporally local motion. However, periodicity may be imperceptible absent, extraction is difficult, and motion cannot handle large holes. This paper presents a new approach using field transfer to avoid such problems. Unlike prior methods, we fill in missing parts by sampling spatio-temporal patches of instead directly color. Once the has been computed within video, can then propagated...
Improvements in data-capture and face modeling techniques have enabled us to create high-fidelity realistic models. However, driving these models requires special input data, e.g., 3D meshes unwrapped textures. Also, expect clean data taken under controlled lab environments, which is very different from collected the wild. All constraints make it challenging use tracking for commodity cameras. In this paper, we propose a self-supervised domain adaptation approach enable animation of camera....
Currently, many important intangible cultural properties of the world are being lost because lack successive performers. Digital archiving technology is one effective solutions for this issue, and we have started our digital project including these ones. For human motion archives, method automatic structure analysis vital a variety purposes. We believe that dance consists "primitive motions" necessary to detect components. Particularly motions, think primitives must be synchronized musical...
Intuitive and efficient retrieval of motion capture data is essential for effective use databases. In this paper, we describe a system that allows the user to retrieve particular sequence by performing an approximation with instrumented puppet. This interface intuitive because both adults children have experience playacting puppets toys express behaviors or tell stories style emotion. The puppet has 17 degrees freedom can therefore represent variety motions. We develop novel similarity...
In this paper, we present an incremental learning framework for efficient and accurate facial performance tracking. Our approach is to alternate the modeling step, which takes tracked meshes texture maps train our deep learning-based statistical model, tracking predictions of geometry model infers from measured images optimize predicted by minimizing image, landmark errors. Geo-Tex VAE extends convolutional variational autoencoder face tracking, jointly learns represents deformations...