- Advanced Vision and Imaging
- 3D Shape Modeling and Analysis
- Robotics and Sensor-Based Localization
- Computer Graphics and Visualization Techniques
- Augmented Reality Applications
- 3D Surveying and Cultural Heritage
- Face recognition and analysis
- Remote Sensing and LiDAR Applications
- Human Motion and Animation
- Advanced Image and Video Retrieval Techniques
- Human Pose and Action Recognition
- Virtual Reality Applications and Impacts
- Interactive and Immersive Displays
- Video Surveillance and Tracking Methods
- Advanced Image Processing Techniques
- Generative Adversarial Networks and Image Synthesis
- Advanced Neural Network Applications
- Spatial Cognition and Navigation
- Hand Gesture Recognition Systems
- Speech and Audio Processing
- Data Management and Algorithms
- Data Visualization and Analytics
- Image Retrieval and Classification Techniques
- Optical measurement and interference techniques
- Image Processing and 3D Reconstruction
University of Southern California
2015-2024
Southern California University for Professional Studies
2011-2023
Creative Technologies (United States)
2006-2023
Adobe Systems (United States)
2019
University of Kassel
2013
LAC+USC Medical Center
2005-2008
IEEE Computer Society
2006
North Carolina State University
2002
Southern States University
1998-2002
University of North Carolina at Chapel Hill
1992-1994
The interaction between human beings and computers will be more natural if are able to perceive respond non-verbal communication such as emotions. Although several approaches have been proposed recognize emotions based on facial expressions or speech, relatively limited work has done fuse these two, other, modalities improve the accuracy robustness of emotion recognition system. This paper analyzes strengths limitations systems only acoustic information. It also discusses two used...
We introduce Similarity Group Proposal Network (SGPN), a simple and intuitive deep learning framework for 3D object instance segmentation on point clouds. SGPN uses single network to predict grouping proposals corresponding semantic class each proposal, from which we can directly extract results. Important the effectiveness of is its novel representation results in form similarity matrix that indicates between pair points embedded feature space, thus producing an accurate proposal point....
Point clouds are an efficient data format for 3D data. However, existing segmentation methods point either do not model local dependencies [21] or require added computations [14, 23]. This work presents a novel framework, RSNet1, to efficiently structures in clouds. The key component of the RSNet is lightweight dependency module. It combination slice pooling layer, Recurrent Neural Network (RNN) layers, and unpooling layer. layer designed project features unordered points onto ordered...
Volumetric neural rendering methods like NeRF [34] generate high-quality view synthesis results but are optimized per-scene leading to prohibitive reconstruction time. On the other hand, deep multi-view stereo can quickly reconstruct scene geometry via direct network inference. Point-NeRF combines advantages of these two approaches by using 3D point clouds, with associated features, model a radiance field. be rendered efficiently aggregating features near surfaces, in ray marching-based...
Reconstructing 3D shapes from single-view images has been a long-standing research problem. In this paper, we present DISN, Deep Implicit Surface Network which can generate high-quality detail-rich mesh an 2D image by predicting the underlying signed distance fields. addition to utilizing global features, DISN predicts projected location for each point on image, and extracts local features feature maps. Combining significantly improves accuracy of field prediction, especially areas. To best...
Due to the sparsity and irregularity of point cloud data, methods that directly consume points have become popular. Among all point-based models, graph convolutional networks (GCN) lead notable performance by fully preserving data granularity exploiting interrelation. However, spend a significant amount time on structuring (e.g., Farthest Point Sampling (FPS) neighbor querying), which limit speed scalability. In this paper, we present method, named Grid-GCN, for fast scalable learning....
Advances in LiDAR sensors provide rich 3D data that supports scene understanding. However, due to occlusion and signal miss, point clouds are practice 2.5D as they cover only partial underlying shapes, which poses a fundamental challenge perception. To tackle the challenge, we present novel LiDAR-based object detection model, dubbed Behind Curtain Detector (BtcDet), learns shape priors estimates complete shapes partially occluded (curtained) clouds. BtcDet first identifies regions affected...
The Virtual Environments Laboratory at the University of Southern California (USC) has initiated a research program aimed developing virtual reality (VR) technology applications for study, assessment, and rehabilitation cognitive/functional processes. This is seen to offer many advantages these aims an introductory section this article will discuss specific rationale VR in area clinical neuropsychology. A discussion attention processes follow issues development head-mounted display (HMD)...
We present a novel approach to producing facial expression animations for new models. Instead of creating from scratch each model created, we take advantage existing animation data in the form vertex motion vectors. Our method allows created by any tools or methods be easily retargeted call this process cloning and it provides alternative character Expression makes meaningful compile high-quality library since can reused transfers vectors source face target having different geometric...
This paper presents cognitive studies and analyses relating to how augmented reality (AR) interacts with human abilities in order benefit manufacturing maintenance tasks. A specific set of applications is described detail, as well a prototype system the software library that it built upon. An integrated view information flow support AR also presented, along proposal for an media language (ARML) could provide interoperability between various implementations.
The biggest single obstacle to building effective augmented reality (AR) systems is the lack of accurate wide-area sensors for trackers that report locations and orientations objects in an environment. Active (sensor-emitter) tracking technologies require powered-device installation. Limiting their use prepared areas are relatively free natural or man-made interference sources. Vision-based can passive landmarks, but they more computationally demanding often exhibit erroneous behavior due...
We present a novel approach to producing facial expression animations for new models. Instead of creating from scratch each model created, we take advantage existing animation data in the form vertex motion vectors. Our method allows created by any tools or methods be easily retargeted call this process cloning and it provides alternative character Expression makes meaningful compile high-quality library since can reused transfers vectors source face target having different geometric...
Many state-of-the-art computer vision algorithms use large scale convolutional neural networks (CNNs) as basic building blocks. These CNNs are known for their huge number of parameters, high redundancy in weights, and tremendous computing resource consumptions. This paper presents a learning algorithm to simplify speed up these CNNs. Specifically, we introduce "try-and-learn" train pruning agents that remove unnecessary CNN filters data-driven way. With the help novel reward function, our...
Recent advances in convolutional neural networks have shown promising results 3D shape completion. But due to GPU memory limitations, these methods can only produce low-resolution outputs. To inpaint models with semantic plausibility and contextual details, we introduce a hybrid framework that combines Encoder-Decoder Generative Adversarial Network (3D-ED-GAN) Longterm Recurrent Convolutional (LRCN). The 3DED- GAN is network trained generative adversarial paradigm fill missing data...
Applications in virtual and augmented reality create a demand for rapid creation easy access to large sets of 3D models. An effective way address this is edit or deform existing models based on reference, e.g., 2D image which very acquire. Given such source model target can be image, model, point cloud acquired as depth scan, we introduce 3DN, an end-to-end network that deforms the resemble target. Our method infers per-vertex offset displacements while keeping mesh connectivity fixed. We...
The majority of prior monocular depth estimation meth-ods without groundtruth guidance focus on driving scenarios. We show that such methods generalize poorly to unseen complex indoor scenes, where objects are cluttered and arbitrarily arranged in the near field. To obtain more robustness, we propose a structure distillation approach learn knacks from an off-the-shelf relative estima-tor produces structured but metric-agnostic depth. By combining with branch learns metrics left-right...
Augmented reality systems allow users to interact with real and computer-generated objects by displaying 3D virtual registered in a user's natural environment. Applications of this powerful visualization tool include previewing proposed buildings their settings, interacting complex machinery for purposes construction or maintenance training, visualizing in-patient medical data such as ultrasound. In all these applications, must be visually respect real-world every image the user sees. If...
A novel framework enables accurate augmented reality (AR) registration with integrated inertial gyroscope and vision tracking technologies. The includes a two-channel complementary motion filter that combines the low-frequency stability of sensors high-frequency sensors, hence achieving stable static dynamic six-degree-of-freedom pose tracking. Our implementation uses an extended Kalman (EKF). Quantitative analysis experimental results show fusion method achieves dramatic improvements in...
Natural scene features stabilize and extend the tracking range of augmented reality (AR) pose-tracking systems. We develop robust computer vision methods to detect track natural in video images. Point region are automatically adaptively selected for properties that lead tracking. A multistage algorithm produces accurate motion estimates, entire system operates a closed-loop stabilizes its performance accuracy. present demonstrations benefits using tracked AR applications illustrate direct...
Our work stems from a program focused on developing tracking technologies for wide-area augmented realities in unprepared outdoor environments. Other participants the Defense Advanced Research Projects Agency (Darpa) funded Geospatial Registration of Information Dismounted Soldiers (Grids) included University North Carolina at Chapel Hill and Raytheon. We describe hybrid orientation system combining inertial sensors computer vision. exploit complementary nature these two sensing to...
Animating 3D faces to achieve compelling realism is a challenging task in the entertainment industry. Previously proposed face transfer approaches generally require high-quality animated source order its motion new faces. In this work, we present semi-automatic technique directly animate popularized blendshape models by mapping facial capture data spaces spaces. After sparse markers on of human subject are captured systems while video camera simultaneously used record his/her front face,...