Helge Rhodin

ORCID: 0000-0003-2692-0801
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Human Pose and Action Recognition
  • Advanced Vision and Imaging
  • Video Surveillance and Tracking Methods
  • 3D Shape Modeling and Analysis
  • Human Motion and Animation
  • Diabetic Foot Ulcer Assessment and Management
  • Generative Adversarial Networks and Image Synthesis
  • Advanced Neural Network Applications
  • Robotics and Sensor-Based Localization
  • Face recognition and analysis
  • Computer Graphics and Visualization Techniques
  • Visual Attention and Saliency Detection
  • Winter Sports Injuries and Performance
  • Multimodal Machine Learning Applications
  • 3D Surveying and Cultural Heritage
  • Video Analysis and Summarization
  • Hand Gesture Recognition Systems
  • Optical measurement and interference techniques
  • Advanced Image and Video Retrieval Techniques
  • Neurobiology and Insect Physiology Research
  • Domain Adaptation and Few-Shot Learning
  • Gait Recognition and Analysis
  • Zebrafish Biomedical Research Applications
  • Cell Image Analysis Techniques
  • Industrial Vision Systems and Defect Detection

University of British Columbia
2019-2024

University of British Columbia Hospital
2019-2023

École Polytechnique Fédérale de Lausanne
2017-2020

Laboratoire d’Imagerie Biomédicale
2019-2020

Max Planck Society
2013-2019

Max Planck Institute for Informatics
2013-2019

Universidad Braulio Carrillo
2019

We present the first real-time method to capture full global 3D skeletal pose of a human in stable, temporally consistent manner using single RGB camera. Our combines new convolutional neural network (CNN) based regressor with kinematic skeleton fitting. novel fully-convolutional formulation regresses 2D and joint positions jointly real time does not require tightly cropped input frames. A fitting uses CNN output yield stable reconstructions on basis coherent skeleton. This makes our...

10.1145/3072959.3073596 article EN ACM Transactions on Graphics 2017-07-20

We propose a CNN-based approach for 3D human body pose estimation from single RGB images that addresses the issue of limited generalizability models trained solely on starkly publicly available data. Using only existing data and 2D data, we show state-of-the-art performance established benchmarks through transfer learned features, while also generalizing to in-the-wild scenes. further introduce new training set monocular real humans has ground truth captured with multi-camera marker-less...

10.1109/3dv.2017.00064 article EN 2021 International Conference on 3D Vision (3DV) 2017-10-01

We present a real-time approach for multi-person 3D motion capture at over 30 fps using single RGB camera. It operates successfully in generic scenes which may contain occlusions by objects and other people. Our method subsequent stages. The first stage is convolutional neural network (CNN) that estimates 2D pose features along with identity assignments all visible joints of individuals.We contribute new architecture this CNN, called SelecSLS Net, uses novel selective long short range skip...

10.1145/3386569.3392410 article EN ACM Transactions on Graphics 2020-08-12

Accurate 3D human pose estimation from single images is possible with sophisticated deep-net architectures that have been trained on very large datasets. However, this still leaves open the problem of capturing motions for which no such database exists. Manual annotation tedious, slow, and error-prone. In paper, we propose to replace most annotations by use multiple views, at training time only. Specifically, train system predict same in all views. Such a consistency constraint necessary but...

10.1109/cvpr.2018.00880 article EN 2018-06-01

We present the first marker-less approach for temporally coherent 3D performance capture of a human with general clothing from monocular video. Our reconstructs articulated skeleton motion as well medium-scale non-rigid surface deformations in scenes. Human is challenging problem due to large range articulation, potentially fast motion, and considerable deformations, even multi-view data. Reconstruction video alone drastically more challenging, since strong occlusions inherent depth...

10.1145/3181973 article EN ACM Transactions on Graphics 2018-04-30

Studying how neural circuits orchestrate limbed behaviors requires the precise measurement of positions each appendage in three-dimensional (3D) space. Deep networks can estimate two-dimensional (2D) pose freely behaving and tethered animals. However, unique challenges associated with transforming these 2D measurements into reliable 3D poses have not been addressed for small animals including fly, Drosophila melanogaster. Here, we present DeepFly3D, a software that infers tethered, adult...

10.7554/elife.48571 article EN cc-by eLife 2019-10-04

Marker-based and marker-less optical skeletal motion-capture methods use an outside-in arrangement of cameras placed around a scene, with viewpoints converging on the center. They often create discomfort marker suits, their recording volume is severely restricted constrained to indoor scenes controlled backgrounds. Alternative suit-based systems several inertial measurement units or exoskeleton capture motion inside-in setup, i.e. without external sensors. This makes independent confined...

10.1145/2980179.2980235 article EN ACM Transactions on Graphics 2016-11-11

We propose the first real-time system for egocentric estimation of 3D human body pose in a wide range unconstrained everyday activities. This setting has unique set challenges, such as mobility hardware setup, and robustness to long capture sessions with fast recovery from tracking failures. tackle these challenges based on novel lightweight setup that converts standard baseball cap device high-quality single cap-mounted fisheye camera. From captured live stream, our CNN approach runs at 60...

10.1109/tvcg.2019.2898650 article EN IEEE Transactions on Visualization and Computer Graphics 2019-03-16

Human pose estimation from single images is a challenging problem in computer vision that requires large amounts of labeled training data to be solved accurately. Unfortunately, for many human activities (e.g. outdoor sports) such does not exist and hard or even impossible acquire with traditional motion capture systems. We propose self-supervised approach learns image 3D estimator unlabeled multi-view data. To this end, we exploit consistency constraints disentangle the observed 2D into...

10.1109/cvpr46437.2021.01309 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

While deep learning reshaped the classical motion capture pipeline with feed-forward networks, generative models are required to recover fine alignment via iterative refinement. Unfortunately, existing usually hand-crafted or learned in controlled conditions, only applicable limited domains. We propose a method learn neural body model from unlabelled monocular videos by extending Neural Radiance Fields (NeRFs). equip them skeleton apply time-varying and articulated motion. A key insight is...

10.48550/arxiv.2102.06199 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Generative reconstruction methods compute the 3D configuration (such as pose and/or geometry) of a shape by optimizing overlap projected model with images. Proper handling occlusions is big challenge, since visibility function that indicates if surface point seen from camera can often not be formulated in closed form, and general discrete non-differentiable at occlusion boundaries. We present new scene representation enables an analytically differentiable closed-form formulation visibility....

10.1109/iccv.2015.94 article EN 2015-12-01

Human pose estimation from single images is a challenging problem that typically solved by supervised learning. Unfortunately, labeled training data does not yet exist for many human activities since 3D annotation requires dedicated motion capture systems. Therefore, we propose an unsupervised approach learns to predict image while only being trained with 2D data, which can be crowd-sourced and already widely available. To this end, estimate the most likely over random projections,...

10.1109/cvpr52688.2022.00652 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

Real-time marker-less hand tracking is of increasing importance in human-computer interaction. Robust and accurate arbitrary motion a challenging problem due to the many degrees freedom, frequent self-occlusions, fast motions, uniform skin color. In this paper, we propose new approach that tracks full skeleton from multiple RGB cameras real-time. The main contributions include generative method which employs an implicit shape representation based on Sum Anisotropic Gaussians (SAG), pose...

10.1109/3dv.2014.37 preprint EN 2014-12-01

Learning general image representations has proven key to the success of many computer vision tasks. For example, approaches understanding problems rely on deep networks that were initially trained ImageNet, mostly because learned features are a valuable starting point learn from limited labeled data. However, when it comes 3D motion capture multiple people, these only use. In this paper, we therefore propose an approach learning useful for purpose. To end, introduce self-supervised what call...

10.1109/cvpr.2019.00789 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

Reconstruction of a 3D shape from single 2D image is classical computer vision problem, whose difficulty stems the inherent ambiguity recovering occluded or only partially observed surfaces. Recent methods address this challenge through use largely unstructured neural networks that effectively distill conditional mapping and priors over shape. In work, we induce structure geometric constraints by leveraging three core observations: (1) surface most everyday objects often almost entirely...

10.1109/cvpr42600.2020.00061 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

Our goal is to capture the pose of real animals using synthetic training examples, without any manual supervision. focus on neuroscience model organisms, be able study how neural circuits orchestrate behaviour. Human estimation attains remarkable accuracy when trained or simulated datasets consisting millions frames. However, for many applications models are unrealistic and with comprehensive annotations do not exist. We address this problem a new sim2real domain transfer method. key...

10.1109/cvpr42600.2020.01317 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

Background: Augmented reality (AR) glasses can be used for different medical indications. Primarily, a visual overlay on the optic screen offers additional operational information. A transfer of acoustic information via speech-to-text transcript using AR presents new non-surgical option to support patients with forms hearing loss. This study aimed evaluate transcription. Methods: We compared four (G1, MYVU, AIR, and Moverio 40) systems transcription regarding speech capturing, design,...

10.20944/preprints202502.2258.v1 preprint EN 2025-02-28

10.1109/wacv61041.2025.00372 article EN 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2025-02-26
Coming Soon ...