NFDI4DS | UHH-SEMS - Publication Details

Stan Birchfield

ORCID: 0000-0001-7366-2441

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5040617006

Research Areas

Robotics and Sensor-Based Localization
Robot Manipulation and Learning
Advanced Vision and Imaging
Advanced Neural Network Applications
Human Pose and Action Recognition
Robotic Path Planning Algorithms
Video Surveillance and Tracking Methods
Optical measurement and interference techniques
Domain Adaptation and Few-Shot Learning
Reinforcement Learning in Robotics
Multimodal Machine Learning Applications
Image and Object Detection Techniques
Advanced Image Processing Techniques
Image Processing Techniques and Applications
3D Surveying and Cultural Heritage
Speech and Audio Processing
Autonomous Vehicle Technology and Safety
Advanced Image and Video Retrieval Techniques
Image Processing and 3D Reconstruction
Tactile and Sensory Interactions
Remote Sensing and LiDAR Applications
Computer Graphics and Visualization Techniques
Hand Gesture Recognition Systems
Music and Audio Processing
Soft Robotics and Applications

Nvidia (United States)
2017-2024

Nvidia (United Kingdom)
2018-2024

Georgia Institute of Technology
2021

Istituto Tecnico Industriale Alessandro Volta
2021

Weatherford College
2021

Seattle University
2020

Microsoft (United States)
2013-2015

Clemson University
2005-2014

Aalto University
2011

Stanford University
1998-2002

Elliptical head tracking using intensity gradients and color histograms

OPENALEX - Publications

Stan Birchfield

An algorithm for tracking a person's head is presented. The head's projection onto the image plane modeled as an ellipse whose position and size are continually updated by local search combining output of module concentrating on intensity gradient around ellipse's perimeter with that another focusing color histogram interior. Since these two modules have roughly orthogonal failure modes, they serve to complement one another. result robust, real-time system able track enough accuracy...

10.1109/cvpr.1998.698614 article EN 2002-11-27

Training Deep Networks with Synthetic Data: Bridging the Reality Gap by Domain Randomization

OPENALEX - Publications

Jonathan Tremblay Aayush Prakash David Acuna Mark Brophy Varun Jampani and 5 more

We present a system for training deep neural networks object detection using synthetic images. To handle the variability in real-world data, relies upon technique of domain randomization, which parameters simulator-such as lighting, pose, textures, etc.-are randomized non-realistic ways to force network learn essential features interest. explore importance these parameters, showing that it is possible produce with compelling performance only non-artistically-generated data. With additional...

10.1109/cvprw.2018.00143 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2018-06-01

A pixel dissimilarity measure that is insensitive to image sampling

OPENALEX - Publications

Stan Birchfield Carlo Tomasi

Because of image sampling, traditional measures pixel dissimilarity can assign a large value to two corresponding pixels in stereo pair, even the absence noise and other degrading effects. We propose measure that is provably insensitive sampling because it uses linearly interpolated intensity functions surrounding pixels. Experiments on real images show our alleviates problem with little additional computational overhead.

10.1109/34.677269 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 1998-04-01

OPENALEX - Publications

Stan Birchfield Carlo Tomasi

10.1023/a:1008160311296 article EN International Journal of Computer Vision 1999-01-01

Spatiograms versus Histograms for Region-Based Tracking

OPENALEX - Publications

Stan Birchfield Shriram S. Rangarajan

We introduce the concept of a spatiogram, which is generalization histogram that includes potentially higher order moments. A zeroth-order while second-order spatiograms contain spatial means and covariances for each bin. This information still allows quite general transformations, as in histogram, but captures richer description target to increase robustness tracking. show how use kernel-based trackers, deriving mean shift procedure individual pixels vote not only amount also its direction....

10.1109/cvpr.2005.330 article EN 2005-07-27

CityFlow: A City-Scale Benchmark for Multi-Target Multi-Camera Vehicle Tracking and Re-Identification

OPENALEX - Publications

Zheng Tang Milind Naphade Ming-Yu Liu Xiaodong Yang Stan Birchfield and 4 more

Urban traffic optimization using cameras as sensors is driving the need to advance state-of-the-art multi-target multi-camera (MTMC) tracking. This work introduces CityFlow, a city-scale camera dataset consisting of more than 3 hours synchronized HD videos from 40 across 10 intersections, with longest distance between two simultaneous being 2.5 km. To best our knowledge, CityFlow largest-scale in terms spatial coverage and number cameras/videos an urban environment. The contains 200K...

10.1109/cvpr.2019.00900 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

Deep Object Pose Estimation for Semantic Robotic Grasping of Household Objects

OPENALEX - Publications

Jonathan Tremblay Thang To Balakumar Sundaralingam Xiang Yu Dieter Fox and 1 more

Using synthetic data for training deep neural networks robotic manipulation holds the promise of an almost unlimited amount pre-labeled data, generated safely out harm's way. One key challenges to date, has been bridge so-called reality gap, so that trained on operate correctly when exposed real-world data. We explore gap in context 6-DoF pose estimation known objects from a single RGB image. show this problem can be successfully spanned by simple combination domain randomized and...

10.48550/arxiv.1809.10790 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Structured Domain Randomization: Bridging the Reality Gap by Context-Aware Synthetic Data

OPENALEX - Publications

Aayush Prakash Shaad Boochoon Mark Brophy David Acuna Eric Cameracci and 3 more

We present structured domain randomization (SDR), a variant of (DR) that takes into account the structure scene in order to add context generated data. In contrast DR, which places objects and distractors randomly according uniform probability distribution, SDR distributions arise from specific problem at hand. this manner, SDR-generated imagery enables neural network take around an object consideration during detection. demonstrate power for 2D bounding box car detection, achieving...

10.1109/icra.2019.8794443 article EN 2022 International Conference on Robotics and Automation (ICRA) 2019-05-01

Toward low-flying autonomous MAV trail navigation using deep neural networks for environmental awareness

OPENALEX - Publications

Nikolai Smolyanskiy A. Kamenev Jeffrey Smith Stan Birchfield

We present a micro aerial vehicle (MAV) system, built with inexpensive off-the-shelf hardware, for autonomously following trails in unstructured, outdoor environments such as forests. The system introduces deep neural network (DNN) called TrailNet estimating the view orientation and lateral offset of MAV respect to trail center. DNN-based controller achieves stable flight without oscillations by avoiding overconfident behavior through loss function that includes both label smoothing entropy...

10.1109/iros.2017.8206285 article EN 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2017-09-01

PAMTRI: Pose-Aware Multi-Task Learning for Vehicle Re-Identification Using Highly Randomized Synthetic Data

OPENALEX - Publications

Zheng Tang Milind Naphade Stan Birchfield Jonathan Tremblay William Hodge and 3 more

In comparison with person re-identification (ReID), which has been widely studied in the research community, vehicle ReID received less attention. Vehicle is challenging due to 1) high intra-class variability (caused by dependency of shape and appearance on viewpoint), 2) small inter-class similarity between vehicles produced different manufacturers). To address these challenges, we propose a Pose-Aware Multi-Task Re-Identification (PAMTRI) framework. This approach includes two innovations...

10.1109/iccv.2019.00030 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

DexYCB: A Benchmark for Capturing Hand Grasping of Objects

OPENALEX - Publications

Yu-Wei Chao Wei Yang Xiang Yu Pavlo Molchanov Ankur Handa and 7 more

We introduce DexYCB, a new dataset for capturing hand grasping of objects. first compare DexYCB with related one through cross-dataset evaluation. then present thorough benchmark state-of-the-art approaches on three relevant tasks: 2D object and keypoint detection, 6D pose estimation, 3D estimation. Finally, we evaluate robotics-relevant task: generating safe robot grasps in human-to-robot handover. <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup>

10.1109/cvpr46437.2021.00893 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

BundleSDF: Neural 6-DoF Tracking and 3D Reconstruction of Unknown Objects

OPENALEX - Publications

Bowen Wen Jonathan Tremblay Valts Blukis Stephen Tyree Thomas Müller and 4 more

We present a near real-time (10Hz) method for 6-DoF tracking of an unknown object from monocular RGBD video sequence, while simultaneously performing neural 3D reconstruction the object. Our works arbi-trary rigid objects, even when visual texture is largely ab-sent. The assumed to be segmented in first frame only. No additional information required, and no assumption made about interaction agent. Key our Neural Object Field that learned concurrently with pose graph optimization process...

10.1109/cvpr52729.2023.00066 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023-06-01

FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects

OPENALEX - Publications

Bowen Wen Wei Yang Jan Kautz Stan Birchfield

10.1109/cvpr52733.2024.01692 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024-06-16

DERVISH An Office-Navigating Robot

OPENALEX - Publications

Illah Nourbakhsh Rob Powers Stan Birchfield

DERVISH won the Office Delivery event of 1994 Robot Competition and Exhibition, held as part Thirteenth National Conferennce on Artificial Intelligence. Although contest required dervish to navigate in an artificial office environment, official goal was push technology robot navigation real buildings with minimal domain information. navigates reliably using retractable assumptions that simplify planning problem. In this article, we present a short description Dervish's hardware low-level...

10.1609/aimag.v16i2.1133 article EN AI Magazine 1995-06-15

Multiway cut for stereo and motion with slanted surfaces

OPENALEX - Publications

Stan Birchfield Carlo Tomasi

Slanted surfaces pose a problem for correspondence algorithms utilizing search because of the greatly increased number possibilities, when compared with fronto-parallel surfaces. In this paper we propose an algorithm to compute between stereo images or frames motion sequence by minimizing energy functional that accounts slanted The is minimized in greedy strategy alternates segmenting image into non-overlapping regions (using multiway-cut Boykov, Veksler, and Zabih) finding affine parameters...

10.1109/iccv.1999.791261 article EN 1999-01-01

Depth discontinuities by pixel-to-pixel stereo

OPENALEX - Publications

Stan Birchfield Carlo Tomasi

An algorithm to detect depth discontinuities from a stereo pair of images is presented. The matches individual pixels in corresponding scanline pairs while allowing occluded remain unmatched, then propagates the information between scanlines by means fast postprocessor. handles large untextured regions, uses measure pixel dissimilarity that insensitive image sampling, and prunes bad search nodes increase speed dynamic programming. computation relatively fast, taking about 1.5 microseconds...

10.1109/iccv.1998.710850 article EN 2002-11-27

Falling Things: A Synthetic Dataset for 3D Object Detection and Pose Estimation

OPENALEX - Publications

Jonathan Tremblay Thang To Stan Birchfield

We present a new dataset, called Falling Things (FAT), for advancing the state-of-the-art in object detection and 3D pose estimation context of robotics.1 By synthetically combining models backgrounds complex composition high graphical quality, we are able to generate photorealistic images with accurate annotations all objects images. Our dataset contains 60k annotated photos 21 household taken from YCB [2]. For each image, provide poses, per-pixel class segmentation, 2D/3D bounding box...

10.1109/cvprw.2018.00275 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2018-06-01

Real-Time Incremental Segmentation and Tracking of Vehicles at Low Camera Angles Using Stable Features

OPENALEX - Publications

Neeraj K. Kanhere Stan Birchfield

We present a method for segmenting and tracking vehicles on highways using camera that is relatively low to the ground. At such angles, 3-D perspective effects cause significant changes in appearance over time, as well severe occlusions by neighboring lanes. Traditional approaches occlusion reasoning assume initially appear separated image; however, our sequences, it not uncommon enter scene partially occluded remain so throughout. By utilizing mapping from image, along with plumb line...

10.1109/tits.2007.911357 article EN IEEE Transactions on Intelligent Transportation Systems 2008-02-29

Adaptive fragments-based tracking of non-rigid objects using level sets

OPENALEX - Publications

Prakash Chockalingam N. Pradeep Stan Birchfield

We present an approach to visual tracking based on dividing a target into multiple regions, or fragments. The is represented by Gaussian mixture model in joint feature-spatial space, with each ellipsoid corresponding different fragment. fragments are automatically adapted the image data, being selected efficient region-growing procedure and updated according weighted average of past statistics. Modeling background performed Chan-Vese manner, using framework level sets preserve accurate...

10.1109/iccv.2009.5459276 article EN 2009-09-01

DexPilot: Vision-Based Teleoperation of Dexterous Robotic Hand-Arm System

OPENALEX - Publications

Ankur Handa Karl Van Wyk Wei Yang Jacky Liang Yu-Wei Chao and 4 more

Teleoperation offers the possibility of imparting robotic systems with sophisticated reasoning skills, intuition, and creativity to perform tasks. However, teleoperation solutions for high degree-of-actuation (DoA), multi-fingered robots are generally cost-prohibitive, while low-cost offerings usually offer reduced degrees control. Herein, a low-cost, depth-based system, DexPilot, was developed that allows complete control over full 23 DoA system by merely observing bare human hand. DexPilot...

10.1109/icra40945.2020.9197124 article EN 2020-05-01

On the Importance of Stereo for Accurate Depth Estimation: An Efficient Semi-Supervised Deep Neural Network Approach

OPENALEX - Publications

Nikolai Smolyanskiy A. Kamenev Stan Birchfield

We revisit the problem of visual depth estimation in context autonomous vehicles. Despite progress on monocular recent years, we show that gap between and stereo accuracy remains large-a particularly relevant result due to prevalent reliance upon cameras by vehicles are expected be self-driving. argue challenges removing this significant, owing fundamental limitations vision. As a result, focus our efforts stereo. propose novel semi-supervised learning approach training deep neural network,...

10.1109/cvprw.2018.00147 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2018-06-01

Camera-to-Robot Pose Estimation from a Single Image

OPENALEX - Publications

Timothy E. Lee Jonathan Tremblay Thang To Jia Cheng Terry Mosier and 3 more

We present an approach for estimating the pose of external camera with respect to a robot using single RGB image robot. The is processed by deep neural network detect 2D projections keypoints (such as joints) associated trained entirely on simulated data domain randomization bridge reality gap. Perspective-n-point (PnP) then used recover extrinsics, assuming that intrinsics and joint configuration manipulator are known. Unlike classic hand-eye calibration systems, our method does not require...

10.1109/icra40945.2020.9196596 article EN 2020-05-01

Hierarchical Planning for Long-Horizon Manipulation with Geometric and Symbolic Scene Graphs

OPENALEX - Publications

Yifeng Zhu Jonathan Tremblay Stan Birchfield Yuke Zhu

We present a visually grounded hierarchical planning algorithm for long-horizon manipulation tasks. Our offers joint framework of neuro-symbolic task and low-level motion generation conditioned on the specified goal. At core our approach is two-level scene graph representation, namely geometric symbolic graph. This representation serves as structured, object-centric abstraction scenes. model uses neural networks to process these graphs predicting high-level plans motions. demonstrate that...

10.1109/icra48506.2021.9561548 article EN 2021-05-30

Coming Soon ...