NFDI4DS | UHH-SEMS - Publication Details

UnsupervisedR&R: Unsupervised Point Cloud Registration via Differentiable Rendering

OPENALEX - Publications

Mohamed El Banani Luya Gao Justin Johnson

Aligning partial views of a scene into single whole is essential to understanding one’s environment and key component numerous robotics tasks such as SLAM SfM. Recent approaches have proposed end-to-end systems that can outperform traditional methods by leveraging pose supervision. However, with the rising prevalence cameras depth sensors, we expect new stream raw RGB-D data without annotations needed for We propose UnsupervisedR&R: an unsupervised approach learning point cloud registration...

10.1109/cvpr46437.2021.00705 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

Probing the 3D Awareness of Visual Foundation Models

OPENALEX - Publications

Mohamed El Banani Amit Raj Kevis-Kokitsi Maninis Abhishek Kar Yuanzhen Li and 5 more

10.1109/cvpr52733.2024.02059 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024-06-16

Bootstrap Your Own Correspondences

OPENALEX - Publications

Mohamed El Banani Justin Johnson

Geometric feature extraction is a crucial component of point cloud registration pipelines. Recent work has demonstrated how supervised learning can be leveraged to learn better and more compact 3D features. However, those approaches' reliance on ground-truth annotation limits their scalability. We propose BYOC: self-supervised approach that learns visual geometric features from RGB-D video without relying pose or correspondence. Our key observation randomly-initialized CNNs readily provide...

10.1109/iccv48922.2021.00637 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

Learning Visual Representations via Language-Guided Sampling

OPENALEX - Publications

Mohamed El Banani Karan Desai Justin Johnson

Although an object may appear in numerous contexts, we often describe it a limited number of ways. Language allows us to abstract away visual variation represent and communicate concepts. Building on this intuition, propose alternative approach representation learning: using language similarity sample semantically similar image pairs for contrastive learning. Our diverges from image-based learning by sampling view instead handcrafted augmentations or learned clusters. also differs image-text...

10.1109/cvpr52729.2023.01841 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023-06-01

Novel Object Viewpoint Estimation Through Reconstruction Alignment

OPENALEX - Publications

Mohamed El Banani Jason J. Corso David F. Fouhey

The goal of this paper is to estimate the viewpoint for a novel object. Standard estimation approaches generally fail on task due their reliance 3D model alignment or large amounts class-specific training data and corresponding canonical pose. We overcome those limitations by learning reconstruct align approach. Our key insight that although we do not have an explicit predefined pose, can still learn object's shape in viewer's frame then use image provide our reference In particular, propose...

10.1109/cvpr42600.2020.00318 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

Self-supervised Correspondence Estimation via Multiview Registration

OPENALEX - Publications

Mohamed El Banani Ignacio Rocco David Novotný Andrea Vedaldi Natalia Neverova and 2 more

Video provides us with the spatio-temporal consistency needed for visual learning. Recent approaches have utilized this signal to learn correspondence estimation from closeby frame pairs. However, by only relying on close-by pairs, those miss out richer long-range between distant overlapping frames. To address this, we propose a self-supervised approach that learns multiview in short RGB-D video sequences. Our combines pairwise and registration novel SE(3) transformation synchronization...

10.1109/wacv56688.2023.00127 article EN 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2023-01-01

Three-dimensional particle tracking in microfluidic channel flow using in and out of focus diffraction

OPENALEX - Publications

Bushra Tasadduq Gonghao Wang Mohamed El Banani Wenbin Mao Wilbur A. Lam and 2 more

10.1016/j.flowmeasinst.2015.06.018 article EN publisher-specific-oa Flow Measurement and Instrumentation 2015-06-16

Probing the 3D Awareness of Visual Foundation Models

OPENALEX - Publications

Mohamed El Banani Amit Raj Kevis-Kokitsi Maninis Abhishek Kar Yuanzhen Li and 5 more

Recent advances in large-scale pretraining have yielded visual foundation models with strong capabilities. Not only can recent generalize to arbitrary images for their training task, intermediate representations are useful other tasks such as detection and segmentation. Given that classify, delineate, localize objects 2D, we ask whether they also represent 3D structure? In this work, analyze the awareness of models. We posit implies (1) encode structure scene (2) consistently surface across...

10.48550/arxiv.2404.08636 preprint EN arXiv (Cornell University) 2024-04-12

Learning Visual Representations from Cross-Modal Correspondence

OPENALEX - Publications

Mohamed El Banani

10.7302/22900 article EN Deep Blue (University of Michigan) 2024-01-01

Adviser Networks: Learning What Question to Ask for Human-In-The-Loop Viewpoint Estimation

OPENALEX - Publications

Mohamed El Banani Jason J. Corso

Humans have an unparalleled visual intelligence and can overcome ambiguities that machines currently cannot. Recent works shown incorporating guidance from humans during inference for monocular viewpoint-estimation help difficult cases in which the computer-alone would otherwise failed. These hybrid approaches are hence gaining traction. However, deciding what question to ask human at time remains unknown these problems. We address this by formulating it as Adviser Problem: we learn a...

10.48550/arxiv.1802.01666 preprint EN other-oa arXiv (Cornell University) 2018-01-01

A PILOT STUDY OF A MODIFIED BATHROOM SCALE TO MONITOR CARDIOVASCULAR HEMODYNAMIC IN PREGNANCY

OPENALEX - Publications

Odayme Quesada Mohamed El Banani J. Alex Heller Shire Beach Mozziyar Etemadi and 4 more

10.1016/s0735-1097(16)31456-5 article EN publisher-specific-oa Journal of the American College of Cardiology 2016-04-01

Learning Visual Representations via Language-Guided Sampling

OPENALEX - Publications

Mohamed El Banani Karan Desai Justin C. Johnson

Although an object may appear in numerous contexts, we often describe it a limited number of ways. Language allows us to abstract away visual variation represent and communicate concepts. Building on this intuition, propose alternative approach representation learning: using language similarity sample semantically similar image pairs for contrastive learning. Our diverges from image-based learning by sampling view instead hand-crafted augmentations or learned clusters. also differs...

10.48550/arxiv.2302.12248 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Novel Object Viewpoint Estimation through Reconstruction Alignment

OPENALEX - Publications

Mohamed El Banani Jason J. Corso David F. Fouhey

The goal of this paper is to estimate the viewpoint for a novel object. Standard estimation approaches generally fail on task due their reliance 3D model alignment or large amounts class-specific training data and corresponding canonical pose. We overcome those limitations by learning reconstruct align approach. Our key insight that although we do not have an explicit predefined pose, can still learn object's shape in viewer's frame then use image provide our reference In particular, propose...

10.48550/arxiv.2006.03586 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Self-Supervised Correspondence Estimation via Multiview Registration

OPENALEX - Publications

Mohamed El Banani Ignacio Rocco David Novotný Andrea Vedaldi Natalia Neverova and 2 more

Video provides us with the spatio-temporal consistency needed for visual learning. Recent approaches have utilized this signal to learn correspondence estimation from close-by frame pairs. However, by only relying on pairs, those miss out richer long-range between distant overlapping frames. To address this, we propose a self-supervised approach that learns multiview in short RGB-D video sequences. Our combines pairwise and registration novel SE(3) transformation synchronization algorithm....

10.48550/arxiv.2212.03236 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Bootstrap Your Own Correspondences

OPENALEX - Publications

Mohamed El Banani Justin C. Johnson

Geometric feature extraction is a crucial component of point cloud registration pipelines. Recent work has demonstrated how supervised learning can be leveraged to learn better and more compact 3D features. However, those approaches' reliance on ground-truth annotation limits their scalability. We propose BYOC: self-supervised approach that learns visual geometric features from RGB-D video without relying pose or correspondence. Our key observation randomly-initialized CNNs readily provide...

10.48550/arxiv.2106.00677 preprint EN other-oa arXiv (Cornell University) 2021-01-01

UnsupervisedR&R: Unsupervised Point Cloud Registration via Differentiable Rendering

OPENALEX - Publications

Mohamed El Banani Luya Gao Justin C. Johnson

Aligning partial views of a scene into single whole is essential to understanding one's environment and key component numerous robotics tasks such as SLAM SfM. Recent approaches have proposed end-to-end systems that can outperform traditional methods by leveraging pose supervision. However, with the rising prevalence cameras depth sensors, we expect new stream raw RGB-D data without annotations needed for We propose UnsupervisedR&R: an unsupervised approach learning point cloud registration...

10.48550/arxiv.2102.11870 preprint EN other-oa arXiv (Cornell University) 2021-01-01