Mohamed El Banani

ORCID: 0000-0003-4686-6048
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • 3D Shape Modeling and Analysis
  • Robotics and Sensor-Based Localization
  • Advanced Vision and Imaging
  • 3D Surveying and Cultural Heritage
  • Advanced Image and Video Retrieval Techniques
  • Remote Sensing and LiDAR Applications
  • Multimodal Machine Learning Applications
  • Domain Adaptation and Few-Shot Learning
  • Advanced Neural Network Applications
  • Human Pose and Action Recognition
  • Thyroid and Parathyroid Surgery
  • Microfluidic and Bio-sensing Technologies
  • Digital Holography and Microscopy
  • Image Retrieval and Classification Techniques
  • Congenital Heart Disease Studies
  • Adversarial Robustness in Machine Learning
  • Cognitive and developmental aspects of mathematical skills
  • Cardiovascular Syncope and Autonomic Disorders
  • Design Education and Practice
  • Microfluidic and Capillary Electrophoresis Applications
  • Creativity in Education and Neuroscience
  • Cardiac Arrhythmias and Treatments
  • Anomaly Detection Techniques and Applications
  • Advanced Image Processing Techniques
  • Explainable Artificial Intelligence (XAI)

University of Michigan
2020-2024

Istituto Tecnico Industriale Alessandro Volta
2021

Weatherford College
2021

Georgia Institute of Technology
2015-2016

University of California, San Francisco
2016

Aligning partial views of a scene into single whole is essential to understanding one’s environment and key component numerous robotics tasks such as SLAM SfM. Recent approaches have proposed end-to-end systems that can outperform traditional methods by leveraging pose supervision. However, with the rising prevalence cameras depth sensors, we expect new stream raw RGB-D data without annotations needed for We propose UnsupervisedR&R: an unsupervised approach learning point cloud registration...

10.1109/cvpr46437.2021.00705 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

Geometric feature extraction is a crucial component of point cloud registration pipelines. Recent work has demonstrated how supervised learning can be leveraged to learn better and more compact 3D features. However, those approaches' reliance on ground-truth annotation limits their scalability. We propose BYOC: self-supervised approach that learns visual geometric features from RGB-D video without relying pose or correspondence. Our key observation randomly-initialized CNNs readily provide...

10.1109/iccv48922.2021.00637 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

Although an object may appear in numerous contexts, we often describe it a limited number of ways. Language allows us to abstract away visual variation represent and communicate concepts. Building on this intuition, propose alternative approach representation learning: using language similarity sample semantically similar image pairs for contrastive learning. Our diverges from image-based learning by sampling view instead handcrafted augmentations or learned clusters. also differs image-text...

10.1109/cvpr52729.2023.01841 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023-06-01

The goal of this paper is to estimate the viewpoint for a novel object. Standard estimation approaches generally fail on task due their reliance 3D model alignment or large amounts class-specific training data and corresponding canonical pose. We overcome those limitations by learning reconstruct align approach. Our key insight that although we do not have an explicit predefined pose, can still learn object's shape in viewer's frame then use image provide our reference In particular, propose...

10.1109/cvpr42600.2020.00318 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

Video provides us with the spatio-temporal consistency needed for visual learning. Recent approaches have utilized this signal to learn correspondence estimation from closeby frame pairs. However, by only relying on close-by pairs, those miss out richer long-range between distant overlapping frames. To address this, we propose a self-supervised approach that learns multiview in short RGB-D video sequences. Our combines pairwise and registration novel SE(3) transformation synchronization...

10.1109/wacv56688.2023.00127 article EN 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2023-01-01

Recent advances in large-scale pretraining have yielded visual foundation models with strong capabilities. Not only can recent generalize to arbitrary images for their training task, intermediate representations are useful other tasks such as detection and segmentation. Given that classify, delineate, localize objects 2D, we ask whether they also represent 3D structure? In this work, analyze the awareness of models. We posit implies (1) encode structure scene (2) consistently surface across...

10.48550/arxiv.2404.08636 preprint EN arXiv (Cornell University) 2024-04-12

10.7302/22900 article EN Deep Blue (University of Michigan) 2024-01-01

Humans have an unparalleled visual intelligence and can overcome ambiguities that machines currently cannot. Recent works shown incorporating guidance from humans during inference for monocular viewpoint-estimation help difficult cases in which the computer-alone would otherwise failed. These hybrid approaches are hence gaining traction. However, deciding what question to ask human at time remains unknown these problems. We address this by formulating it as Adviser Problem: we learn a...

10.48550/arxiv.1802.01666 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Although an object may appear in numerous contexts, we often describe it a limited number of ways. Language allows us to abstract away visual variation represent and communicate concepts. Building on this intuition, propose alternative approach representation learning: using language similarity sample semantically similar image pairs for contrastive learning. Our diverges from image-based learning by sampling view instead hand-crafted augmentations or learned clusters. also differs...

10.48550/arxiv.2302.12248 preprint EN other-oa arXiv (Cornell University) 2023-01-01

The goal of this paper is to estimate the viewpoint for a novel object. Standard estimation approaches generally fail on task due their reliance 3D model alignment or large amounts class-specific training data and corresponding canonical pose. We overcome those limitations by learning reconstruct align approach. Our key insight that although we do not have an explicit predefined pose, can still learn object's shape in viewer's frame then use image provide our reference In particular, propose...

10.48550/arxiv.2006.03586 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Video provides us with the spatio-temporal consistency needed for visual learning. Recent approaches have utilized this signal to learn correspondence estimation from close-by frame pairs. However, by only relying on pairs, those miss out richer long-range between distant overlapping frames. To address this, we propose a self-supervised approach that learns multiview in short RGB-D video sequences. Our combines pairwise and registration novel SE(3) transformation synchronization algorithm....

10.48550/arxiv.2212.03236 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Geometric feature extraction is a crucial component of point cloud registration pipelines. Recent work has demonstrated how supervised learning can be leveraged to learn better and more compact 3D features. However, those approaches' reliance on ground-truth annotation limits their scalability. We propose BYOC: self-supervised approach that learns visual geometric features from RGB-D video without relying pose or correspondence. Our key observation randomly-initialized CNNs readily provide...

10.48550/arxiv.2106.00677 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Aligning partial views of a scene into single whole is essential to understanding one's environment and key component numerous robotics tasks such as SLAM SfM. Recent approaches have proposed end-to-end systems that can outperform traditional methods by leveraging pose supervision. However, with the rising prevalence cameras depth sensors, we expect new stream raw RGB-D data without annotations needed for We propose UnsupervisedR&R: an unsupervised approach learning point cloud registration...

10.48550/arxiv.2102.11870 preprint EN other-oa arXiv (Cornell University) 2021-01-01
Coming Soon ...