- 3D Shape Modeling and Analysis
- Robotics and Sensor-Based Localization
- Advanced Vision and Imaging
- Computer Graphics and Visualization Techniques
- Advanced Neural Network Applications
- Medical Image Segmentation Techniques
- Image Processing and 3D Reconstruction
- Human Pose and Action Recognition
- Advanced Image and Video Retrieval Techniques
- Image and Object Detection Techniques
- Medical Imaging and Analysis
- Geophysics and Gravity Measurements
- 3D Surveying and Cultural Heritage
- Computational Physics and Python Applications
- Optical measurement and interference techniques
- Generative Adversarial Networks and Image Synthesis
- Multimodal Machine Learning Applications
- Neural Networks and Applications
- Remote Sensing and LiDAR Applications
- Pulsars and Gravitational Waves Research
- Image Retrieval and Classification Techniques
- Domain Adaptation and Few-Shot Learning
- Machine Learning and Data Classification
- Advanced Data Compression Techniques
- Industrial Vision Systems and Defect Detection
Google (United States)
2021-2024
University of Pennsylvania
2017-2023
Hospital Israelita Albert Einstein
2022
Google (Canada)
2022
Philadelphia University
2019
California University of Pennsylvania
2017
Classical light field rendering for novel view synthesis can accurately reproduce view-dependent effects such as reflection, refraction, and translucency, but requires a dense sampling of the scene. Methods based on geometric reconstruction need only sparse views, cannot model non-Lambertian effects. We introduce that combines strengths mitigates limitations these two directions. By operating four-dimensional representation field, our learns to represent accurately. enforcing constraints...
Several popular approaches to 3D vision tasks process multiple views of the input independently with deep neural networks pre-trained on natural images, where view permutation invariance is achieved through a single round pooling over all views. We argue that this operation discards important information and leads subpar global descriptors. In paper, we propose group convolutional approach aggregation convolutions are performed discrete subgroup rotation group, enabling, thus, joint...
Convolutional neural networks (CNNs) are inherently equivariant to translation. Efforts embed other forms of equivariance have concentrated solely on rotation. We expand the notion in CNNs through Polar Transformer Network (PTN). PTN combines ideas from Spatial (STN) and canonical coordinate representations. The result is a network invariant translation both rotation scale. trained end-to-end composed three distinct stages: polar origin predictor, newly introduced transformer module...
We consider the problem of finding consistent matches across multiple images. Current state-of-the-art solutions use constraints on cycles together with convex optimization, leading to computationally intensive iterative algorithms. In this paper, we instead propose a clustering-based formulation: first rigorously show its equivalence traditional approaches, and then QuickMatch, novel algorithm that identifies multi-image from density function in feature space. Specifically, QuickMatch uses...
Symmetric orthogonalization via SVD, and closely related procedures, are well-known techniques for projecting matrices onto $O(n)$ or $SO(n)$. These tools have long been used applications in computer vision, example optimal 3D alignment problems solved by orthogonal Procrustes, rotation averaging, Essential matrix decomposition. Despite its utility different settings, SVD as a procedure producing is typically overlooked deep learning models, where the preferences tend toward classic...
A critical obstacle preventing NeRF models from being deployed broadly in the wild is their reliance on accurate camera poses. Consequently, there growing interest extending to jointly optimize poses and scene representation, which offers an alternative off-the-shelf SfM pipelines have well-understood failure modes. Existing approaches for unposed operate under limiting assumptions, such as a prior pose distribution or coarse initialization, making them less effective general setting. In...
Group equivariant neural networks have been explored in the past few years and are interesting from theoretical practical standpoints. They leverage concepts group representation theory, non-commutative harmonic analysis differential geometry that do not often appear machine learning. In practice, they shown to reduce sample model complexity, notably challenging tasks where input transformations such as arbitrary rotations present. We begin this work with an exposition of theory machinery...
Single image pose estimation is a fundamental problem in many vision and robotics tasks, existing deep learning approaches suffer by not completely modeling handling: i) uncertainty about the predictions, ii) symmetric objects with multiple (sometimes infinite) correct poses. To this end, we introduce method to estimate arbitrary, non-parametric distributions on SO(3). Our key idea represent implicitly, neural network that estimates probability given input candidate pose. Grid sampling or...
To evaluate whether a strategy of double-dose influenza vaccination during hospitalization for an acute coronary syndrome (ACS) compared with standard-dose outpatient (as recommended by current guidelines) would further reduce the risk major cardiopulmonary events.Vaccination against Influenza to Prevent cardiovascular events after Acute Coronary Syndromes (VIP-ACS) was pragmatic, randomized, multicentre, active-comparator, open-label trial blinded outcome adjudication comparing two...
We present a method for joint alignment of sparse in-the-wild image collections an object category. Most prior works assume either ground-truth keypoint annotations or large dataset images single However, neither the above assumptions hold true long-tail objects in world. self-supervised technique that directly optimizes on collection particular object/object category to obtain consistent dense correspondences across collection. use pairwise nearest neighbors obtained from deep features...
Learning equivariant representations is a promising way to reduce sample and model complexity improve the generalization performance of deep neural networks. The spherical CNNs are successful examples, producing SO(3)-equivariant inputs. There two main types CNNs. first type lifts inputs functions on rotation group SO(3) applies convolutions group, which computationally expensive since has one extra dimension. second directly sphere, limited zonal (isotropic) filters, thus have expressivity....
Spherical CNNs generalize to functions on the sphere, by using spherical convolutions as main linear operation. The most accurate and efficient way compute is in spectral domain (via convolution theorem), which still costlier than usual planar convolutions. For this reason, applications of have so far been limited small problems that can be approached with low model capacity. In work, we show how scaled for much larger problems. To achieve this, make critical improvements including novel...
Spherical convolutional networks have been introduced recently as tools to learn powerful feature representations of 3D shapes. CNNs are equivariant rotations making them ideally suited applications where data may be observed in arbitrary orientations. In this paper we 2D image embeddings with a similar structure: embedding the object should commute object. We introduce cross-domain from images into spherical CNN latent space. This encodes shape properties and is The model supervised only by...
With the recent proliferation of consumer-grade 360° cameras, it is worth revisiting visual perception challenges with spherical cameras given potential benefit their global field view. To this end we introduce a convolutional hourglass network (SCHN) for dense labeling on sphere. The SCHN invariant to camera orientation (lifting usual requirement `upright' panoramic images), and its design scalable larger practical datasets. Initial experiments show promising results semantic segmentation task.