Abhishek Kar

ORCID: 0000-0003-4724-6545
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Vision and Imaging
  • Robotics and Sensor-Based Localization
  • Advanced Image Processing Techniques
  • 3D Surveying and Cultural Heritage
  • Advanced Image and Video Retrieval Techniques
  • Image Enhancement Techniques
  • Computer Graphics and Visualization Techniques
  • Advanced Neural Network Applications
  • Image and Object Detection Techniques
  • 3D Shape Modeling and Analysis
  • Medical Image Segmentation Techniques
  • Image Processing Techniques and Applications
  • Generative Adversarial Networks and Image Synthesis
  • Optical measurement and interference techniques
  • Human Pose and Action Recognition
  • Face recognition and analysis
  • Data Visualization and Analytics
  • Explainable Artificial Intelligence (XAI)
  • Advanced Numerical Analysis Techniques
  • Educational Games and Gamification
  • Advanced MRI Techniques and Applications
  • Gaze Tracking and Assistive Technology
  • Medical Imaging and Analysis

Google (United States)
2020-2024

University of California, Berkeley
2014-2017

Microsoft Research (India)
2012

Indian Institute of Technology Kanpur
2012

We present a practical and robust deep learning solution for capturing rendering novel views of complex real world scenes virtual exploration. Previous approaches either require intractably dense view sampling or provide little to no guidance how users should sample scene reliably render high-quality views. Instead, we propose an algorithm synthesis from irregular grid sampled that first expands each into local light field via multiplane image (MPI) representation, then renders by blending...

10.1145/3306346.3322980 article EN ACM Transactions on Graphics 2019-07-12

Object reconstruction from a single image - in the wild is problem where we can make progress and get meaningful results today. This main message of this paper, which introduces an automated pipeline with pixels as inputs 3D surfaces various rigid categories outputs images realistic scenes. At core our approach are deformable models that be learned 2D annotations available existing object detection datasets, driven by noisy automatic segmentations complement bottom-up module for recovering...

10.1109/cvpr.2015.7298807 article EN 2015-06-01

We present a learnt system for multi-view stereopsis. In contrast to recent learning based methods 3D reconstruction, we leverage the underlying geometry of problem through feature projection and unprojection along viewing rays. By formulating these operations in differentiable manner, are able learn end-to-end task metric reconstruction. End-to-end allows us jointly reason about shape priors while conforming geometric constraints, enabling reconstruction from much fewer images (even single...

10.48550/arxiv.1708.05375 preprint EN other-oa arXiv (Cornell University) 2017-01-01

We present a practical and robust deep learning solution for capturing rendering novel views of complex real world scenes virtual exploration. Previous approaches either require intractably dense view sampling or provide little to no guidance how users should sample scene reliably render high-quality views. Instead, we propose an algorithm synthesis from irregular grid sampled that first expands each into local light field via multiplane image (MPI) representation, then renders by blending...

10.48550/arxiv.1905.00889 preprint EN other-oa arXiv (Cornell University) 2019-01-01

In recent years, there has been a proliferation of multimedia applications that leverage machine learning (ML) for interactive experiences. Prototyping ML-based is, however, still challenging, given complex workflows are not ideal design and experimentation. To better understand these challenges, we conducted formative study with seven ML practitioners to gather insights about common evaluation workflows.

10.1145/3544548.3581338 article EN 2023-04-19

We address the problem of fully automatic object localization and reconstruction from a single image. This is both very challenging important which has, until recently, received limited attention due to difficulties in segmenting objects predicting their poses. Here we leverage recent advances learning convolutional networks for detection segmentation introduce complementary network task camera viewpoint prediction. These predictors are powerful, but still not perfect given stringent...

10.1109/tpami.2016.2574713 article EN publisher-specific-oa IEEE Transactions on Pattern Analysis and Machine Intelligence 2016-06-02

Single image 3D photography enables viewers to view a still from novel viewpoints. Recent approaches combine monocular depth networks with inpainting achieve compelling results. A drawback of these techniques is the use hard layering, making them unable model intricate appearance details such as thin hair-like structures. We present SLIDE, modular and unified system for single that uses simple yet effective soft layering strategy better preserve in views. In addition, we propose depth-aware...

10.1109/iccv48922.2021.01229 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

We consider the problem of enriching current object detection systems with veridical sizes and relative depth estimates from a single image. There are several technical challenges to this, such as occlusions, lack calibration data scale ambiguity between size distance. These have not been addressed in full generality previous work. Here we propose tackle these issues by building upon advances recognition using recently created large-scale datasets. first introduce task amodal bounding box...

10.1109/iccv.2015.23 article EN 2015-12-01

We present a system for learning motion maps of independently moving objects from stereo videos. The only annotations used in our are 2D object bounding boxes which introduce the notion system. Unlike prior based approaches have focused on predicting dense optical flow fields and/or depth images, we propose to predict instance specific 3D scene and masks derive factored map each instance. Our network takes geometry problem into account allows it correlate input images distinguish static...

10.1109/cvpr.2019.00574 preprint EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

All that structure from motion algorithms "see" are sets of 2D points. We show these impoverished views the world can be faked for purpose reconstructing objects in challenging settings, such as a single image, or few ones far apart, by recognizing object and getting help collection images other same class. synthesize virtual computing geodesics on networks connecting with similar viewpoints, introduce techniques to increase specificity robustness factorization-based reconstruction this...

10.1109/cvpr.2015.7298912 article EN 2015-06-01

We present a touch-free interface for viewing large imagery on mobile devices. In particular, we focus paradigms 360 degree panoramas, parallax image sequences, and long multi-perspective panoramas. describe sensor fusion methodology that combines face tracking using front-facing camera with gyroscope data to produce robust signal defines the viewer's 3D position relative display. The gyroscopic provides both low-latency feedback allows extrapolation of beyond field-of-view camera. also...

10.1145/2207676.2208375 article EN 2012-05-05

We present a system for learning motion of independently moving objects from stereo videos. The only human annotation used in our are 2D object bounding boxes which introduce the notion to system. Unlike prior based work has focused on predicting dense pixel-wise optical flow field and/or depth map each image, we propose predict instance specific 3D scene maps and masks able derive direction speed instance. Our network takes geometry problem into account allows it correlate input images....

10.48550/arxiv.1901.01971 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Actions as simple grasping an object or navigating around it require a rich understanding of that object's 3D shape from given viewpoint. In this paper we repurpose powerful learning machinery, originally developed for classification, to discover image cues relevant recovering the potentially unfamiliar objects. We cast problem one local prediction surface normals and global detection reflection symmetry planes, which open door extrapolating occluded surfaces visible ones. demonstrate our...

10.48550/arxiv.1511.07845 preprint EN other-oa arXiv (Cornell University) 2015-01-01

We propose a system for free-viewpoint facial re-enactment from casual video capture of target subject. Our can render and re-enact the subject consistently in all captured views. Furthermore, our also enables interactive novel The is driven by an expression sequence source subject, which using custom app running on iPhone X. handles large pose variations while keeping consistent. demonstrate efficacy showing various applications.

10.1145/3415264.3425453 article EN 2020-11-24

We propose NeRFiller, an approach that completes missing portions of a 3D capture via generative inpainting using off-the-shelf 2D visual models. Often parts captured scene or object are due to mesh reconstruction failures lack observations (e.g., contact regions, such as the bottom objects, hard-to-reach areas). this challenging problem by leveraging diffusion model. identify surprising behavior these models, where they generate more consistent inpaints when images form 2$\times$2 grid, and...

10.48550/arxiv.2312.04560 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Single image 3D photography enables viewers to view a still from novel viewpoints. Recent approaches combine monocular depth networks with inpainting achieve compelling results. A drawback of these techniques is the use hard layering, making them unable model intricate appearance details such as thin hair-like structures. We present SLIDE, modular and unified system for single that uses simple yet effective soft layering strategy better preserve in views. In addition, we propose depth-aware...

10.48550/arxiv.2109.01068 preprint EN cc-by-nc-sa arXiv (Cornell University) 2021-01-01
Coming Soon ...