Sean Fanello

ORCID: 0000-0001-9726-4501
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Vision and Imaging
  • Computer Graphics and Visualization Techniques
  • Image Enhancement Techniques
  • Human Pose and Action Recognition
  • Advanced Image and Video Retrieval Techniques
  • 3D Shape Modeling and Analysis
  • Robotics and Sensor-Based Localization
  • Advanced Image Processing Techniques
  • Generative Adversarial Networks and Image Synthesis
  • Optical measurement and interference techniques
  • Face recognition and analysis
  • Image Processing Techniques and Applications
  • Multimodal Machine Learning Applications
  • Hand Gesture Recognition Systems
  • Video Surveillance and Tracking Methods
  • Tactile and Sensory Interactions
  • Domain Adaptation and Few-Shot Learning
  • Interactive and Immersive Displays
  • Color Science and Applications
  • Human Motion and Animation
  • Image and Signal Denoising Methods
  • Robot Manipulation and Learning
  • Industrial Vision Systems and Defect Detection
  • Augmented Reality Applications
  • Advanced Data Compression Techniques

Google (United States)
2018-2024

Italian Institute of Technology
2012-2017

Perceptive Engineering (United Kingdom)
2017

Microsoft Research (United Kingdom)
2014-2016

Microsoft (United States)
2014-2016

University of Genoa
2014

Microsoft Research (India)
2014

Sapienza University of Rome
2010

We present an end-to-end system for augmented and virtual reality telepresence, called Holoportation. Our demonstrates high-quality, real-time 3D reconstructions of entire space, including people, furniture objects, using a set new depth cameras. These models can also be transmitted in to remote users. This allows users wearing or displays see, hear interact with participants 3D, almost as if they were the same physical space. From audio-visual perspective, communicating interacting edges...

10.1145/2984511.2984517 article EN 2016-10-16

We contribute a new pipeline for live multi-view performance capture, generating temporally coherent high-quality reconstructions in real-time. Our algorithm supports both incremental reconstruction, improving the surface estimation over time, as well parameterizing nonrigid scene motion. approach is highly robust to large frame-to-frame motion and topology changes, allowing us reconstruct extremely challenging scenes. demonstrate advantages related real-time techniques that either deform an...

10.1145/2897824.2925969 article EN ACM Transactions on Graphics 2016-07-11

Abstract Efficient rendering of photo‐realistic virtual worlds is a long standing effort computer graphics. Modern graphics techniques have succeeded in synthesizing images from hand‐crafted scene representations. However, the automatic generation shape, materials, lighting, and other aspects scenes remains challenging problem that, if solved, would make more widely accessible. Concurrently, progress vision machine learning given rise to new approach image synthesis editing, namely deep...

10.1111/cgf.14022 article EN publisher-specific-oa Computer Graphics Forum 2020-05-01

This paper presents HITNet, a novel neural network architecture for real-time stereo matching. Contrary to many recent approaches that operate on full cost volume and rely 3D convolutions, our approach does not explicitly build instead relies fast multi-resolution initialization step, differentiable 2D geometric propagation warping mechanisms infer disparity hypotheses. To achieve high level of accuracy, only geometrically reasons about disparities but also infers slanted plane hypotheses...

10.1109/cvpr46437.2021.01413 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

We present "The Relightables", a volumetric capture system for photorealistic and high quality relightable full-body performance capture. While significant progress has been made on systems, focusing 3D geometric reconstruction with resolution textures, much less work done to recover photometric properties needed relighting. Results from such systems lack high-frequency details the subject's shading is prebaked into texture. In contrast, large body of addressed acquisition image-based...

10.1145/3355089.3356571 article EN ACM Transactions on Graphics 2019-11-08

We present Motion2Fusion, a state-of-the-art 360 performance capture system that enables *real-time* reconstruction of arbitrary non-rigid scenes. provide three major contributions over prior work: 1) new fusion pipeline allowing for far more faithful high frequency geometric details, avoiding the over-smoothing and visual artifacts observed previously. 2) speed coupled with machine learning technique 3D correspondence field estimation reducing tracking errors are attributed to fast motions....

10.1145/3130800.3130801 article EN ACM Transactions on Graphics 2017-11-20

We present a novel machine learning based algorithm extending the interaction space around mobile devices. The technique uses only RGB camera now commonplace on off-the-shelf Our robustly recognizes wide range of in-air gestures, supporting user variation, and varying lighting conditions. demonstrate that our runs in real-time unmodified devices, including resource-constrained smartphones smartwatches. goal is not to replace touchscreen as primary input device, but rather augment enrich...

10.1145/2642918.2647373 article EN 2014-10-01

Motivated by augmented and virtual reality applications such as telepresence, there has been a recent focus in real-time performance capture of humans under motion. However, given the constraint, these systems often suffer from artifacts geometry texture holes noise final rendering, poor lighting, low-resolution textures. We take novel approach to augment with deep architecture that takes rendering an arbitrary viewpoint, jointly performs completion, super resolution, denoising imagery...

10.1145/3272127.3275099 article EN ACM Transactions on Graphics 2018-11-28

Augmented reality (AR) for smartphones has matured from a technology earlier adopters, available only on select high-end phones, to one that is truly the general public. One of key breakthroughs been in low-compute methods six degree freedom (6DoF) tracking phones using existing hardware (camera and inertial sensors). 6DoF cornerstone smartphone AR allowing virtual content be precisely locked top real world. However, really give users impression believable AR, requires mobile depth. Without...

10.1145/3272127.3275041 article EN ACM Transactions on Graphics 2018-11-28

Structured light sensors are popular due to their robustness untextured scenes and multipath. These systems triangulate depth by solving a correspondence problem between each camera projector pixel. This is often framed as local stereo matching task, correlating patches of pixels in the observed reference image. However, this computationally intensive, leading reduced accuracy framerate. We contribute an algorithm for efficiently, without compromising accuracy. For first time, cast...

10.1109/cvpr.2016.587 article EN 2016-06-01

We propose a novel system for portrait relighting and background replacement, which maintains high-frequency boundary details accurately synthesizes the subject's appearance as lit by illumination, thereby producing realistic composite images any desired scene. Our technique includes foreground estimation via alpha matting, relighting, compositing. demonstrate that each of these stages can be tackled in sequential pipeline without use priors (e.g. known or illumination) with no specialized...

10.1145/3450626.3459872 article EN ACM Transactions on Graphics 2021-07-19

The light transport (LT) of a scene describes how it appears under different lighting conditions from viewing directions, and complete knowledge scene’s LT enables the synthesis novel views arbitrary lighting. In this article, we focus on image-based acquisition, primarily for human bodies within stage setup. We propose semi-parametric approach learning neural representation that is embedded in texture atlas known but possibly rough geometry. model all non-diffuse global as residuals added...

10.1145/3446328 article EN ACM Transactions on Graphics 2021-01-18

We present a novel technique to relight images of human faces by learning model facial reflectance from database 4D field data several subjects in variety expressions and viewpoints. Using our learned model, face can be relit arbitrary illumination environments using only two original recorded under spherical color gradient illumination. The output deep network indicates that the contain information needed estimate full field, including specular reflections high frequency details. While...

10.1145/3306346.3323027 article EN ACM Transactions on Graphics 2019-07-12

We present FlexSense, a new thin-film, transparent sensing surface based on printed piezoelectric sensors, which can reconstruct complex deformations without the need for any external sensing, such as cameras. FlexSense provides fully self-contained setup improves mobility and is not affected from occlusions. Using only sparse set of periphery substrate, we devise two algorithms to sheet, using these sensor measurements. An evaluation shows that both proposed are capable reconstructing...

10.1145/2642918.2647405 article EN 2014-10-01

We present a machine learning technique for estimating absolute, per-pixel depth using any conventional monocular 2D camera, with minor hardware modifications. Our approach targets close-range human capture and interaction where dense 3D estimation of hands faces is desired. use hybrid classification-regression forests to learn how map from near infrared intensity images absolute , metric in real-time. demonstrate variety human-computer scenarios. Experiments show an accuracy that...

10.1145/2601097.2601223 article EN ACM Transactions on Graphics 2014-07-22

The increasing demand for 3D content in augmented and virtual reality has motivated the development of volumetric performance capture systemsnsuch as Light Stage. Recent advances are pushing free viewpoint relightable videos dynamic human performances closer to photorealistic quality. However, despite significant efforts, these sophisticated systems limited by reconstruction rendering algorithms which do not fully model complex structures higher order light transport effects such global...

10.1145/3414685.3417814 article EN ACM Transactions on Graphics 2020-11-27

Efficient estimation of depth from pairs stereo images is one the core problems in computer vision. We efficiently solve specialized problem matching under active illumination using a new learning-based algorithm. This type i.e. where scene texture augmented by an light projector proving compelling for designing cameras, largely due to improved robustness when compared time flight or traditional structured techniques. Our algorithm uses unsupervised greedy optimization scheme that learns...

10.1109/cvpr.2017.692 article EN 2017-07-01

The light stage has been widely used in computer graphics for the past two decades, primarily to enable relighting of human faces. By capturing appearance subject under different sources, one obtains transport matrix that subject, which enables image-based novel environments. However, due finite number lights stage, only represents a sparse sampling on entire sphere. As consequence, with point or directional source does not coincide exactly requires interpolation and resampling images...

10.1145/3414685.3417821 article EN ACM Transactions on Graphics 2020-11-27

We describe a novel approach for compressing truncated signed distance fields (TSDF) stored in 3D voxel grids, and their corresponding textures. To compress the TSDF, our method relies on block-based neural network architecture trained end-to-end, achieving state-of-the-art rate-distortion trade-off. prevent topological errors, we losslessly com- press signs of which also upper bounds reconstruction error by size. texture, designed fast UV parameterization, generating coherent texture maps...

10.1109/cvpr42600.2020.00137 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

We introduce Multiresolution Deep Implicit Functions (MDIF), a hierarchical representation that can recover fine geometry detail, while being able to perform global operations such as shape completion. Our model represents complex 3D with hierarchy of latent grids, which be decoded into different levels detail and also achieve better accuracy. For completion, we propose grid dropout simulate partial data in the space therefore defer completing functionality decoder side. This along our...

10.1109/iccv48922.2021.01284 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

We propose a method to learn high-quality implicit 3D head avatar from monocular RGB video captured in the wild. The learnt is driven by parametric face model achieve user-controlled facial expressions and poses. Our hybrid pipeline combines geometry prior dynamic tracking of 3DMM with neural radiance field fine-grained control photorealism. To reduce over-smoothing improve out-of-model synthesis, we predict local features anchored on geometry. These are deformation interpolated space yield...

10.1109/cvpr52729.2023.01620 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023-06-01
Coming Soon ...