NFDI4DS | UHH-SEMS - Publication Details

Vincent Sitzmann

ORCID: 0000-0002-0107-5704

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5016061808

Research Areas

Advanced Vision and Imaging
3D Shape Modeling and Analysis
Computer Graphics and Visualization Techniques
Generative Adversarial Networks and Image Synthesis
Visual perception and processing mechanisms
Advanced Image Processing Techniques
Image and Signal Denoising Methods
Neural Networks and Applications
Robotics and Sensor-Based Localization
Advanced Neural Network Applications
3D Surveying and Cultural Heritage
Visual Attention and Saliency Detection
Image Processing and 3D Reconstruction
Music and Audio Processing
Remote Sensing and LiDAR Applications
Advanced Numerical Analysis Techniques
Robot Manipulation and Learning
Model Reduction and Neural Networks
Neural Networks and Reservoir Computing
Face recognition and analysis
Advanced Image Fusion Techniques
Domain Adaptation and Few-Shot Learning
Industrial Vision Systems and Defect Detection
Olfactory and Sensory Function Studies
Speech and Audio Processing

Massachusetts Institute of Technology
2017-2024

Moscow Institute of Thermal Technology
2022-2023

Stanford University
2017-2021

Stanford Medicine
2018

Implicit Neural Representations with Periodic Activation Functions

OPENALEX - Publications

Vincent Sitzmann Julien Martel Alexander W. Bergman David B. Lindell Gordon Wetzstein

Implicitly defined, continuous, differentiable signal representations parameterized by neural networks have emerged as a powerful paradigm, offering many possible benefits over conventional representations. However, current network architectures for such implicit are incapable of modeling signals with fine detail, and fail to represent signal's spatial temporal derivatives, despite the fact that these essential physical defined implicitly solution partial differential equations. We propose...

10.48550/arxiv.2006.09661 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations

OPENALEX - Publications

Vincent Sitzmann Michael Zollhöfer Gordon Wetzstein

Unsupervised learning with generative models has the potential of discovering rich representations 3D scenes. While geometric deep explored 3D-structure-aware scene geometry, these typically require explicit supervision. Emerging neural can be trained only posed 2D images, but existing methods ignore three-dimensional structure We propose Scene Representation Networks (SRNs), a continuous, representation that encodes both geometry and appearance. SRNs represent scenes as continuous functions...

10.48550/arxiv.1906.01618 preprint EN other-oa arXiv (Cornell University) 2019-01-01

DeepVoxels: Learning Persistent 3D Feature Embeddings

OPENALEX - Publications

Vincent Sitzmann Justus Thies Felix Heide Matthias NieBner Gordon Wetzstein and 1 more

In this work, we address the lack of 3D understanding generative neural networks by introducing a persistent feature embedding for view synthesis. To end, propose DeepVoxels, learned representation that encodes view-dependent appearance scene without having to explicitly model its geometry. At core, our approach is based on Cartesian grid embedded features learn make use underlying structure. Our combines insights from geometric computer vision with recent advances in learning image-to-image...

10.1109/cvpr.2019.00254 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

Hybrid optical-electronic convolutional neural networks with optimized diffractive optics for image classification

OPENALEX - Publications

Julie Chang Vincent Sitzmann Xiong Dun Wolfgang Heidrich Gordon Wetzstein

Convolutional neural networks (CNNs) excel in a wide variety of computer vision applications, but their high performance also comes at computational cost. Despite efforts to increase efficiency both algorithmically and with specialized hardware, it remains difficult deploy CNNs embedded systems due tight power budgets. Here we explore complementary strategy that incorporates layer optical computing prior electronic computing, improving on image classification tasks while adding minimal cost...

10.1038/s41598-018-30619-y article EN cc-by Scientific Reports 2018-08-13

Saliency in VR: How Do People Explore Virtual Environments?

OPENALEX - Publications

Vincent Sitzmann Ana Serrano Amy Pavel Maneesh Agrawala Diego Gutiérrez and 2 more

Understanding how people explore immersive virtual environments is crucial for many applications, such as designing reality (VR) content, developing new compression algorithms, or learning computational models of saliency visual attention. Whereas a body recent work has focused on modeling in desktop viewing conditions, VR very different from these conditions that behavior governed by stereoscopic vision and the complex interaction head orientation, gaze, other kinematic constraints. To...

10.1109/tvcg.2018.2793599 article EN IEEE Transactions on Visualization and Computer Graphics 2018-01-25

End-to-end optimization of optics and image processing for achromatic extended depth of field and super-resolution imaging

OPENALEX - Publications

Vincent Sitzmann Steven Diamond Yifan Peng Xiong Dun Stephen Boyd and 3 more

In typical cameras the optical system is designed first; once it fixed, parameters in image processing algorithm are tuned to get good reproduction. contrast this sequential design approach, we consider joint optimization of an (for example, physical shape lens) together with reconstruction algorithm. We build a fully-differentiable simulation model that maps true source reconstructed one. The includes diffractive light propagation, depth and wavelength-dependent effects, noise...

10.1145/3197517.3201333 article EN ACM Transactions on Graphics 2018-07-30

State of the Art on Neural Rendering

OPENALEX - Publications

Ayush Tewari Ohad Fried Justus Thies Vincent Sitzmann Stephen Lombardi and 14 more

Abstract Efficient rendering of photo‐realistic virtual worlds is a long standing effort computer graphics. Modern graphics techniques have succeeded in synthesizing images from hand‐crafted scene representations. However, the automatic generation shape, materials, lighting, and other aspects scenes remains challenging problem that, if solved, would make more widely accessible. Concurrently, progress vision machine learning given rise to new approach image synthesis editing, namely deep...

10.1111/cgf.14022 article EN publisher-specific-oa Computer Graphics Forum 2020-05-01

Neural Fields in Visual Computing and Beyond

OPENALEX - Publications

Yiheng Xie Towaki Takikawa Shunsuke Saito Or Litany Shiqin Yan and 5 more

Abstract Recent advances in machine learning have led to increased interest solving visual computing problems using methods that employ coordinate‐based neural networks. These methods, which we call fields , parameterize physical properties of scenes or objects across space and time. They seen widespread success such as 3D shape image synthesis, animation human bodies, reconstruction, pose estimation. Rapid progress has numerous papers, but a consolidation the discovered knowledge not yet...

10.1111/cgf.14505 article EN publisher-specific-oa Computer Graphics Forum 2022-05-01

Advances in Neural Rendering

OPENALEX - Publications

Ayush Tewari Justus Thies Ben Mildenhall Pratul P. Srinivasan Edgar Tretschk and 12 more

Abstract Synthesizing photo‐realistic images and videos is at the heart of computer graphics has been focus decades research. Traditionally, synthetic a scene are generated using rendering algorithms such as rasterization or ray tracing, which take specifically defined representations geometry material properties input. Collectively, these inputs define actual what rendered, referred to representation (where consists one more objects). Example triangle meshes with accompanied textures (e.g.,...

10.1111/cgf.14507 article EN Computer Graphics Forum 2022-05-01

Kubric: A scalable dataset generator

OPENALEX - Publications

Klaus Greff Francois Belletti Lucas Beyer Carl Doersch Yilun Du and 29 more

Data is the driving force of machine learning, with amount and quality training data often being more important for performance a system than architecture details. But collecting, processing annotating real at scale difficult, expensive, frequently raises additional privacy, fairness legal concerns. Synthetic powerful tool potential to address these shortcomings: 1) it cheap 2) supports rich ground-truth annotations 3) offers full control over 4) can circumvent or mitigate problems regarding...

10.1109/cvpr52688.2022.00373 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

Decomposing NeRF for Editing via Feature Field Distillation

OPENALEX - Publications

Sosuke Kobayashi Eiichi Matsumoto Vincent Sitzmann

Emerging neural radiance fields (NeRF) are a promising scene representation for computer graphics, enabling high-quality 3D reconstruction and novel view synthesis from image observations. However, editing represented by NeRF is challenging, as the underlying connectionist representations such MLPs or voxel grids not object-centric compositional. In particular, it has been difficult to selectively edit specific regions objects. this work, we tackle problem of semantic decomposition NeRFs...

10.48550/arxiv.2205.15585 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Neural Descriptor Fields: SE(3)-Equivariant Object Representations for Manipulation

OPENALEX - Publications

Anthony Simeonov Yilun Du Andrea Tagliasacchi Joshua B. Tenenbaum Alberto Rodríguez and 2 more

We present Neural Descriptor Fields (NDFs), an object representation that encodes both points and relative poses between a target (such as robot gripper or rack used for hanging) via category-level descriptors. employ this manipulation, where given task demonstration, we want to repeat the same on new instance from category. propose achieve objective by searching (via optimization) pose whose descriptor matches observed in demonstration. NDFs are conveniently trained self-supervised fashion...

10.1109/icra46639.2022.9812146 article EN 2022 International Conference on Robotics and Automation (ICRA) 2022-05-23

PixelSplat: 3D Gaussian Splats from Image Pairs for Scalable Generalizable 3D Reconstruction

OPENALEX - Publications

David Charatan Sizhe Li Andrea Tagliasacchi Vincent Sitzmann

10.1109/cvpr52733.2024.01840 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024-06-16

Unrolled Optimization with Deep Priors

OPENALEX - Publications

Steven Diamond Vincent Sitzmann Felix Heide Gordon Wetzstein

A broad class of problems at the core computational imaging, sensing, and low-level computer vision reduces to inverse problem extracting latent images that follow a prior distribution, from measurements taken under known physical image formation model. Traditionally, hand-crafted priors along with iterative optimization methods have been used solve such problems. In this paper we present unrolled deep priors, principled framework for infusing knowledge into networks in inspired by classical...

10.48550/arxiv.1705.08041 preprint EN other-oa arXiv (Cornell University) 2017-01-01

Towards a Machine-Learning Approach for Sickness Prediction in 360° Stereoscopic Videos

OPENALEX - Publications

Nitish Padmanaban Timon Ruban Vincent Sitzmann Anthony M. Norcia Gordon Wetzstein

Virtual reality systems are widely believed to be the next major computing platform. There are, however, some barriers adoption that must addressed, such as of motion sickness - which can lead undesirable symptoms including postural instability, headaches, and nausea. Motion in virtual occurs a result moving visual stimuli cause users perceive self-motion while they remain stationary real world. several contributing factors both this perception subsequent onset sickness, field view,...

10.1109/tvcg.2018.2793560 article EN IEEE Transactions on Visualization and Computer Graphics 2018-01-23

MetaSDF: Meta-learning Signed Distance Functions

OPENALEX - Publications

Vincent Sitzmann Eric R. Chan Richard Tucker Noah Snavely Gordon Wetzstein

Neural implicit shape representations are an emerging paradigm that offers many potential benefits over conventional discrete representations, including memory efficiency at a high spatial resolution. Generalizing across shapes with such neural amounts to learning priors the respective function space and enables geometry reconstruction from partial or noisy observations. Existing generalization methods rely on conditioning network low-dimensional latent code is either regressed by encoder...

10.48550/arxiv.2006.09662 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Light Field Networks: Neural Scene Representations with Single-Evaluation Rendering

OPENALEX - Publications

Vincent Sitzmann Semon Rezchikov William T. Freeman Joshua B. Tenenbaum Frédo Durand

Inferring representations of 3D scenes from 2D observations is a fundamental problem computer graphics, vision, and artificial intelligence. Emerging 3D-structured neural scene are promising approach to understanding. In this work, we propose novel representation, Light Field Networks or LFNs, which represent both geometry appearance the underlying in 360-degree, four-dimensional light field parameterized via implicit representation. Rendering ray an LFN requires only single network...

10.48550/arxiv.2106.02634 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Intrinsic Image Diffusion for Indoor Single-view Material Estimation

OPENALEX - Publications

Peter Kocsis Vincent Sitzmann Matthias NieBner

10.1109/cvpr52733.2024.00497 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024-06-16

Advances in neural rendering

OPENALEX - Publications

Ayush Tewari Ohad Fried Justus Thies Vincent Sitzmann Stephen Lombardi and 17 more

Loss functions for Neural Rendering Jun-Yan Zhu

10.1145/3450508.3464573 article EN 2021-07-21

Learning to Render Novel Views from Wide-Baseline Stereo Pairs

OPENALEX - Publications

Yilun Du Cameron M. Smith Ayush Tewari Vincent Sitzmann

We introduce a method for novel view synthesis given only single wide-baseline stereo image pair. In this challenging regime, 3D scene points are regularly observed once, requiring prior-based reconstruction of geometry and appearance. find that existing approaches to from sparse observations fail due recovering incorrect the high cost differentiable rendering precludes their scaling large-scale training. take step towards resolving these shortcomings by formulating multi-view transformer...

10.1109/cvpr52729.2023.00481 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023-06-01

Movie editing and cognitive event segmentation in virtual reality video

OPENALEX - Publications

Ana Serrano Vincent Sitzmann Jaime Ruiz-Borau Gordon Wetzstein Diego Gutiérrez and 1 more

Traditional cinematography has relied for over a century on well-established set of editing rules, called continuity editing, to create sense situational continuity. Despite massive changes in visual content across cuts, viewers general experience no trouble perceiving the discontinuous flow information as coherent events. However, Virtual Reality (VR) movies are intrinsically different from traditional that viewer controls camera orientation at all times. As consequence, common techniques...

10.1145/3072959.3073668 article EN ACM Transactions on Graphics 2017-07-20

Dirty Pixels: Towards End-to-end Image Processing and Perception

OPENALEX - Publications

Steven Diamond Vincent Sitzmann Frank Julca-Aguilar Stephen Boyd Gordon Wetzstein and 1 more

Real-world, imaging systems acquire measurements that are degraded by noise, optical aberrations, and other imperfections make image processing for human viewing higher-level perception tasks challenging. Conventional cameras address this problem compartmentalizing from high-level task processing. As such, conventional involves the RAW sensor in a sequential pipeline of steps, such as demosaicking, denoising, deblurring, tone-mapping, compression. This is optimized to obtain visually...

10.1145/3446918 article EN ACM Transactions on Graphics 2021-05-05

Diffusion with Forward Models: Solving Stochastic Inverse Problems Without Direct Supervision

OPENALEX - Publications

Ayush Tewari Tianwei Yin George Cazenavette Semon Rezchikov Joshua B. Tenenbaum and 3 more

Denoising diffusion models are a powerful type of generative used to capture complex distributions real-world signals. However, their applicability is limited scenarios where training samples readily available, which not always the case in applications. For example, inverse graphics, goal generate from distribution 3D scenes that align with given image, but ground-truth unavailable and only 2D images accessible. To address this limitation, we propose novel class denoising probabilistic learn...

10.48550/arxiv.2306.11719 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Coming Soon ...