Matthias Nießner

ORCID: 0000-0001-6093-5199
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • 3D Shape Modeling and Analysis
  • Advanced Vision and Imaging
  • Computer Graphics and Visualization Techniques
  • Generative Adversarial Networks and Image Synthesis
  • 3D Surveying and Cultural Heritage
  • Robotics and Sensor-Based Localization
  • Human Pose and Action Recognition
  • Face recognition and analysis
  • Advanced Neural Network Applications
  • Advanced Image and Video Retrieval Techniques
  • Advanced Numerical Analysis Techniques
  • Optical measurement and interference techniques
  • Image Processing and 3D Reconstruction
  • Digital Media Forensic Detection
  • Remote Sensing and LiDAR Applications
  • Multimodal Machine Learning Applications
  • Domain Adaptation and Few-Shot Learning
  • Speech and Audio Processing
  • Computational Geometry and Mesh Generation
  • Advanced Image Processing Techniques
  • Human Motion and Animation
  • Video Surveillance and Tracking Methods
  • Adversarial Robustness in Machine Learning
  • Hand Gesture Recognition Systems
  • Industrial Vision Systems and Defect Detection

Technical University of Munich
2017-2024

Association for Computing Machinery
2021

Stanford University
2013-2020

ETH Zurich
2018

Courant Institute of Mathematical Sciences
2018

New York University
2018

Tel Aviv University
2018

Czech Academy of Sciences, Institute of Computer Science
2018

Intel (United States)
2018

Palo Alto University
2014-2015

A key requirement for leveraging supervised deep learning methods is the availability of large, labeled datasets. Unfortunately, in context RGB-D scene understanding, very little data available - current datasets cover a small range views and have limited semantic annotations. To address this issue, we introduce ScanNet, an video dataset containing 2.5M 1513 scenes annotated with 3D camera poses, surface reconstructions, segmentations. collect data, designed easy-to-use scalable capture...

10.1109/cvpr.2017.261 article EN 2017-07-01

The rapid progress in synthetic image generation and manipulation has now come to a point where it raises significant concerns for the implications towards society. At best, this leads loss of trust digital content, but could potentially cause further harm by spreading false information or fake news. This paper examines realism state-of-the-art manipulations, how difficult is detect them, either automatically humans. To standardize evaluation detection methods, we propose an automated...

10.1109/iccv.2019.00009 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

We present a novel approach for real-time facial reenactment of monocular target video sequence (e.g., Youtube video). The source is also stream, captured live with commodity webcam. Our goal to animate the expressions by actor and re-render manipulated output in photo-realistic fashion. To this end, we first address under-constrained problem identity recovery from non-rigid model-based bundling. At run time, track both using dense photometric consistency measure. Reenactment then achieved...

10.1109/cvpr.2016.262 article EN 2016-06-01

The modern computer graphics pipeline can synthesize images at remarkable visual quality; however, it requires well-defined, high-quality 3D content as input. In this work, we explore the use of imperfect content, for instance, obtained from photo-metric reconstructions with noisy and incomplete surface geometry, while still aiming to produce photo-realistic (re-)renderings. To address challenging problem, introduce Deferred Neural Rendering , a new paradigm image synthesis that combines...

10.1145/3306346.3323035 article EN ACM Transactions on Graphics 2019-07-12

Matching local geometric features on real-world depth images is a challenging task due to the noisy, low-resolution, and incomplete nature of 3D scan data. These difficulties limit performance current state-of-art methods, which are typically based histograms over properties. In this paper, we present 3DMatch, data-driven model that learns volumetric patch descriptor for establishing correspondences between partial To amass training data our model, propose self-supervised feature learning...

10.1109/cvpr.2017.29 article EN 2017-07-01

Online 3D reconstruction is gaining newfound interest due to the availability of real-time consumer depth cameras. The basic problem takes live overlapping maps as input and incrementally fuses these into a single model. This challenging particularly when performance desired without trading quality or scale. We contribute an online system for large fine scale volumetric based on memory speed efficient data structure. Our uses simple spatial hashing scheme that compresses space, allows access...

10.1145/2508363.2508374 article EN ACM Transactions on Graphics 2013-11-01

We present a novel approach that enables photo-realistic re-animation of portrait videos using only an input video. In contrast to existing approaches are restricted manipulations facial expressions only, we the first transfer full 3D head position, rotation, face expression, eye gaze, and blinking from source actor video target actor. The core our is generative neural network with space-time architecture. takes as synthetic renderings parametric model, based on which it predicts frames for...

10.1145/3197517.3201283 article EN ACM Transactions on Graphics 2018-07-30

Real-time, high-quality, 3D scanning of large-scale scenes is key to mixed reality and robotic applications. However, scalability brings challenges drift in pose estimation, introducing significant errors the accumulated model. Approaches often require hours offline processing globally correct model errors. Recent online methods demonstrate compelling results but suffer from (1) needing minutes perform correction, preventing true real-time use; (2) brittle frame-to-frame (or frame-to-model)...

10.1145/3072959.3054739 article EN ACM Transactions on Graphics 2017-07-16

A key requirement for leveraging supervised deep learning methods is the availability of large, labeled datasets. Unfortunately, in context RGB-D scene understanding, very little data available -- current datasets cover a small range views and have limited semantic annotations. To address this issue, we introduce ScanNet, an video dataset containing 2.5M 1513 scenes annotated with 3D camera poses, surface reconstructions, segmentations. collect data, designed easy-to-use scalable capture...

10.48550/arxiv.1702.04405 preprint EN other-oa arXiv (Cornell University) 2017-01-01

We present a novel approach for real-time facial reenactment of monocular target video sequence (e.g., Youtube video). The source is also stream, captured live with commodity webcam. Our goal to animate the expressions by actor and re-render manipulated output in photo-realistic fashion. To this end, we first address under-constrained problem identity recovery from non-rigid model-based bundling. At run time, track both using dense photometric consistency measure. Reenactment then achieved...

10.1145/2929464.2929475 article EN 2016-07-19

We present a combined hardware and software solution for markerless reconstruction of non-rigidly deforming physical objects with arbitrary shape in real-time . Our system uses single self-contained stereo camera unit built from off-the-shelf components consumer graphics to generate spatio-temporally coherent 3D models at 30 Hz. A new matching algorithm estimates RGB-D data. start by scanning smooth template model the subject as they move rigidly. This geometric surface prior avoids strong...

10.1145/2601097.2601165 article EN ACM Transactions on Graphics 2014-07-22

We present a method for the real-time transfer of facial expressions from an actor in source video to target video, thus enabling ad-hoc control actor. The novelty our approach lies and photorealistic re-rendering deformations detail into way that newly-synthesized are virtually indistinguishable real video. To achieve this, we accurately capture performances subjects using commodity RGB-D sensor. For each frame, jointly fit parametric model identity, expression, skin reflectance input color...

10.1145/2816795.2818056 article EN ACM Transactions on Graphics 2015-10-27

Abstract Efficient rendering of photo‐realistic virtual worlds is a long standing effort computer graphics. Modern graphics techniques have succeeded in synthesizing images from hand‐crafted scene representations. However, the automatic generation shape, materials, lighting, and other aspects scenes remains challenging problem that, if solved, would make more widely accessible. Concurrently, progress vision machine learning given rise to new approach image synthesis editing, namely deep...

10.1111/cgf.14022 article EN publisher-specific-oa Computer Graphics Forum 2020-05-01

With recent advances in computer vision and graphics, it is now possible to generate videos with extremely realistic synthetic faces, even real time. Countless applications are possible, some of which raise a legitimate alarm, calling for reliable detectors fake videos. In fact, distinguishing between original manipulated video can be challenge humans computers alike, especially when the compressed or have low resolution, as often happens on social networks. Research detection face...

10.48550/arxiv.1803.09179 preprint EN other-oa arXiv (Cornell University) 2018-01-01

We introduce ScanComplete, a novel data-driven approach for taking an incomplete 3D scan of scene as input and predicting complete model along with per-voxel semantic labels. The key contribution our method is its ability to handle large scenes varying spatial extent, managing the cubic growth in data size increases. To this end, we devise fully-convolutional generative CNN whose filter kernels are invariant overall size. can be trained on subvolumes but deployed arbitrarily at test time. In...

10.1109/cvpr.2018.00481 preprint EN 2018-06-01

Access to large, diverse RGB-D datasets is critical for training scene understanding algorithms. However, existing still cover only a limited number of views or restricted scale spaces. In this paper, we introduce Matterport3D, large-scale dataset containing 10,800 panoramic from 194,400 images 90 building-scale scenes. Annotations are provided with surface reconstructions, camera poses, and 2D 3D semantic segmentations. The precise global alignment comprehensive, set over entire buildings...

10.48550/arxiv.1709.06158 preprint EN other-oa arXiv (Cornell University) 2017-01-01

Abstract The computer graphics and vision communities have dedicated long standing efforts in building computerized tools for reconstructing, tracking, analyzing human faces based on visual input. Over the past years rapid progress has been made, which led to novel powerful algorithms that obtain impressive results even very challenging case of reconstruction from a single RGB or RGB‐D camera. range applications is vast steadily growing as these technologies are further improving speed,...

10.1111/cgf.13382 article EN Computer Graphics Forum 2018-05-01

Abstract The advent of affordable consumer grade RGB‐D cameras has brought about a profound advancement visual scene reconstruction methods. Both computer graphics and vision researchers spend significant effort to develop entirely new algorithms capture comprehensive shape models static dynamic scenes with cameras. This led advances the state art along several dimensions. Some methods achieve very high detail, despite limited sensor resolution. Others even real‐time performance, yet...

10.1111/cgf.13386 article EN Computer Graphics Forum 2018-05-01

Face2Face is an approach for real-time facial reenactment of a monocular target video sequence (e.g., Youtube video). The source also stream, captured live with commodity webcam. Our goal to animate the expressions by actor and re-render manipulated output in photo-realistic fashion. To this end, we first address under-constrained problem identity recovery from non-rigid model-based bundling. At run time, track both using dense photometric consistency measure. Reenactment then achieved fast...

10.1145/3292039 article EN Communications of the ACM 2018-12-19

Distinguishing manipulated from real images is becoming increasingly difficult as new sophisticated image forgery approaches come out by the day. Naive classification based on Convolutional Neural Networks (CNNs) show excellent performance in detecting manipulations when they are trained a specific method. However, examples unseen manipulation approaches, their drops significantly. To address this limitation transferability, we introduce Forensic-Transfer (FT). We devise learning-based...

10.48550/arxiv.1812.02510 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Abstract Synthesizing photo‐realistic images and videos is at the heart of computer graphics has been focus decades research. Traditionally, synthetic a scene are generated using rendering algorithms such as rasterization or ray tracing, which take specifically defined representations geometry material properties input. Collectively, these inputs define actual what rendered, referred to representation (where consists one more objects). Example triangle meshes with accompanied textures (e.g.,...

10.1111/cgf.14507 article EN Computer Graphics Forum 2022-05-01
Coming Soon ...