NFDI4DS | UHH-SEMS - Publication Details

Learning 3D Semantic Scene Graphs From 3D Indoor Reconstructions

OPENALEX - Publications

Johanna Wald Helisa Dhamo Nassir Navab Federico Tombari

Scene understanding has been of high interest in computer vision. It encompasses not only identifying objects a scene, but also their relationships within the given context. With this goal, recent line works tackles 3D semantic segmentation and scene layout prediction. In our work we focus on graphs, data structure that organizes entities graph, where are nodes modeled as edges. We leverage inference graphs way to carry out understanding, mapping relationships. particular, propose learned...

10.1109/cvpr42600.2020.00402 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

Human Gaussian Splatting: Real-Time Rendering of Animatable Avatars

OPENALEX - Publications

Arthur Moreau Jifei Song Helisa Dhamo Richard A. Shaw Yiren Zhou and 1 more

10.1109/cvpr52733.2024.00081 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024-06-16

Semantic Image Manipulation Using Scene Graphs

OPENALEX - Publications

Helisa Dhamo Azade Farshad Iro Laina Nassir Navab Gregory D. Hager and 2 more

Image manipulation can be considered a special case of image generation where the to produced is modification an existing image. and have been, for most part, tasks that operate on raw pixels. However, remarkable progress in learning rich object representations has opened way such as text-to-image or layout-to-image are mainly driven by semantics. In our work, we address novel problem from scene graphs, which user edit images merely applying changes nodes edges semantic graph generated Our...

10.1109/cvpr42600.2020.00526 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

Peeking behind objects: Layered depth prediction from a single image

OPENALEX - Publications

Helisa Dhamo Keisuke Tateno Iro Laina Nassir Navab Federico Tombari

10.1016/j.patrec.2019.05.007 article EN Pattern Recognition Letters 2019-05-07

Graph-to-3D: End-to-End Generation and Manipulation of 3D Scenes Using Scene Graphs

OPENALEX - Publications

Helisa Dhamo Fabian Manhardt Nassir Navab Federico Tombari

Controllable scene synthesis consists of generating 3D information that satisfy underlying specifications. Thereby, these specifications should be abstract, i.e. allowing easy user interaction, whilst providing enough interface for detailed control. Scene graphs are representations a scene, composed objects (nodes) and inter-object relationships (edges), proven to particularly suited this task, as they allow semantic control on the generated content. Previous works tackling task often rely...

10.1109/iccv48922.2021.01604 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

Object-Driven Multi-Layer Scene Decomposition From a Single Image

OPENALEX - Publications

Helisa Dhamo Nassir Navab Federico Tombari

We present a method that tackles the challenge of predicting color and depth behind visible content an image. Our approach aims at building up Layered Depth Image (LDI) from single RGB input, which is efficient representation arranges scene in layers, including originally occluded regions. Unlike previous work, we enable adaptive scheme for number layers incorporate semantic encoding better hallucination partly objects. Additionally, our object-driven, especially boosts accuracy intermediate...

10.1109/iccv.2019.00547 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

Unconditional Scene Graph Generation

OPENALEX - Publications

Sarthak Garg Helisa Dhamo Azade Farshad Sabrina Musatian Nassir Navab and 1 more

Despite recent advancements in single-domain or single-object image generation, it is still challenging to generate complex scenes containing diverse, multiple objects and their interactions. Scene graphs, composed of nodes as directed-edges relationships among objects, offer an alternative representation a scene that more semantically grounded than images. We hypothesize generative model for graphs might be able learn the underlying semantic structure real-world effectively images, hence,...

10.1109/iccv48922.2021.01605 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

FORCE: Dataset and Method for Intuitive Physics Guided Human-object Interaction

OPENALEX - Publications

Xiaohan Zhang Bharat Lal Bhatnagar Sebastian Starke Ilya A. Petrov Vladimir Guzov and 3 more

Interactions between human and objects are influenced not only by the object's pose shape, but also physical attributes such as object mass surface friction. They introduce important motion nuances that essential for diversity realism. Despite advancements in recent kinematics-based methods, this aspect has been overlooked. Generating nuanced presents two challenges. First, it is non-trivial to learn from multi-modal information derived both non-physical attributes. Second, there exists no...

10.48550/arxiv.2403.11237 preprint EN arXiv (Cornell University) 2024-03-17

VSCHH 2023: A Benchmark for the View Synthesis Challenge of Human Heads

OPENALEX - Publications

Youngkyoon Jang Jiali Zheng Jifei Song Helisa Dhamo Eduardo Pérez-Pellitero and 32 more

This manuscript presents the results of "A View Synthesis Challenge for Humans Heads (VSCHH)", which was part ICCV 2023 workshops. paper describes competition setup and provides details on replicating our initial baseline, TensoRF. Additionally, we provide a summary participants' methods their in benchmark table. The challenge aimed to synthesize novel camera views human heads using given set sparse training view images. proposed solutions participants were evaluated ranked based objective...

10.1109/iccvw60793.2023.00120 article EN 2023-10-02

MIGS: Meta Image Generation from Scene Graphs

OPENALEX - Publications

Azade Farshad Sabrina Musatian Helisa Dhamo Nassir Navab

Generation of images from scene graphs is a promising direction towards explicit generation and manipulation. However, the generated lack quality, which in part comes due to high difficulty diversity data. We propose MIGS (Meta Image Scene Graphs), meta-learning based approach for few-shot image that enables adapting model different scenes increases quality by training on diverse sets tasks. By sampling data task-driven fashion, we train generator using tasks are categorized attributes. Our...

10.48550/arxiv.2110.11918 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Human Gaussian Splatting: Real-time Rendering of Animatable Avatars

OPENALEX - Publications

Arthur Moreau Jifei Song Helisa Dhamo Richard A. Shaw Yiren Zhou and 1 more

This work addresses the problem of real-time rendering photorealistic human body avatars learned from multi-view videos. While classical approaches to model and render virtual humans generally use a textured mesh, recent research has developed neural representations that achieve impressive visual quality. However, these models are difficult in their quality degrades when character is animated with poses different than training observations. We propose an animatable based on 3D Gaussian...

10.48550/arxiv.2311.17113 preprint EN cc-by-nc-sa arXiv (Cornell University) 2023-01-01

HeadGaS: Real-Time Animatable Head Avatars via 3D Gaussian Splatting

OPENALEX - Publications

Helisa Dhamo Yinyu Nie Arthur Moreau Jifei Song Richard A. Shaw and 2 more

3D head animation has seen major quality and runtime improvements over the last few years, particularly empowered by advances in differentiable rendering neural radiance fields. Real-time is a highly desirable goal for real-world applications. We propose HeadGaS, first model to use Gaussian Splats (3DGS) reconstruction animation. In this paper we introduce hybrid that extends explicit representation from 3DGS with base of learnable latent features, which can be linearly blended...

10.48550/arxiv.2312.02902 preprint EN other-oa arXiv (Cornell University) 2023-01-01

SWAGS: Sampling Windows Adaptively for Dynamic 3D Gaussian Splatting

OPENALEX - Publications

Richard A. Shaw Jifei Song Arthur Moreau Michał Nazarczuk Sibi Catley-Chandar and 2 more

Novel view synthesis has shown rapid progress recently, with methods capable of producing evermore photo-realistic results. 3D Gaussian Splatting emerged as a particularly promising method, high-quality renderings static scenes and enabling interactive viewing at real-time frame rates. However, it is currently limited to only. In this work, we extend reconstruct dynamic scenes. We model the dynamics scene using tunable MLP, which learns deformation field from canonical space set Gaussians...

10.48550/arxiv.2312.13308 preprint EN cc-by arXiv (Cornell University) 2023-01-01

DisPositioNet: Disentangled Pose and Identity in Semantic Image Manipulation

OPENALEX - Publications

Azade Farshad Yousef Yeganeh Helisa Dhamo Federico Tombari Nassir Navab

Graph representation of objects and their relations in a scene, known as scene graph, provides precise discernible interface to manipulate by modifying the nodes or edges graph. Although existing works have shown promising results placement pose objects, manipulation often leads losing some visual characteristics like appearance identity objects. In this work, we propose DisPositioNet, model that learns disentangled for each object task image using graphs self-supervised manner. Our...

10.48550/arxiv.2211.05499 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Object-Driven Multi-Layer Scene Decomposition From a Single Image

OPENALEX - Publications

Helisa Dhamo Nassir Navab Federico Tombari

We present a method that tackles the challenge of predicting color and depth behind visible content an image. Our approach aims at building up Layered Depth Image (LDI) from single RGB input, which is efficient representation arranges scene in layers, including originally occluded regions. Unlike previous work, we enable adaptive scheme for number layers incorporate semantic encoding better hallucination partly objects. Additionally, our object-driven, especially boosts accuracy intermediate...

10.48550/arxiv.1908.09521 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Learning 3D Semantic Scene Graphs from 3D Indoor Reconstructions

OPENALEX - Publications

Johanna Wald Helisa Dhamo Nassir Navab Federico Tombari

Scene understanding has been of high interest in computer vision. It encompasses not only identifying objects a scene, but also their relationships within the given context. With this goal, recent line works tackles 3D semantic segmentation and scene layout prediction. In our work we focus on graphs, data structure that organizes entities graph, where are nodes modeled as edges. We leverage inference graphs way to carry out understanding, mapping relationships. particular, propose learned...

10.48550/arxiv.2004.03967 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Semantic Image Manipulation Using Scene Graphs

OPENALEX - Publications

Helisa Dhamo Azade Farshad Iro Laina Nassir Navab Gregory D. Hager and 2 more

Image manipulation can be considered a special case of image generation where the to produced is modification an existing image. and have been, for most part, tasks that operate on raw pixels. However, remarkable progress in learning rich object representations has opened way such as text-to-image or layout-to-image are mainly driven by semantics. In our work, we address novel problem from scene graphs, which user edit images merely applying changes nodes edges semantic graph generated Our...

10.48550/arxiv.2004.03677 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Object-Aware Monocular Depth Prediction With Instance Convolutions

OPENALEX - Publications

Enis Simsar Evin Pınar Örnek Fabian Manhardt Helisa Dhamo Nassir Navab and 1 more

With the advent of deep learning, estimating depth from a single RGB image has recently received lot attention, being capable empowering many different applications ranging path planning for robotics to computational cinematography. Nevertheless,while maps are in their entirety fairly reliable, estimates around object discontinuities still far satisfactory. This can beattributed fact that convolutional operator naturally aggregates features across discontinuities, resulting smooth...

10.1109/lra.2022.3155823 article EN IEEE Robotics and Automation Letters 2022-03-03