Matteo Toso

ORCID: 0000-0002-8990-7156
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Vision and Imaging
  • Human Pose and Action Recognition
  • Video Surveillance and Tracking Methods
  • Robotics and Sensor-Based Localization
  • 3D Shape Modeling and Analysis
  • 3D Surveying and Cultural Heritage
  • Advanced Image and Video Retrieval Techniques
  • Multimodal Machine Learning Applications
  • Advanced Neural Network Applications
  • Computer Graphics and Visualization Techniques
  • Wood Treatment and Properties
  • Neural dynamics and brain function
  • Video Analysis and Summarization
  • Archaeological Research and Protection
  • Human Motion and Animation
  • Photoreceptor and optogenetics research
  • Domain Adaptation and Few-Shot Learning
  • Computability, Logic, AI Algorithms
  • Bamboo properties and applications
  • Image Retrieval and Classification Techniques
  • Memory and Neural Mechanisms
  • Structural Load-Bearing Analysis
  • Image Processing and 3D Reconstruction
  • Image and Object Detection Techniques

Italian Institute of Technology
2021-2024

University of Surrey
2018-2019

We propose a CNN-based approach for multi-camera markerless motion capture of the human body. Unlike existing methods that first perform pose estimation on individual cameras and generate 3D models as post-processing, our makes use reasoning throughout multi-stage approach. This novelty allows us to provisional rethink where joints should be located in image recover from past mistakes. Our principled refinement poses lets make cues, even images we previously misdetected joints, refine...

10.1109/3dv.2018.00061 article EN 2021 International Conference on 3D Vision (3DV) 2018-09-01

This paper presents a novel generative approach that outputs 3D indoor environments solely from textual description of the scene. Current methods often treat scene synthesis as mere layout prediction task, leading to rooms with overlapping objects or overly structured scenes, limited consideration practical usability generated environment. Instead, our is based on simple, but effective principle: we condition generate are usable by humans. principle implemented synthesizing humans interact...

10.48550/arxiv.2502.06819 preprint EN arXiv (Cornell University) 2025-02-04

During the last years, timber-concrete composite (TCC) structures have been extensively used in Europe both new and existing buildings. Generally speaking, a structure combines advantages of materials employed: strength stiffness concrete compression tensile strength, lightweight, low embodied energy, aesthetical appearance timber. The slab provides protection timber beams from direct contact with water, which is crucial to ensure durability beams, particularly when for bridges. Different...

10.1016/j.jtte.2018.09.001 article EN cc-by-nc-nd Journal of Traffic and Transportation Engineering (English Edition) 2018-10-04

Robustly estimating camera poses from a set of images is fundamental task which remains challenging for differentiable methods, especially in the case small and sparse pose graphs. To overcome this challenge, we propose Pose-refined Rotation Averaging Graph Optimization (PRAGO). From objectness detections on unordered images, our method reconstructs rotational pose, turn, absolute manner benefiting optimization sequence geometrical tasks. We show how pose-refinement module PRAGO able to...

10.48550/arxiv.2403.08586 preprint EN arXiv (Cornell University) 2024-03-13

We introduce Contrastive Gaussian Clustering, a novel approach capable of provide segmentation masks from any viewpoint and enabling 3D the scene. Recent works in novel-view synthesis have shown how to model appearance scene via cloud Gaussians, generate accurate images given by projecting on it Gaussians before $\alpha$ blending their color. Following this example, we train include also feature vector for each Gaussian. These can then be used segmentation, clustering according vectors; 2D...

10.48550/arxiv.2404.12784 preprint EN arXiv (Cornell University) 2024-04-19

Robustly estimating camera poses from a set of images is fundamental task which remains challenging for differentiable methods, especially in the case small and sparse pose graphs. To overcome this challenge, we propose Pose-refined Rotation Averaging Graph Optimization (PRAGO). From objectness detections on unordered images, our method reconstructs rotational pose, turn, absolute manner benefiting optimization sequence geometrical tasks. We show how pose-refinement module PRAGO able to...

10.1109/3dv62453.2024.00117 article EN 2021 International Conference on 3D Vision (3DV) 2024-03-18

The creation of digital replicas physical objects has valuable applications for the preservation and dissemination tangible cultural heritage. However, existing methods are often slow, expensive, require expert knowledge. We propose a pipeline to generate 3D replica scene using only RGB images (e.g. photos museum) then extract model each item interest pieces in exhibit). do this by leveraging advancements novel view synthesis Gaussian Splatting, modified enable efficient segmentation. This...

10.48550/arxiv.2409.19039 preprint EN arXiv (Cornell University) 2024-09-27

World-wide detailed 2D maps require enormous collective efforts. OpenStreetMap is the result of 11 million registered users manually annotating GPS location over 1.75 billion entries, including distinctive landmarks and common urban objects. At same time, manual annotations can include errors are slow to update, limiting map's accuracy. Maps from Motion (MfM) a step forward automatize such time-consuming map making procedure by computing semantic objects directly collection uncalibrated...

10.48550/arxiv.2411.12620 preprint EN arXiv (Cornell University) 2024-11-19

Efficient visual localization is crucial to many applications, such as large-scale deployment of autonomous agents and augmented reality. Traditional localization, while achieving remarkable accuracy, relies on extensive 3D models the scene or large collections geolocalized images, which are often inefficient store scale novel environments. In contrast, humans orient themselves using very abstract 2D maps, location clearly identifiable landmarks. Drawing this success recent works that...

10.48550/arxiv.2304.06373 preprint EN cc-by-nc-nd arXiv (Cornell University) 2023-01-01

We propose a CNN-based approach for multi-camera markerless motion capture of the human body. Unlike existing methods that first perform pose estimation on individual cameras and generate 3D models as post-processing, our makes use reasoning throughout multi-stage approach. This novelty allows us to provisional rethink where joints should be located in image recover from past mistakes. Our principled refinement poses lets make cues, even images we previously misdetected joints, refine...

10.48550/arxiv.1808.01525 preprint EN other-oa arXiv (Cornell University) 2018-01-01

The estimation of the camera poses associated with a set images commonly relies on feature matches between images. In contrast, we are first to address this challenge by using objectness regions guide pose problem rather than explicit semantic object detections. We propose Pose Refiner Network (PoserNet) light-weight Graph Neural refine approximate pair-wise relative poses. PoserNet exploits associations - concisely expressed as bounding boxes across multiple views globally sparsely...

10.48550/arxiv.2207.09445 preprint EN cc-by-nc-nd arXiv (Cornell University) 2022-01-01
Coming Soon ...