Francesc Moreno-Noguer

ORCID: 0000-0002-8640-684X
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Vision and Imaging
  • Human Pose and Action Recognition
  • 3D Shape Modeling and Analysis
  • Robotics and Sensor-Based Localization
  • Advanced Image and Video Retrieval Techniques
  • Video Surveillance and Tracking Methods
  • Human Motion and Animation
  • Computer Graphics and Visualization Techniques
  • Multimodal Machine Learning Applications
  • Generative Adversarial Networks and Image Synthesis
  • Advanced Neural Network Applications
  • Optical measurement and interference techniques
  • Video Analysis and Summarization
  • Hand Gesture Recognition Systems
  • Face recognition and analysis
  • Robot Manipulation and Learning
  • Advanced Image Processing Techniques
  • Image Retrieval and Classification Techniques
  • Image and Object Detection Techniques
  • Anomaly Detection Techniques and Applications
  • Medical Image Segmentation Techniques
  • Gait Recognition and Analysis
  • Cell Image Analysis Techniques
  • Image Enhancement Techniques
  • Handwritten Text Recognition Techniques

Universitat Politècnica de Catalunya
2016-2025

Institut de Robòtica i Informàtica Industrial
2016-2025

Consejo Superior de Investigaciones Científicas
2010-2023

Max Planck Institute for Informatics
2023

University of Tübingen
2023

Université de Bordeaux
2018

University of Surrey
2017

Imperial College London
2017

Waseda University
2017

Unidades Centrales Científico-Técnicas
2017

10.1007/s11263-008-0152-6 article EN International Journal of Computer Vision 2008-07-18

Deep learning has revolutionalized image-level tasks such as classification, but patch-level tasks, correspondence, still rely on hand-crafted features, e.g. SIFT. In this paper we use Convolutional Neural Networks (CNNs) to learn discriminant patch representations and in particular train a Siamese network with pairs of (non-)corresponding patches. We deal the large number potential combination stochastic sampling training set an aggressive mining strategy biased towards patches that are...

10.1109/iccv.2015.22 preprint EN 2015-12-01

Neural rendering techniques combining machine learning with geometric reasoning have arisen as one of the most promising approaches for synthesizing novel views a scene from sparse set images. Among these, stands out radiance fields (NeRF) [31], which trains deep network to map 5D input coordinates (representing spatial location and viewing direction) into volume density view-dependent emitted radiance. However, despite achieving an unprecedented level photorealism on generated images, NeRF...

10.1109/cvpr46437.2021.01018 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

This paper addresses the problem of 3D human pose estimation from a single image. We follow standard two-step pipeline by first detecting 2D position N body joints, and then using these observations to infer pose. For step, we use recent CNN-based detector. second most existing approaches perform 2N-to-3N regression Cartesian joint coordinates. show that more precise estimates can be obtained representing both poses NxN distance matrices, formulating as 2D-to-3D matrix regression. learning...

10.1109/cvpr.2017.170 article EN 2017-07-01

Low textured scenes are well known to be one of the main Achilles heels geometric computer vision algorithms relying on point correspondences, and in particular for visual SLAM. Yet, there many environments which, despite being low textured, can still reliably estimate line-based primitives, instance city indoor scenes, or so-called "Manhattan worlds", where structured edges predominant. In this paper we propose a solution handle these situations. Specifically, build upon ORB-SLAM,...

10.1109/icra.2017.7989522 article EN 2017-05-01

In this paper, we analyze the fashion of clothing a large social website. Our goal is to learn and predict how fashionable person looks on photograph suggest subtle improvements user could make improve her/his appeal. We propose Conditional Random Field model that jointly reasons about several fashionability factors such as type outfit garments wearing, user, photograph's setting (e.g., scenery behind user), score. Importantly, our able give rich feedback back conveying which or even she/he...

10.1109/cvpr.2015.7298688 article EN 2015-06-01

We present a novel approach for synthesizing photorealistic images of people in arbitrary poses using generative adversarial learning. Given an input image person and desired pose represented by 2D skeleton, our model renders the same under new pose, views parts visible hallucinating those that are not seen. This problem has recently been addressed supervised manner [16, 35], i.e., during training ground truth given to network. go beyond these approaches proposing fully unsupervised...

10.1109/cvpr.2018.00899 article EN 2018-06-01

In this paper we introduce SMPLicit, a novel generative model to jointly represent body pose, shape and clothing geometry. contrast existing learning-based approaches that require training specific models for each type of garment, SMPLicit can in unified manner different garment topologies (e.g. from sleeveless tops hoodies open jackets), while controlling other properties like the size or tightness/looseness. We show our be applicable large variety garments including T-shirts, hoodies,...

10.1109/cvpr46437.2021.01170 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

The rise of deep learning has brought remarkable progress in estimating hand geometry from images where the hands are part scene. This paper focuses on a new problem not explored so far, consisting predicting how human would grasp one or several objects, given single RGB image these objects. is with enormous potential e.g. augmented reality, robotics prosthetic design. In order to predict feasible grasps, we need understand semantic content image, its geometric structure and all interactions...

10.1109/cvpr42600.2020.00508 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

This paper tackles the problem of human motion prediction, consisting in forecasting future body poses from historically observed sequences. State-of-the-art approaches provide good results, however, they rely on deep learning architectures arbitrary complexity, such as Recurrent Neural Networks(RNN), Transformers or Graph Convolutional Networks(GCN), typically requiring multiple training stages and more than 2 million parameters. In this paper, we show that, after combining with a series...

10.1109/wacv56688.2023.00479 article EN 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2023-01-01

We propose a non-iterative solution to the PnP problem-the estimation of pose calibrated camera from n 3D-to-2D point correspondences—whose computational complexity grows linearly with 𝑛<. This is in contrast state-of-the-art methods that are 𝑂(𝑛 <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">5</sup> ) or even xmlns:xlink="http://www.w3.org/1999/xlink">8</sup> ), without being more accurate. Our method applicable for all 𝑛≥4 and handles...

10.1109/iccv.2007.4409116 article EN 2007-01-01

We propose a real-time, robust to outliers and accurate solution the Perspective-n-Point (PnP) problem. The main advantages of our are twofold: first, it in- tegrates outlier rejection within pose estimation pipeline with negligible computational overhead, sec- ond, its scalability arbitrarily large number correspon- dences. Given set 3D-to-2D matches, we formulate problem as low-rank homogeneous sys- tem where lies on 1D null space. Outlier correspondences those rows linear system which...

10.1109/cvpr.2014.71 article EN 2009 IEEE Conference on Computer Vision and Pattern Recognition 2014-06-01

We introduce a novel approach to automatically recover 3D human pose from single image. Most previous work follows pipelined approach: initially, set of 2D features such as edges, joints or silhouettes are detected in the image, and then these observations used infer pose. Solving two problems separately may lead erroneous poses when feature detector has performed poorly. In this paper, we address issue by jointly solving both detection inference problems. For purpose, propose Bayesian...

10.1109/cvpr.2013.466 article EN 2009 IEEE Conference on Computer Vision and Pattern Recognition 2013-06-01

Markerless 3D human pose detection from a single image is severely underconstrained problem because different poses can have similar projections. In order to handle this ambiguity, current approaches rely on prior shape models that only be correctly adjusted if 2D features are accurately detected. Unfortunately, although part detector algorithms shown promising results, they not yet accurate enough guarantee complete disambiguation of the inferred shape. paper, we introduce novel approach...

10.1109/cvpr.2012.6247988 article EN 2009 IEEE Conference on Computer Vision and Pattern Recognition 2012-06-01

We propose a novel approach for the estimation of pose and focal length camera from set 3D-to-2D point correspondences. Our method compares favorably to competing approaches in that it is both more accurate than existing closed form solutions, as well faster also iterative ones. inspired on EPnP algorithm, recent O(n) solution calibrated case. Yet we show considering an additional unknown renders linearization relinearization techniques original no longer valid, especially with large amounts...

10.1109/tpami.2013.36 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2013-08-20

Detecting grasping points is a key problem in cloth manipulation. Most current approaches follow multiple re-grasp strategy for this purpose, which clothes are sequentially grasped from different until one of them yields to desired configuration. In paper, by contrast, we circumvent the need re-graspings building robust detector that identifies points, generally single step, even when highly wrinkled. order handle large variability deformed may have, build Bag Features based combines...

10.1109/icra.2012.6225045 article EN 2012-05-01

The problem of predicting human motion given a sequence past observations is at the core many applications in robotics and computer vision. Current state-of-the-art formulate this as sequence-to-sequence task, which historical 3D skeletons feeds Recurrent Neural Network (RNN) that predicts future movements, typically order 1 to 2 seconds. However, one aspect has been obviated so far, fact inherently driven by interactions with objects and/or other humans environment. In paper, we explore...

10.1109/cvpr42600.2020.00702 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

Flow-based generative models have highly desirable properties like exact log-likelihood evaluation and latent-variable inference, however they are still in their infancy not received as much attention alternative models. In this paper, we introduce C-Flow, a novel conditioning scheme that brings normalizing flows to an entirely new scenario with great possibilities for multimodal data modeling. C-Flow is based on parallel sequence of invertible mappings which source flow guides the target at...

10.1109/cvpr42600.2020.00797 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

Recent learning approaches that implicitly represent surface geometry using coordinate-based neural representations have shown impressive results in the problem of multi-view 3D reconstruction. The effectiveness these techniques is, however, subject to availability a large number (several tens) input views scene, and computationally demanding optimizations. In this paper, we tackle limitations for specific few-shot full head reconstruction, by endowing with probabilistic shape prior enables...

10.1109/iccv48922.2021.00557 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

This paper proposes a do-it-all neural model of human hands, named LISA. The can capture accurate hand shape and appearance, generalize to arbitrary sub-jects, provide dense surface correspondences, be reconstructed from images in the wild, easily an-imated. We train LISA by minimizing appearance losses on large set multi-view RGB image se-quences annotated with coarse 3D poses skele-ton. For point local coordinates, our predicts color signed distance respect each bone independently, then...

10.1109/cvpr52688.2022.01988 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

Human motion prediction aims to forecast future poses given a sequence of past 3D skeletons. While this problem has recently received increasing attention, it mostly been tackled for single humans in isolation. In paper, we explore when dealing with performing collaborative tasks, seek predict the two interacted persons sequences their We propose novel cross interaction attention mechanism that exploits historical information both persons, and learns dependencies between pose sequences....

10.1109/cvpr52688.2022.01271 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01
Coming Soon ...