Daniel Lichy

ORCID: 0000-0001-9209-0821
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Vision and Imaging
  • Computer Graphics and Visualization Techniques
  • Remote Sensing and LiDAR Applications
  • Image Enhancement Techniques
  • Robotics and Sensor-Based Localization
  • Optical measurement and interference techniques
  • Advanced Image Processing Techniques
  • Color Science and Applications
  • Image and Object Detection Techniques
  • Advanced Image and Video Retrieval Techniques
  • Image Processing Techniques and Applications
  • 3D Shape Modeling and Analysis

University of Maryland, College Park
2020-2024

In this paper, we present a technique for estimating the geometry and reflectance of objects using only camera, flashlight, optionally tripod. We propose simple data capture in which user goes around object, illuminating it with flashlight capturing few images. Our main technical contribution is introduction recursive neural architecture, can predict at 2 <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">k</sup> ×2 resolution given an input image...

10.1109/cvpr46437.2021.00606 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

We propose a fast and generalizable solution to Multiview Photometric Stereo (MVPS), called MVPSNet. The key our approach is feature extraction network that effectively combines images from the same view captured under multiple lighting conditions extract geometric features shading cues for stereo matching. demonstrate these features, termed 'Light Aggregated Feature Maps' (LAFM), are effective matching even in textureless regions, where traditional multi-view methods often fail. Our method...

10.1109/iccv51070.2023.01151 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

We introduce the first end-to-end learning-based solution to near-field Photometric Stereo (PS), where light sources are close object of interest. This setup is especially useful for reconstructing large immobile objects. Our method fast, producing a mesh from 52 512x384 resolution images in about 1 second on commodity GPU, thus potentially unlocking several AR/VR applications. Existing approaches rely optimization coupled with far-field PS network operating pixels or small patches. Using...

10.1109/cvpr52688.2022.01228 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

We present SfSNet, an end-to-end learning framework for producing accurate decomposition of unconstrained human face image into shape, reflectance and illuminance. SfSNet is designed to reflect a physical lambertian rendering model. learns from mixture labeled synthetic unlabeled real-world images. This allows the network capture low-frequency variations high-frequency details real images through photometric reconstruction loss. consists new architecture with residual blocks that complete...

10.1109/tpami.2020.3046915 article EN publisher-specific-oa IEEE Transactions on Pattern Analysis and Machine Intelligence 2020-12-23

Wide field-of-view (FoV) cameras efficiently capture large portions of the scene, which makes them attractive in multiple domains, such as automotive and robotics. For applications, estimating depth from images is a critical task, therefore, amount ground truth (GT) data available. Unfortunately, most GT for pinhole cameras, making it impossible to properly train estimation models large-FoV cameras. We propose first method stereo model on widely available data, generalize captured with...

10.48550/arxiv.2401.13786 preprint EN other-oa arXiv (Cornell University) 2024-01-01

Wide field-of-view (FoV) cameras efficiently capture large portions of the scene, which makes them attractive in multiple domains, such as automotive and robotics. For applications, estimating depth from images is a critical task, therefore, amount ground truth (GT) data available. Unfortunately, most GT for pinhole cameras, making it impossible to properly train estimation models large-FoV cameras. We propose first method stereo model on widely available data, generalize captured with...

10.1109/3dv62453.2024.00056 article EN 2021 International Conference on 3D Vision (3DV) 2024-03-18

We introduce nvTorchCam, an open-source library under the Apache 2.0 license, designed to make deep learning algorithms camera model-independent. nvTorchCam abstracts critical operations such as projection and unprojection, allowing developers implement once apply them across diverse models--including pinhole, fisheye, 360 equirectangular panoramas, which are commonly used in automotive real estate capture applications. Built on PyTorch, is fully differentiable supports GPU acceleration...

10.48550/arxiv.2410.12074 preprint EN arXiv (Cornell University) 2024-10-15

We propose a fast and generalizable solution to Multi-view Photometric Stereo (MVPS), called MVPSNet. The key our approach is feature extraction network that effectively combines images from the same view captured under multiple lighting conditions extract geometric features shading cues for stereo matching. demonstrate these features, termed `Light Aggregated Feature Maps' (LAFM), are effective matching even in textureless regions, where traditional multi-view methods fail. Our method...

10.48550/arxiv.2305.11167 preprint EN other-oa arXiv (Cornell University) 2023-01-01

We introduce the first end-to-end learning-based solution to near-field Photometric Stereo (PS), where light sources are close object of interest. This setup is especially useful for reconstructing large immobile objects. Our method fast, producing a mesh from 52 512$\times$384 resolution images in about 1 second on commodity GPU, thus potentially unlocking several AR/VR applications. Existing approaches rely optimization coupled with far-field PS network operating pixels or small patches....

10.48550/arxiv.2203.16515 preprint EN cc-by arXiv (Cornell University) 2022-01-01

In this paper, we present a technique for estimating the geometry and reflectance of objects using only camera, flashlight, optionally tripod. We propose simple data capture in which user goes around object, illuminating it with flashlight capturing few images. Our main technical contribution is introduction recursive neural architecture, can predict at 2^{k}*2^{k} resolution given an input image estimated from previous step 2^{k-1}*2^{k-1}. This termed RecNet, trained 256x256 but easily...

10.48550/arxiv.2104.06397 preprint EN cc-by arXiv (Cornell University) 2021-01-01
Coming Soon ...