- Advanced Vision and Imaging
- Generative Adversarial Networks and Image Synthesis
- Computer Graphics and Visualization Techniques
- Adversarial Robustness in Machine Learning
- Digital Media Forensic Detection
- 3D Shape Modeling and Analysis
- 3D Surveying and Cultural Heritage
- Fire Detection and Safety Systems
- Video Surveillance and Tracking Methods
- Domain Adaptation and Few-Shot Learning
- Human Pose and Action Recognition
- Robotics and Sensor-Based Localization
- Spectroscopy and Chemometric Analyses
- Optical measurement and interference techniques
- Image Processing and 3D Reconstruction
- Remote Sensing in Agriculture
- Image Enhancement Techniques
- Image Processing Techniques and Applications
- Smart Agriculture and AI
- Advanced Image Processing Techniques
University of California, San Diego
2020-2024
Nvidia (United States)
2022
UC San Diego Health System
2021
National Taiwan University
2013-2018
Person re-identification (Re-ID) aims at recognizing the same person from images taken across different cameras. To address this task, one typically requires a large amount labeled data for training an effective Re-ID model, which might not be practical real-world applications. alleviate limitation, we choose to exploit sufficient of pre-existing (auxiliary) dataset. By jointly considering such auxiliary dataset and interest (but without label information), our proposed adaptation network...
While representation learning aims to derive interpretable features for describing visual data, disentanglement further results in such so that particular image attributes can be identified and manipulated. However, one cannot easily address this task without observing ground truth annotation the training data. To problem, we propose a novel deep model of Cross-Domain Representation Disentangler (CDRD). By fully annotated source-domain data unlabeled target-domain interest, our bridges...
We present a novel and unified deep learning framework which is capable of domain-invariant representation from data across multiple domains. Realized by adversarial training with additional ability to exploit domain-specific information, the proposed network able perform continuous cross-domain image translation manipulation, produces desirable output images accordingly. In addition, resulting feature exhibits superior performance unsupervised domain adaptation, also verifies effectiveness...
Given a portrait image of person and an environment map the target lighting, relighting aims to re-illuminate in as if appeared with lighting. To achieve high-quality results, recent methods rely on deep learning. An effective approach is supervise training neural networks high-fidelity dataset desired input-output pairs, captured light stage. However, acquiring such data requires expensive special capture rig time-consuming efforts, limiting access only few resourceful laboratories. address...
Recovering the 3D shape of transparent objects using a small number unconstrained natural images is an ill-posed problem. Complex light paths induced by refraction and reflection have prevented both traditional deep multiview stereo from solving this challenge. We propose physically-based network to recover few acquired with mobile phone camera, under known but arbitrary environment map. Our novel contributions include normal representation that enables model complex transport through local...
We propose a novel framework for creating large-scale photorealistic datasets of indoor scenes, with ground truth geometry, material, lighting and semantics. Our goal is to make the dataset creation process widely accessible, transforming scans into high-quality appearance, layout, semantic labels, high quality spatially-varying BRDF complex lighting, including direct, indirect visibility components. This enables important applications in inverse rendering, scene understanding robotics. show...
Most indoor 3D scene reconstruction methods focus on recovering geometry and layout. In this work, we go beyond to propose PhotoScene <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup> Code: https://github.com/ViLab-UCSD/PhotoScene, a framework that takes input image(s) of along with approximately aligned CAD (either reconstructed automatically or manually specified) builds photorealistic digital twin high-quality materials similar...
Person re-identification (Re-ID) aims at recognizing the same person from images taken across different cameras. To address this task, one typically requires a large amount labeled data for training an effective Re-ID model, which might not be practical real-world applications. alleviate limitation, we choose to exploit sufficient of pre-existing (auxiliary) dataset. By jointly considering such auxiliary dataset and interest (but without label information), our proposed adaptation network...
In this article, we address a novel and challenging task of video inference, which aims to infer sequences from given non-consecutive frames. Taking such frames as the anchor inputs, our focus is recover possible sequence outputs based on observed at associated time. With proposed Stochastic Recurrent Conditional GAN (SR-cGAN), are able preserve visual content across with additional ability handle temporal ambiguity. experiments, show that SR-cGAN not only produces preferable inference...
We present TextureDreamer, a novel image-guided texture synthesis method to transfer relightable textures from small number of input images (3 5) target 3D shapes across arbitrary categories. Texture creation is pivotal challenge in vision and graphics. Industrial companies hire experienced artists manually craft for assets. Classical methods require densely sampled views accurately aligned geometry, while learning-based are confined category-specific within the dataset. In contrast,...
While representation learning aims to derive interpretable features for describing visual data, disentanglement further results in such so that particular image attributes can be identified and manipulated. However, one cannot easily address this task without observing ground truth annotation the training data. To problem, we propose a novel deep model of Cross-Domain Representation Disentangler (CDRD). By fully annotated source-domain data unlabeled target-domain interest, our bridges...
Recovering the 3D shape of transparent objects using a small number unconstrained natural images is an ill-posed problem. Complex light paths induced by refraction and reflection have prevented both traditional deep multiview stereo from solving this challenge. We propose physically-based network to recover few acquired with mobile phone camera, under known but arbitrary environment map. Our novel contributions include normal representation that enables model complex transport through local...
Most indoor 3D scene reconstruction methods focus on recovering geometry and layout. In this work, we go beyond to propose PhotoScene, a framework that takes input image(s) of along with approximately aligned CAD (either reconstructed automatically or manually specified) builds photorealistic digital twin high-quality materials similar lighting. We model using procedural material graphs; such graphs represent resolution-independent materials. optimize the parameters these their texture scale...