- Video Analysis and Summarization
- Machine Learning and Data Classification
- Computer Graphics and Visualization Techniques
- Image Retrieval and Classification Techniques
- Human Motion and Animation
- Human Pose and Action Recognition
- Software Engineering Research
- Advanced Vision and Imaging
- Industrial Vision Systems and Defect Detection
- Medical Image Segmentation Techniques
- 3D Shape Modeling and Analysis
- Advanced Image Fusion Techniques
Hong Kong University of Science and Technology
2023-2025
University of Hong Kong
2023-2025
University of Electronic Science and Technology of China
2022
Despite recent advancements in the Large Reconstruction Model (LRM) demonstrating impressive results, when extending its input from single image to multiple images, it exhibits inefficiencies, subpar geometric and texture quality, as well slower convergence speed than expected. It is attributed that, LRM formulates 3D reconstruction a naive images-to-3D translation problem, ignoring strong coherence among images. In this paper, we propose Multi-view (M-LRM) designed reconstruct high-quality...
Pan-sharpening, a task involving information fusion, entails merging panchromatic (PAN) images with high spatial resolution and low-resolution multispectral (LRMS) in order to obtain high-resolution (HRMS) images. Due deep learning's excellent regression capabilities, it has recently become the dominating technique for this assignment. Meanwhile, development of transformer, novel learning architecture natural language processing, provided researchers new insights. In letter, we seek extend...
Abstract Neural radiance fields (NeRF) have demonstrated a promising research direction for novel view synthesis. However, the existing approaches either require per‐scene optimization that takes significant computation time or condition on local features which overlook global context of images. To tackle this shortcoming, we propose Conditionally Parameterized Radiance Fields (CP‐NeRF), plug‐in module enables NeRF to leverage contextual information from different scales. Instead optimizing...
Score Distillation Sampling (SDS) has emerged as a prevalent technique for text-to-3D generation, enabling 3D content creation by distilling view-dependent information from text-to-2D guidance. However, they frequently exhibit shortcomings such over-saturated color and excess smoothness. In this paper, we conduct thorough analysis of SDS refine its formulation, finding that the core design is to model distribution rendered images. Following insight, introduce novel strategy called...
Leveraging the visual priors of pre-trained text-to-image diffusion models offers a promising solution to enhance zero-shot generalization in dense prediction tasks. However, existing methods often uncritically use original formulation, which may not be optimal due fundamental differences between and image generation. In this paper, we provide systemic analysis formulation for prediction, focusing on both quality efficiency. And find that parameterization type generation, learns predict...