- Advanced Vision and Imaging
- 3D Shape Modeling and Analysis
- Computer Graphics and Visualization Techniques
- Face recognition and analysis
- Image Enhancement Techniques
- Advanced Neural Network Applications
- Human Pose and Action Recognition
- Advanced Image and Video Retrieval Techniques
- Advanced Image Processing Techniques
- Generative Adversarial Networks and Image Synthesis
- Domain Adaptation and Few-Shot Learning
- 3D Surveying and Cultural Heritage
- Visual Attention and Saliency Detection
- Human Motion and Animation
- Advanced Numerical Analysis Techniques
- Biometric Identification and Security
- Multimodal Machine Learning Applications
- Video Surveillance and Tracking Methods
- Anomaly Detection Techniques and Applications
- Video Analysis and Summarization
- Robotics and Sensor-Based Localization
- Face and Expression Recognition
- Digital Media Forensic Detection
- Remote Sensing and LiDAR Applications
- Traditional Chinese Medicine Studies
Shanghai Jiao Tong University
2016-2025
East China Normal University
2017-2025
Shanghai Normal University
2025
Shanghai University
2013-2024
Chongqing Normal University
2024
Motion Control (United States)
2021
ETH Zurich
2020
Shanghai University of Traditional Chinese Medicine
2008-2013
National Rehabilitation Center
2013
Zhejiang Ocean University
2013
Single image dehazing is a challenging ill-posed problem due to the severe information degeneration. However, existing deep learning based methods only adopt clear images as positive samples guide training of network while negative unexploited. Moreover, most them focus on strengthening with an increase depth and width, leading significant requirement computation memory. In this paper, we propose novel contrastive regularization (CR) built upon exploit both hazy samples, respectively. CR...
This paper presents a novel parametric curve-based method for lane detection in RGB images. Unlike state-of-the-art segmentation-based and point detection-based methods that typically require heuristics to either decode predictions or formulate large sum of anchors, the can learn holistic representations naturally. To handle optimization difficulties existing poly-nomial curve methods, we propose exploit Bézier due its ease computation, stability, high freedom degrees transformations. In...
In this work, we investigate the problem of creating high-fidelity 3D content from only a single image. This is inherently challenging: it essentially involves estimating underlying geometry while simultaneously hallucinating unseen textures. To address challenge, leverage prior knowledge well-trained 2D diffusion model to act as 3D-aware supervision for creation. Our approach, Make-It-3D, employs two-stage optimization pipeline: first stage optimizes neural radiance field by incorporating...
Detection Transformer (DETR) and Deformable DETR have been proposed to eliminate the need for many hand-designed components in object detection while demonstrating good performance as previous complex hand-crafted detectors. However, their on Video Object (VOD) has not well explored. In this paper, we present TransVOD, first end-to-end video system based simple yet effective spatial-temporal architectures. The goal of paper is streamline pipeline current VOD, effectively removing feature...
Unsupervised domain adaptation (UDA) aims to adapt a model of the labeled source an unlabeled target domain. Existing UDA-based semantic segmentation approaches always reduce shifts in pixel level, feature and output level. However, almost all them largely neglect contextual dependency, which is generally shared across different domains, leading less-desired performance. In this paper, we propose novel Context-Aware Mixup (CAMix) framework for adaptive segmentation, exploits important clue...
Glass surface detection is challenging as glass normally borrows similar visual appearances from the arbitrary objects/scenes behind it. Although some methods have been proposed to address this problem, they may fail if reference objects are nonexistent or additional annotations missing. This article aims problem by utilizing intrinsic properties without and annotations. We observe makes blurs naturally. Based on investigation of blurriness cue, we propose a novel aggregation module model...
In this paper we present a process called color transfer which can borrow one image's characteristics from another. Recently Reinhard and his colleagues reported pioneering work of transfer. Their technology produce very believable results, but has to transform pixel values RGB lαβ. Inspired by their work, advise an approach directly deal with the in any 3D space.From view statistics, consider pixel's value as three-dimension stochastic variable image set samples, so correlations between...
Attention mechanism has recently attracted increasing attentions in the field of facial action unit (AU) detection. By finding region interest each AU with attention mechanism, AU-related local features can be captured. Most existing based detection works use prior knowledge to predefine fixed or refine predefined within a small range, which limits their capacity model various AUs. In this paper, we propose an end-to-end deep learning and relation framework for only labels, not been explored...
The Information Bottleneck (IB) provides an information theoretic principle for representation learning, by retaining all relevant predicting label while minimizing the redundancy. Though IB has been applied to a wide range of applications, its optimization remains challenging problem which heavily relies on accurate estimation mutual information. In this paper, we present new strategy, Variational Self-Distillation (VSD), scalable, flexible and analytic solution essentially fitting but...
This paper reviews the NTIRE 2020 Challenge on Non-Homogeneous Dehazing of images (restoration rich details in hazy image). We focus proposed solutions and their results evaluated NH-Haze, a novel dataset consisting 55 pairs real haze free nonhomogeneous recorded outdoor. NH-Haze is first realistic that provides ground truth images. The has been produced using professional generator imitates conditions scenes. 168 participants registered challenge 27 teams competed final testing phase. gauge...
In this paper, a novel <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">D</b> eep xmlns:xlink="http://www.w3.org/1999/xlink">M</b> ulti-view xmlns:xlink="http://www.w3.org/1999/xlink">J</b> oint xmlns:xlink="http://www.w3.org/1999/xlink">C</b> lustering ( xmlns:xlink="http://www.w3.org/1999/xlink">DMJC</b> ) framework is proposed, where multiple deep embedded features, multi-view fusion mechanism, and clustering assignments can be learned...
The rapid development of facial manipulation techniques has aroused public concerns in recent years. Following the success deep learning, existing methods always formulate DeepFake video detection as a binary classification problem and develop frame-based video-based solutions. However, little attention been paid to capturing spatial-temporal inconsistency forged videos. To address this issue, we term task Spatial-Temporal Inconsistency Learning (STIL) process instantiate it into novel STIL...
Although huge progress has been made on scene analysis in recent years, most existing works assume the input images to be day-time with good lighting conditions. In this work, we aim address night-time parsing (NTSP) problem, which two main challenges: 1) labeled data are scarce, and 2) over- under-exposures may co-occur not explicitly modeled pipelines. To tackle scarcity of data, collect a novel dataset, named NightCity, 4,297 real ground truth pixel-level semantic annotations. our...
With various face presentation attacks arising under unseen scenarios, anti-spoofing (FAS) based on domain generalization (DG) has drawn growing attention due to its robustness. Most existing methods utilize DG frameworks align the features seek a compact and generalized feature space. However, little been paid extraction process for FAS task, especially influence of normalization, which also great impact learned representation. To address this issue, we propose novel perspective that...
Recently, DETR and Deformable have been proposed to eliminate the need for many hand-designed components in object detection while demonstrating good performance as previous complex hand-crafted detectors. However, their on Video Object Detection (VOD) has not well explored. In this paper, we present TransVOD, an end-to-end video model based a spatial-temporal Transformer architecture. The goal of paper is streamline pipeline VOD, effectively removing feature aggregation, e.g., optical flow,...
Most existing dehazing methods are not robust to nonhomogeneous haze. Meanwhile, the information of dense haze region is usually unknown and hard estimate, leading blurry in dehaze result for those regions. Focusing on these two issues, we propose a novel coarse-to-fine model, namely Trident Dehazing Network (TDN), learn hazy hazy- free image mapping with automatic density recognition. In detail, TDN composed three sub-nets: EncoderDecoder Net (EDN) main net reconstruct coarse hazy-free...
In this article, we propose a multiview self-representation model for nonlinear subspaces clustering. By assuming that the heterogeneous features lie within union of multiple linear subspaces, recent subspace learning methods aim to capture complementary and consensus from views boost performance. However, in real-world applications, data feature usually resides leading undesirable results. To end, kernelized version tensor-based clustering, which is referred as Kt-SVD-MSC, jointly learn...
We propose a robust normal estimation method for both point clouds and meshes using low rank matrix approximation algorithm. First, we compute local isotropic structure each find its similar, non-local structures that organize into matrix. then show algorithm can robustly estimate normals meshes. Furthermore, provide new filtering cloud data to smooth the position fit estimated normals. applications of our filtering, set upsampling, surface reconstruction, mesh denoising, geometric texture...
Face anti-spoofing approaches based on domain generalization (DG) have drawn growing attention due to their robustness for unseen scenarios. Previous methods treat each sample from multiple domains indiscriminately during the training process, and endeavor extract a common feature space improve generalization. However, complex biased data distribution, directly treating them equally will corrupt ability. To settle issue, we propose novel Dual Reweighting Domain Generalization (DRDG)...
To address the huge labeling cost in large-scale point cloud semantic segmentation, we propose a novel hybrid contrastive regularization (HybridCR) framework weakly-supervised setting, which obtains competitive performance compared to its fully-supervised counterpart. Specifically, HybridCR is first leverage both consistency and employ with pseudo an end-to-end manner. Fundamentally, explicitly effectively considers similarity between local neighboring points global characteristics of 3D...