- 3D Shape Modeling and Analysis
- Domain Adaptation and Few-Shot Learning
- Human Pose and Action Recognition
- Computer Graphics and Visualization Techniques
- Advanced Vision and Imaging
- Advanced Neural Network Applications
- 3D Surveying and Cultural Heritage
- Multimodal Machine Learning Applications
- Advanced Image and Video Retrieval Techniques
- Video Surveillance and Tracking Methods
- Robotics and Sensor-Based Localization
- Robot Manipulation and Learning
- Face and Expression Recognition
- Advanced Image Processing Techniques
- Advanced Numerical Analysis Techniques
- COVID-19 diagnosis using AI
- Face recognition and analysis
- Anomaly Detection Techniques and Applications
- Gait Recognition and Analysis
- Sparse and Compressive Sensing Techniques
- Generative Adversarial Networks and Image Synthesis
- Image and Signal Denoising Methods
- Adversarial Robustness in Machine Learning
- Remote Sensing and LiDAR Applications
- Diabetic Foot Ulcer Assessment and Management
First Affiliated Hospital of GuangXi Medical University
2024-2025
Guangxi Medical University
2024-2025
South China University of Technology
2016-2025
Chinese University of Hong Kong, Shenzhen
2008-2025
Peng Cheng Laboratory
2020-2023
University of Macau
2014-2016
University of Hong Kong
2008-2015
Advanced Digital Sciences Center
2012-2015
Shandong Institute of Automation
2014
Chinese Academy of Sciences
2008-2012
Single image haze removal is a challenging ill-posed problem. Existing methods use various constraints/priors to get plausible dehazing solutions. The key achieve estimate medium transmission map for an input hazy image. In this paper, we propose trainable end-to-end system called DehazeNet, estimation. DehazeNet takes as input, and outputs its that subsequently used recover haze-free via atmospheric scattering model. adopts Convolutional Neural Networks (CNN) based deep architecture, whose...
In this paper, we propose a very simple deep learning network for image classification that is based on basic data processing components: 1) cascaded principal component analysis (PCA); 2) binary hashing; and 3) blockwise histograms. the proposed architecture, PCA employed to learn multistage filter banks. This followed by hashing block histograms indexing pooling. architecture thus called (PCANet) can be extremely easily efficiently designed learned. For comparison provide better...
Human actions in video sequences are three-dimensional (3D) spatio-temporal signals characterizing both the visual appearance and motion dynamics of involved humans objects. Inspired by success convolutional neural networks (CNN) for image classification, recent attempts have been made to learn 3D CNNs recognizing human videos. However, partly due high complexity training convolution kernels need large quantities videos, only limited has reported. This triggered us investigate this paper a...
In this work, we propose a novel method termed Frustum ConvNet (F-ConvNet) for amodal 3D object detection from point clouds. Given 2D region proposals in an RGB image, our first generates sequence of frustums each proposal, and uses the obtained to group local points. F-ConvNet aggregates point-wise features as frustum-level feature vectors, arrays these vectors map use its subsequent component fully convolutional network (FCN), which spatially fuses supports end-to-end continuous estimation...
Unsupervised domain adaptation aims to learn a model of classifier for unlabeled samples on the target domain, given training data labeled source domain. Impressive progress is made recently by learning invariant features via domain-adversarial deep networks. In spite recent progress, still limited in achieving invariance feature distributions at finer category level. To this end, we propose paper new method called Domain-Symmetric Networks (SymNets). The proposed SymNet based symmetric...
This paper proposes a joint multi-task learning algorithm to better predict attributes in images using deep convolutional neural networks (CNN). We consider binary semantic through CNN model, where each will one attribute. The allows models simultaneously share visual knowledge among different attribute categories. Each generate attribute-specific feature representations, and then we apply on the features their attributes. In our framework, propose method decompose overall model's parameters...
Unsupervised domain adaptation (UDA) is to make predictions for unlabeled data on a target domain, given labeled source whose distribution shifts from the one. Mainstream UDA methods learn aligned features between two domains, such that classifier trained can be readily applied ones. However, transferring strategy has potential risk of damaging intrinsic discrimination data. To alleviate this risk, we are motivated by assumption structural similarity, and propose directly uncover via...
We propose a joint intrinsic-extrinsic prior model to estimate both illumination and reflectance from an observed image. The 2D image formed 3D object in the scene is affected by intrinsic properties (shape texture) extrinsic property (illumination). Based on novel structure-preserving measure called local variation deviation, proposed for better representation. Better than conventional Retinex models, can preserve structure information shape prior, with fine details texture capture luminous...
This paper targets learning robust image representation for single training sample per person face recognition. Motivated by the success of deep in representation, we propose a supervised autoencoder, which is new type building block architectures. There are two features distinct our autoencoder from standard autoencoder. First, enforce faces with variants to be mapped canonical person, example, frontal neutral expression and normal illumination; Second, corresponding same similar. As...
Human actions captured in video sequences are threedimensional signals characterizing visual appearance and motion dynamics. To learn action patterns, existing methods adopt Convolutional and/or Recurrent Neural Networks (CNNs RNNs). CNN based effective learning spatial appearances, but limited modeling long-term RNNs, especially Long Short- Term Memory (LSTM), able to temporal However, naively applying RNNs a convolutional manner implicitly assumes that motions videos stationary across...
Reconstructing the 3D mesh of a general object from single image is now possible thanks to latest advances deep learning technologies. However, due nontrivial difficulty generating feasible structure, state-of-the-art approaches often simplify problem by displacements template that deforms it target surface. Though reconstructing shape with complex topology can be achieved deforming multiple patches, remains difficult stitch results ensure high meshing quality. In this paper, we present an...
Automatic 3D content creation has achieved rapid progress recently due to the availability of pre-trained, large language models and image diffusion models, forming emerging topic text-to-3D creation. Existing methods commonly use implicit scene representations, which couple geometry appearance via volume rendering are suboptimal in terms recovering finer geometries achieving photorealistic rendering; consequently, they less effective for generating high-quality assets. In this work, we...
Given labeled instances on a source domain and unlabeled ones target domain, unsupervised adaptation aims to learn task classifier that can well classify instances. Recent advances rely domain-adversarial training of deep networks domain-invariant features. However, due an issue mode collapse induced by the separate design classifiers, these methods are limited in aligning joint distributions feature category across domains. To overcome it, we propose novel adversarial learning method termed...
In this work, we propose a novel method termed \emph{Frustum ConvNet (F-ConvNet)} for amodal 3D object detection from point clouds. Given 2D region proposals in an RGB image, our first generates sequence of frustums each proposal, and uses the obtained to group local points. F-ConvNet aggregates point-wise features as frustum-level feature vectors, arrays these vectors map use its subsequent component fully convolutional network (FCN), which spatially fuses supports end-to-end continuous...
Arbitrary style transfer (AST) and domain generalization (DG) are important yet challenging visual learning tasks, which can be cast as a feature distribution matching problem. With the assumption of Gaussian distribution, conventional methods usually match mean standard deviation features. However, distributions real-world data much more complicated than Gaussian, cannot accurately matched by using only first-order second-order statistics, while it is computationally prohibitive to use...
3D LiDAR (light detection and ranging) semantic segmentation is important in scene understanding for many applications, such as auto-driving robotics. For example, autonomous cars equipped with RGB cameras LiDAR, it crucial to fuse complementary information from different sensors robust accurate segmentation. Existing fusion-based methods, however, may not achieve promising performance due the vast difference between two modalities. In this work, we investigate a collaborative fusion scheme...
Gait, as a promising biometric for recognizing human identities, can be nonintrusively captured series of acceleration signals using wearable or portable smart devices. It used access control. Most existing methods on accelerometer-based gait recognition require explicit step-cycle detection, suffering from cycle detection failures and intercycle phase misalignment. We propose novel algorithm that avoids both the above two problems. makes use type salient points termed signature (SPs), has...
Promoting the spatial resolution of off-the-shelf hyperspectral sensors is expected to improve typical computer vision tasks, such as target tracking and image classification. In this paper, we investigate scenario in which two cameras, one with a conventional RGB sensor other sensor, capture same scene, attempting extract redundant complementary information. We propose non-negative sparse promoting framework integrate data into high set data. The formulated problem form matrix factorization...
We study in this paper the problem of learning classifiers from ambiguously labeled images. For instance, collection new images, each image contains some samples interest (\emph{e.g.,} human faces), and its associated caption has labels with true ones included, while sample-label association is unknown. The task to learn these images generalize An essential consideration here how make use information embedded relations between labels, both within across set. To end, we propose a novel...
In this paper, we propose a framework of transforming images from source image space to target space, based on learning coupled dictionaries training set paired images. The can be used for applications such as super-resolution and estimation intrinsic components (shading albedo). It is local parametric regression approach, using sparse feature representations over learned across the spaces. After dictionary learning, coefficient vectors patch pairs are partitioned into easily retrievable...
Part-based visual tracking is advantageous due to its robustness against partial occlusion. However, how effectively exploit the confidence scores of individual parts construct a robust tracker still challenging problem. In this paper, we address problem by simultaneously matching in each multiple frames, which realized locality-constrained low-rank sparse learning method that establishes multi-frame part correspondences through optimization permutation matrices. The proposed (PMT) has...
Most of the previous work on video action recognition use complex hand-designed local features, such as SIFT, HOG and SURF, but these approaches are implemented sophisticatedly difficult to be extended other sensor modalities. Recent studies discover that there no universally best hand-engineered features for all datasets, learning directly from data may more advantageous. One endeavor is Slow Feature Analysis (SFA) proposed by Wiskott Sejnowski [33]. SFA can learn invariant slowly varying...