- Domain Adaptation and Few-Shot Learning
- Face and Expression Recognition
- Advanced Neural Network Applications
- Face recognition and analysis
- Advanced Image Fusion Techniques
- Advanced Image and Video Retrieval Techniques
- Advanced Image Processing Techniques
- Remote-Sensing Image Classification
- Multimodal Machine Learning Applications
- Video Surveillance and Tracking Methods
- Human Pose and Action Recognition
- Image Enhancement Techniques
- Image Processing Techniques and Applications
- Advanced Vision and Imaging
- Machine Learning and Data Classification
- Image and Signal Denoising Methods
- Generative Adversarial Networks and Image Synthesis
- Medical Image Segmentation Techniques
- Biometric Identification and Security
- Emotion and Mood Recognition
- Image Retrieval and Classification Techniques
- Spectroscopy and Chemometric Analyses
- Sparse and Compressive Sensing Techniques
- Blind Source Separation Techniques
- Image and Video Quality Assessment
University College London
2016-2025
Xiamen University
2024
Guangdong University of Technology
2022
Aalborg University
2021
Hong Kong University of Science and Technology
2021
University of Hong Kong
2021
Chinese University of Hong Kong
2021
Shenzhen Technology University
2021
Beijing University of Posts and Telecommunications
2020-2021
The University of Sydney
2021
Timely monitoring of crop lands is important in order to make agricultural activities more sustainable, as well ensuring food security. The use Earth Observation (EO) data allows at a range spatial scales, but can be hampered by limitations the data. Crop growth modelling, on other hand, used simulate physiological processes that result development. Data assimilation (DA) provides way blending properties EO with predictive and explanatory abilities models. In this paper, we first provide...
In this work, we propose TediGAN, a novel framework for multi-modal image generation and manipulation with textual descriptions. The proposed method consists of three components: StyleGAN inversion module, visual-linguistic similarity learning, instance-level optimization. module maps real images to the latent space well-trained StyleGAN. learns text-image matching by mapping text into common embedding space. instancelevel optimization is identity preservation in manipulation. Our model can...
GAN inversion aims to invert a given image back into the latent space of pretrained model so that can be faithfully reconstructed from inverted code by generator. As an emerging technique bridge real and fake domains, plays essential role in enabling models, such as StyleGAN BigGAN, for applications editing. Moreover, interprets GAN's examines how realistic images generated. In this paper, we provide survey with focus on its representative algorithms restoration manipulation. We further...
Few-shot learning for fine-grained image classification has gained recent attention in computer vision. Among the approaches few-shot learning, due to simplicity and effectiveness, metric-based methods are favorably state-of-the-art on many tasks. Most of assume a single similarity measure thus obtain feature space. However, if samples can simultaneously be well classified via two distinct measures, within class distribute more compactly smaller space, producing discriminative maps....
In this paper, we propose novel strategies for neutral vector variable decorrelation. Two fundamental invertible transformations, namely, serial nonlinear transformation and parallel transformation, are proposed to carry out the For a variable, which is not multivariate-Gaussian distributed, conventional principal component analysis cannot yield mutually independent scalar variables. With two highly negatively correlated can be transformed set of variables with same degrees freedom. We also...
In finger vein verification, the most important and challenging part is to robustly extract patterns from low-contrast infrared images with limited a priori knowledge. Although recent convolutional neural network (CNN)-based methods for verification have shown powerful capacity feature representation promising perspective in this area, they still two critical issues address. First, these CNN-based unexceptionally utilize fully connected layers, which restrict size of process increase...
Full-reference image quality assessment algorithms usually perform comparisons of features extracted from square patches. These patches do not have any visual meanings. On the contrary, a superpixel is set pixels that share similar characteristics and thus perceptually meaningful. Features superpixels may improve performance assessment. Inspired by this, we propose new superpixel-based similarity index extracting meaningful revising measures. The proposed method evaluates on basis three...
Exploiting both RGB (2D appearance) and Depth (3D geometry) information can improve the performance of semantic segmentation. However, due to inherent difference between information, it remains a challenging problem in how integrate RGB-D features effectively. In this letter, address issue, we propose Non-local Aggregation Network (NANet), with well-designed Multi-modality Module (MNAM), better exploit non-local context at multi-stage. Compared most existing segmentation schemes, which only...
Lightweight models are pivotal in efficient semantic segmentation, but they often suffer from insufficient context information due to limited convolution and small receptive field. To address this problem, we propose a tailored approach segmentation by leveraging two complementary distillation schemes for supplementing networks: 1) self-attention scheme, which transfers long-range knowledge adaptively large teacher networks student networks; 2) layer-wise structured deep layers shallow...
Graph convolutional networks (GCN) have recently been studied to exploit the graph topology of human body for skeleton-based action recognition. However, most these methods unfortunately aggregate messages via an inflexible pattern various samples, lacking awareness intra-class variety and suitableness skeleton sequences, which often contain redundant or even detrimental connections. In this paper, we propose a novel Deformable Convolutional Network (DeGCN) adaptively capture informative...
Personalized text-to-image (T2I) synthesis based on diffusion models has attracted significant attention in recent research. However, existing methods primarily concentrate customizing subjects or styles, neglecting the exploration of global geometry. In this study, we propose an approach that focuses customization 360-degree panoramas, which inherently possess geometric properties, using a T2I model. To achieve this, curate paired image-text dataset specifically designed for task and...
Metric-based methods are one of the most common to solve problem few-shot image classification. However, traditional metric-based suffer from overfitting and local feature misalignment. The recently proposed reconstruction-based approach, which reconstructs query features support set a given class compares distance between original reconstructed as classification criterion, effectively solves misalignment problem. issue still has not been considered. To this end, we propose...
In this work we present DREAM, an fMRI-to-image method for reconstructing viewed images from brain activities, grounded on fundamental knowledge of the human visual system. We craft reverse pathways that emulate hierarchical and parallel nature how humans perceive world. These tailored are specialized to decipher semantics, color, depth cues fMRI data, mirroring forward stimuli recordings. To do so, two components mimic inverse processes within system: Reverse Visual Association Cortex...
This paper presents a fully automatic three-dimensional classification of brain tissues for Magnetic Resonance (MR) images. An MR image volume may be composed mixture several tissue types due to partial effects. Therefore, we consider that in dataset there are not only the three main tissue: gray matter, white and cerebro spinal fluid, called pure classes, but also mixtures, mixclasses. A statistical model mixtures is proposed studied by means simulations. It shown it can approximated...
Many established classifiers fail to identify the minority class when it is much smaller than majority class. To tackle this problem, researchers often first rebalance sizes in training dataset, through oversampling or undersampling class, and then use rebalanced data train classifiers. This leads interesting empirical patterns. In particular, using can improve area under receiver operating characteristic curve (AUC) for original, unbalanced test data. The AUC a widely-used quantitative...
In recent years, deep learning-based person re-identification (Re-ID) methods have made significant progress. However, the performance of these substantially decreases when dealing with occlusion, which is ubiquitous in realistic scenarios. this article, we propose a novel semantic-aware occlusion-robust network (SORN) that effectively exploits intrinsic relationship between tasks Re-ID and semantic segmentation for occluded Re-ID. Specifically, SORN composed three branches, including local...
Hyperspectral images (HSIs) have been used in a wide range of fields, such as agriculture, food safety, mineralogy, and environment monitoring, but being corrupted by various kinds noise limits its efficacy. Low-rank representation (LRR) has proved effectiveness the denoising HSIs. However, it just employs local information for denoising, which results ineffectiveness when is heavy. In this paper, we propose an approach group low-rank (GLRR) HSI denoising. our GLRR, divided into overlapping...
Capturing an all-in-focus image with a single camera is difficult since the depth of field usually limited. An alternative method to obtain fuse several images that are focused at different depths. However, existing multi-focus fusion methods cannot clear results for areas near focused/defocused boundary (FDB). In this article, novel α-matte defocus model proposed generate realistic training data spread effect precisely modeled, especially FDB. Based on and generated data, cascaded...
To achieve effective facial expression recognition (FER), it is of great importance to address various disturbing factors, including pose, illumination, identity, and so on. However, a number FER databases merely provide the labels expression, but lack label information for other factors. As result, many methods are only able cope with one or two ignoring heavy entanglement between multiple In this paper, we propose novel Deep Disturbance-disentangled Learning (DDL) method FER. DDL capable...
Elastic weight consolidation (EWC) has been successfully applied for general incremental learning to overcome the catastrophic forgetting issue. It adaptively constrains each parameter of new model not deviate much from its counterpart in old during fine-tuning on class data sets, according importance tasks. However, previous study demonstrates that it still suffers when directly used object detection. In this article, we show EWC is effective detection if with critical adaptations. First,...