- Face and Expression Recognition
- Sparse and Compressive Sensing Techniques
- Advanced Neural Network Applications
- Advanced Image and Video Retrieval Techniques
- Video Surveillance and Tracking Methods
- Advanced Vision and Imaging
- Face recognition and analysis
- Domain Adaptation and Few-Shot Learning
- Image Retrieval and Classification Techniques
- Human Pose and Action Recognition
- Remote-Sensing Image Classification
- Advanced Image Processing Techniques
- Image Processing Techniques and Applications
- 3D Shape Modeling and Analysis
- Medical Image Segmentation Techniques
- Robotics and Sensor-Based Localization
- Anomaly Detection Techniques and Applications
- Image and Signal Denoising Methods
- Multimodal Machine Learning Applications
- Remote Sensing and LiDAR Applications
- Neural Networks and Applications
- Machine Learning and Data Classification
- Blind Source Separation Techniques
- Biometric Identification and Security
- Generative Adversarial Networks and Image Synthesis
Nanjing University of Information Science and Technology
2018-2025
Nanjing University of Science and Technology
2016-2025
Nankai University
2022-2025
Macquarie University
2024-2025
Dalian Maritime University
2024
University of Kentucky
2024
Zhujiang Hospital
2024
Southern Medical University
2024
Ningbo University
2024
Beijing Institute of Technology
2011-2024
In this paper, a new technique coined two-dimensional principal component analysis (2DPCA) is developed for image representation. As opposed to PCA, 2DPCA based on 2D matrices rather than 1D vectors so the matrix does not need be transformed into vector prior feature extraction. Instead, an covariance constructed directly using original matrices, and its eigenvectors are derived To test evaluate performance, series of experiments were performed three face databases: ORL, AR, Yale databases....
In standard Convolutional Neural Networks (CNNs), the receptive fields of artificial neurons in each layer are designed to share same size. It is well-known neuroscience community that field size visual cortical modulated by stimulus, which has been rarely considered constructing CNNs. We propose a dynamic selection mechanism CNNs allows neuron adaptively adjust its based on multiple scales input information. A building block called Selective Kernel (SK) unit designed, branches with...
Recently, Convolutional Neural Network (CNN) based models have achieved great success in Single Image Super-Resolution (SISR). Owing to the strength of deep networks, these CNN learn an effective nonlinear mapping from low-resolution input image high-resolution target image, at cost requiring enormous parameters. This paper proposes a very model (up 52 convolutional layers) named Deep Recursive Residual (DRRN) that strives for yet concise networks. Specifically, residual learning is adopted,...
Recently, very deep convolutional neural networks (CNNs) have been attracting considerable attention in image restoration. However, as the depth grows, longterm dependency problem is rarely realized for these models, which results prior states/layers having little influence on subsequent ones. Motivated by fact that human thoughts persistency, we propose a persistent memory network (MemNet) introduces block, consisting of recursive unit and gate unit, to explicitly mine through an adaptive...
Sparse representation has attracted much attention from researchers in fields of signal processing, image computer vision and pattern recognition. also a good reputation both theoretical research practical applications. Many different algorithms have been proposed for sparse representation. The main purpose this article is to provide comprehensive study an updated review on supply guidance researchers. taxonomy methods can be studied various viewpoints. For example, terms norm minimizations...
This paper examines the theory of kernel Fisher discriminant analysis (KFD) in a Hilbert space and develops two-phase KFD framework, i.e., principal component (KPCA) plus linear (LDA). framework provides novel insights into nature KFD. Based on this authors propose complete (CKFD) algorithm. CKFD can be used to carry out "double subspaces." The fact that, it make full use two kinds information, regular irregular, makes more powerful discriminator. proposed algorithm was tested evaluated...
One-stage detector basically formulates object detection as dense classification and localization. The is usually optimized by Focal Loss the box location commonly learned under Dirac delta distribution. A recent trend for one-stage detectors to introduce an individual prediction branch estimate quality of localization, where predicted facilitates improve performance. This paper delves into representations above three fundamental elements: estimation, Two problems are discovered in existing...
Recently the sparse representation (or coding) based classification (SRC) has been successfully used in face recognition. In SRC, testing image is represented as a linear combination of training samples, and fidelity measured by l <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sub> -norm or xmlns:xlink="http://www.w3.org/1999/xlink">1</sub> coding residual. Such model actually assumes that residual follows Gaussian Laplacian distribution,...
Face Super-Resolution (SR) is a domain-specific superresolution problem. The facial prior knowledge can be leveraged to better super-resolve face images. We present novel deep end-to-end trainable Network (FSRNet), which makes use of the geometry prior, i.e., landmark heatmaps and parsing maps, very low-resolution (LR) images without well-aligned requirement. Specifically, we first construct coarse SR network recover high-resolution (HR) image. Then, HR image sent two branches: fine encoder...
This paper develops an unsupervised discriminant projection (UDP) technique for dimensionality reduction of high-dimensional data in small sample size cases. UDP can be seen as a linear approximation multimanifolds-based learning framework which takes into account both the local and nonlocal quantities. characterizes scatter well scatter, seeking to find that simultaneously maximizes minimizes scatter. characteristic makes more intuitive powerful than most up-to-date method, Locality...
Convolutional neural network (CNN) has demonstrated impressive ability to represent hyperspectral images and achieve promising results in image classification. However, traditional CNN models can only operate convolution on regular square regions with fixed size weights, thus, they cannot universally adapt the distinct local various object distributions geometric appearances. Therefore, their classification performances are still be improved, especially class boundaries. To alleviate this...
Pedestrian detection has progressed significantly in the last years. However, occluded people are notoriously hard to detect, as their appearance varies substantially depending on a wide range of occlusion patterns. In this paper, we aim propose simple and compact method based FasterRCNN architecture for pedestrian detection. We start with interpreting CNN channel features detector, find that different channels activate responses body parts respectively. These findings motivate us employ an...
Understanding shadows from a single image consists of two types task in previous studies, containing shadow detection and removal. In this paper, we present multi-task perspective, which is not embraced by any existing work, to jointly learn both removal an end-to-end fashion that aims at enjoying the mutually improved benefits each other. Our framework based on novel STacked Conditional Generative Adversarial Network (ST-CGAN), composed stacked CGANs, with generator discriminator....
For human pose estimation in monocular images, joint occlusions and overlapping upon bodies often result deviated predictions. Under these circumstances, biologically implausible predictions may be produced. In contrast, vision is able to predict poses by exploiting geometric constraints of inter-connectivity. To address the problem incorporating priors about structure bodies, we propose a novel structure-aware convolutional network implicitly take such into account during training deep...
Recently, regression analysis has become a popular tool for face recognition. Most existing methods use the one-dimensional, pixel-based error model, which characterizes representation individually, pixel by pixel, and thus neglects two-dimensional structure of image. We observe that occlusion illumination changes generally lead, approximately, to low-rank In order make this structural information, paper presents image-matrix-based namely, nuclear norm based matrix (NMR), classification. NMR...
Face recognition (FR) is an active yet challenging topic in computer vision applications. As a powerful tool to represent high dimensional data, recently sparse representation based classification (SRC) has been successfully used for FR. This paper discusses the metaface learning (MFL) of face images under framework SRC. Although directly using training samples as dictionary bases can achieve good FR performance, well learned matrix lead higher rate with less atoms. An SRC oriented...
In this paper, we propose a novel Pattern-Affinitive Propagation (PAP) framework to jointly predict depth, surface normal and semantic segmentation. The motivation behind it comes from the statistic observation that pattern-affinitive pairs recur much frequently across different tasks as well within task. Thus, can conduct two types of propagations, cross-task propagation task-specific propagation, adaptively diffuse those similar patterns. former integrates affinity patterns adapt each task...
Localization Quality Estimation (LQE) is crucial and popular in the recent advancement of dense object detectors since it can provide accurate ranking scores that benefit Non-Maximum Suppression processing improve detection performance. As a common practice, most existing methods predict LQE through vanilla convolutional features shared with classification or bounding box regression. In this paper, we explore completely novel different perspective to perform – based on learned distributions...
This paper first answers the question ``why do two most powerful techniques Dropout and Batch Normalization (BN) often lead to a worse performance when they are combined together in many modern neural networks, but cooperate well sometimes as Wide ResNet (WRN)?'' both theoretical empirical aspects. Theoretically, we find that shifts variance of specific unit transfer state network from training test. However, BN maintains its statistical variance, which is accumulated entire learning...
Recent research on remote sensing object detection has largely focused improving the representation of oriented bounding boxes but overlooked unique prior knowledge presented in scenarios. Such can be useful because tiny objects may mistakenly detected without referencing a sufficiently long-range context, which vary for different objects. This paper considers these priors and proposes lightweight Large Selective Kernel Network (LSKNet). LSKNet dynamically adjust its large spatial receptive...
Fine-grained image analysis (FGIA) is a longstanding and fundamental problem in computer vision pattern recognition, underpins diverse set of real-world applications. The task FGIA targets analyzing visual objects from subordinate categories, e.g., species birds or models cars. small inter-class large intra-class variation inherent to fine-grained makes it challenging problem. Capitalizing on advances deep learning, recent years we have witnessed remarkable progress learning powered FGIA. In...
Recently the sparse representation based classification (SRC) has been proposed for robust face recognition (FR). In SRC, testing image is coded as a linear combination of training samples, and fidelity measured by l2-norm or l1-norm coding residual. Such model assumes that residual follows Gaussian Laplacian distribution, which may not be effective enough to describe in practical FR systems. Meanwhile, sparsity constraint on coefficients makes SRC's computational cost very high. this paper,...
The task of traffic line detection is a fundamental yet challenging problem. Previous approaches usually conduct via two-stage way, namely the segment followed by clustering, which very likely to ignore global semantic information an entire line. To address problem, we propose end-to-end system called Line-CNN (L-CNN), in key component novel proposal unit (LPU). LPU utilizes proposals as references locate accurate curves, forces learn feature representation lines. We benchmark proposed L-CNN...