- Advanced Image and Video Retrieval Techniques
- Video Surveillance and Tracking Methods
- Multimodal Machine Learning Applications
- Advanced Vision and Imaging
- Human Pose and Action Recognition
- Image Enhancement Techniques
- Face and Expression Recognition
- Medical Image Segmentation Techniques
- Topic Modeling
- Natural Language Processing Techniques
- Image Retrieval and Classification Techniques
- Domain Adaptation and Few-Shot Learning
- Robotics and Sensor-Based Localization
- Advanced Image Processing Techniques
- Visual Attention and Saliency Detection
- Anomaly Detection Techniques and Applications
- Advanced Neural Network Applications
- Image and Object Detection Techniques
- Face recognition and analysis
- Advanced Graph Neural Networks
- Network Security and Intrusion Detection
- Microencapsulation and Drying Processes
- Artificial Immune Systems Applications
- Radiomics and Machine Learning in Medical Imaging
- Impact of Light on Environment and Health
MediaTek (Taiwan)
2025
Beijing Institute of Technology
2015-2024
Northwest University
2022-2024
Science and Technology on Surface Physics and Chemistry Laboratory
2024
Shenzhen MSU-BIT University
2022-2024
Peking University
2024
Nanjing University of Aeronautics and Astronautics
2022-2023
Shenzhen University
2022-2023
Vision Technology (United States)
2023
Baidu (China)
2023
In this paper, we propose a vehicle type classification method using semisupervised convolutional neural network from frontal-view images. order to capture rich and discriminative information of vehicles, introduce sparse Laplacian filter learning obtain the filters with large amounts unlabeled data. Serving as output layer network, softmax classifier is trained by multitask small labeled For given image, can provide probability each which belongs. Unlike traditional methods handcrafted...
The widespread use of surveillance cameras toward smart and safe cities poses the critical but challenging problem vehicle reidentification (Re-ID). state-of-the-art research work performed Re-ID relying on deep metric learning with a triplet network. However, most existing methods basically ignore impact intraclass variance-incorporated embedding performance reidentification, in which robust fine-grained features for large-scale have not been fully studied. In this paper, we propose method,...
For machine reading comprehension, the capacity of effectively modeling linguistic knowledge from detail-riddled and lengthy passages getting ride noises is essential to improve its performance. Traditional attentive models attend all words without explicit constraint, which results in inaccurate concentration on some dispensable words. In this work, we propose using syntax guide text by incorporating syntactic constraints into attention mechanism for better linguistically motivated word...
The middle reaches of the Yellow River basin (MYRB) are among regions most severely affected by soil erosion globally. It has always held a pivotal role in and water conservation ecological restoration efforts China. Nonetheless, face recurrent drought occurrences growing human intervention, there have been notable alterations eco-environmental quality (EEQ) within MYRB. However, influences intervention on EEQ MYRB remain unclear. In this study, remote sensing index (RSEI) was applied to...
Multi-choice reading comprehension is a challenging task to select an answer from set of candidate options when given passage and question. Previous approaches usually only calculate question-aware representation ignore passage-aware question modeling the relationship between question, which cannot effectively capture In this work, we propose dual co-matching network (DCMN) models among passage, bidirectionally. Besides, inspired by how humans solve multi-choice questions, integrate two...
Most existing Visual Question Answering (VQA) models overly rely on language priors between questions and answers. In this paper, we present a novel method of attention-based VQA that learns decomposed linguistic representations utilizes the to infer answers for overcoming priors. We introduce modular attention mechanism parse question into three phrase representations: type representation, object concept representation. use representation identify possible answer set (yes/no or specific...
Quantifying full left ventricular (LV) metrics including cavity area, myocardium dimensions and wall thicknesses from cardiac magnetic resonance (MR) images, then assessing regional global function plays a crucial role in clinical practice. However, due to highly variable structures across different subjects, it is challenging obtain an accurate estimation of LV metrics. In this paper, we propose novel deep learning framework, called cascaded segmentation regression network (CSRNet), improve...
In this paper, we propose a scene-aware context reasoning method that exploits information from visual features for unsupervised abnormal event detection in videos, which bridges the semantic gap between and meaning of events. particular, build na spatio-temporal graph to model including appearances objects, relationships among objects scene types. The is encoded into nodes edges graph, their states are iteratively updated by using multiple RNNs with message passing reasoning. To infer...
Deep neural networks have shown excellent performance for stereo matching. Many efforts focus on the feature extraction and similarity measurement of matching cost computation step while less attention is paid aggregation which crucial In this paper, we present a learning-based method by novel sub-architecture in end-to-end trainable pipeline. We reformulate as learning process generation selection proposals indicate possible results. The realized two-stream network: one proposals, other...
Hyperbolic graph convolutional networks (GCNs) demonstrate powerful representation ability to model graphs with hierarchical structure. Existing hyperbolic GCNs resort tangent spaces realize convolution on manifolds, which is inferior because space only a local approximation of manifold. In this paper, we propose hyperbolic-to-hyperbolic network (H2H-GCN) that directly works manifolds. Specifically, developed manifold-preserving consists feature transformation and neighborhood aggregation....
In this paper, we present a decomposition model for stereo matching to solve the problem of excessive growth in computational cost (time and memory cost) as resolution increases. order reduce huge at original resolution, our only runs dense very low uses sparse different higher resolutions recover disparity lost details scale-by-scale. After matching, iteratively fuses maps from adjacent scales with an occlusion-aware mask. A refinement network is also applied improving fusion result....
In this paper, we aim to construct a deep neural network which embeds high dimensional symmetric positive definite (SPD) matrices into more discriminative low SPD manifold. To end, develop two types of basic layers: 2D fully connected layer reduces the dimensionality matrices, and symmetrically clean achieves non-linear mapping. Specifically, extend classical such that it is suitable for further show with pair elements setting zero operations are still definite. Finally, complete...
Bilinear pooling has achieved state-of-the-art performance on fusing features in various machine learning tasks, owning to its ability capture complex associations between features. Despite the success, bilinear suffers from redundancy and burstiness issues, mainly due rank-one property of resulting representation. In this paper, we prove that is indeed a similarity-based coding-pooling formulation. This establishment then enables us devise new feature fusion algorithm, factorized coding...
Few-shot learning describes the challenging problem of recognizing samples from unseen classes given very few labeled examples. In many cases, few-shot is cast as an embedding space that assigns test to their corresponding class prototypes. Previous methods assume data all tasks comply with a fixed geometrical structure, mostly Euclidean structure. Questioning this assumption clearly difficult hold in real-world scenarios and incurs distortions data, we propose learn task-aware curved by...
Different from the ground image with uniform haze, haze in remote sensing (RS) has characteristics of irregular shape and uneven concentration hazy weather. It brings a great challenge to application RS data advanced processing tasks. A novel dehazing network for non-uniform image, named as KFA-Net, is proposed solve aforementioned issues. The designed asymmetric size feature cascade (ASFC), k-means pixel attention (KPA) FFT channel (FCA) KFA-Net all show excellent effects. Compared...
Stereo matching becomes computationally challenging when dealing with a large disparity range. Prior methods mainly alleviate the computation through dynamic cost volume by focusing on local space, but it requires many iterations to get close ground truth due lack of global view. We find that approximately encodes space as single Gaussian distribution fixed and small variance at each iteration, which results in an inadequate view over update step every iteration. In this paper, we propose...
Riemannian meta-optimization provides a promising approach to solving non-linear constrained optimization problems, which trains neural networks as optimizers perform on manifolds. However, existing methods take up huge memory footprints in large-scale settings, the learned optimizer can only adapt gradients of fixed size and thus cannot be shared across different parameters. In this paper, we propose an efficient method that significantly reduces burden for via subspace adaptation scheme....
The appearance of an object could be continuously changing during tracking, thereby being not independent identically distributed. A good discriminative tracker often needs a large number training samples to fit the underlying data distribution, which is impractical for visual tracking. In this paper, we present new via landmark-based label propagation (LLP) that nonparametric and makes no specific assumption about sample distribution. With undirected graph representation samples, LLP...
Global optimization algorithms have shown impressive performance in data-association based multi-object tracking, but handling online data remains a difficult hurdle to overcome. In this paper, we present hybrid association framework with min-cost multi-commodity network flow for robust tracking. We build local target-specific models interleaved global of the optimal over multiple video frames. More specifically, flow, similarities are learned enforce consistency reducing complexity...
Multi-choice reading comprehension is a challenging task that requires complex reasoning procedure. Given passage and question, correct answer need to be selected from set of candidate answers. In this paper, we propose \textbf{D}ual \textbf{C}o-\textbf{M}atching \textbf{N}etwork (\textbf{DCMN}) which model the relationship among passage, question bidirectionally. Different existing approaches only calculate question-aware or option-aware representation, passage-aware representation at same...