- Face and Expression Recognition
- Advanced Vision and Imaging
- Face recognition and analysis
- Traffic Prediction and Management Techniques
- Advanced Image Processing Techniques
- Video Surveillance and Tracking Methods
- Multimodal Machine Learning Applications
- Advanced Image and Video Retrieval Techniques
- Image and Signal Denoising Methods
- Human Pose and Action Recognition
- Advanced Graph Neural Networks
- Sparse and Compressive Sensing Techniques
- 3D Shape Modeling and Analysis
- Domain Adaptation and Few-Shot Learning
- Anomaly Detection Techniques and Applications
- Video Coding and Compression Technologies
- Image Retrieval and Classification Techniques
- Advanced Neural Network Applications
- Generative Adversarial Networks and Image Synthesis
- Transportation Planning and Optimization
- Computer Graphics and Visualization Techniques
- Advanced Data Compression Techniques
- Text and Document Classification Technologies
- Image Processing Techniques and Applications
- Complex Network Analysis Techniques
Beijing University of Technology
2016-2025
Dalian University of Technology
2016-2025
Peng Cheng Laboratory
2019-2024
Beijing Academy of Artificial Intelligence
2020-2023
Institute of Art
2022-2023
Dalian University
2016-2022
Beijing Advanced Sciences and Innovation Center
2017-2018
Institute of Software
2018
Charles Sturt University
2015
Beijing Technology and Business University
2010
Deep Neural Networks (DNNs) have substantially improved the state-of-the-art in salient object detection. However, training DNNs requires costly pixel-level annotations. In this paper, we leverage observation that image-level tags provide important cues of foreground objects, and develop a weakly supervised learning method for saliency detection using only. The Foreground Inference Network (FIN) is introduced challenging task. first stage our method, FIN jointly trained with fully...
Deep convolutional neural networks (CNNs) have delivered superior performance in many computer vision tasks. In this paper, we propose a novel deep fully network model for accurate salient object detection. The key contribution of work is to learn uncertain features (UCF), which encourage the robustness and accuracy saliency We achieve via introducing reformulated dropout (R-dropout) after specific layers construct an ensemble internal feature units. addition, effective hybrid upsampling...
Traffic prediction plays an essential role in intelligent transportation system. Accurate traffic can assist route planing, guide vehicle dispatching, and mitigate congestion. This problem is challenging due to the complicated dynamic spatio-temporal dependencies between different regions road network. Recently, a significant amount of research efforts have been devoted this area, especially deep learning method, greatly advancing abilities. The purpose paper provide comprehensive survey on...
Traffic prediction is a core problem in the intelligent transportation system and has broad applications management planning, main challenge of this field how to efficiently explore spatial temporal information traffic data. Recently, various deep learning methods, such as convolution neural network (CNN), have shown promising performance prediction. However, it samples data regular grids input CNN, thus destroys structure road network. In paper, we introduce graph propose an optimized...
Low-light images typically suffer from two problems. First, they have low visibility (i.e., small pixel values). Second, noise becomes significant and disrupts the image content, due to signal-to-noise ratio. Most existing low-light enhancement methods, however, learn noise-negligible datasets. They rely on users having good photographic skills in taking with noise. Unfortunately, this is not case for majority of images. While concurrently enhancing a removing its ill-posed, we observe that...
Grain boundaries (GBs) play an important role in the mechanical behavior of polycrystalline materials. Despite decades investigation, atomic-scale dynamic processes GB deformation remain elusive, particularly for GBs polycrystals, which are commonly asymmetric and general type. We conducted situ atomic-resolution study to reveal how sliding-dominant is accomplished at tilt platinum bicrystals. observed either direct sliding along or with atom transfer across boundary plane. The latter...
Accurate traffic forecasting is important to enable intelligent transportation systems in a smart city. This problem challenging due the complicated spatial, short-term temporal and long-term periodical dependencies. Existing approaches have considered these factors modeling. Most solutions apply CNN, or its extension Graph Convolution Networks (GCN) model spatial correlation. However, convolution operator may not adequately non-Euclidean pair-wise correlations. In this paper, we propose...
Traffic forecasting is a challenging problem in the transportation research field as complexity and non-stationary changing of traffic data, thus key to issue how explore proper spatial temporal characteristics. Based on this thought, many creative methods have been proposed, which Graph Convolution Network (GCN) based shown promising performance. However, these depend graph construction, mainly uses prior knowledge road network. Recently, some works realized fact network tried construct...
Traffic forecasting is attracting considerable interest due to its widespread application in intelligent transportation systems. Given the complex and dynamic traffic data, many methods focus on how establish a spatial-temporal model express non-stationary patterns. Recently, latest Graph Convolution Network (GCN) has been introduced learn spatial features while time neural networks are used temporal features. These GCN based obtain state-of-the-art performance. However, current ignore...
Metro passenger flow prediction is a strategically necessary demand in an intelligent transportation system to alleviate traffic pressure, coordinate operation schedules, and plan future constructions. Graph-based neural networks have been widely used problems. Graph Convolutional Neural Networks (GCN) captures spatial features according established connections but ignores the high-order relationships between stations travel patterns of passengers. In this paper, we utilize novel...
This article presents a novel person reidentification model, named multihead self-attention network (MHSA-Net), to prune unimportant information and capture key local from images. MHSA-Net contains two main components: branch (MHSAB) attention competition mechanism (ACM). The MHSAB adaptively captures then produces effective diversity embeddings of an image for the matching. ACM further helps filter out noise nonkey information. Through extensive ablation studies, we verified that both...
Event-based cameras bring a unique capability to tracking, being able function in challenging real-world conditions as direct result of their high temporal resolution and dynamic range. These imagers capture events asynchronously that encode rich spatial information. However, effectively extracting this information from remains an open challenge. In work, we propose spiking transformer network, STNet, for single object tracking. STNet dynamically extracts fuses both domains. particular, the...
Graph convolutional networks (GCN) have been applied in the traffic flow forecasting tasks with graph capability describing irregular topology structures of road networks. However, GCN based methods often fail to simultaneously capture short-term and long-term temporal relations carried by data, also suffer over-smoothing problem. To overcome problems, we propose a hierarchical network merging newly designed Transformer (LTT) spatio-temporal (STGC). Specifically, LTT aims learn among while...
The recently emerged compressive sensing (CS) theory provides a whole new avenue for data gathering in wireless sensor networks with benefits of universal sampling and decentralized encoding. However, existing based approaches assume the sensed has known constant sparsity, ignoring that sparsity natural signals vary temporal spatial domain. In this paper, we present an adaptive scheme by networks. By introducing autoregressive (AR) model into reconstruction data, local correlation is...
Recently, single-image super-resolution has made great progress owing to the development of deep convolutional neural networks (CNNs). The vast majority CNN-based models use a pre-defined upsampling operator, such as bicubic interpolation, upscale input low-resolution images desired size and learn non-linear mapping between interpolated image ground truth high-resolution (HR) image. However, interpolation processing can lead visual artifacts details are over-smoothed, particularly when...
This paper presents a new model, Semantics-enhanced Generative Adversarial Network (SEGAN), for fine-grained text-to-image generation. We introduce two modules, Semantic Consistency Module (SCM) and an Attention Competition (ACM), to our SEGAN. The SCM incorporates image-level semantic consistency into the training of (GAN), can diversify generated images improve their structural coherence. A Siamese network types similarities are designed map synthesized image groundtruth nearby points in...
This paper presents a new framework, Knowledge-Transfer Generative Adversarial Network (KT-GAN), for fine-grained text-to-image generation. We introduce two novel mechanisms: an Alternate Attention-Transfer Mechanism (AATM) and Semantic Distillation (SDM), to help generator better bridge the cross-domain gap between text image. The AATM updates word attention weights of image sub-regions alternately, progressively highlight important information enrich details synthesized images. SDM uses...
Inspired by the complementarity between conventional frame-based and bio-inspired event-based cameras, we propose a multi-modal based approach to fuse visual cues from frame- event-domain enhance single object tracking performance, especially in degraded conditions (e.g., scenes with high dynamic range, low light, fast-motion objects). The proposed can effectively adaptively combine meaningful information both domains. Our approach’s effectiveness is enforced novel designed cross-domain...
Recently, deep convolutional neural networks (CNNs) have been widely explored in single image super-resolution (SISR) and contribute remarkable progress. However, most of the existing CNNs-based SISR methods do not adequately explore contextual information feature extraction stage pay little attention to final high-resolution (HR) reconstruction step, hence hindering desired SR performance. To address above two issues, this paper, we propose a two-stage attentive network (TSAN) for accurate...
Recently, Graph Convolution Network (GCN) and Temporal (TCN) are introduced into traffic prediction achieve state-of-the-art performance due to their good ability for modeling the spatial temporal property of data. In spite having performance, current methods generally focus on measurement road segments, i.e. nodes flow graph, while edges which represent correlation data different segments form affinity matrix GCN, usually constructed according structure network, but properties not well...
In this paper, we propose a novel person Re-ID model, Consecutive Batch DropBlock Network (CBDB-Net), to capture the attentive and robust descriptor for task. The CBDB-Net contains two designs: Module (CBDBM) Elastic Loss (EL). (CBDBM), firstly conduct uniform partition on feature maps. And then, independently continuously drop each patch from top bottom maps, which can output multiple incomplete training stage, these features better encourage model (EL), design weight control item help...
Traffic prediction methods on a single-source data have achieved excellent results in recent years, especially the Graph Convolutional Networks (GCN) based models with spatio-temporal dependency. In reality, various modes of urban transportation operate simultaneously. They influence and complement each other common space-time occasions, constituting system dynamically. Thus, traffic from multiple sources is ostensibly heterogeneous, but internally correlated. The typical single driven are,...
Negative emotions may induce dangerous driving behaviors leading to extremely serious traffic accidents. Therefore, it is necessary establish a system that can automatically recognize driver so some actions be taken avoid Existing studies on emotion recognition have mainly used facial data and physiological data. However, there are fewer multimodal with contextual characteristics of driving. In addition, fully fusing in the feature fusion layer improve performance still challenge. To this...
Most existing RGB-based trackers target low frame rate benchmarks of around 30 frames per second. This setting restricts the tracker's functionality in real world, especially for fast motion. Event-based cameras as bioinspired sensors provide considerable potential high tracking due to their temporal resolution. However, event-based cannot offer fine-grained texture information like conventional cameras. unique complementarity motivates us combine and events object under various challenging...