- Parallel Computing and Optimization Techniques
- Advanced Neural Network Applications
- Advanced SAR Imaging Techniques
- Synthetic Aperture Radar (SAR) Applications and Techniques
- Advanced Image and Video Retrieval Techniques
- Autonomous Vehicle Technology and Safety
- Traffic Prediction and Management Techniques
- Microwave Imaging and Scattering Analysis
- Embedded Systems Design Techniques
- Face and Expression Recognition
- Time Series Analysis and Forecasting
- Infrastructure Maintenance and Monitoring
- Interconnection Networks and Systems
- Distributed systems and fault tolerance
- Security and Verification in Computing
- Industrial Vision Systems and Defect Detection
National University of Defense Technology
2008-2025
Tsinghua University
2024
Chengdu University of Technology
2024
Jilin University
2023
State Key Laboratory of Automotive Simulation and Control
2023
Institute of Computing Technology
2009
The airborne and satellite-based synthetic aperture radar enables the acquisition of high-resolution SAR oceanographic images in which even outlines ships can be identified. detection ship targets from has a wide range applications. Due to density images, extreme imbalance between foreground background clutter, diversity target sizes, achieving lightweight highly accurate multi-scale remains great challenge. To this end, paper proposed an attention mechanism for receptive fields convolution...
The verification of an execution against memory consistency is known to be NP-hard. This paper proposes a novel fast method by identifying new natural partial order: time order. In multiprocessor systems with store atomicity, order restriction exists between two operations whose pending periods are disjoint: the former operation in must observed latter operation. Based on restriction, localized: for any operation, both inferring related orders and checking cycles need take into account only...
Convolution kernels are widely seen in deep learning workloads and often responsible for performance bottlenecks. Recent research has demonstrated that a direct convolution approach can outperform the traditional implementation based on tensor-to-matrix conversions. However, existing approaches still have room improvement. We present nDirect, new targets ARM-based multi-core CPUs commonly found smartphones HPC systems. nDirect is designed to be compatible with data layout formats used by...
Road detection technology is an important part of the automatic driving environment perception system. With development technology, situations that needs to consider will become broader and more complex. This paper contributes a lightweight convolutional neural network model, incorporating novel convolution parallel pooling modules, improved activation function, comprehensive training verification with multiple datasets. The proposed model achieves high accuracy in detecting drivable areas...
Microscopic traffic flow data, an important input to virtual test scenarios for autonomous driving, are often difficult obtain in large quantities allow batch testing. In this paper, a neural network generating microscopic scene fragments is proposed, which improved by adding Gate Recurrent Units (GRU) the discriminator of Deep Convolutional Generative Adversarial Network (DCGAN) enable it better discriminate continuous data. Subsequently, paper compares individual sample motion trajectories...
Piecewise uniform sampling local backprojection (PUS-LBP) algorithm is presented for short-constant-aperture SAR image formation. In the PUS-LBP subaperture images are formed based on Cartesian coordinates, which dramatically improves computation efficiency comparing with original LBP. Furthermore, piecewise technique applied to LBP in coordinates. This increases subimage rate at edges, but retains a lower center. As result, consistent focusing performance of whole achieved an efficient way....
Synchronization operations like barriers are fre-quently seen in parallel OpenMP programs, where an inefficient implementation can severely limit the application performance. While synchronization optimization has been heavily studied on traditional x86 architectures, there is no consensus how be best implemented ARMv8 multi-core CPUs. This paper presents a study of two representative Phytium 2000+ and ThunderX2, by considering various mechanisms offered mainstreamed compilers, GCC LLVM. Our...