- Video Surveillance and Tracking Methods
- Advanced Image Fusion Techniques
- Railway Systems and Energy Efficiency
- Image and Signal Denoising Methods
- Advanced Measurement and Detection Methods
- Speech Recognition and Synthesis
- Medical Image Segmentation Techniques
- Sentiment Analysis and Opinion Mining
- Image Enhancement Techniques
- Advanced Text Analysis Techniques
- Remote Sensing and Land Use
- Metaheuristic Optimization Algorithms Research
- Phonetics and Phonology Research
- Image Processing Techniques and Applications
- Advanced Algorithms and Applications
- Simulation and Modeling Applications
- Speech and Audio Processing
- Topic Modeling
- Advanced Vision and Imaging
- Evaluation Methods in Various Fields
- Speech and dialogue systems
- Remote-Sensing Image Classification
- Innovative Educational Techniques
- Human Pose and Action Recognition
- Face and Expression Recognition
Lanzhou Jiaotong University
2015-2025
Shenzhen Institutes of Advanced Technology
2024-2025
Chinese Academy of Sciences
2024-2025
Gansu Research Institute of Chemical Industry
2024
Tianjin University
2012-2024
Peng Cheng Laboratory
2023
Japan Advanced Institute of Science and Technology
2008-2021
Hexi University
2012-2021
Nankai University
2021
Technical University of Darmstadt
2021
After going through the deep network, there will be some loss of pedestrian information, which cause disappearance gradients, causing inaccurate detection. This paper improves network structure YOLO algorithm and proposes a new YOLO-R. First, three Passthrough layers were added to original network. The layer consists Route Reorg layer. Its role is connect shallow features link high low resolution features. pass characteristic information specified current layer, then use reorganize feature...
The detection of rail surface defects is an important part railway daily inspection, according to the requirements modern automatic technology on real-time and adaptability. This paper presents a method for based machine vision. According basic principle vision, image acquisition device equipped with LED auxiliary light source shading box has been designed portable testing model carry field experiment. In view requirement, extracting target area from original carried out without...
In remote sensing change detection (RSCD) tasks, high-resolution images can provide fine image details and complex texture features. Deep learning based RSCD methods have achieved state-of-the-art (SOTA) performance. However, most of these only consider vertical multiscale features when extracting features, while ignoring the problem loss contextual due to scale changes. Hence, in order overcome above issues, a deeper encoding-decoding feature fusion network (DMEDNet) is proposed this...
Object 6D pose estimation, as a key technology in applications such augmented reality (AR), virtual (VR), robotics, and autonomous driving, requires the prediction of 3D position objects robustly from complex scene images. However, environmental factors occlusion, noise, weak texture, lighting changes may affect accuracy robustness object estimation. We propose robust CoS-PVNet (complex scenarios pixel-wise voting network) estimation network for scenes. By adding pixel-weight layer based on...
The existing learning resource recommendation systems suffer from data sparsity and missing labels, leading to the insufficient mining of correlation between users courses. To address these issues, we propose a method based on graph contrastive learning, which uses construct an auxiliary task combined with main task, achieving joint resources. Firstly, interaction bipartite user course is input into lightweight convolutional network, embedded representation each node in obtained after...
Recently, emotional speech generation and speaker cloning have garnered significant interest in text-to-speech (TTS). With the open-sourcing of codec language TTS models trained on massive datasets with large-scale parameters, adapting these general pre-trained to generate specific expressions target characteristics has become a topic great attention. Common approaches, such as full adapter-based fine-tuning, often overlook contributions model parameters emotion control. Treating all...
Multimodal Sentiment Analysis (MSA) stands as a critical research frontier, seeking to comprehensively unravel human emotions by amalgamating text, audio, and visual data. Yet, discerning subtle emotional nuances within audio video expressions poses formidable challenge, particularly when polarities across various segments appear similar. In this paper, our objective is spotlight emotion-relevant attributes of modalities facilitate multimodal fusion in the context nuanced shifts visual-audio...
The Bacterial Foraging Optimization Algorithm is a swarm intelligence optimization algorithm. This paper first analyzes the chemotaxis, as well elimination and dispersal operation, based on basic Algorithm. operation makes bacterium which has found or nearly an optimal position escape away from that position, greatly affects convergence speed of In order to avoid this escape, sphere action can be altered in accordance with generations evolution. Secondly, we put forward algorithm adaptive...
For wireless sensor network, the localization algorithm based on Voronoi diagram has been applied. However, location accuracy node position in network needs to be optimized by analysis of literature, a and support vector machine is proposed this article. The basic idea first divide region into several parts using anchor region. range initial target obtained locating each then used optimize accurately. performance analyzed simulation real-world experiments. experimental results show that...
Extended reality (XR) is a general term for virtual (VR), augmented (AR), and mixed (MR). By converting abstract digital expressions into intelligent feedback through figures, one can effectively compensate the poor performance of traditional learning in deep cognitive processing operational skills training. However, extant results are uncertain, only limited number studies have investigated influence mechanism heterogeneity among VR, AR, MR on procedural knowledge learning, higher-level...
Recently, progress has been made towards improving automatic sarcasm detection in computer science. Among existing models, manually constructing static graphs for texts and then using graph neural networks (GNNs) is one of the most effective approaches drawing long-range incongruity patterns. However, constructed structure might be prone to errors (e.g., noisy or incomplete) not optimal task. Errors produced during construction step cannot remedied may accrue following stages, resulting poor...
Aiming at the problems of high time overhead, low positioning accuracy, and inability to meet requirements indoor applications in WiFi Pedestrian Dead Rockoning (PDR) technologies, a cross-layer method based on multi-sensor fingerprint fusion is proposed. Firstly, Multi-dimensional scaling technology (MDS) used location, which reduces overhead offline stage large area location improves responsiveness online location. Without limiting holding mode, threshold detection reduce error gait PDR...
The nonrigid registration algorithm based on B-spline Free-Form Deformation (FFD) plays a key role and is widely applied in medical image processing due to the good flexibility robustness. However, it requires tremendous amount of computing time obtain more accurate results especially for large data. To address issue, parallel proposed this paper. First, Logarithm Squared Difference (LSD) considered as similarity metric improve precision. After that, we create strategy lookup tables (LUTs)...
Sarcasm is widely utilized on social media platforms such as Twitter and Reddit. detection required for analyzing people's true feelings since sarcasm commonly used to portray a reversed emotion opposing the literal meaning. The syntactic structure key make better use of commonsense when detecting sarcasm. However, it extremely challenging effectively explicitly explore information implied in simultaneously. In this paper, we apply pre-trained COMET model generate relevant knowledge, novel...
Abstract Multi-focus image fusion is a process of fusing multiple images different focus areas into total image, which has important application value. In view the defects current method in detail information retention effect original architecture based on two stages designed. training phase, combined with polarized self-attention module and DenseNet network structure, an encoder-decoder structure designed for reconstruction tasks to enhance ability model. stage, encoded feature map,...
In order to solve the problem that mutual information function is easy fall into local optimal values because of much extremism in registration method, a multi-resolution medical image algorithm based on firefly and Powell put forward this paper. The normalized used as similarity measure algorithm, strategy wavelet transformation adopted process searching best value. lower resolution image, for imprecise result, And higher better result. experimental results show can effectively overcome get...
The Chinese Train Control Systems (CTCS) have five levels from 0 to 4 based on the of European (ETCS).The complicated and redundant structures ensure reliability safety CTCS.However, requirements next-generation train control system optimize structure reduce cost investment maintenance are hardly met.First, state art CTCS was summarized in this study.Second, characteristics typical projects, namely, NGTC, SHIFT2RAIL, Positive Control, Rail Traffic Management System-Regional, Urbalis Fluence,...