- Advanced Image and Video Retrieval Techniques
- Video Surveillance and Tracking Methods
- Image Retrieval and Classification Techniques
- Human Pose and Action Recognition
- Remote-Sensing Image Classification
- Multimodal Machine Learning Applications
- Advanced Neural Network Applications
- Generative Adversarial Networks and Image Synthesis
- Face recognition and analysis
- Domain Adaptation and Few-Shot Learning
- Advanced Image Fusion Techniques
- Digital Media Forensic Detection
- Image Enhancement Techniques
- 3D Shape Modeling and Analysis
- Video Analysis and Summarization
- Topic Modeling
- Machine Fault Diagnosis Techniques
- Context-Aware Activity Recognition Systems
- Anomaly Detection Techniques and Applications
- Image and Signal Denoising Methods
- Meteorological Phenomena and Simulations
- Oceanographic and Atmospheric Processes
- Gait Recognition and Analysis
- Advanced Graph Neural Networks
- Gear and Bearing Dynamics Analysis
Ocean University of China
2012-2025
Changsha University of Science and Technology
2019-2024
China National Petroleum Corporation (China)
2023-2024
Tsinghua University
2014-2017
University of Pittsburgh
2010-2015
Jiangxi Science and Technology Normal University
2011
Image captioning is one of the most challenging tasks in AI because it requires an understanding both complex visuals and natural language. Because image essentially a sequential prediction task, recent advances have used reinforcement learning (RL) to better explore dynamics word-by-word generation. However, existing RL-based methods rely primarily on single policy network reward function-an approach that not well matched multi-level (word sentence) multi-modal (vision language) nature...
Fault diagnosis for rolling bearing has been an important engineering problem through decades. To detect the damaged surface, engineers analyze features from extracted vibration signals of machine. As artificial intelligence rapidly develops and provides favorable effects in data analytics, using deep learning technology to attack fault problems attracted increasing research interests recent years. However most existing methods do not provide satisfactory performance mining relationship...
In this paper, we propose the multi-view saliency guided deep neural network (MVSG-DNN) for 3D object retrieval and classification. This method mainly consists of three key modules. First, module model projection rendering is employed to capture multiple views one object. Second, visual context learning applies basic Convolutional Neural Networks feature extraction individual then employs LSTM adaptively select representative based on context. Finally, with these information, representation...
With the development of both hardware and deep neural network technologies, tremendous improvements have been achieved in performance automatic emotion recognition (AER) based on video data. However, AER is still a challenging task due to subtle expression, abstract concept representation multi-modal information. Most proposed approaches focus feature learning fusion strategy, which pay more attention characteristic single ignore correlation among videos. To explore this correlation, paper,...
Temporal action localization is currently an active research topic in computer vision and machine learning due to its usage smart surveillance. It a challenging problem since the categories of actions must be classified untrimmed videos start end need accurately found. Although many temporal methods have been proposed, they require substantial amounts computational resources for training inference processes. To solve these issues, this work, novel temporal-aware relation attention network...
2D image based 3D model retrieval is a challenging research topic in the field of retrieval. The huge gap between two modalities - and model, extremely constrains performance. In order to handle this problem, we propose novel multi-branch graph convolution network (M-GCN) address problem. First, compute similarity on visual information construct one cross-modalities which can provide original relationship model. However, not accurate because difference modalities. Thus, multi-head attention...
Low-order features based on convolution kernel are easy to be distorted when encountering dramatic view angle transformation and atmospheric scattering in remote sensing (RS) images. To address this concern, article first proposes operate semantic segmentation of RS images the high-order information, which can represent relative relationship low-order is robust stable suffering feature distortion. Besides, decouples have recently been well researched achieved significant improvement image...
Motivation: Automated identification of thoracic diseases from chest X-ray images (CXR) is a significant area in computer-aided diagnosis. However, most existing methods have limited ability to extract multi-scale features and accurately capture the spatial location lesions when dealing with that exhibit concurrency large variations lesion size. Method: Based on above problems, we propose multi-level residual feature fusion network (MLRFNet) for classifying diseases. Our approach can quickly...
The shale gas development process is complex in terms of its flow mechanisms and the accuracy production forecasting influenced by geological parameters engineering parameters. Therefore, to quantitatively evaluate relative importance model on performance, sensitivity analysis required. are ranked according coefficients for subsequent optimization scheme design. A data-driven global (GSA) method using convolutional neural networks (CNN) proposed identify influencing production. CNN trained a...
As we all know, semantic segmentation of remote sensing (RS) images is to classify the pixel by realize decoupling images. Most traditional methods only decouple and do not perform scale-separation operations, which leads serious problems. In process, if feature extractor too large, it will ignore small-scale targets; small, lead separation large-scale target objects reduce accuracy. To address this concern, propose a Scale-separated Semantic Decoupled Transformer(SSDT), first performs in...
Visual question answering (VQA) is a challenging task that requires models to understand both visual and linguistic inputs produce accurate answers. However, VQA often exploit biases in datasets make predictions rather than reasoning based on the inputs. Prior approaches debiasing have suggested implementation of supplementary model, deliberately designed exhibit bias, which subsequently informs training resilient target model. Nevertheless, such techniques merely quantify model’s divergence...
Recently, wearable computers have become new members in the family of mobile electronic devices, adding functions to those provided by smart-phones and tablets. As "always-on" miniature personal space, they will play increasing roles field healthcare. In this work, we present our development eButton, a computer designed as personalized, attractive, convenient chest pin circular shape. It contains powerful microprocessor, numerous sensors, wireless communication links. We describe its design...
In order to study the direct shear properties of ultra-high performance concrete (UHPC) structures, 15 Z-shaped monolithic placement specimens (MPSs) and 12 waterjet treated (WJTSs) were tested behavior failure modes. The effects steel fiber shape, volume fraction interface treatment on UHPC investigated. test results demonstrate that MPSs reinforced with fibers underwent ductile failure. ultimate load MPS is about 166.9% initial cracking load. However, WJTSs failed in a typical brittle...
In this work, we aim to investigate the practical task of flexible fashion search with attribute manipulation, where users can retrieve target items by replacing unwanted attributes an available query image desired ones (e.g., changing collar from v-neck round). Although several pioneer efforts have been dedicated fulfilling task, they mainly ignore potential generative models in enhancing visual understanding items. To end, propose end-to-end manipulation scheme, which consists a generator...