Xiangpeng Li

ORCID: 0000-0001-5350-5780
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Multimodal Machine Learning Applications
  • Advanced Image and Video Retrieval Techniques
  • Rough Sets and Fuzzy Logic
  • Human Pose and Action Recognition
  • Advanced Computational Techniques and Applications
  • Domain Adaptation and Few-Shot Learning
  • Video Analysis and Summarization
  • Disaster Management and Resilience
  • Video Surveillance and Tracking Methods
  • Infrastructure Resilience and Vulnerability Analysis
  • Data Mining Algorithms and Applications
  • Flood Risk Assessment and Management
  • Image Processing and 3D Reconstruction
  • Advanced Decision-Making Techniques
  • Security in Wireless Sensor Networks
  • Anomaly Detection Techniques and Applications
  • Extenics and Innovation Methods
  • Energy Efficient Wireless Sensor Networks
  • Time Series Analysis and Forecasting
  • Fire Detection and Safety Systems
  • Distributed Control Multi-Agent Systems
  • Statistical and Computational Modeling
  • Higher Education and Teaching Methods
  • Advanced Malware Detection Techniques
  • Advanced Measurement and Detection Methods

Texas A&M University
2023-2024

University of Electronic Science and Technology of China
2017-2023

Mitchell Institute
2023

National University of Defense Technology
2022

Chongqing University of Posts and Telecommunications
2021

Nanchang University
2019

Shanghai Jiao Tong University
2012

Wuhan Textile University
2010-2011

Wuhan University of Science and Technology
2005-2009

Wuhan University
2005-2008

Most of the recent progresses on visual question answering are based recurrent neural networks (RNNs) with attention. Despite success, these models often timeconsuming and having difficulties in modeling long range dependencies due to sequential nature RNNs. We propose a new architecture, Positional Self-Attention Coattention (PSAC), which does not require RNNs for video answering. Specifically, inspired by success self-attention machine translation task, we calculate response at each...

10.1609/aaai.v33i01.33018658 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2019-07-17

Recent progress has been made in using attention based encoder-decoder framework for image and video captioning. Most existing decoders apply the mechanism to every generated word including both visual words (e.g., “gun” “shooting”) non-visual “the”, “a”). However, these can be easily predicted natural language model without considering signals or attention. Imposing on could mislead decrease overall performance of Furthermore, hierarchy LSTMs enables more complex representation data,...

10.1109/tpami.2019.2894139 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2019-01-01

Existing video hash functions are built on three isolated stages: frame pooling, relaxed learning, and binarization, which have not adequately explored the temporal order of frames in a joint binary optimization model, resulting severe information loss. In this paper, we propose novel unsupervised hashing framework dubbed self-supervised (SSVH), is able to capture nature videos an end-to-end learning fashion. We specifically address two central problems: 1) how design encoder-decoder...

10.1109/tip.2018.2814344 article EN IEEE Transactions on Image Processing 2018-03-09

Video visual question answering (V-VQA) remains challenging at the intersection of vision and language, where it requires joint comprehension video natural language question. Image-Question co-attention mechanism, which aims generating a spatial map highlighting image regions relevant to vice versa, has obtained impressive results. Despite success, simply applying results in unsatisfactory performance due complexity temporal nature videos. In this paper, we proposed novel architecture,...

10.1145/3343031.3350971 article EN Proceedings of the 30th ACM International Conference on Multimedia 2019-10-15

Visual question answering (VQA) that involves understanding an image and paired questions develops very quickly with the boost of deep learning in relevant research fields, such as natural language processing computer vision. Existing works highly rely on knowledge data set. However, some require more professional cues other than set to answer correctly. To address issue, we propose a novel framework named knowledge-based augmentation network (KAN) for VQA. We introduce object-related...

10.1109/tnnls.2020.3017530 article EN IEEE Transactions on Neural Networks and Learning Systems 2020-09-17

Fully mining visual cues to aid in content understanding is crucial for video captioning. However, most state-of-the-art captioning methods are limited generating captions purely based on straightforward information while ignoring the scenario and context information. To fill gap, we propose a novel, simple but effective scenario-aware recurrent transformer (SART) model execute Our contains “scenario understanding” module obtain global perspective across multiple frames, providing specific...

10.1145/3503927 article EN ACM Transactions on Multimedia Computing Communications and Applications 2022-03-04

Generating consecutive descriptions for videos, that is, video captioning, requires taking full advantage of visual representation along with the generation process. Existing captioning methods focus on an exploration spatial-temporal representations and their relationships to produce inferences. However, such only exploit superficial association contained in a itself without considering intrinsic commonsense knowledge exists dataset, which may hinder capabilities cognitive reason accurate...

10.1109/tnnls.2023.3323491 article EN IEEE Transactions on Neural Networks and Learning Systems 2023-01-01

ChatGPT has been emerging as a novel information source, and it is likely that the public might seek from while taking protective actions when facing climate hazards such floods hurricanes. The objective of this study to evaluate accuracy completeness responses generated by individuals about aspects actions. survey analysis results indicated that: (1) emergency managers considered provided accurate complete great extent; (2) was statistically verified in evaluations accurate, but lacked...

10.2139/ssrn.4408290 preprint EN 2023-01-01

Loitering detection can help recognize vulnerable people needing attention and potential suspects harmful to public security. The existing loitering methods used time or target trajectories as assessment criteria, only handled some simple circumstances because of complex track. To solve these problems, this paper proposes a method based on pedestrian activity area classification. first gave definition from new perspective using the size area. behaviors were divided into three categories....

10.3390/app9091866 article EN cc-by Applied Sciences 2019-05-07

Recent progress has been made in using attention based encoder-decoder framework for image and video captioning. Most existing decoders apply the mechanism to every generated word including both visual words (e.g., "gun" "shooting") non-visual (e.g. "the", "a"). However, these can be easily predicted natural language model without considering signals or attention. Imposing on could mislead decrease overall performance of Furthermore, hierarchy LSTMs enables more complex representation data,...

10.48550/arxiv.1812.11004 preprint EN other-oa arXiv (Cornell University) 2018-01-01

This paper presents a new approach for constructing decision trees based on variable precision rough set model. The presented is aimed at handling uncertain information during the process of inducing and generalizes to tree construction by allowing some extent misclassification when classifying objects. In paper, weighted mean are introduced. algorithm effectively overcomes influence noise data in structuring tree, reduces complexity strengthens its extensive ability.

10.1109/icnc.2008.88 article EN 2008-01-01

Wireless sensor networks have been widely applied to various application domains such as environmental monitoring and surveillance. Because of reliance on the open transmission media, a network may suffer from radio jamming attacks, which is easy launch but difficult defend. Attacked by signals, experience corrupted packets low throughput. A number defense techniques proposed. However, each technique suitable for only limited range conditions. This paper proposes an adaptive approach...

10.1109/icc.2012.6364589 article EN 2012-06-01

It is a common fact that data (features, characteristics or variables) are collected at different sampling frequencies in some fields such as economic and industry. The existing methods usually either ignore the difference from hardly take notice of inherent temporal mixed frequency data. authors propose an innovative dual attention-based neural network for (MID-DualAtt), order to utilize select input reasonably without losing information. According authors' knowledge, this first study use...

10.1049/cit2.12013 article EN cc-by-nc-nd CAAI Transactions on Intelligence Technology 2021-03-17

Wireless sensor networks have been widely applied to various application domains such as environmental monitoring and surveillance. Because of reliance on the open transmission media, a network may suffer from radio jamming attacks, which are easy launch but difficult defend. Attacked by signals, experience corrupted packets low throughput. A number defense techniques proposed. However, each technique is suitable for only limited range conditions. This paper proposes an adaptive approach...

10.1155/2012/485345 article EN cc-by International Journal of Distributed Sensor Networks 2012-12-17

Aiming at the problem that counter propagation network (CPN) can not make good use of nerve cells, a rough CP neural model based on set is proposed. It changes strategy winning as king, and decides output values to Rough Member Function, which expresses level an element subordinated set. Experiments show approach solve some problems in other Networks, for example, sample size quality. They would directly influence accuracy. While reducing training time, prediction precision be greatly improved.

10.1109/icnc.2007.133 article EN 2007-01-01

A new thresholding algorithm and its multi-threshold extension are presented to improve the performance of image segmentation. The probabilistic rough set is used describe objects backgrounds in image. optimal threshold given through a minimum boundary criterion. membership function defined light requirements thresholding.

10.1109/grc.2005.1547253 article EN 2005-01-01

Visual dialog aims to answer several consecutive questions based on image and history. Most works resolve all with ambiguous references (e.g., "she") by history, which generates redundant information gets in-accurate results. Also, they regard this task as a classification task, ignores the diversity of response answers results in poor generalization capability. To tackle these problems, we propose novel Context Gating Multi-level Ranking Learning (CGMRL). Specifically, proposed context...

10.1109/icme52920.2022.9859849 article EN 2022 IEEE International Conference on Multimedia and Expo (ICME) 2022-07-18

Based on the rough set theory, a counter propagation neural network algorithm for edge detection is presented in this paper. Firstly, definition of membership function, which used to modify weigh values nomal network, proposed after introducing set. Experiments show that approach has achieved good results improving accuracy detection. And can also overcome effectively problem simply cluster network.

10.1109/grc.2007.23 article EN 2007 IEEE International Conference on Granular Computing (GRC 2007) 2007-11-01
Coming Soon ...