Xiaodong Yu

ORCID: 0000-0003-0826-1056
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Image and Video Retrieval Techniques
  • Image Retrieval and Classification Techniques
  • Video Surveillance and Tracking Methods
  • Video Analysis and Summarization
  • Human Pose and Action Recognition
  • Anomaly Detection Techniques and Applications
  • Advanced Computational Techniques and Applications
  • Robotics and Sensor-Based Localization
  • Advanced Vision and Imaging
  • Advanced Neural Network Applications
  • Image and Signal Denoising Methods
  • Advanced Image Fusion Techniques
  • Handwritten Text Recognition Techniques
  • Remote-Sensing Image Classification
  • Face recognition and analysis
  • Multimodal Machine Learning Applications
  • Power Transformer Diagnostics and Insulation
  • Music and Audio Processing
  • Domain Adaptation and Few-Shot Learning
  • Image Processing and 3D Reconstruction
  • Rough Sets and Fuzzy Logic
  • Metallurgy and Material Science
  • Advanced Decision-Making Techniques
  • Image Enhancement Techniques
  • Human Motion and Animation

Qingdao University of Science and Technology
2025

Sanda University
2022-2024

Shandong Electric Power Engineering Consulting Institute Corp
2023

China Power Engineering Consulting Group (China)
2023

Changzhou Institute of Technology
2017-2020

Shandong University
2017-2019

Harbin Engineering University
2019

Virginia Tech
2016-2018

Wuhan University
2013

Comcast (United States)
2011-2013

This paper presents an end-to-end learning architecture for video-based person re-identification by integrating convolutional neural networks (CNNs) and bidirectional recurrent (BRNNs). Given a video with consecutive frames, features of each frame are extracted CNN then fed into the BRNN to get final spatio-temporal representation about video. Specifically, acts as Spatial Feature Extractor, while is expected capture temporal cues sequential frames in both forward backward directions,...

10.1109/tcsvt.2017.2718188 article EN IEEE Transactions on Circuits and Systems for Video Technology 2017-06-21

To reach SDSB (Self-Driving Sweeping Bot) in an efficient-sweeping manner, data collection of visual images regarding sweeping target must be conducted prior to analyze the required objects with other noises. In this work, three categorized including, fallen leaves, speed bumps and manhole cover etc. are involved training validation phases. real-time object detection, work further investigated one-stage Yolo v5 learning approach four version v5s, v5m, v51, v5x, wherein v5s terms its benefits...

10.1109/icot56925.2022.10008164 article EN 2022-11-10

In this paper, we propose a novel deep neural network based attention model to learn the representative local regions from video sequence for person re-identification. Specifically, multi-scale spatial-temporal (MSTA) measure of each frame in different scales perspective whole sequence. Compared traditional temporal models, MSTA focuses on exploiting importance representation both spatial and domains. A new training strategy is designed proposed by incorporating image-to-image mode with...

10.1109/tip.2019.2959653 article EN IEEE Transactions on Image Processing 2019-12-25

Multi-person pose estimation is an attractive and challenging task. Existing methods are mostly based on two-stage frameworks, which include top-down bottom-up methods. Two-stage either suffer from high computational redundancy for additional person detectors or they need to group keypoints heuristically after predicting all the instance-agnostic keypoints. The single-stage paradigm aims simplify multi-person pipeline receives a lot of attention. However, recent have limitation low...

10.1145/3474085.3475447 article EN Proceedings of the 30th ACM International Conference on Multimedia 2021-10-17

This paper presents a novel approach to utilizing high level knowledge for the problem of scene recognition in an active vision framework, which we call recognition. In traditional approaches, is used post-processing combine outputs object detectors achieve better classification performance. contrast, proposed employs actively by implementing interaction between reasoning module and sensory (Figure 1). Following this paradigm, implemented recognizer evaluated it with dataset 20 scenes 100+...

10.1109/iccv.2011.6126320 article EN International Conference on Computer Vision 2011-11-01

Algebraic reconstruction technique (ART) is an iterative algorithm for computed tomography (CT) image reconstruction. Due to the high computational cost, researchers turn modern HPC systems with GPUs accelerate ART algorithm. However, existing proposals suffer from inefficient designs of compressed data structure and kernel on GPUs. In this paper, we identify patterns in as product a sparse matrix (and its transpose) multiple vectors (SpMV SpMV_T). Because implementations well-tuned...

10.1109/ccgrid.2016.96 article EN 2016-05-01

In this paper, we proposed a robust moving video object segmentation algorithm using features in the MPEG compressed domain. We first cluster motion vectors and produce mask. Then, difference mask at 8 x block size is extracted from DC image by background subtraction method. Finally, are combined conditionally to generate final based on set of rules that application specified obtained with learning or heuristic methods. The experimental results show scheme more than those only.

10.1109/icip.2003.1247399 article EN 2004-06-03

In this demonstration, we present a unified framework for semantic shot classification in sports videos. Unlike previous approaches, which focus on clustering by aggregating shots with similar low-level features, the proposed scheme makes use of domain knowledge specific sport to perform top-down video classification, including identification classes each sport, and supervised learning given middle-level features extracted from video. It's observed that can predefine small number classes,...

10.1145/641007.641096 article EN 2002-12-01

In the problem where there is not enough data to use Deep Learning, Bag-of-Visual-Words (BoVW) still a good alternative for image classification. BoVW model, many pooling methods are proposed incorporate spatial information of local feature into representation vector, but none devote making each visual word have its own regions. The practice designing same regions all words restrains discriminability representation, since distributions features indexed by different same. this paper, we...

10.1371/journal.pone.0234144 article EN cc-by PLoS ONE 2020-06-05

Performance variability can have a significant impact on many applications of computing. Cloud computing, high performance and computer security communities each exert considerable effort managing analyzing throughout the system stack. This work presents evaluates methodology for predicting precise characteristics computational an input/output (I/O) application over varying configurations. Results demonstrate that presented is capable precisely modeling variability, which could allow tighten...

10.1109/secon.2018.8478814 article EN SoutheastCon 2018-04-01

Support vector machine is a novel learning method in recent years, the SVM with RBF widely used pattern recognition because of its good properties. If applied into risk assessment, it will get better assessment results. But performance RBF-SVM influenced greatly by parameter C and sigma. The principle essence kernel function are introduced this paper,This paper analyses influence parameters sigma to RBF-SVM, then picture changing curve affect number SV wrong rate presented. result indicate...

10.1109/wicom.2008.1110 article EN 2008-10-01

Transformer fault diagnosis based on artificial neural networks (ANN) is widely used, because ANN has essential nonlinear character, parallel processing ability and the of self organize learning. But there exist problems if we use traditional method alone to diagnose transformer fault, large input vector dimension complex training database will cause computation complexity space increase greatly, lead long time, slow convergence low judgement accuracy. In this paper, a hybrid combining rough...

10.1109/cmd.2008.4580517 article EN International Conference on Condition Monitoring and Diagnosis 2008-01-01

In this paper, we are interested in detecting action attributes from sports videos for event understanding and video analysis. Action attribute is a middle layer between low level motion features high classes, which includes various patterns of human limbs bodies the interaction objects. Successfully provides richer description that facilitates many other important tasks, such classification, understanding, automatic transcript, etc. A naive approach to deal with challenging problem train...

10.5244/c.27.79 article EN 2013-01-01

Spatial pyramid is a very popular method for preserving the spatial information of local features, which partitions an image into multiple blocks at different resolution levels. Nevertheless, strategy partitioning designed by hand and same all codewords in codebook. To address this problem, we propose novel partition named discriminative tree (ST), implement it from two viewpoints: class image. Discriminative ST builds one each codeword, node corresponds to region. For better performance...

10.1109/smartcloud.2017.53 article EN 2017-11-01

The previous work reported the importance of visual intelligence on Self-Driving Sweeping Bot (SDSB) to reach pedestrian security and sweeping efficiency, in which machine vision object detection semantic segmentation terms vehicle, pedestrian, rubbishes road would be further investigated. Subsequentially, this proposed an optimizing HSV encoding framework embedded with morphology operation examine campus road. Furthermore, considering complex weather effects caused performed partially...

10.1109/icot56925.2022.10008157 article EN 2022-11-10
Coming Soon ...