Wei Mao

ORCID: 0000-0002-8876-8983
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Human Motion and Animation
  • Human Pose and Action Recognition
  • Video Analysis and Summarization
  • Advanced Vision and Imaging
  • Optical measurement and interference techniques
  • Anomaly Detection Techniques and Applications
  • Advanced Algorithms and Applications
  • Robotics and Sensor-Based Localization
  • Infrared Target Detection Methodologies
  • Image Processing Techniques and Applications
  • Advanced Sensor and Control Systems
  • Image Retrieval and Classification Techniques
  • Computer Graphics and Visualization Techniques
  • Advanced Image and Video Retrieval Techniques
  • Multimodal Machine Learning Applications
  • Aerosol Filtration and Electrostatic Precipitation
  • Gait Recognition and Analysis
  • Hand Gesture Recognition Systems
  • Advanced Image Processing Techniques
  • Laser Material Processing Techniques
  • Greenhouse Technology and Climate Control
  • Wireless Signal Modulation Classification
  • Diversity and Impact of Dance
  • Advanced Neural Network Applications
  • Rough Sets and Fuzzy Logic

Affiliated Eye Hospital of Wenzhou Medical College
2025

Wenzhou Medical University
2025

Australian National University
2019-2023

Xi'an University of Technology
2021-2023

China Academy of Launch Vehicle Technology
2021-2022

Beijing Institute of Technology
2019

Numerical Method (China)
2019

IBM (United States)
2012

Jiangsu University
2012

Tongji University
2012

Human motion prediction, i.e., forecasting future body poses given observed pose sequence, has typically been tackled with recurrent neural networks (RNNs). However, as evidenced by prior work, the resulted RNN models suffer from prediction errors accumulation, leading to undesired discontinuities in prediction. In this paper, we propose a simple feed-forward deep network for which takes into account both temporal smoothness and spatial dependencies among human joints. context, then encode...

10.1109/iccv.2019.00958 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

We propose a cost volume-based neural network for depth inference from multi-view images. demonstrate that building volume pyramid in coarse-to-fine manner instead of constructing at fixed resolution leads to compact, lightweight and allows us inferring high maps achieve better reconstruction results. To this end, we first build based on uniform sampling fronto-parallel planes across the entire range coarsest an image. Then, given current estimate, construct new volumes iteratively pixelwise...

10.1109/cvpr42600.2020.00493 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

10.1007/s11263-021-01483-7 article EN International Journal of Computer Vision 2021-06-16

Recent progress in stochastic motion prediction, i.e., predicting multiple possible future human motions given a single past pose sequence, has led to producing truly diverse and even providing control over the of some body parts. However, achieve this, state-of-the-art method requires learning several mappings for diversity dedicated model controllable prediction. In this paper, we introduce unified deep generative network both To end, leverage intuition that realistic consist smooth...

10.1109/iccv48922.2021.01306 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

We propose a cost volume-based neural network for depth inference from multi-view images. demonstrate that building volume pyramid in coarse-to-fine manner instead of constructing at fixed resolution leads to compact, lightweight and allows us inferring high maps achieve better reconstruction results. To this end, we first build based on uniform sampling fronto-parallel planes across the entire range coarsest an image. Then, given current estimate, construct new volumes iteratively perform...

10.1109/tpami.2021.3082562 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2021-01-01

We introduce the task of action-driven stochastic human motion prediction, which aims to predict multiple plausible future motions given a sequence action labels and short history. This differs from existing works, that either do not respect any specific category, or follow single label. In particular, addressing this requires tackling two challenges: The transitions between different actions must be smooth; length predicted depends on varies significantly across samples. As we cannot...

10.1109/cvpr52688.2022.00798 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

Abstract While selecting the most suitable infrared thermal imaging detection scheme for online inspection during laser cladding processing, this paper designs RespathU-net semantic segmentation defect network coating defects in images. The is based on U-net framework. It optimized and improved by redesigning coding structure, expanding perceptual field, connecting paths of residuals, thus enhancing effect defective areas melt addressing problems that original cannot realize end-to-end...

10.1088/1361-6501/acc7bd article EN Measurement Science and Technology 2023-03-27

We propose VisFusion, a visibility-aware online 3D scene reconstruction approach from posed monocular videos. In particular, we aim to reconstruct the volumetric features. Unlike previous methods which aggregate features for each voxel input views without considering its visibility, improve feature fusion by explicitly inferring visibility similarity matrix, computed projected in image pair. Following works, our model is coarse-to-fine pipeline including volume sparsification process....

10.1109/cvpr52729.2023.01661 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023-06-01

SIFT descriptor plays a great role in image mosaic, retrieval and target recognition for good invariance of translation, rotation zoom. However, the disadvantages are its high dimensionality complex computation. Besides, has poor performance when massive similar local features background exist matching image. In this paper, distinctive robust weighted intensity binary descriptor(WLIB-SIFT) is proposed. A WLIB-SIFT consists descriptor(B-SIFT) descriptor. The experimental results show that...

10.1109/icicsp48821.2019.8958587 article EN 2019-09-01

The existing methods for thermal barrier coating (TBC) life prediction rely mainly on experience and formula derivation are inefficient inaccurate. By introducing deep learning into TBC analyses, a convolutional neural network (CNN) is used to extract the interface morphology analyze its information, which can achieve high-efficiency accurate judgment of life. In this thesis, an Adap–Alex algorithm proposed overcome problems related large training time, over-fitting, low accuracy in CNN...

10.3390/coatings11080890 article EN Coatings 2021-07-26

Human motion prediction, i.e., forecasting future body poses given observed pose sequence, has typically been tackled with recurrent neural networks (RNNs). However, as evidenced by prior work, the resulted RNN models suffer from prediction errors accumulation, leading to undesired discontinuities in prediction. In this paper, we propose a simple feed-forward deep network for which takes into account both temporal smoothness and spatial dependencies among human joints. context, then encode...

10.48550/arxiv.1908.05436 preprint EN other-oa arXiv (Cornell University) 2019-01-01

This paper addresses the task of 3D pose estimation for a hand interacting with an object from single image observation. When modeling hand-object interaction, previous works mainly exploit proximity cues, while overlooking dynamical nature that must stably grasp to counteract gravity and thus preventing slipping or falling. These fail leverage constraints in consequently often produce unstable results. Meanwhile, refining configurations physics-based reasoning remains challenging, both by...

10.48550/arxiv.2310.07206 preprint EN cc-by arXiv (Cornell University) 2023-01-01

NS(Simulator Network) is an object-oriented visual simulator based on large-scale discrete event. It simulates not only the transmission of network data and topology architecture, but also all kinds IP circumstance. This paper describes architecture characteristics NS, gives technique general process NS. The instance a streaming media applications-based adaptive congestion control algorithm implemented simulation results are analyzed. experimental result shows that when congested, delay,...

10.1109/iccis.2012.282 article EN 2012-08-01

By studying the shortcomings of feature, which extracted from Radar-Cross Section(RCS),using mathematical and statistical method, using idea extracting abstract features in image recognition speech by artificial intelligence for reference[2][3]. This paper explores possibility target's RCS sequence, proposes an feature extraction method sequence based on singular value decomposition(SVD) decomposition. Because poor interpretability features, four different machine learning algorithms are...

10.1088/1757-899x/569/5/052010 article EN IOP Conference Series Materials Science and Engineering 2019-07-01

We propose a cost volume-based neural network for depth inference from multi-view images. demonstrate that building volume pyramid in coarse-to-fine manner instead of constructing at fixed resolution leads to compact, lightweight and allows us inferring high maps achieve better reconstruction results. To this end, we first build based on uniform sampling fronto-parallel planes across the entire range coarsest an image. Then, given current estimate, construct new volumes iteratively pixelwise...

10.48550/arxiv.1912.08329 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Human motion prediction aims to forecast future human poses given a past motion. Whether based on recurrent or feed-forward neural networks, existing methods fail model the observation that tends repeat itself, even for complex sports actions and cooking activities. Here, we introduce an attention-based network explicitly leverages this observation. In particular, instead of modeling frame-wise attention via pose similarity, propose extract capture similarity between current context...

10.48550/arxiv.2007.11755 preprint EN other-oa arXiv (Cornell University) 2020-01-01
Coming Soon ...