Wenhai Wang

ORCID: 0000-0003-3707-6546
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Network Security and Intrusion Detection
  • Advanced Malware Detection Techniques
  • Advanced Neural Network Applications
  • Smart Grid Security and Resilience
  • Multimodal Machine Learning Applications
  • Domain Adaptation and Few-Shot Learning
  • Advanced Image and Video Retrieval Techniques
  • Topic Modeling
  • Fault Detection and Control Systems
  • Anomaly Detection Techniques and Applications
  • Industrial Technology and Control Systems
  • Security and Verification in Computing
  • Natural Language Processing Techniques
  • Advanced Computational Techniques and Applications
  • Handwritten Text Recognition Techniques
  • Advanced Control Systems Optimization
  • Embedded Systems and FPGA Design
  • Vehicle License Plate Recognition
  • Advanced Algorithms and Applications
  • Wireless Signal Modulation Classification
  • Machine Fault Diagnosis Techniques
  • Spine and Intervertebral Disc Pathology
  • Software-Defined Networks and 5G
  • Advanced Image Processing Techniques
  • Reliability and Maintenance Optimization

Yantai Institute of Coastal Zone Research
2025

Zhejiang University of Technology
2011-2024

State Key Laboratory of Industrial Control Technology
2013-2024

Zhejiang University
2009-2024

Guilin University of Technology
2024

Shanghai Artificial Intelligence Laboratory
2022-2024

Beijing Academy of Artificial Intelligence
2023-2024

Beijing University of Civil Engineering and Architecture
2004-2024

Gansu Provincial Hospital
2024

South China University of Technology
2022-2023

We present SegFormer, a simple, efficient yet powerful semantic segmentation framework which unifies Transformers with lightweight multilayer perception (MLP) decoders. SegFormer has two appealing features: 1) comprises novel hierarchically structured Transformer encoder outputs multiscale features. It does not need positional encoding, thereby avoiding the interpolation of codes leads to decreased performance when testing resolution differs from training. 2) avoids complex The proposed MLP...

10.48550/arxiv.2105.15203 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Scene text detection has witnessed rapid progress especially with the recent development of convolutional neural networks. However, there still exists two challenges which prevent algorithm into industry applications. On one hand, most state-of-art algorithms require quadrangle bounding box is in-accurate to locate texts arbitrary shape. other instances are close each may lead a false covers both instances. Traditionally, segmentation-based approach can relieve first problem but usually fail...

10.1109/cvpr.2019.00956 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

Scene text detection, an important step of scene reading systems, has witnessed rapid development with convolutional neural networks. Nonetheless, two main challenges still exist and hamper its deployment to real-world applications. The first problem is the trade-off between speed accuracy. second one model arbitrary-shaped instance. Recently, some methods have been proposed tackle but they rarely take entire pipeline into consideration, which may fall short in practical In this paper, we...

10.1109/iccv.2019.00853 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

Modern autonomous driving system is characterized as modular tasks in sequential order, i.e., perception, prediction, and planning. In order to perform a wide diversity of achieve advanced-level intelligence, contemporary approaches either deploy standalone models for individual tasks, or design multi-task paradigm with separate heads. However, they might suffer from accumulative errors deficient task coordination. Instead, we argue that favorable framework should be devised optimized...

10.1109/cvpr52729.2023.01712 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023-06-01

This work investigates a simple yet powerful dense prediction task adapter for Vision Transformer (ViT). Unlike recently advanced variants that incorporate vision-specific inductive biases into their architectures, the plain ViT suffers inferior performance on predictions due to weak prior assumptions. To address this issue, we propose ViT-Adapter, which allows achieve comparable transformers. Specifically, backbone in our framework is can learn representations from large-scale multi-modal...

10.48550/arxiv.2205.08534 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Audio-visual segmentation (AVS) aims to locate and segment the sounding objects in a given video, which demands audio-driven pixel-level scene understanding. The existing methods cannot fully process fine-grained correlations between audio visual cues across various situations dynamically. They also face challenges adapting complex scenarios, such as evolving audio, coexistence of multiple objects, more. In this paper, we propose AVSegFormer, novel framework for AVS that leverages...

10.1609/aaai.v38i11.29104 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2024-03-24

Finite element analysis.Via finite analysis: (1) to demonstrate the abnormal forces present at top of a scoliosis construct, (2) importance an intact interspinous and supraspinous ligament (ISL/SSL) complex, (3) evaluate transition rod (a that has short taper smaller diameter one end) as implant solution diminish these pathomechanics, regardless integrity ISL/SSL complex.The pathophysiology increased nucleus pressure angular displacement may contribute proximal junctional kyphosis....

10.1097/brs.0b013e318246d4f2 article EN Spine 2011-12-31

Nonlinear degradation trajectories are encountered frequently, and not all of them evolve homogeneously in practical systems. To take nonlinearity, heterogeneity, the entire historical data into account, we propose a nonlinear heterogeneous Wiener process model with an adaptive drift to characterize trajectories. A state-space based method is employed delineate our model. Due introduction drift, it difficult directly apply Kalman filter methods update distribution estimated drift. address...

10.1109/tr.2015.2403433 article EN IEEE Transactions on Reliability 2015-02-27

Embodied AI is a crucial frontier in robotics, capable of planning and executing action sequences for robots to accomplish long-horizon tasks physical environments. In this work, we introduce EmbodiedGPT, an end-to-end multi-modal foundation model embodied AI, empowering agents with understanding execution capabilities. To achieve this, have made the following efforts: (i) We craft large-scale dataset, termed EgoCOT. The dataset consists carefully selected videos from Ego4D along...

10.48550/arxiv.2305.15021 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Incremental few-shot semantic segmentation (IFSS) targets at incrementally expanding model's capacity to segment new class of images supervised by only a few samples. However, features learned on old classes could significantly drift, causing catastrophic forgetting. Moreover, samples for pixel-level lead notorious overfitting issues in each learning session. In this paper, we explicitly represent class-based knowledge as category embedding and hyper-class embedding, where the former...

10.1145/3503161.3548218 article EN Proceedings of the 30th ACM International Conference on Multimedia 2022-10-10

Despite the remarkable success of foundation models, their task-specific fine-tuning paradigm makes them inconsistent with goal general perception modeling. The key to eliminating this inconsistency is use generalist models for task However, existing attempts at are inadequate in both versatility and performance. In paper, we propose Uni-Perceiver v2, which first model capable handling major large-scale vision vision-language tasks competitive Specifically, images encoded as region...

10.1109/cvpr52729.2023.00264 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023-06-01

Specific emitter identification (SEI) is significant in military communication scenarios, cognitive radio, and self-organized networks. However, these methods only consider the feature of signals or after signal transformation. In other words, time-domain correlation each relationships between features are seldom taken into account. A novel method is, therefore, proposed, which includes a transformation to convert specific graph tensor model named attention network (TDGTAN) encode tensors...

10.1109/tim.2023.3241976 article EN IEEE Transactions on Instrumentation and Measurement 2023-01-01

Zero-shot fault diagnosis can identify unseen faults by predicting attributes. However, existing methods ignore the multi-grained characteristics of attributes, namely varying levels detail in describing categories. We recognize following considerations for first time: (1) attributes show typical characteristics, which could be expressed a coarse-to-fine-grained hierarchical structure; (2) play different roles diagnosis, where coarse-grained indicate rough range faults, while fine-grained...

10.1109/tfuzz.2024.3363708 article EN IEEE Transactions on Fuzzy Systems 2024-02-08
Coming Soon ...