- Network Security and Intrusion Detection
- Advanced Malware Detection Techniques
- Advanced Neural Network Applications
- Smart Grid Security and Resilience
- Multimodal Machine Learning Applications
- Domain Adaptation and Few-Shot Learning
- Advanced Image and Video Retrieval Techniques
- Topic Modeling
- Fault Detection and Control Systems
- Anomaly Detection Techniques and Applications
- Industrial Technology and Control Systems
- Security and Verification in Computing
- Natural Language Processing Techniques
- Advanced Computational Techniques and Applications
- Handwritten Text Recognition Techniques
- Advanced Control Systems Optimization
- Embedded Systems and FPGA Design
- Vehicle License Plate Recognition
- Advanced Algorithms and Applications
- Wireless Signal Modulation Classification
- Machine Fault Diagnosis Techniques
- Spine and Intervertebral Disc Pathology
- Software-Defined Networks and 5G
- Advanced Image Processing Techniques
- Reliability and Maintenance Optimization
Yantai Institute of Coastal Zone Research
2025
Zhejiang University of Technology
2011-2024
State Key Laboratory of Industrial Control Technology
2013-2024
Zhejiang University
2009-2024
Guilin University of Technology
2024
Shanghai Artificial Intelligence Laboratory
2022-2024
Beijing Academy of Artificial Intelligence
2023-2024
Beijing University of Civil Engineering and Architecture
2004-2024
Gansu Provincial Hospital
2024
South China University of Technology
2022-2023
We present SegFormer, a simple, efficient yet powerful semantic segmentation framework which unifies Transformers with lightweight multilayer perception (MLP) decoders. SegFormer has two appealing features: 1) comprises novel hierarchically structured Transformer encoder outputs multiscale features. It does not need positional encoding, thereby avoiding the interpolation of codes leads to decreased performance when testing resolution differs from training. 2) avoids complex The proposed MLP...
Scene text detection has witnessed rapid progress especially with the recent development of convolutional neural networks. However, there still exists two challenges which prevent algorithm into industry applications. On one hand, most state-of-art algorithms require quadrangle bounding box is in-accurate to locate texts arbitrary shape. other instances are close each may lead a false covers both instances. Traditionally, segmentation-based approach can relieve first problem but usually fail...
Scene text detection, an important step of scene reading systems, has witnessed rapid development with convolutional neural networks. Nonetheless, two main challenges still exist and hamper its deployment to real-world applications. The first problem is the trade-off between speed accuracy. second one model arbitrary-shaped instance. Recently, some methods have been proposed tackle but they rarely take entire pipeline into consideration, which may fall short in practical In this paper, we...
Modern autonomous driving system is characterized as modular tasks in sequential order, i.e., perception, prediction, and planning. In order to perform a wide diversity of achieve advanced-level intelligence, contemporary approaches either deploy standalone models for individual tasks, or design multi-task paradigm with separate heads. However, they might suffer from accumulative errors deficient task coordination. Instead, we argue that favorable framework should be devised optimized...
This work investigates a simple yet powerful dense prediction task adapter for Vision Transformer (ViT). Unlike recently advanced variants that incorporate vision-specific inductive biases into their architectures, the plain ViT suffers inferior performance on predictions due to weak prior assumptions. To address this issue, we propose ViT-Adapter, which allows achieve comparable transformers. Specifically, backbone in our framework is can learn representations from large-scale multi-modal...
Audio-visual segmentation (AVS) aims to locate and segment the sounding objects in a given video, which demands audio-driven pixel-level scene understanding. The existing methods cannot fully process fine-grained correlations between audio visual cues across various situations dynamically. They also face challenges adapting complex scenarios, such as evolving audio, coexistence of multiple objects, more. In this paper, we propose AVSegFormer, novel framework for AVS that leverages...
Finite element analysis.Via finite analysis: (1) to demonstrate the abnormal forces present at top of a scoliosis construct, (2) importance an intact interspinous and supraspinous ligament (ISL/SSL) complex, (3) evaluate transition rod (a that has short taper smaller diameter one end) as implant solution diminish these pathomechanics, regardless integrity ISL/SSL complex.The pathophysiology increased nucleus pressure angular displacement may contribute proximal junctional kyphosis....
Nonlinear degradation trajectories are encountered frequently, and not all of them evolve homogeneously in practical systems. To take nonlinearity, heterogeneity, the entire historical data into account, we propose a nonlinear heterogeneous Wiener process model with an adaptive drift to characterize trajectories. A state-space based method is employed delineate our model. Due introduction drift, it difficult directly apply Kalman filter methods update distribution estimated drift. address...
Embodied AI is a crucial frontier in robotics, capable of planning and executing action sequences for robots to accomplish long-horizon tasks physical environments. In this work, we introduce EmbodiedGPT, an end-to-end multi-modal foundation model embodied AI, empowering agents with understanding execution capabilities. To achieve this, have made the following efforts: (i) We craft large-scale dataset, termed EgoCOT. The dataset consists carefully selected videos from Ego4D along...
Incremental few-shot semantic segmentation (IFSS) targets at incrementally expanding model's capacity to segment new class of images supervised by only a few samples. However, features learned on old classes could significantly drift, causing catastrophic forgetting. Moreover, samples for pixel-level lead notorious overfitting issues in each learning session. In this paper, we explicitly represent class-based knowledge as category embedding and hyper-class embedding, where the former...
Despite the remarkable success of foundation models, their task-specific fine-tuning paradigm makes them inconsistent with goal general perception modeling. The key to eliminating this inconsistency is use generalist models for task However, existing attempts at are inadequate in both versatility and performance. In paper, we propose Uni-Perceiver v2, which first model capable handling major large-scale vision vision-language tasks competitive Specifically, images encoded as region...
Specific emitter identification (SEI) is significant in military communication scenarios, cognitive radio, and self-organized networks. However, these methods only consider the feature of signals or after signal transformation. In other words, time-domain correlation each relationships between features are seldom taken into account. A novel method is, therefore, proposed, which includes a transformation to convert specific graph tensor model named attention network (TDGTAN) encode tensors...
Zero-shot fault diagnosis can identify unseen faults by predicting attributes. However, existing methods ignore the multi-grained characteristics of attributes, namely varying levels detail in describing categories. We recognize following considerations for first time: (1) attributes show typical characteristics, which could be expressed a coarse-to-fine-grained hierarchical structure; (2) play different roles diagnosis, where coarse-grained indicate rough range faults, while fine-grained...