Shuai Zhao

ORCID: 0000-0003-1320-4283
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Neural Network Applications
  • Domain Adaptation and Few-Shot Learning
  • Multimodal Machine Learning Applications
  • Advanced Image and Video Retrieval Techniques
  • Natural Language Processing Techniques
  • Topic Modeling
  • Adversarial Robustness in Machine Learning
  • Anomaly Detection Techniques and Applications
  • Image and Video Quality Assessment
  • Human Pose and Action Recognition
  • Time Series Analysis and Forecasting
  • Vehicle License Plate Recognition
  • Medical Image Segmentation Techniques
  • Speech and Audio Processing
  • Music and Audio Processing
  • Advancements in Battery Materials
  • Stock Market Forecasting Methods
  • Handwritten Text Recognition Techniques
  • Recommender Systems and Techniques
  • Reliability and Maintenance Optimization
  • Radiomics and Machine Learning in Medical Imaging
  • Machine Learning and Data Classification
  • Video Surveillance and Tracking Methods
  • Industrial Vision Systems and Defect Detection
  • Advanced Battery Technologies Research

Jilin Agricultural University
2025

Tianjin University
2022-2024

China Automotive Technology and Research Center
2017-2024

Shenyang Aerospace University
2024

Jiaozuo University
2024

VA Tennessee Valley Healthcare System
2024

Vanderbilt University Medical Center
2024

China Mobile (China)
2024

Nanjing University
2024

University of Technology Sydney
2024

Recently, remarkable progress has been made in learning transferable representation across domains. Previous works domain adaptation are majorly based on two techniques: domain-adversarial and self-training. However, only aligns feature distributions between domains but does not consider whether the target features discriminative. On other hand, self-training utilizes model predictions to enhance discrimination of features, it is unable explicitly align distributions. In order combine...

10.1609/aaai.v34i04.5757 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2020-04-03

Recently, large-scale pre-training methods like CLIP have made great progress in multi-modal research such as text-video retrieval. In CLIP, transformers are vital for modeling complex relations. However, the vision transformer of essential visual tokenization process, which produces discrete token sequences, generates many homogeneous tokens due to redundancy nature consecutive and similar frames videos. This significantly increases computation costs hinders deployment video retrieval...

10.1145/3477495.3531950 article EN Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2022-07-06

The prompt-based learning paradigm, which bridges the gap between pre-training and fine-tuning, achieves state-of-the-art performance on several NLP tasks, particularly in few-shot settings. Despite being widely applied, is vulnerable to backdoor attacks. Textual attacks are designed introduce targeted vulnerabilities into models by poisoning a subset of training samples through trigger injection label modification. However, they suffer from flaws such as abnormal natural language...

10.18653/v1/2023.emnlp-main.757 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2023-01-01

Computer-assisted clinical coding (CAC) based on automated algorithms has been expected to improve the International Classification of Disease, tenth version (ICD-10) quality and productivity, whereas studies oriented primary diagnosis auto-coding are limited in Chinese context. This study aims at developing a machine learning (ML) model for ICD-10 coding. A total 71,709 admissions Fuwai hospital were included carry out this study, corresponding 168 codes. Based implications, two feature...

10.1016/j.ijmedinf.2021.104543 article EN cc-by-nc-nd International Journal of Medical Informatics 2021-07-27

Devamanyu Hazarika, Yingting Li, Bo Cheng, Shuai Zhao, Roger Zimmermann, Soujanya Poria. Proceedings of the 2022 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies. 2022.

10.18653/v1/2022.naacl-main.50 article EN cc-by Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2022-01-01

The width of a neural network matters since increasing the will necessarily increase model capacity. However, performance does not improve linearly with and soon gets saturated. In this case, we argue that number networks (ensemble) can achieve better accuracy-efficiency trade-offs than purely width. To prove it, one large is divided into several small ones regarding its parameters regularization components. Each these has fraction original one's parameters. We then train together make them...

10.1109/tip.2022.3201602 article EN IEEE Transactions on Image Processing 2022-01-01

In response to the structural changes of tomato seedlings, traditional image techniques are difficult accurately quantify key morphological parameters, such as leaf area, internode length, and mutual occlusion between organs. Therefore, this paper proposes a point cloud stem segmentation framework based on Elite Strategy-based Improved Red-billed Blue Magpie Optimization (ES-RBMO) Algorithm. The uses four-layer Convolutional Neural Network (CNN) for by incorporating an improved swarm...

10.3390/agriculture15020180 article EN cc-by Agriculture 2025-01-15

Capacity fade in lithium-ion batteries (LIBs) poses challenges for various industries. Predicting and preventing this is crucial, hybrid methods estimating remaining useful life (RUL) have become prevalent achieved significant advancements. In paper, we introduce a voting ensemble that combines Gradient Boosting, Random Forest, K-Nearest Neighbors to forecast the fading capacity trend knee point. We conducted extensive experiments using CALCE CS2 datasets. The results indicate our proposed...

10.3390/en18051114 article EN cc-by Energies 2025-02-25

Attributed graph clustering aims to partition nodes of a structure into different groups. Recent works usually use variational autoencoder (VGAE) make the node representations obey specific distribution. Although they have shown promising results, how introduce supervised information guide representation learning and improve performance is still an open problem. In this article, we propose Collaborative Decision-Reinforced Self-Supervision (CDRS) method solve problem, in which pseudo...

10.1109/tnnls.2022.3171583 article EN IEEE Transactions on Neural Networks and Learning Systems 2022-05-18

Pre-trained vision-language models (VLMs) are the de-facto foundation for various downstream tasks.However, scene text recognition methods still prefer backbones pretrained on a single modality, namely, visual despite potential of VLMs to serve as powerful readers.For example, CLIP can robustly identify regular (horizontal) and irregular (rotated, curved, blurred, or occluded) in images.With such merits, we transform into reader introduce CLIP4STR, simple yet effective STR method built upon...

10.1109/tip.2024.3512354 article EN IEEE Transactions on Image Processing 2024-01-01

Video streaming over HTTP is becoming the de facto dominating paradigm for today's video applications. as an over-the-top (OTT) protocol has been leveraged quality traversal Internet. High user-received quality-of-experience (QoE) driven not only by new technology, but also a wide range of user demands. Given limitation traditional TCP/IP network supporting transmission, typical on-off transfer pattern inevitable. Dynamic adaptive (DASH) establishes simple architecture and enables...

10.1109/iccnc.2017.7876191 article EN 2016 International Conference on Computing, Networking and Communications (ICNC) 2017-01-01

As an instance-level recognition problem, re-identification (re-ID) requires models to capture diverse features. However, with continuous training, re-ID pay more and attention the salient areas. a result, model may only focus on few small regions representations ignore other important information. This phenomenon leads inferior performance, especially when are evaluated inter-identity variation data. In this paper, we propose novel network, Erasing-Salient Net (ES-Net), learn comprehensive...

10.1109/tip.2020.3046904 article EN IEEE Transactions on Image Processing 2020-12-31

In the realm of lithium-ion batteries (LIBs), issues like material aging and capacity decline contribute to performance degradation or potential safety hazards. Predicting remaining useful life (RUL) serves as a crucial method assessing health batteries, thereby enhancing reliability safety. To reduce complexity improve accuracy applicability early RUL predictions for LIBs, we proposed Mamba-based state space model prediction. Due impacts abnormal data, first use interquartile range (IQR)...

10.3390/en17246326 article EN cc-by Energies 2024-12-16

Neural network pruning is one of the most popular methods accelerating inference deep convolutional neural networks (CNNs). The dominant methods, filter-level evaluate their performance through reduction ratio computations and deem that a higher equivalent to acceleration in terms time. However, we argue they are not if parallel computing considered. Given only prunes filters layers layer usually run parallel, reduced by with un-reduced ones. Thus, limited. To get ratio, it better prune...

10.48550/arxiv.1912.10178 preprint EN other-oa arXiv (Cornell University) 2019-01-01

At present, the degree of urban traffic congestion is increasing, so it necessary to detect and predict road situation. However, current vehicle detection has some problems, such as poor effect inaccurate classification relatively small vehicles. To solve these an improved YOLOv3 algorithm for proposed. This improves traditional YOLO algorithm. Firstly, uses clustering analysis method cluster data set, network structure increase number final output grids enhance prediction ability. Secondly,...

10.1109/icitbs49701.2020.00024 article EN 2020-01-01

Using prompts to explore the knowledge contained within pre-trained language models for downstream tasks has now become an active topic. Current prompt tuning methods mostly convert masked modeling problems by adding cloze-style phrases and mapping all labels verbalizations with fixed length, which proven effective simple label spaces. However, when applied relation classification exhibiting complex spaces, vanilla may struggle arbitrary lengths due rigid restrictions. Inspired text...

10.18653/v1/2022.findings-emnlp.231 article EN cc-by 2022-01-01

Bounding box regression is an important component in object detection. Recent work achieves promising performance by optimizing the Intersection over Union (IoU). However, IoU-based loss has gradient vanish problem case of low overlapping bounding boxes, and model could easily ignore these simple cases. In this paper, we propose Side Overlap (SO) maximizing side overlap two which puts more penalty for Besides, to speed up convergence, Corner Distance (CD) added into objective function....

10.1609/aaai.v36i3.20265 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2022-06-28

Pre-trained vision-language models~(VLMs) are the de-facto foundation models for various downstream tasks. However, scene text recognition methods still prefer backbones pre-trained on a single modality, namely, visual despite potential of VLMs to serve as powerful readers. For example, CLIP can robustly identify regular (horizontal) and irregular (rotated, curved, blurred, or occluded) in images. With such merits, we transform into reader introduce CLIP4STR, simple yet effective STR method...

10.48550/arxiv.2305.14014 preprint EN cc-by-nc-sa arXiv (Cornell University) 2023-01-01

Abstract Background Avian influenza A H7N9 emerged in 2013, threatening public health and causing acute respiratory distress syndrome, even death, the human population. However, underlying mechanism by which virus causes infection remains elusive. Methods Herein, we infected A549 cells with for different times assessed tripartite motif-containing protein 46 (TRIM46) expression. To determine role of TRIM46 infection, applied lentivirus-based short hairpin RNA sequences overexpression plasmids...

10.1186/s12985-022-01907-x article EN cc-by Virology Journal 2022-11-03

Display advertising is the most important revenue source for publishers in online publishing industry. The ad pricing standards are shifting to a new model which ads paid only if they viewed. Consequently, an problem predict probability that at given page depth will be shown on user's screen certain dwell time. This paper proposes deep learning models based Long Short-Term Memory (LSTM) viewability of any main novelty our best consists combination bi-directional LSTM networks,...

10.1109/tkde.2018.2839599 article EN publisher-specific-oa IEEE Transactions on Knowledge and Data Engineering 2018-05-22
Coming Soon ...