- Advanced Neural Network Applications
- Domain Adaptation and Few-Shot Learning
- Multimodal Machine Learning Applications
- Advanced Image and Video Retrieval Techniques
- Natural Language Processing Techniques
- Topic Modeling
- Adversarial Robustness in Machine Learning
- Anomaly Detection Techniques and Applications
- Image and Video Quality Assessment
- Human Pose and Action Recognition
- Time Series Analysis and Forecasting
- Vehicle License Plate Recognition
- Medical Image Segmentation Techniques
- Speech and Audio Processing
- Music and Audio Processing
- Advancements in Battery Materials
- Stock Market Forecasting Methods
- Handwritten Text Recognition Techniques
- Recommender Systems and Techniques
- Reliability and Maintenance Optimization
- Radiomics and Machine Learning in Medical Imaging
- Machine Learning and Data Classification
- Video Surveillance and Tracking Methods
- Industrial Vision Systems and Defect Detection
- Advanced Battery Technologies Research
Jilin Agricultural University
2025
Tianjin University
2022-2024
China Automotive Technology and Research Center
2017-2024
Shenyang Aerospace University
2024
Jiaozuo University
2024
VA Tennessee Valley Healthcare System
2024
Vanderbilt University Medical Center
2024
China Mobile (China)
2024
Nanjing University
2024
University of Technology Sydney
2024
Recently, remarkable progress has been made in learning transferable representation across domains. Previous works domain adaptation are majorly based on two techniques: domain-adversarial and self-training. However, only aligns feature distributions between domains but does not consider whether the target features discriminative. On other hand, self-training utilizes model predictions to enhance discrimination of features, it is unable explicitly align distributions. In order combine...
Recently, large-scale pre-training methods like CLIP have made great progress in multi-modal research such as text-video retrieval. In CLIP, transformers are vital for modeling complex relations. However, the vision transformer of essential visual tokenization process, which produces discrete token sequences, generates many homogeneous tokens due to redundancy nature consecutive and similar frames videos. This significantly increases computation costs hinders deployment video retrieval...
The prompt-based learning paradigm, which bridges the gap between pre-training and fine-tuning, achieves state-of-the-art performance on several NLP tasks, particularly in few-shot settings. Despite being widely applied, is vulnerable to backdoor attacks. Textual attacks are designed introduce targeted vulnerabilities into models by poisoning a subset of training samples through trigger injection label modification. However, they suffer from flaws such as abnormal natural language...
Computer-assisted clinical coding (CAC) based on automated algorithms has been expected to improve the International Classification of Disease, tenth version (ICD-10) quality and productivity, whereas studies oriented primary diagnosis auto-coding are limited in Chinese context. This study aims at developing a machine learning (ML) model for ICD-10 coding. A total 71,709 admissions Fuwai hospital were included carry out this study, corresponding 168 codes. Based implications, two feature...
Devamanyu Hazarika, Yingting Li, Bo Cheng, Shuai Zhao, Roger Zimmermann, Soujanya Poria. Proceedings of the 2022 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies. 2022.
The width of a neural network matters since increasing the will necessarily increase model capacity. However, performance does not improve linearly with and soon gets saturated. In this case, we argue that number networks (ensemble) can achieve better accuracy-efficiency trade-offs than purely width. To prove it, one large is divided into several small ones regarding its parameters regularization components. Each these has fraction original one's parameters. We then train together make them...
In response to the structural changes of tomato seedlings, traditional image techniques are difficult accurately quantify key morphological parameters, such as leaf area, internode length, and mutual occlusion between organs. Therefore, this paper proposes a point cloud stem segmentation framework based on Elite Strategy-based Improved Red-billed Blue Magpie Optimization (ES-RBMO) Algorithm. The uses four-layer Convolutional Neural Network (CNN) for by incorporating an improved swarm...
Capacity fade in lithium-ion batteries (LIBs) poses challenges for various industries. Predicting and preventing this is crucial, hybrid methods estimating remaining useful life (RUL) have become prevalent achieved significant advancements. In paper, we introduce a voting ensemble that combines Gradient Boosting, Random Forest, K-Nearest Neighbors to forecast the fading capacity trend knee point. We conducted extensive experiments using CALCE CS2 datasets. The results indicate our proposed...
Attributed graph clustering aims to partition nodes of a structure into different groups. Recent works usually use variational autoencoder (VGAE) make the node representations obey specific distribution. Although they have shown promising results, how introduce supervised information guide representation learning and improve performance is still an open problem. In this article, we propose Collaborative Decision-Reinforced Self-Supervision (CDRS) method solve problem, in which pseudo...
Pre-trained vision-language models (VLMs) are the de-facto foundation for various downstream tasks.However, scene text recognition methods still prefer backbones pretrained on a single modality, namely, visual despite potential of VLMs to serve as powerful readers.For example, CLIP can robustly identify regular (horizontal) and irregular (rotated, curved, blurred, or occluded) in images.With such merits, we transform into reader introduce CLIP4STR, simple yet effective STR method built upon...
Video streaming over HTTP is becoming the de facto dominating paradigm for today's video applications. as an over-the-top (OTT) protocol has been leveraged quality traversal Internet. High user-received quality-of-experience (QoE) driven not only by new technology, but also a wide range of user demands. Given limitation traditional TCP/IP network supporting transmission, typical on-off transfer pattern inevitable. Dynamic adaptive (DASH) establishes simple architecture and enables...
As an instance-level recognition problem, re-identification (re-ID) requires models to capture diverse features. However, with continuous training, re-ID pay more and attention the salient areas. a result, model may only focus on few small regions representations ignore other important information. This phenomenon leads inferior performance, especially when are evaluated inter-identity variation data. In this paper, we propose novel network, Erasing-Salient Net (ES-Net), learn comprehensive...
In the realm of lithium-ion batteries (LIBs), issues like material aging and capacity decline contribute to performance degradation or potential safety hazards. Predicting remaining useful life (RUL) serves as a crucial method assessing health batteries, thereby enhancing reliability safety. To reduce complexity improve accuracy applicability early RUL predictions for LIBs, we proposed Mamba-based state space model prediction. Due impacts abnormal data, first use interquartile range (IQR)...
Neural network pruning is one of the most popular methods accelerating inference deep convolutional neural networks (CNNs). The dominant methods, filter-level evaluate their performance through reduction ratio computations and deem that a higher equivalent to acceleration in terms time. However, we argue they are not if parallel computing considered. Given only prunes filters layers layer usually run parallel, reduced by with un-reduced ones. Thus, limited. To get ratio, it better prune...
At present, the degree of urban traffic congestion is increasing, so it necessary to detect and predict road situation. However, current vehicle detection has some problems, such as poor effect inaccurate classification relatively small vehicles. To solve these an improved YOLOv3 algorithm for proposed. This improves traditional YOLO algorithm. Firstly, uses clustering analysis method cluster data set, network structure increase number final output grids enhance prediction ability. Secondly,...
Using prompts to explore the knowledge contained within pre-trained language models for downstream tasks has now become an active topic. Current prompt tuning methods mostly convert masked modeling problems by adding cloze-style phrases and mapping all labels verbalizations with fixed length, which proven effective simple label spaces. However, when applied relation classification exhibiting complex spaces, vanilla may struggle arbitrary lengths due rigid restrictions. Inspired text...
Bounding box regression is an important component in object detection. Recent work achieves promising performance by optimizing the Intersection over Union (IoU). However, IoU-based loss has gradient vanish problem case of low overlapping bounding boxes, and model could easily ignore these simple cases. In this paper, we propose Side Overlap (SO) maximizing side overlap two which puts more penalty for Besides, to speed up convergence, Corner Distance (CD) added into objective function....
Pre-trained vision-language models~(VLMs) are the de-facto foundation models for various downstream tasks. However, scene text recognition methods still prefer backbones pre-trained on a single modality, namely, visual despite potential of VLMs to serve as powerful readers. For example, CLIP can robustly identify regular (horizontal) and irregular (rotated, curved, blurred, or occluded) in images. With such merits, we transform into reader introduce CLIP4STR, simple yet effective STR method...
Abstract Background Avian influenza A H7N9 emerged in 2013, threatening public health and causing acute respiratory distress syndrome, even death, the human population. However, underlying mechanism by which virus causes infection remains elusive. Methods Herein, we infected A549 cells with for different times assessed tripartite motif-containing protein 46 (TRIM46) expression. To determine role of TRIM46 infection, applied lentivirus-based short hairpin RNA sequences overexpression plasmids...
Display advertising is the most important revenue source for publishers in online publishing industry. The ad pricing standards are shifting to a new model which ads paid only if they viewed. Consequently, an problem predict probability that at given page depth will be shown on user's screen certain dwell time. This paper proposes deep learning models based Long Short-Term Memory (LSTM) viewability of any main novelty our best consists combination bi-directional LSTM networks,...