- Speech Recognition and Synthesis
- Speech and Audio Processing
- Natural Language Processing Techniques
- Topic Modeling
- Domain Adaptation and Few-Shot Learning
- Music and Audio Processing
- Multimodal Machine Learning Applications
- Anomaly Detection Techniques and Applications
- Adversarial Robustness in Machine Learning
- Advanced Computational Techniques and Applications
- Sentiment Analysis and Opinion Mining
- Advanced Text Analysis Techniques
- Neural Networks and Applications
- Non-Invasive Vital Sign Monitoring
- Context-Aware Activity Recognition Systems
- Advanced Neural Network Applications
- Machine Learning and Data Classification
- Genomics and Phylogenetic Studies
- Text and Document Classification Technologies
- Gait Recognition and Analysis
- Face and Expression Recognition
- Metaheuristic Optimization Algorithms Research
- Explainable Artificial Intelligence (XAI)
- Algorithms and Data Compression
- Misinformation and Its Impacts
Tianjin First Center Hospital
2021-2025
Tianjin Medical University
2021-2025
Lanzhou University
2024
Shaanxi Coal Chemical Industry Technology Research Institute
2024
Chengdu University
2024
Zhejiang Normal University
2023
University of Science and Technology of China
2021-2023
Nanjing University of Science and Technology
2022
Tianjin University
2022
Minzu University of China
2021-2022
Optical neural networks have significant advantages in terms of power consumption, parallelism, and high computing speed, which has intrigued extensive attention both academic engineering communities. It been considered as one the powerful tools promoting fields imaging processing object recognition. However, existing optical system architecture cannot be reconstructed to realization multi-functional artificial intelligence systems simultaneously. To push development this issue, we propose...
Artificial Intelligence (AI) has achieved great success in many domains, and game AI is widely regarded as its beachhead since the dawn of AI. In recent years, studies on have gradually evolved from relatively simple environments (e.g., perfect-information games such Go, chess, shogi or two-player imperfect-information heads-up Texas hold'em) to more complex ones multi-player hold'em StartCraft II). Mahjong a popular worldwide but very challenging for research due playing/scoring rules rich...
Li Huang, Junjie Li, Weiwei Jiang, Zhiyu Zhang, Minchuan Chen, Shaojun Wang, Jing Xiao. Proceedings of the 59th Annual Meeting Association for Computational Linguistics and 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2021.
Artificial bee colony (ABC) algorithm is one of the most recently proposed swarm intelligence algorithms for global numerical optimization. It performs well in cases; however, there still exist some problems it cannot solve very well. This paper presents a novel hybrid Hooke Jeeves ABC (HJABC) with intensification search based on pattern and ABC. The main purpose to demonstrate how standard can be improved by incorporating hybridization strategy. tested comprehensive set 3 6 complex...
BACKGROUND Hepatocellular carcinoma (HCC) ranks as the sixth most common cancer and third- leading cause of cancer-related deaths worldwide. The multidisciplinary tumor board (MDTB) has been recognized for improving outcomes in management, but its role patients with HCC undergoing liver transplantation (LT) remains underexplored. AIM To evaluate impact an MDTB on survival LT. METHODS We retrospectively analyzed 393 who underwent LT at our institution from October 2015 to 2021. Patients were...
As a fundamental task in opinion mining, aspect and co-extraction aims to identify the terms reviews. However, due lack of fine-grained annotated resources, it is hard train robust model for many domains. To alleviate this issue, unsupervised domain adaptation proposed transfer knowledge from labeled source an unlabeled target domain. In paper, we propose new Generative Cross-Domain Data Augmentation framework adaptation. The aimed generate target-domain data with annotation by exploiting...
We develop novel single-GPU parallelisations of the Smith- Waterman algorithm for pairwise sequence alignment. Our algorithms, which are suitable alignment a single pair very long sequences, can be used to determine score as well actual Experimental results demonstrate an order magnitude reduction in run time relative competing GPU algorithms.
We develop novel single-GPU parallelizations of the Smith-Waterman algorithm for pairwise sequence alignment. Our algorithms, which are suitable alignment a single pair very long sequences, can be used to determine score as well actual Experimental results demonstrate an order magnitude reduction in run time relative competing GPU algorithms.
The precise detection of falls is essential for promptly providing first aid to individuals who are at risk accidental injury. Presently, the predominant approach detecting through inertial measurement unit (IMU) sensors, which can capture real-time motion an object. However, it difficult current face challenges in attaining anticipated performance real-world applications, owing diverse nature human behavior. To tackle this concern, a fall that uses graph convolutional neural network (GCN)...
Speaker verification systems experience significant performance degradation when tasked with short-duration trial recordings. To address this challenge, a multi-scale feature fusion approach has been proposed to effectively capture speaker characteristics from short utterances. Constrained by the model's size, robust backbone Enhanced Res2Net (ERes2Net) combining global and local demonstrates sub-optimal in verification. further improve extraction capability of ERes2Net, we expand channel...
Speaker extraction seeks to extract the target speech in a multitalker scenario given an auxiliary reference.Such reference can be auditory, i.e., pre-recorded speech, visual, lip movements, or contextual, phonetic sequence.References different modalities provide distinct and complementary information that could fused form top-down attention on speaker.Previous studies have introduced visual contextual single model.In this paper, we propose two-stage time-domain visual-contextual speaker...
Text Normalization (TN) is an essential part in conversational systems like text-to-speech synthesis (TTS) and automatic speech recognition (ASR). It a process of transforming non-standard words (NSW) into representation how the are to be spoken. Existing approaches TN mainly rule-based or hybrid systems, which require abundant hand-crafted rules. In this paper, we treat as neural machine translation problem present pure data-driven system using Transformer framework. Partial Parameter...
This paper presents a subspace k-means clustering algorithm for high-dimensional data with automatic selection of k. A new penalty term is introduced to the objective function fuzzy process enable several clusters compete objects, which leads merging some cluster centres and identification 'true' number clusters. The determines in dataset by adjusting factor. validation index proposed employed verify results generated algorithm. experimental from both synthetic real have demonstrated that...
Detecting out-of-distribution (OOD) examples is crucial to guarantee the reliability and safety of deep neural networks in real-world settings. In this paper, we offer an innovative perspective on quantifying disparities between in-distribution (ID) OOD data -- analyzing uncertainty that arises when models attempt explain their predictive decisions. This motivated by our observation gradient-based attribution methods encounter challenges assigning feature importance data, thereby yielding...
Contrastive learning can largely enhance the feature discriminability in a self-supervised manner and has achieved remarkable success for various visual tasks. However, it is undesirably observed that standard contrastive paradigm (features+$\ell_{2}$ normalization) only brings little help domain adaptation. In this work, we delve into phenomenon find main reason due to class weights (weights of final fully connected layer) which are vital recognition yet ignored optimization. To tackle...
The ability of large language models (LLMs) to follow instructions is crucial real-world applications. Despite recent advances, several studies have highlighted that LLMs struggle when faced with challenging instructions, especially those include complex constraints, hindering their effectiveness in various tasks. To address this challenge, we introduce Conifer, a novel instruction tuning dataset, designed enhance multi-level constraints. Utilizing GPT-4, curate the dataset by series...
Video anomaly detection (VAD) is a challenging task aiming to recognize anomalies in video frames, and existing large-scale VAD researches primarily focus on road traffic human activity scenes. In industrial scenes, there are often variety of unpredictable anomalies, the method can play significant role these scenarios. However, lack applicable datasets methods specifically tailored for production scenarios due concerns regarding privacy security. To bridge this gap, we propose new dataset,...
Speaker verification systems experience significant performance degradation when tasked with short-duration trial recordings. To address this challenge, a multi-scale feature fusion approach has been proposed to effectively capture speaker characteristics from short utterances. Constrained by the model's size, robust backbone Enhanced Res2Net (ERes2Net) combining global and local demonstrates sub-optimal in verification. further improve extraction capability of ERes2Net, we expand channel...