- Topic Modeling
- Natural Language Processing Techniques
- Speech Recognition and Synthesis
- Food composition and properties
- Microbial Metabolites in Food Biotechnology
- Biomedical Text Mining and Ontologies
- Advanced Graph Neural Networks
- Music and Audio Processing
- Speech and Audio Processing
- Multimodal Machine Learning Applications
- Social and Intergroup Psychology
- Advanced Text Analysis Techniques
- Polysaccharides Composition and Applications
- Cerebral Palsy and Movement Disorders
- Balance, Gait, and Falls Prevention
- Proteins in Food Systems
- Educational Technology and Pedagogy
- Text Readability and Simplification
- Cultural Differences and Values
- Frailty in Older Adults
- Text and Document Classification Technologies
- Nutrition and Health in Aging
- Phytase and its Applications
- Psychology of Moral and Emotional Judgment
- Artificial Intelligence in Healthcare and Education
Liaoning Normal University
2025
Dalian University
2025
Jiangnan University
2024
Nanjing Tech University
2024
Henan University
2023
Huawei Technologies (China)
2022-2023
Sichuan University
2023
Qilu University of Technology
2023
Shandong Academy of Sciences
2023
Fordham University
2023
In this paper, we present WenetSpeech, a multi-domain Mandarin corpus consisting of 10000+ hours high-quality labeled speech, 2400+ weakly and about 10000 unlabeled with 22400+ in total. We collect the data from YouTube Podcast, which covers variety speaking styles, scenarios, domains, topics noisy conditions. An optical character recognition (OCR) method is introduced to generate audio/text segmentation candidates for on corresponding video subtitles, while ASR transcription system used...
In this paper, we present WenetSpeech, a multi-domain Mandarin corpus consisting of 10000+ hours high-quality labeled speech, 2400+ weakly and about 10000 unlabeled with 22400+ in total. We collect the data from YouTube Podcast, which covers variety speaking styles, scenarios, domains, topics, noisy conditions. An optical character recognition (OCR) based method is introduced to generate audio/text segmentation candidates for on its corresponding video captions, while ASR transcription...
The unified streaming and non-streaming two-pass (U2) end-to-end model for speech recognition has shown great performance in terms of capability, accuracy, real-time factor (RTF), latency. In this paper, we present U2++, an enhanced version U2 to further improve the accuracy. core idea U2++ is use forward backward information labeling sequences at same time training learn richer information, combine prediction decoding give more accurate results. We also proposed a new data augmentation...
The objective of this study was to determine falls risk profiles derive a prediction score and establish simple practical clinical screening tool for Chinese community-dwelling elderly individuals. This prospective cohort (n = 619) among adults aged 60 years older. Falls were ascertained at 1-year follow-up appointment. Sociodemographic information, medical history, physical performance data collected. mean age 67.4 years; 57.7% women. Female sex (odds ratios [ORs] 1.82; 95% confidence...
Automatically extracting relations between chemicals and diseases plays an important role in biomedical text mining. Chemical-disease relation (CDR) extraction aims at complex semantic relationships entities documents, which contain intrasentence intersentence relations. Most previous methods did not consider dependency syntactic information across the sentences, are very valuable for task, particular, accurately.In this paper, we propose a novel end-to-end neural network based on graph...
Daimeng Wei, Zhanglin Wu, Hengchao Shang, Zongyao Li, Minghan Wang, Jiaxin Guo, Xiaoyu Chen, Zhengzhe Yu, Hao Yang. Proceedings of the 61st Annual Meeting Association for Computational Linguistics (Volume 1: Long Papers). 2023.
Few-shot learning is an established topic in natural images for years, but few work attended to histology images, which of high clinical value since well-labeled datasets and rare abnormal samples are expensive collect. Here, we facilitate the study few-shot by setting up three cross-domain tasks that simulate real clinics problems. To enable label-efficient better generalizability, propose incorporate contrastive (CL) with latent augmentation (LA) build a system. CL learns useful...
To determine whether combined performance-based models could exert better predictive values toward discriminating community-dwelling elderly with high risk of any-falls or recurrent-falls.This prospective cohort study included a total 875 participants (mean age: 67.10±5.94 years) 513 females and 362 males, recruited from Hangu suburb area Tianjin, China. All completed comprehensive assessments.We documented information about sociodemographic information, behavioral characteristics medical...
With strong capabilities of reasoning and a generic understanding the world, Large Language Models (LLMs) have shown great potential in building versatile embodied decision making agents capable performing diverse tasks. However, when deployed to unfamiliar environments, we show that LLM face challenges efficiently gathering necessary information, leading suboptimal performance. On other hand, scenarios, human individuals often seek additional information from their peers before taking...
Zhanglin Wu, Zongyao Li, Daimeng Wei, Hengchao Shang, Jiaxin Guo, Xiaoyu Chen, Zhiqiang Rao, Zhengzhe Yu, Jinlong Yang, Shaojun Yuhao Xie, Bin Jiawei Zheng, Ming Zhu, Lizhi Lei, Hao Yanfei Jiang. Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023). 2023.
Medical information carried by electronic medical records has high clinical application value, and named entity recognition is the key task to extract valuable from a large number of records. In order realize intelligent identification entities in Chinese record texts, this paper analyzes characteristics that affect performance. The feature set composed language symbol features, part speech context word boundary features identifier feature. template designed Cascaded Conditional Random Field...
This paper describes our work in the WAT 2020 Indic Multilingual Translation Task. We participated all 7 language pairs (En Bn/Hi/Gu/Ml/Mr/Ta/Te) both directions under constrained condition—using only officially provided data. Using transformer as a baseline, Multi->En and En->Multi translation systems achieve best performances. Detailed data filtering domain selection are keys to performance enhancement experiment, with an average improvement of 2.6 BLEU scores for each pair system 4.6...
This paper presents a study on strategies to enhance the translation capabilities of large language models (LLMs) in context machine (MT) tasks. The proposes novel paradigm consisting three stages: Secondary Pre-training using Extensive Monolingual Data, Continual with Interlinear Text Format Documents, and Leveraging Source-Language Consistent Instruction for Supervised Fine-Tuning. Previous research LLMs focused various supervised fine-tuning (SFT), but their effectiveness has been...