- Speech Recognition and Synthesis
- Speech and Audio Processing
- Natural Language Processing Techniques
- Music and Audio Processing
- Topic Modeling
- Air Quality Monitoring and Forecasting
- Industrial Technology and Control Systems
- Robot Manipulation and Learning
- Advanced Sensor Technologies Research
- Air Quality and Health Impacts
- Atmospheric chemistry and aerosols
- Visual Attention and Saliency Detection
- Advanced Power Generation Technologies
- Evolutionary Algorithms and Applications
- Mechanical and Thermal Properties Analysis
- Autoimmune and Inflammatory Disorders Research
- Multimodal Machine Learning Applications
- Advanced Image and Video Retrieval Techniques
- Regional Development and Environment
- Advanced SAR Imaging Techniques
- Time Series Analysis and Forecasting
- Color perception and design
- Reservoir Engineering and Simulation Methods
- Lysosomal Storage Disorders Research
- Robotic Path Planning Algorithms
Sichuan University
2023-2025
Qingdao Huanghai University
2018-2024
University College London
2019-2024
Beijing Jiaotong University
2022-2023
Shanghai Normal University
2020-2023
Army Medical University
2023
Northwestern Polytechnical University
2021
National University of Singapore
2020
China Nonferrous Metal Mining (China)
2012
China National Petroleum Corporation (China)
2003
Unmanned surface vehicle (USV) has witnessed a rapid growth in the recent decade and been applied various practical applications both military civilian domains. USVs can either be deployed as single unit or multiple vehicles fleet to conduct ocean missions. Central control of USV formations, path planning is key technology that ensures navigation safety by generating collision free trajectories. Compared with conventional algorithms, deep reinforcement learning (RL) based algorithms provides...
Code-switching (CS) occurs when a speaker alternates words of two or more languages within single sentence across sentences.Automatic speech recognition (ASR) CS has to deal with at the same time.In this study, we propose Transformer-based architecture symmetric language-specific encoders capture individual language attributes, that improve acoustic representation each language.These representations are combined using multi-head attention mechanism in decoder module.Each encoder and its...
Weitai Zhang, Zhongyi Ye, Haitao Tang, Xiaoxi Li, Xinyuan Zhou, Jing Yang, Jianwei Cui, Pan Deng, Mohan Shi, Yifan Song, Dan Liu, Junhua Lirong Dai. Proceedings of the 19th International Conference on Spoken Language Translation (IWSLT 2022). 2022.
Niemann-Pick disease type C (NP-C) is a genetic lysosomal disorder associated with progressive neurodegenerative phenotypes. Its therapeutic options are very limited. Here, we show that lithium treatment improves ataxia and feeding phenotypes, attenuates cerebellar inflammation degeneration, extends survival in Npc1 mouse models. In addition, suppresses STING activation, SREBP2 processing to its mature form the expression of target genes mice Npc1-deficient fibroblasts. Lithium impedes...
The end-to-end approaches for single-channel target speech extraction have attracted widespread attention. However, the studies multi-channel are still relatively limited. In this work, we propose two methods exploiting spatial information to extract speech. first one is using a adaptation layer in parallel encoder architecture. second designing channel decorrelation mechanism inter-channel differential enhance representation. We compare proposed with strong state-of-the-art baselines....
Large language models (LLMs) show promise for natural tasks but struggle when applied directly to complex domains like finance. LLMs have difficulty reasoning about and integrating all relevant information. We propose a data-centric approach enable better handle financial tasks. Our key insight is that rather than overloading the LLM with everything at once, it more effective preprocess pre-understand data. create (FLLM) using multitask prompt-based finetuning achieve data pre-processing...
End-to-end paradigm has aroused more and interests attention for improving speech-to-text translation (ST) recently. Existing end-to-end models mainly attributes attempts to address the problem of modeling burden data scarcity, while always fail maintain both cross-modal cross-lingual mapping well at same time. In this work, we investigate methods endto-end ST with pre-trained acoustic-and-textual models. Our acoustic encoder decoder begins processing source speech sequence as usual. A...
Transformer has shown impressive performance in automatic speech recognition.It uses an encoder-decoder structure with self-attention to learn the relationship between high-level representation of source inputs and embedding target outputs.In this paper, we propose a novel decoder that features self-and-mixed attention (SMAD) deep acoustic (DAS) improve Transformer-based LVCSR.Specifically, introduce mechanism multi-layer for multiple levels abstraction.We also design mixed learns alignment...
PM <inline-formula><tex-math notation="LaTeX">$_{2.5}$</tex-math></inline-formula> concentration forecasting is important yet challenging. First, complicated local fluctuations in concentrations disturb modeling global trends. Second, errors are often accumulated through an autoregressive process. To contend with the two challenges, we propose a <b>C</b> ategory <b>G</b> uidance based notation="LaTeX">${_{2.5}}$</tex-math></inline-formula> sequence <b>F</b> orecasting training framework...
This paper describes the submissions of research group USTC-NELSLIP to 2023 IWSLT Offline Speech Translation competition, which involves translating spoken English into written Chinese. We utilize both cascaded models and end-to-end for this task. To improve performance models, we introduce Whisper reduce errors in intermediate source language text, achieving a significant improvement ASR recognition performance. For propose Stacked Acoustic-and-Textual En- coding extension (SATE-ex), feeds...
This study assessed Human papillomavirus (HPV) vaccination knowledge, willingness, and status among University of Nottingham Ningbo undergraduate students, utilizing the Theory Planned Behaviour (TPB) Health Belief Model (HBM). Self-administered questionnaires covered demographics, sexual behavior, factors influencing intentions. Quantitative qualitative analyses included t-tests, ANOVA, Pearson correlation, logistic regression, linear regression. Of 373 surveyed HPV rate was notably higher...
Modeling substitutable and complementary item relationships is a fundamental important topic for recommendation in e-commerce online scenarios. In the real world, are usually coupled, heterogeneous they also have abundant side information hierarchical data structures. Recently, to take full advantage of both sides topological structure, graph neural networks widely explored relationship modeling. However, existing methods crude decoupling relationships. Their model designs lack deep insight...
Code-switching (CS) occurs when a speaker alternates words of two or more languages within single sentence across sentences. Automatic speech recognition (ASR) CS has to deal with at the same time. In this study, we propose Transformer-based architecture symmetric language-specific encoders capture individual language attributes, that improve acoustic representation each language. These representations are combined using multi-head attention mechanism in decoder module. Each encoder and its...
Domain mismatch is a noteworthy issue in acoustic event detection tasks, as the target domain data difficult to access most real applications. In this study, we propose novel CNN-based discriminative training framework compensation method handle issue. It uses parallel discriminator learn pair of high-level intermediate representations. Together with binary loss, discriminators are forced maximally exploit discrimination heterogeneous information each audio clip events, which results robust...
The Transformer has shown impressive performance in automatic speech recognition. It uses the encoder-decoder structure with self-attention to learn relationship between high-level representation of source inputs and embedding target outputs. In this paper, we propose a novel decoder that features self-and-mixed attention (SMAD) deep acoustic (DAS) improve Transformer-based LVCSR. Specifically, introduce mechanism multi-layer for multiple levels abstraction. We also design mixed learns...