- Speech Recognition and Synthesis
- Advanced Text Analysis Techniques
- Natural Language Processing Techniques
- Topic Modeling
- Sentiment Analysis and Opinion Mining
- Underwater Vehicles and Communication Systems
- Advanced Data Compression Techniques
- Speech and Audio Processing
- Text and Document Classification Technologies
- Indoor and Outdoor Localization Technologies
- Sparse and Compressive Sensing Techniques
Nanjing University
2023-2024
Didi Chuxing (China)
2021
Xi'an University of Technology
2021
This paper introduces a new open-sourced Mandarin speech corpus, called DiDiSpeech. It consists of about 800 hours data at 48kHz sampling rate from 6000 speakers and the corresponding texts. All in corpus is recorded quiet environment suitable for various processing tasks, such as voice conversion, multi-speaker text-to-speech automatic recognition. We conduct experiments with multiple tasks evaluate performance, showing that it promising to use both academic research practical application....
Aiming at the problem that Aspect-based sentiment analysis in Chinese has low recognition rate due to many steps, this paper proposes an improved BiLSTM-CRF model based on combine character vector and words position feature, which can extract attribute jointly simultaneously, while extracting Polarity judges of words. Experiments show improves precision by 9.2% 13.32%, recall 0.48% 21.29%, F-measure 7.33% 15.74% compared with Conditional Random Fields (CRF) Long Short Term Memory (LSTM)...
In this paper, we describe our speech generation system for the first Audio Deep Synthesis Detection Challenge (ADD 2022). Firstly, build an any-to-many voice conversion (VC) to convert source with arbitrary language content into target speaker's fake speech. Then converted generated from VC is post-processed in time-domain improve deception ability. The experimental results show that has adversarial ability against anti-spoofing detectors a little compromise audio quality and speaker...
One of the key challenges for wireless sensing systems is how to efficiently enable capabilities multiple devices while leveraging existing communication resources. In this paper, we propose DEWS, a distributed channel measurement scheme that allows transmitters perform tasks simultaneously, which considers three issues in tasks: multi-device resolution, reliability, and accuracy. First, use carefully designed Resource Unit (dRU) allocation based on OFDMA ensure simultaneously with entire...
This paper introduces a new open-sourced Mandarin speech corpus, called DiDiSpeech. It consists of about 800 hours data at 48kHz sampling rate from 6000 speakers and the corresponding texts. All in corpus is recorded quiet environment suitable for various processing tasks, such as voice conversion, multi-speaker text-to-speech automatic recognition. We conduct experiments with multiple tasks evaluate performance, showing that it promising to use both academic research practical application....
This paper describes our submission to ICASSP 2023 MUG Challenge Track 4, Keyphrase Extraction, which aims extract keyphrases most relevant the conference theme from materials. We model challenge as a single-class Named Entity Recognition task and developed techniques for better performance on challenge: For data preprocessing, we encode split after word segmentation. In addition, increase amount of input information that can accept at one time by fusing multiple preprocessed sentences into...