Chengyi Wang

ORCID: 0000-0002-6780-9299
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Speech Recognition and Synthesis
  • Music and Audio Processing
  • Speech and Audio Processing
  • Remote-Sensing Image Classification
  • Natural Language Processing Techniques
  • Advanced Battery Materials and Technologies
  • Remote Sensing and Land Use
  • Remote Sensing and LiDAR Applications
  • Advancements in Battery Materials
  • Membrane-based Ion Separation Techniques
  • Topic Modeling
  • 3D Surveying and Cultural Heritage
  • Advanced Image Fusion Techniques
  • Membrane Separation Technologies
  • Remote Sensing in Agriculture
  • Advanced Image and Video Retrieval Techniques
  • Advanced battery technologies research
  • Video Surveillance and Tracking Methods
  • Automated Road and Building Extraction
  • Synthetic Aperture Radar (SAR) Applications and Techniques
  • Infrared Target Detection Methodologies
  • Advanced Measurement and Detection Methods
  • Medical Image Segmentation Techniques
  • Robotics and Sensor-Based Localization
  • Air Quality and Health Impacts

Nankai University
2017-2025

Chinese Academy of Sciences
2015-2025

Yancheng Teachers University
2025

Shanghai Ninth People's Hospital
2025

Aerospace Information Research Institute
2020-2025

Shanghai Jiao Tong University
2022-2025

Tianjin Chengjian University
2024

Jiangsu University of Science and Technology
2008-2024

Shandong Agricultural University
2009-2024

State Key Laboratory of Digital Medical Engineering
2024

Self-supervised learning (SSL) achieves great success in speech recognition, while limited exploration has been attempted for other processing tasks. As signal contains multi-faceted information including speaker identity, paralinguistics, spoken content, etc., universal representations all tasks is challenging. To tackle the problem, we propose a new pre-trained model, WavLM, to solve full-stack downstream WavLM jointly learns masked prediction and denoising pre-training. By this means,...

10.1109/jstsp.2022.3188113 article EN IEEE Journal of Selected Topics in Signal Processing 2022-07-04

We introduce a language modeling approach for text to speech synthesis (TTS). Specifically, we train neural codec model (called Vall-E) using discrete codes derived from an off-the-shelf audio model, and regard TTS as conditional task rather than continuous signal regression in previous work. During the pre-training stage, scale up training data 60K hours of English which is hundreds times larger existing systems. Vall-E emerges in-context learning capabilities can be used synthesize...

10.48550/arxiv.2301.02111 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Continuous speech separation was recently proposed to deal with the overlapped in natural conversations. While it shown significantly improve recognition performance for multichannel conversation transcription, its effectiveness has yet be proven a single-channel recording scenario. This paper examines use of Conformer architecture lieu recurrent neural networks model. allows model efficiently capture both local and global context information, which is helpful separation. Experimental...

10.1109/icassp39728.2021.9413423 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021-05-13

Junyi Ao, Rui Wang, Long Zhou, Chengyi Shuo Ren, Yu Wu, Shujie Liu, Tom Ko, Qing Li, Zhang, Zhihua Wei, Yao Qian, Jinyu Furu Wei. Proceedings of the 60th Annual Meeting Association for Computational Linguistics (Volume 1: Papers). 2022.

10.18653/v1/2022.acl-long.393 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2022-01-01

Phase engineering to construct In 2 S 3 heterophase junctions and abundant active boundaries surfaces for efficient Pyro-PEC performance in CdS/In .

10.1039/d4ta01455c article EN Journal of Materials Chemistry A 2024-01-01

A NiO/CNT composite was prepared by a solvothermal method. The used as the air cathode for Li–CO<sub>2</sub> batteries, and displayed great stability high catalytic activity.

10.1039/c7ta11015d article EN Journal of Materials Chemistry A 2018-01-01

Abstract Li‐CO 2 batteries are promising energy storage systems by utilizing CO at the same time, though there still some critical barriers before its practical applications such as high charging overpotential and poor cycling stability. In this work, iridium/carbon nanofibers (Ir/CNFs) prepared via electrospinning subsequent heat treatment, used cathode catalysts for rechargeable batteries. Benefitting from unique porous network structure activity of ultrasmall Ir nanoparticles, Ir/CNFs...

10.1002/smll.201800641 article EN Small 2018-06-07

Recently, there has been a strong push to transition from hybrid models end-to-end (E2E) for automatic speech recognition.Currently, are three promising E2E methods: recurrent neural network transducer (RNN-T), RNN attentionbased encoder-decoder (AED), and Transformer-AED.In this study, we conduct an empirical comparison of RNN-T, RNN-AED, Transformer-AED models, in both non-streaming streaming modes.We use 65 thousand hours Microsoft anonymized training data train these models.As more...

10.21437/interspeech.2020-2846 article EN Interspeech 2022 2020-10-25

Rechargeable Li-O2 batteries have aroused much attention for their high energy density as a promising battery technology; however, the performance of is still unsatisfactory. Lithium anodes, one most important part batteries, play vital role in improving cycle life batteries. Now, very simple method introduced to produce protective film on lithium surface via chemical reactions between metals and 1,4-dioxacyclohexane. The mainly composed ethylene oxide monomers endows with enhanced cycling...

10.1002/anie.201807985 article EN Angewandte Chemie International Edition 2018-08-07

Abstract Rechargeable Li−CO 2 batteries represent a novel approach towards clean recycling and utilization of CO . Nevertheless, there are still many obstacles to be overcome. Herein, we report the first application an electrolyte redox mediator (LiBr) in rechargeable batteries. The electrochemical performances were found significantly improved, especially terms cyclic stability rate capability. proposed mechanism is that LiBr beneficial for formation desired morphology discharge products....

10.1002/celc.201700539 article EN ChemElectroChem 2017-06-02

End-to-end speech translation poses a heavy burden on the encoder because it has to transcribe, understand, and learn cross-lingual semantics simultaneously. To obtain powerful encoder, traditional methods pre-train ASR data capture features. However, we argue that pre-training only through simple recognition is not enough, high-level linguistic knowledge should be considered. Inspired by this, propose curriculum method includes an elementary course for transcription learning two advanced...

10.18653/v1/2020.acl-main.344 preprint EN cc-by 2020-01-01

End-to-end speech translation, a hot topic in recent years, aims to translate segment of audio into specific language with an end-to-end model. Conventional approaches employ multi-task learning and pre-training methods for this task, but they suffer from the huge gap between fine-tuning. To address these issues, we propose Tandem Connectionist Encoding Network (TCEN) which bridges by reusing all subnets fine-tuning, keeping roles consistent, attention module. Furthermore, two simple...

10.1609/aaai.v34i05.6452 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2020-04-03

The speech representations learned from large-scale unlabeled data have shown better generalizability than those supervised learning and thus attract a lot of interest to be applied for various downstream tasks. In this paper, we explore the limits by different self-supervised objectives datasets automatic speaker verification (ASV), especially with well-recognized SOTA ASV model, ECAPA-TDNN [1], as model. all hidden layers pre-trained model are firstly averaged learnable weights then fed...

10.1109/icassp43922.2022.9747814 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022-04-27

Self-supervised learning (SSL) is a long-standing goal for speech processing, since it utilizes large-scale unlabeled data and avoids extensive human labeling. Recent years have witnessed great successes in applying self-supervised recognition, while limited exploration was attempted SSL modeling speaker characteristics. In this paper, we aim to improve the existing framework representation learning. Two methods are introduced enhancing unsupervised information extraction. First, apply...

10.1109/icassp43922.2022.9747077 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022-04-27

The goal of self-supervised learning (SSL) for automatic speech recognition (ASR) is to learn good representations from a large amount unlabeled the downstream ASR task. However, most SSL frameworks do not consider noise robustness which crucial real-world applications. In this paper we propose wav2vec-Switch, method encode into contextualized via contrastive learning. Specifically, feed original-noisy pairs simultaneously wav2vec 2.0 network. addition existing task, switch quantized...

10.1109/icassp43922.2022.9746929 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022-04-27

A satellite image time series (SITS) contains a significant amount of temporal information. By analysing this type data, the pattern changes in object concern can be explored. The natural change Earth’s surface is relatively slow and exhibits pronounced pattern. Some events (for example, fires, floods, plant diseases, insect pests) human activities deforestation urbanisation) will disturb cause profound on surface. These are usually referred to as disturbances. However, disturbances...

10.3390/rs10030452 article EN cc-by Remote Sensing 2018-03-13

Lithium-air batteries have caught worldwide attention due to their extremely high theoretical energy density and are regarded as powerful competitors replace traditional lithium ion batteries. However, it is rather critical how maximize the capacity while keeping good cycling stability, which has impeded practical applications of Li-air for decades. Although admirable achievements been made in recent years, there still many unsolved issues developing In this review, challenges pointed out...

10.1063/1.5091444 article EN cc-by APL Materials 2019-04-01
Coming Soon ...