NFDI4DS | UHH-SEMS - Publication Details

Chen Zhang

ORCID: 0000-0002-4485-8434

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5100374118

Research Areas

Natural Language Processing Techniques
Topic Modeling
Text Readability and Simplification
Speech Recognition and Synthesis
Speech and dialogue systems
Multimodal Machine Learning Applications
Ferroelectric and Negative Capacitance Devices
Advanced Sensor and Control Systems
Educational Technology and Pedagogy
Target Tracking and Data Fusion in Sensor Networks
Advanced Computational Techniques and Applications
Translation Studies and Practices
Legal Issues in Education
Music and Audio Processing
Time Series Analysis and Forecasting
Advanced Algorithms and Applications
Advanced Decision-Making Techniques
Academic integrity and plagiarism
Neural Networks and Applications
linguistics and terminology studies
Gaussian Processes and Bayesian Inference

Gansu Institute of Political Science and Law
2014-2023

Xidian University
2023

Peking University
2021

Zhejiang University
2020

Microsoft Research (United Kingdom)
2020

Beijing University of Chemical Technology
2010

Michigan State University
2010

Technical University of Darmstadt
2009

SimulSpeech: End-to-End Simultaneous Speech to Text Translation

OPENALEX - Publications

Yi Ren Jinglin Liu Xu Tan Chen Zhang Tao Qin and 2 more

In this work, we develop SimulSpeech, an end-to-end simultaneous speech to text translation system which translates in source language target concurrently. SimulSpeech consists of a encoder, segmenter and decoder, where 1) the builds upon encoder leverages connectionist temporal classification (CTC) loss split input streaming real time, 2) encoder-decoder attention adopts wait-k strategy for translation. is more challenging than previous cascaded systems (with automatic recognition (ASR)...

10.18653/v1/2020.acl-main.350 article EN cc-by 2020-01-01

Exploring Intrinsic Alignments Within Text Corpus

OPENALEX - Publications

Zi Liang Pinghui Wang Ruofei Zhang Haibo Hu Shuo Zhang and 5 more

Recent years have witnessed rapid advancements in the safety alignments of large language models (LLMs). Methods such as supervised instruction fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) thus emerged vital components constructing LLMs. While these methods achieve robust fine-grained alignment to values, their practical application is still hindered by high annotation costs incomplete alignments. Besides, intrinsic values within training corpora not been fully...

10.1609/aaai.v39i26.34957 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2025-04-11

Task-Level Curriculum Learning for Non-Autoregressive Neural Machine Translation

OPENALEX - Publications

Jinglin Liu Yi Ren Xu Tan Chen Zhang Tao Qin and 2 more

Non-autoregressive translation (NAT) achieves faster inference speed but at the cost of worse accuracy compared with autoregressive (AT). Since AT and NAT can share model structure is an easier task than due to explicit dependency on previous target-side tokens, a natural idea gradually shift training from harder task. To smooth training, in this paper, we introduce semi-autoregressive (SAT) as intermediate tasks. SAT contains hyperparameter k, each k value defines different degrees...

10.24963/ijcai.2020/534 preprint EN 2020-07-01

Neural Quality Estimation Based on Multiple Hypotheses Interaction and Self-Attention for Grammatical Error Correction

OPENALEX - Publications

Chen Zhang Tongjie Xu Guangli Wu

The English grammatical error correction system is suitable for the learning environment, with goal of accurately correcting errors in learners' writing. However, false corrections are often generated practical applications, and many cannot be corrected, thus misleading learners. quality estimation model beneficial to ensure that learners obtain accurate results avoid sentences caused by corrections. Grammatical models can generate multiple hypotheses higher quality, but existing do not...

10.1109/access.2023.3239693 article EN cc-by-nc-nd IEEE Access 2023-01-01

Extract, Integrate, Compete: Towards Verification Style Reading Comprehension

OPENALEX - Publications

Chen Zhang Yuxuan Lai Yansong Feng Dongyan Zhao

In this paper, we present a new verification style reading comprehension dataset named VGaokao from Chinese Language tests of Gaokao. Different existing efforts, the is originally designed for native speakers’ evaluation, thus requiring more advanced language understanding skills. To address challenges in VGaokao, propose novel Extract-Integrate-Compete approach, which iteratively selects complementary evidence with query updating mechanism and adaptively distills supportive evidence,...

10.18653/v1/2021.findings-emnlp.255 preprint EN cc-by 2021-01-01

MC2: Towards Transparent and Culturally-Aware NLP for Minority Languages in China

OPENALEX - Publications

Chen Zhang Mingxu Tao Quzhe Huang Jiuheng Lin Zhibin Chen and 1 more

10.18653/v1/2024.acl-long.479 article EN 2024-01-01

Task-Level Curriculum Learning for Non-Autoregressive Neural Machine Translation

OPENALEX - Publications

Jinglin Liu Yi Ren Xu Tan Chen Zhang Tao Qin and 2 more

10.48550/arxiv.2007.08772 preprint EN other-oa arXiv (Cornell University) 2020-01-01

MC^2: A Multilingual Corpus of Minority Languages in China

OPENALEX - Publications

Chen Zhang Mingxu Tao Quzhe Huang Jiuheng Lin Zhibin Chen and 1 more

Large-scale corpora play a vital role in the construction of large language models (LLMs). However, existing LLMs exhibit limited abilities understanding low-resource languages, including minority languages China, due to lack training data. To improve accessibility these we present MC^2, Multilingual Corpus Minority Languages which is largest open-source corpus so far. It encompasses four underrepresented i.e., Tibetan, Uyghur, Kazakh Arabic script, and Mongolian traditional script. Notably,...

10.48550/arxiv.2311.08348 preprint EN cc-by arXiv (Cornell University) 2023-01-01

MiniDisc: Minimal Distillation Schedule for Language Model Compression

OPENALEX - Publications

Chen Zhang Yang Yang Qifan Wang Jiahao Liu Jingang Wang and 2 more

Recent studies have uncovered that language model distillation is less effective when facing a large capacity gap between the teacher and student, introduced assistant-based to bridge gap. As connection, scale performance of assistant vital importance bring knowledge from student. However, existing methods require maximally many trials before scheduling an optimal assistant. To this end, we propose minimal schedule (MiniDisc) for in minimally one trial. In particular, motivated by finding...

10.48550/arxiv.2205.14570 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Chaotic Time Series Prediction Algorithm for Lorenz System

OPENALEX - Publications

Chen Zhang

Based on the glowworm swarm optimization (GSO) and BP neural network (BPNN), an algorithm for optimized (GSOBPNN) is proposed. In algorithm, GSO used to generate better initial thresholds weights so as compensate random defects of BPNN, thus it can make BPNN have faster convergence greater learning ability. The efficiency proposed prediction method tested by simulation chaotic time series generated Lorenz system. simulations results show that has higher forecasting accuracy compared with...

10.4028/www.scientific.net/amm.513-517.2412 article EN Applied Mechanics and Materials 2014-02-06

Towards the Robust Small-Perturbation Stability Region of Natural Language Inference

OPENALEX - Publications

Chunyi Li Xiaobing Wang Liang Zhao Xinmin Duan Chen Zhang

Natural language inference (NLI) has the intention to infer a hypothesis from premise, and strictly faithful results depend on neural networks with anti-interference ability. To improve stability of process, we initialize optimize adversarial examples based both distance minimization embedding similarity maximization, where outside region are usually constructed small perturbations. In specific, ideal candidate set alternative wordss is obtained by efficient pruning, example forced lie close...

10.2139/ssrn.4453305 preprint EN 2023-01-01

Study on the Network Assisted Teaching of the National Outstanding Course Polymer Physics

OPENALEX - Publications

Chen Zhang Hangquan Li Sizhu Wu

With the construction and promotion of Chinese national outstanding course, high-quality network curriculum are in urgent need. This article discusses application platform course "Polymer Physics" teaching practice, which include syllabus design, statistical analysis different modulus improvements future.

10.1109/icee.2010.1383 article EN International Conference on E-Business and E-Government 2010-05-01

UWSpeech: Speech to Speech Translation for Unwritten Languages

OPENALEX - Publications

Chen Zhang Xu Tan Yi Ren Tao Qin Kejun Zhang and 1 more

Existing speech to translation systems heavily rely on the text of target language: they usually translate source language either and then synthesize from text, or directly with for auxiliary training. However, those methods cannot be applied unwritten languages, which have no written phoneme available. In this paper, we develop a system named as UWSpeech, converts into discrete tokens converter, translates source-language translator, finally synthesizes an inverter. We propose method called...

10.48550/arxiv.2006.07926 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Region velocity estimation and visulization with seurat v1

OPENALEX - Publications

Chen Zhang

Example pipeline of region velocity estimation and visulization steady-state model dynamical using EM algorithm in R with Seurat to pretreat scRNA-seq data

10.17504/protocols.io.b8kbrusn preprint EN 2022-05-02

Extract, Integrate, Compete: Towards Verification Style Reading Comprehension

OPENALEX - Publications

Chen Zhang Yuxuan Lai Yansong Feng Dongyan Zhao

In this paper, we present a new verification style reading comprehension dataset named VGaokao from Chinese Language tests of Gaokao. Different existing efforts, the is originally designed for native speakers' evaluation, thus requiring more advanced language understanding skills. To address challenges in VGaokao, propose novel Extract-Integrate-Compete approach, which iteratively selects complementary evidence with query updating mechanism and adaptively distills supportive evidence,...

10.48550/arxiv.2109.05149 preprint EN cc-by arXiv (Cornell University) 2021-01-01

ESL Student Perspectives on Problems and Solutions for Academic Integrity

OPENALEX - Publications

Jim C. Hu Chen Zhang

While technology has made information readily available to university students, many of them have no sound understanding how use the sources properly, especially ESL students (Löfström & Kupila, 2013). When they others’ ideas, text, or work without crediting sources, may commit either intentional involuntary plagiarism (Camara et al, 2017). reuse a submitted assignment for another course improperly, self-plagiarism (APA Style, 2019), However, rather than simply punishing plagiarism,...

10.55016/ojs/cpai.v4i2.74168 article EN Canadian Perspectives on Academic Integrity 2021-12-30

Coming Soon ...