NFDI4DS | UHH-SEMS - Publication Details

Zhirui Zhang

ORCID: 0000-0003-1385-3742

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5028604823

Research Areas

Natural Language Processing Techniques
Topic Modeling
Multimodal Machine Learning Applications
Speech Recognition and Synthesis
Speech and dialogue systems
Text Readability and Simplification
Data Quality and Management
Music and Audio Processing
Lightning and Electromagnetic Phenomena
Power Systems and Technologies
Power System Reliability and Maintenance
High-Voltage Power Transmission Systems
Heavy metals in environment
Electric Power System Optimization
Machine Learning and Data Classification
Privacy-Preserving Technologies in Data
Hate Speech and Cyberbullying Detection
Adversarial Robustness in Machine Learning
Coastal wetland ecosystem dynamics
Smart Grid and Power Systems
Color perception and design
Intelligent Tutoring Systems and Adaptive Learning
Energy Load and Power Forecasting
Software Engineering Research
Advanced Vision and Imaging

Tongji University
2023-2025

Shanghai Normal University
2025

ShanghaiTech University
2024

Tencent (China)
2022-2024

Changchun University of Science and Technology
2024

Heilongjiang University of Chinese Medicine
2024

University of Science and Technology of China
2018-2023

Xi'an University of Architecture and Technology
2023

Dalian Maritime University
2023

North China University of Science and Technology
2023

Achieving Human Parity on Automatic Chinese to English News Translation

OPENALEX - Publications

Hany Hassan Anthony Aue Chang Chen Vishal Chowdhary Jonathan H. Clark and 19 more

Machine translation has made rapid advances in recent years. Millions of people are using it today online systems and mobile applications order to communicate across language barriers. The question naturally arises whether such can approach or achieve parity with human translations. In this paper, we first address the problem how define accurately measure translation. We then describe Microsoft's machine system quality its translations on widely used WMT 2017 news task from Chinese English....

10.48550/arxiv.1803.05567 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Document-Level Machine Translation with Large Language Models

OPENALEX - Publications

Longyue Wang Chenyang Lyu Tianbo Ji Zhirui Zhang Dian Yu and 2 more

Large language models (LLMs) such as ChatGPT can produce coherent, cohesive, relevant, and fluent answers for various natural processing (NLP) tasks. Taking document-level machine translation (MT) a testbed, this paper provides an in-depth evaluation of LLMs’ ability on discourse modeling. The study focuses three aspects: 1) Effects Context-Aware Prompts, where we investigate the impact different prompts quality phenomena; 2) Comparison Translation Models, compare performance with commercial...

10.18653/v1/2023.emnlp-main.1036 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2023-01-01

Style Transfer as Unsupervised Machine Translation

OPENALEX - Publications

Zhirui Zhang Shuo Ren Shujie Liu Jianyong Wang Peng Chen and 3 more

Language style transferring rephrases text with specific stylistic attributes while preserving the original attribute-independent content. One main challenge in learning a transfer system is lack of parallel data where source sentence one and target another style. With this constraint, paper, we adapt unsupervised machine translation methods for task automatic transfer. We first take advantage style-preference information word embedding similarity to produce pseudo-parallel statistical (SMT)...

10.48550/arxiv.1808.07894 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Joint Training for Neural Machine Translation Models with Monolingual Data

OPENALEX - Publications

Zhirui Zhang Shujie Liu Mu Li Ming Zhou Enhong Chen

Monolingual data have been demonstrated to be helpful in improving translation quality of both statistical machine (SMT) systems and neural (NMT) systems, especially resource-poor or domain adaptation tasks where parallel are not rich enough. In this paper, we propose a novel approach better leveraging monolingual for by jointly learning source-to-target target-to-source NMT models language pair with joint EM optimization method. The training process starts two initial pre-trained on each...

10.1609/aaai.v32i1.11248 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2018-04-25

Adaptive Nearest Neighbor Machine Translation

OPENALEX - Publications

Xin Zheng Zhirui Zhang Junliang Guo Shujian Huang Boxing Chen and 2 more

Xin Zheng, Zhirui Zhang, Junliang Guo, Shujian Huang, Boxing Chen, Weihua Luo, Jiajun Chen. Proceedings of the 59th Annual Meeting Association for Computational Linguistics and 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). 2021.

10.18653/v1/2021.acl-short.47 article EN cc-by 2021-01-01

Fe2O3@C core@shell nanotubes: Porous Fe2O3 nanotubes derived from MIL-88A as cores and carbon as shells for high power lithium ion batteries

OPENALEX - Publications

Zhenkang Wang Zhirui Zhang Jing Xia Wei Wang Shasha Sun and 2 more

10.1016/j.jallcom.2018.08.081 article EN Journal of Alloys and Compounds 2018-08-11

Regularizing Neural Machine Translation by Target-Bidirectional Agreement

OPENALEX - Publications

Zhirui Zhang Shuangzhi Wu Shujie Liu Mu Li Ming Zhou and 1 more

Although Neural Machine Translation (NMT) has achieved remarkable progress in the past several years, most NMT systems still suffer from a fundamental shortcoming as other sequence generation tasks: errors made early process are fed inputs to model and can be quickly amplified, harming subsequent generation. To address this issue, we propose novel regularization method for training, which aims improve agreement between translations generated by left-to-right (L2R) right-to-left (R2L)...

10.1609/aaai.v33i01.3301443 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2019-07-17

Dependency-to-Dependency Neural Machine Translation

OPENALEX - Publications

Shuangzhi Wu Dongdong Zhang Zhirui Zhang Nan Yang Mu Li and 1 more

Recent research has proven that syntactic knowledge is effective to improve the performance of neural machine translation (NMT). Most previous work focuses on leveraging either source or target syntax in recurrent network (RNN) based encoder–decoder model. In this paper, we simultaneously use both and dependency tree NMT First, propose a simple but syntax-aware encoder incorporate into NMT. The new enriches each state with dependence relations from tree. Then, novel sequence-to-dependence...

10.1109/taslp.2018.2855968 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2018-07-13

Unsupervised Neural Machine Translation with SMT as Posterior Regularization

OPENALEX - Publications

Shuo Ren Zhirui Zhang Shujie Liu Ming Zhou Shuai Ma

Without real bilingual corpus available, unsupervised Neural Machine Translation (NMT) typically requires pseudo parallel data generated with the back-translation method for model training. However, due to weak supervision, inevitably contain noises and errors that will be accumulated reinforced in subsequent training process, leading bad translation performance. To address this issue, we introduce phrase based Statistic (SMT) models which are robust noisy data, as posterior regularizations...

10.1609/aaai.v33i01.3301241 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2019-07-17

Cross-Lingual Pre-Training Based Transfer for Zero-Shot Neural Machine Translation

OPENALEX - Publications

Baijun Ji Zhirui Zhang Xiangyu Duan Min Zhang Boxing Chen and 1 more

Transfer learning between different language pairs has shown its effectiveness for Neural Machine Translation (NMT) in low-resource scenario. However, existing transfer methods involving a common target are far from success the extreme scenario of zero-shot translation, due to space mismatch problem transferor (the parent model) and transferee child on source side. To address this challenge, we propose an effective approach based cross-lingual pre-training. Our key idea is make all languages...

10.1609/aaai.v34i01.5341 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2020-04-03

Predictions of heavy metal concentrations by physiochemical water quality parameters in coastal areas of Yangtze river estuary

OPENALEX - Publications

Yuwen Zou Sha Lou Zhirui Zhang Shuguang Liu Xiaosheng Zhou and 4 more

10.1016/j.marpolbul.2023.115951 article EN Marine Pollution Bulletin 2023-12-26

Learning to Collaborate for Question Answering and Asking

OPENALEX - Publications

Duyu Tang Nan Duan Zhao Yan Zhirui Zhang Yibo Sun and 3 more

Duyu Tang, Nan Duan, Zhao Yan, Zhirui Zhang, Yibo Sun, Shujie Liu, Yuanhua Lv, Ming Zhou. Proceedings of the 2018 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). 2018.

10.18653/v1/n18-1141 article EN cc-by 2018-01-01

Task-Oriented Dialogue System as Natural Language Generation

OPENALEX - Publications

Weizhi Wang Zhirui Zhang Junliang Guo Yinpei Dai Boxing Chen and 1 more

In this paper, we propose to formulate the task-oriented dialogue system as purely natural language generation task, so fully leverage large-scale pre-trained models like GPT-2 and simplify complicated delexicalization prepossessing. However, directly applying method heavily suffers from entity inconsistency caused by removal of delexicalized tokens, well catastrophic forgetting problem model during fine-tuning, leading unsatisfactory performance. To alleviate these problems, design a novel...

10.1145/3477495.3531920 article EN Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2022-07-06

Investigation into a novel pulsating cavitation air jet polishing method for Ti-6Al-4V alloy

OPENALEX - Publications

Lei Zhang Chen Ding Yanjun Han Zhirui Zhang Cheng Fan

10.1016/j.triboint.2022.107837 article EN Tribology International 2022-07-30

Incorporating BERT into Parallel Sequence Decoding with Adapters

OPENALEX - Publications

Junliang Guo Zhirui Zhang Linli Xu Haoran Wei Boxing Chen and 1 more

While large scale pre-trained language models such as BERT have achieved great success on various natural understanding tasks, how to efficiently and effectively incorporate them into sequence-to-sequence the corresponding text generation tasks remains a non-trivial problem. In this paper, we propose address problem by taking two different encoder decoder respectively, fine-tuning introducing simple lightweight adapter modules, which are inserted between layers tuned task-specific dataset....

10.48550/arxiv.2010.06138 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Vegetation morphology and phytobiology intervene in heavy metal contamination of surface sediments in Yangtze River Estuary

OPENALEX - Publications

Zhirui Zhang Sha Lou Shuguang Liu Zhongyuan Yang Shizhe Chen and 2 more

10.1016/j.marpolbul.2025.117795 article EN Marine Pollution Bulletin 2025-03-09

Seasonal Effects of Hydrometeorological Factors on the Distribution and Partition of Organic Carbon in the Yangtze River Estuary

OPENALEX - Publications

Sha Lou Shizhe Chen Zhongyuan Yang Zhirui Zhang Shuguang Liu and 1 more

10.1007/s12237-025-01528-x article EN Estuaries and Coasts 2025-04-02

Interactive genetic color matching design of cultural and creative products considering color image and visual aesthetics

OPENALEX - Publications

Li Deng Fangyuan Zhou Zhirui Zhang

To optimize the colors used in cultural and creative products, this paper proposes a color matching design method that considers image visual aesthetics. First, 99 samples are identified based on Chinese traditional colors, user preferences for 30 semantic terms measured by differential method. This leads to six factors being extracted factor analysis. Second, quantitative analysis of aesthetics is applied, formulas calculating harmony, balance, symmetry derived. On basis, an interactive...

10.1016/j.heliyon.2022.e10768 article EN cc-by-nc-nd Heliyon 2022-09-01

Document-Level Machine Translation with Large Language Models

OPENALEX - Publications

Longyue Wang Chenyang Lyu Tianbo Ji Zhirui Zhang Dian Yu and 2 more

Large language models (LLMs) such as ChatGPT can produce coherent, cohesive, relevant, and fluent answers for various natural processing (NLP) tasks. Taking document-level machine translation (MT) a testbed, this paper provides an in-depth evaluation of LLMs' ability on discourse modeling. The study focuses three aspects: 1) Effects Context-Aware Prompts, where we investigate the impact different prompts quality phenomena; 2) Comparison Translation Models, compare performance with commercial...

10.48550/arxiv.2304.02210 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Fairness-guided Few-shot Prompting for Large Language Models

OPENALEX - Publications

Zhenqiang Ma Changqing Zhang Yatao Bian Lemao Liu Zhirui Zhang and 5 more

Large language models have demonstrated surprising ability to perform in-context learning, i.e., these can be directly applied solve numerous downstream tasks by conditioning on a prompt constructed few input-output examples. However, prior research has shown that learning suffer from high instability due variations in training examples, example order, and formats. Therefore, the construction of an appropriate is essential for improving performance learning. In this paper, we revisit problem...

10.48550/arxiv.2303.13217 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Communication-Efficient Personalized Federated Learning for Speech-to-Text Tasks

OPENALEX - Publications

Yichao Du Zhirui Zhang Linan Yue Xu Huang Yuqing Zhang and 3 more

To protect privacy and meet legal regulations, federated learning (FL) has gained significant attention for training speech-to-text (S2T) systems, including automatic speech recognition (ASR) translation (ST). However, the commonly used FL approach (i.e., FEDAVG) in S2T tasks typically suffers from extensive communication overhead due to multi-round interactions based on whole model performance degradation caused by data heterogeneity among clients. address these issues, we propose a...

10.1109/icassp48485.2024.10447662 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024-03-18

Bidirectional Generative Adversarial Networks for Neural Machine Translation

OPENALEX - Publications

Zhirui Zhang Shujie Liu Mu Li Ming Zhou Enhong Chen

Generative Adversarial Network (GAN) has been proposed to tackle the exposure bias problem of Neural Machine Translation (NMT). However, discriminator typically results in instability GAN training due inadequate problem: search space is so huge that sampled translations are not sufficient for training. To address this issue and stabilize training, paper, we propose a novel Bidirectional (BGAN-NMT), which aims introduce generator model act as discriminator, whereby naturally considers entire...

10.18653/v1/k18-1019 article EN cc-by 2018-01-01

Stack-based Multi-layer Attention for Transition-based Dependency Parsing

OPENALEX - Publications

Zhirui Zhang Shujie Liu Mu Li Ming Zhou Enhong Chen

Although sequence-to-sequence (seq2seq) network has achieved significant success in many NLP tasks such as machine translation and text summarization, simply applying this approach to transition-based dependency parsing cannot yield a comparable performance gain other state-of-the-art methods, stack-LSTM head selection. In paper, we propose stack-based multi-layer attention model for seq2seq learning better leverage structural linguistics information. our method, two binary vectors are used...

10.18653/v1/d17-1175 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2017-01-01

Coming Soon ...