Zhirui Zhang

ORCID: 0000-0003-1385-3742
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Natural Language Processing Techniques
  • Topic Modeling
  • Multimodal Machine Learning Applications
  • Speech Recognition and Synthesis
  • Speech and dialogue systems
  • Text Readability and Simplification
  • Data Quality and Management
  • Music and Audio Processing
  • Lightning and Electromagnetic Phenomena
  • Power Systems and Technologies
  • Power System Reliability and Maintenance
  • High-Voltage Power Transmission Systems
  • Heavy metals in environment
  • Electric Power System Optimization
  • Machine Learning and Data Classification
  • Privacy-Preserving Technologies in Data
  • Hate Speech and Cyberbullying Detection
  • Adversarial Robustness in Machine Learning
  • Coastal wetland ecosystem dynamics
  • Smart Grid and Power Systems
  • Color perception and design
  • Intelligent Tutoring Systems and Adaptive Learning
  • Energy Load and Power Forecasting
  • Software Engineering Research
  • Advanced Vision and Imaging

Tongji University
2023-2025

Shanghai Normal University
2025

ShanghaiTech University
2024

Tencent (China)
2022-2024

Changchun University of Science and Technology
2024

Heilongjiang University of Chinese Medicine
2024

University of Science and Technology of China
2018-2023

Xi'an University of Architecture and Technology
2023

Dalian Maritime University
2023

North China University of Science and Technology
2023

Machine translation has made rapid advances in recent years. Millions of people are using it today online systems and mobile applications order to communicate across language barriers. The question naturally arises whether such can approach or achieve parity with human translations. In this paper, we first address the problem how define accurately measure translation. We then describe Microsoft's machine system quality its translations on widely used WMT 2017 news task from Chinese English....

10.48550/arxiv.1803.05567 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Large language models (LLMs) such as ChatGPT can produce coherent, cohesive, relevant, and fluent answers for various natural processing (NLP) tasks. Taking document-level machine translation (MT) a testbed, this paper provides an in-depth evaluation of LLMs’ ability on discourse modeling. The study focuses three aspects: 1) Effects Context-Aware Prompts, where we investigate the impact different prompts quality phenomena; 2) Comparison Translation Models, compare performance with commercial...

10.18653/v1/2023.emnlp-main.1036 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2023-01-01

Language style transferring rephrases text with specific stylistic attributes while preserving the original attribute-independent content. One main challenge in learning a transfer system is lack of parallel data where source sentence one and target another style. With this constraint, paper, we adapt unsupervised machine translation methods for task automatic transfer. We first take advantage style-preference information word embedding similarity to produce pseudo-parallel statistical (SMT)...

10.48550/arxiv.1808.07894 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Monolingual data have been demonstrated to be helpful in improving translation quality of both statistical machine (SMT) systems and neural (NMT) systems, especially resource-poor or domain adaptation tasks where parallel are not rich enough. In this paper, we propose a novel approach better leveraging monolingual for by jointly learning source-to-target target-to-source NMT models language pair with joint EM optimization method. The training process starts two initial pre-trained on each...

10.1609/aaai.v32i1.11248 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2018-04-25

Xin Zheng, Zhirui Zhang, Junliang Guo, Shujian Huang, Boxing Chen, Weihua Luo, Jiajun Chen. Proceedings of the 59th Annual Meeting Association for Computational Linguistics and 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). 2021.

10.18653/v1/2021.acl-short.47 article EN cc-by 2021-01-01

Although Neural Machine Translation (NMT) has achieved remarkable progress in the past several years, most NMT systems still suffer from a fundamental shortcoming as other sequence generation tasks: errors made early process are fed inputs to model and can be quickly amplified, harming subsequent generation. To address this issue, we propose novel regularization method for training, which aims improve agreement between translations generated by left-to-right (L2R) right-to-left (R2L)...

10.1609/aaai.v33i01.3301443 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2019-07-17

Recent research has proven that syntactic knowledge is effective to improve the performance of neural machine translation (NMT). Most previous work focuses on leveraging either source or target syntax in recurrent network (RNN) based encoder–decoder model. In this paper, we simultaneously use both and dependency tree NMT First, propose a simple but syntax-aware encoder incorporate into NMT. The new enriches each state with dependence relations from tree. Then, novel sequence-to-dependence...

10.1109/taslp.2018.2855968 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2018-07-13

Without real bilingual corpus available, unsupervised Neural Machine Translation (NMT) typically requires pseudo parallel data generated with the back-translation method for model training. However, due to weak supervision, inevitably contain noises and errors that will be accumulated reinforced in subsequent training process, leading bad translation performance. To address this issue, we introduce phrase based Statistic (SMT) models which are robust noisy data, as posterior regularizations...

10.1609/aaai.v33i01.3301241 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2019-07-17

Transfer learning between different language pairs has shown its effectiveness for Neural Machine Translation (NMT) in low-resource scenario. However, existing transfer methods involving a common target are far from success the extreme scenario of zero-shot translation, due to space mismatch problem transferor (the parent model) and transferee child on source side. To address this challenge, we propose an effective approach based cross-lingual pre-training. Our key idea is make all languages...

10.1609/aaai.v34i01.5341 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2020-04-03

Duyu Tang, Nan Duan, Zhao Yan, Zhirui Zhang, Yibo Sun, Shujie Liu, Yuanhua Lv, Ming Zhou. Proceedings of the 2018 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). 2018.

10.18653/v1/n18-1141 article EN cc-by 2018-01-01

In this paper, we propose to formulate the task-oriented dialogue system as purely natural language generation task, so fully leverage large-scale pre-trained models like GPT-2 and simplify complicated delexicalization prepossessing. However, directly applying method heavily suffers from entity inconsistency caused by removal of delexicalized tokens, well catastrophic forgetting problem model during fine-tuning, leading unsatisfactory performance. To alleviate these problems, design a novel...

10.1145/3477495.3531920 article EN Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2022-07-06

While large scale pre-trained language models such as BERT have achieved great success on various natural understanding tasks, how to efficiently and effectively incorporate them into sequence-to-sequence the corresponding text generation tasks remains a non-trivial problem. In this paper, we propose address problem by taking two different encoder decoder respectively, fine-tuning introducing simple lightweight adapter modules, which are inserted between layers tuned task-specific dataset....

10.48550/arxiv.2010.06138 preprint EN other-oa arXiv (Cornell University) 2020-01-01

To optimize the colors used in cultural and creative products, this paper proposes a color matching design method that considers image visual aesthetics. First, 99 samples are identified based on Chinese traditional colors, user preferences for 30 semantic terms measured by differential method. This leads to six factors being extracted factor analysis. Second, quantitative analysis of aesthetics is applied, formulas calculating harmony, balance, symmetry derived. On basis, an interactive...

10.1016/j.heliyon.2022.e10768 article EN cc-by-nc-nd Heliyon 2022-09-01

Large language models (LLMs) such as ChatGPT can produce coherent, cohesive, relevant, and fluent answers for various natural processing (NLP) tasks. Taking document-level machine translation (MT) a testbed, this paper provides an in-depth evaluation of LLMs' ability on discourse modeling. The study focuses three aspects: 1) Effects Context-Aware Prompts, where we investigate the impact different prompts quality phenomena; 2) Comparison Translation Models, compare performance with commercial...

10.48550/arxiv.2304.02210 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Large language models have demonstrated surprising ability to perform in-context learning, i.e., these can be directly applied solve numerous downstream tasks by conditioning on a prompt constructed few input-output examples. However, prior research has shown that learning suffer from high instability due variations in training examples, example order, and formats. Therefore, the construction of an appropriate is essential for improving performance learning. In this paper, we revisit problem...

10.48550/arxiv.2303.13217 preprint EN cc-by arXiv (Cornell University) 2023-01-01

To protect privacy and meet legal regulations, federated learning (FL) has gained significant attention for training speech-to-text (S2T) systems, including automatic speech recognition (ASR) translation (ST). However, the commonly used FL approach (i.e., FEDAVG) in S2T tasks typically suffers from extensive communication overhead due to multi-round interactions based on whole model performance degradation caused by data heterogeneity among clients. address these issues, we propose a...

10.1109/icassp48485.2024.10447662 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024-03-18

Generative Adversarial Network (GAN) has been proposed to tackle the exposure bias problem of Neural Machine Translation (NMT). However, discriminator typically results in instability GAN training due inadequate problem: search space is so huge that sampled translations are not sufficient for training. To address this issue and stabilize training, paper, we propose a novel Bidirectional (BGAN-NMT), which aims introduce generator model act as discriminator, whereby naturally considers entire...

10.18653/v1/k18-1019 article EN cc-by 2018-01-01

Although sequence-to-sequence (seq2seq) network has achieved significant success in many NLP tasks such as machine translation and text summarization, simply applying this approach to transition-based dependency parsing cannot yield a comparable performance gain other state-of-the-art methods, stack-LSTM head selection. In paper, we propose stack-based multi-layer attention model for seq2seq learning better leverage structural linguistics information. our method, two binary vectors are used...

10.18653/v1/d17-1175 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2017-01-01
Coming Soon ...