NFDI4DS | UHH-SEMS - Publication Details

Dynamic Context Selection for Document-level Neural Machine Translation via Reinforcement Learning

OPENALEX - Publications

Xiaomian Kang Yang Zhao Jiajun Zhang Chengqing Zong

Document-level neural machine translation has yielded attractive improvements. However, majority of existing methods roughly use all context sentences in a fixed scope. They neglect the fact that different source need sizes context. To address this problem, we propose an effective approach to select dynamic so document-level model can utilize more useful selected produce better translations. Specifically, introduce selection module is independent score each candidate sentence. Then, two...

10.18653/v1/2020.emnlp-main.175 article EN cc-by 2020-01-01

An End-to-End Chinese Discourse Parser with Adaptation to Explicit and Non-explicit Relation Recognition

OPENALEX - Publications

Xiaomian Kang Haoran Li Long Zhou Jiajun Zhang Chengqing Zong

This paper describes our end-to-end discourse parser in the CoNLL-2016 Shared Task on Chinese Shallow Discourse Parsing.To adapt to characteristics of Chinese, we implement a uniform framework for both explicit and non-explicit relation parsing.In this framework, are first utilize seed-expansion approach argument extraction subtask.In official evaluation, system achieves an F1 score 26.90% overall performance blind test set.

10.18653/v1/k16-2003 article EN cc-by 2016-01-01

Incorporating Multi-Level User Preference into Document-Level Sentiment Classification

OPENALEX - Publications

Junjie Li Haoran Li Xiaomian Kang Haitong Yang Chengqing Zong

Document-level sentiment classification aims to predict a user’s polarity in document about product. Most existing methods only focus on review contents and ignore users who post reviews. In fact, when reviewing product, different have word-using habits express opinions (i.e., word-level user preference), care attributes of the product aspect-level characteristics score polarity-level preference). These preferences great influence interpreting text. To address this issue, we propose model...

10.1145/3234512 article EN ACM Transactions on Asian and Low-Resource Language Information Processing 2018-11-19

Medical Term and Status Generation From Chinese Clinical Dialogue With Multi-Granularity Transformer

OPENALEX - Publications

Mei Li Lu Xiang Xiaomian Kang Yang Zhao Zhou Yu and 1 more

This paper describes a generative model for extracting medical terms and their status from Chinese dialogues. Notably, the extracted semantic information plays an essential role in downstream tasks such as automatic scribe diagnosis system. However, how to effectively leverage dialogue context generate corresponding accurately remains less explored. Existing methods treat text concentrated long without considering characteristics of conversation, colloquialism, redundancy, interactions, etc....

10.1109/taslp.2021.3122301 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2021-01-01

Deep Neural Network--based Machine Translation System Combination

OPENALEX - Publications

Long Zhou Jiajun Zhang Xiaomian Kang Chengqing Zong

Deep neural networks (DNNs) have provably enhanced the state-of-the-art natural language process (NLP) with their capability of feature learning and representation. As one more challenging NLP tasks, machine translation (NMT) becomes a new approach to generates much fluent results compared statistical (SMT). However, SMT is usually better than NMT in adequacy word coverage. It therefore promising direction combine advantages both SMT. In this article, we propose deep network--based system...

10.1145/3389791 article EN ACM Transactions on Asian and Low-Resource Language Information Processing 2020-07-07

A Survey of Discourse Representations for Chinese Discourse Annotation

OPENALEX - Publications

Xiaomian Kang Chengqing Zong Nianwen Xue

A key element in computational discourse analysis is the design of a formal representation for structure text. With machine learning being dominant method, it important to identify that can be used perform large-scale annotation. This survey provides systematic existing theories evaluate whether they are suitable annotation Chinese Specifically, two properties, expressiveness and practicality, introduced compare representations based on rhetorical relations entity relations. The comparison...

10.1145/3293442 article EN ACM Transactions on Asian and Low-Resource Language Information Processing 2019-01-25

Enhancing Lexical Translation Consistency for Document-Level Neural Machine Translation

OPENALEX - Publications

Xiaomian Kang Yang Zhao Jiajun Zhang Chengqing Zong

Document-level neural machine translation (DocNMT) has yielded attractive improvements. In this article, we systematically analyze the discourse phenomena in Chinese-to-English translation, and focus on most obvious ones, namely lexical consistency. To alleviate inconsistency, propose an effective approach that is aware of words which need to be translated consistently constrains model produce more consistent translations. Specifically, first introduce a global context extractor extract...

10.1145/3485469 article EN ACM Transactions on Asian and Low-Resource Language Information Processing 2021-12-13

Knowledge Graph Guided Neural Machine Translation with Dynamic Reinforce-selected Triples

OPENALEX - Publications

Yang Zhao Xiaomian Kang Yaping Zhang Jiajun Zhang Yu Zhou and 1 more

Previous methods incorporating knowledge graphs (KGs) into neural machine translation (NMT) adopt a static utilization strategy, that introduces many useless triples and makes the useful difficult be utilized by NMT. To address this problem, we propose KG guided NMT model with dynamic reinforce-selected triples. The proposed could dynamically select different for source sentences. Specifically, contains two components: 1) selector, selects sentence, 2) (KgNMT), utilizes selected to guide of...

10.1145/3696664 article EN ACM Transactions on Asian and Low-Resource Language Information Processing 2024-09-24

Self-Modifying State Modeling for Simultaneous Machine Translation

OPENALEX - Publications

Donglei Yu Xiaomian Kang Yuchen Liu Zhou Yu Chengqing Zong

Simultaneous Machine Translation (SiMT) generates target outputs while receiving stream source inputs and requires a read/write policy to decide whether wait for the next token or generate new token, whose decisions form \textit{decision path}. Existing SiMT methods, which learn by exploring various decision paths in training, face inherent limitations. These methods not only fail precisely optimize due inability accurately assess individual impact of each on performance, but also cannot...

10.48550/arxiv.2406.02237 preprint EN arXiv (Cornell University) 2024-06-04

Self-Modifying State Modeling for Simultaneous Machine Translation

OPENALEX - Publications

Donglei Yu Xiaomian Kang Yuchen Liu Zhou Yu Chengqing Zong

10.18653/v1/2024.acl-long.528 article EN 2024-01-01

Dynamic Context Selection for Document-level Neural Machine Translation via Reinforcement Learning

OPENALEX - Publications

Xiaomian Kang Yang Zhao Jiajun Zhang Chengqing Zong

Document-level neural machine translation has yielded attractive improvements. However, majority of existing methods roughly use all context sentences in a fixed scope. They neglect the fact that different source need sizes context. To address this problem, we propose an effective approach to select dynamic so document-level model can utilize more useful selected produce better translations. Specifically, introduce selection module is independent score each candidate sentence. Then, two...

10.48550/arxiv.2010.04314 preprint EN other-oa arXiv (Cornell University) 2020-01-01