Xiaomian Kang

ORCID: 0000-0003-3929-8548
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Natural Language Processing Techniques
  • Topic Modeling
  • Multimodal Machine Learning Applications
  • Advanced Text Analysis Techniques
  • Text Readability and Simplification
  • Sentiment Analysis and Opinion Mining
  • Semantic Web and Ontologies

Chinese Academy of Sciences
2016-2024

Shandong Institute of Automation
2020-2024

Institute of Automation
2018-2021

University of Chinese Academy of Sciences
2016-2021

Beijing Academy of Artificial Intelligence
2020

Document-level neural machine translation has yielded attractive improvements. However, majority of existing methods roughly use all context sentences in a fixed scope. They neglect the fact that different source need sizes context. To address this problem, we propose an effective approach to select dynamic so document-level model can utilize more useful selected produce better translations. Specifically, introduce selection module is independent score each candidate sentence. Then, two...

10.18653/v1/2020.emnlp-main.175 article EN cc-by 2020-01-01

This paper describes our end-to-end discourse parser in the CoNLL-2016 Shared Task on Chinese Shallow Discourse Parsing.To adapt to characteristics of Chinese, we implement a uniform framework for both explicit and non-explicit relation parsing.In this framework, are first utilize seed-expansion approach argument extraction subtask.In official evaluation, system achieves an F1 score 26.90% overall performance blind test set.

10.18653/v1/k16-2003 article EN cc-by 2016-01-01

Document-level sentiment classification aims to predict a user’s polarity in document about product. Most existing methods only focus on review contents and ignore users who post reviews. In fact, when reviewing product, different have word-using habits express opinions (i.e., word-level user preference), care attributes of the product aspect-level characteristics score polarity-level preference). These preferences great influence interpreting text. To address this issue, we propose model...

10.1145/3234512 article EN ACM Transactions on Asian and Low-Resource Language Information Processing 2018-11-19

This paper describes a generative model for extracting medical terms and their status from Chinese dialogues. Notably, the extracted semantic information plays an essential role in downstream tasks such as automatic scribe diagnosis system. However, how to effectively leverage dialogue context generate corresponding accurately remains less explored. Existing methods treat text concentrated long without considering characteristics of conversation, colloquialism, redundancy, interactions, etc....

10.1109/taslp.2021.3122301 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2021-01-01

Deep neural networks (DNNs) have provably enhanced the state-of-the-art natural language process (NLP) with their capability of feature learning and representation. As one more challenging NLP tasks, machine translation (NMT) becomes a new approach to generates much fluent results compared statistical (SMT). However, SMT is usually better than NMT in adequacy word coverage. It therefore promising direction combine advantages both SMT. In this article, we propose deep network--based system...

10.1145/3389791 article EN ACM Transactions on Asian and Low-Resource Language Information Processing 2020-07-07

A key element in computational discourse analysis is the design of a formal representation for structure text. With machine learning being dominant method, it important to identify that can be used perform large-scale annotation. This survey provides systematic existing theories evaluate whether they are suitable annotation Chinese Specifically, two properties, expressiveness and practicality, introduced compare representations based on rhetorical relations entity relations. The comparison...

10.1145/3293442 article EN ACM Transactions on Asian and Low-Resource Language Information Processing 2019-01-25

Document-level neural machine translation (DocNMT) has yielded attractive improvements. In this article, we systematically analyze the discourse phenomena in Chinese-to-English translation, and focus on most obvious ones, namely lexical consistency. To alleviate inconsistency, propose an effective approach that is aware of words which need to be translated consistently constrains model produce more consistent translations. Specifically, first introduce a global context extractor extract...

10.1145/3485469 article EN ACM Transactions on Asian and Low-Resource Language Information Processing 2021-12-13

Previous methods incorporating knowledge graphs (KGs) into neural machine translation (NMT) adopt a static utilization strategy, that introduces many useless triples and makes the useful difficult be utilized by NMT. To address this problem, we propose KG guided NMT model with dynamic reinforce-selected triples. The proposed could dynamically select different for source sentences. Specifically, contains two components: 1) selector, selects sentence, 2) (KgNMT), utilizes selected to guide of...

10.1145/3696664 article EN ACM Transactions on Asian and Low-Resource Language Information Processing 2024-09-24

Simultaneous Machine Translation (SiMT) generates target outputs while receiving stream source inputs and requires a read/write policy to decide whether wait for the next token or generate new token, whose decisions form \textit{decision path}. Existing SiMT methods, which learn by exploring various decision paths in training, face inherent limitations. These methods not only fail precisely optimize due inability accurately assess individual impact of each on performance, but also cannot...

10.48550/arxiv.2406.02237 preprint EN arXiv (Cornell University) 2024-06-04

Document-level neural machine translation has yielded attractive improvements. However, majority of existing methods roughly use all context sentences in a fixed scope. They neglect the fact that different source need sizes context. To address this problem, we propose an effective approach to select dynamic so document-level model can utilize more useful selected produce better translations. Specifically, introduce selection module is independent score each candidate sentence. Then, two...

10.48550/arxiv.2010.04314 preprint EN other-oa arXiv (Cornell University) 2020-01-01
Coming Soon ...