Kehai Chen

ORCID: 0000-0002-4346-7618
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Natural Language Processing Techniques
  • Topic Modeling
  • Multimodal Machine Learning Applications
  • Text Readability and Simplification
  • Semantic Web and Ontologies
  • Biomedical Text Mining and Ontologies
  • Speech Recognition and Synthesis
  • Speech and dialogue systems
  • Advanced Text Analysis Techniques
  • Adversarial Robustness in Machine Learning
  • Advanced Graph Neural Networks
  • Handwritten Text Recognition Techniques
  • Text and Document Classification Technologies
  • Human Pose and Action Recognition
  • Cognitive Computing and Networks
  • Algorithms and Data Compression
  • Data Quality and Management
  • Hand Gesture Recognition Systems
  • Advanced Manufacturing and Logistics Optimization
  • Agronomic Practices and Intercropping Systems
  • Hearing Impairment and Communication
  • Robotic Path Planning Algorithms
  • Selenium in Biological Systems
  • Intelligent Tutoring Systems and Adaptive Learning
  • Web Data Mining and Analysis

Harbin Institute of Technology
2016-2024

Anyang Academy of Agricultural Sciences
2023

Shanghai Municipal Education Commission
2023

Shanghai Jiao Tong University
2023

National Institute of Information and Communications Technology
2017-2021

Tencent (China)
2020

University of Chinese Academy of Sciences
2013

Instance weighting has been widely applied to phrase-based machine translation domain adaptation. However, it is challenging be Neural Machine Translation (NMT) directly, because NMT not a linear model. In this paper, two instance technologies, i.e., sentence and with dynamic weight learning strategy, are proposed for Empirical results on the IWSLT English-German/French tasks show that methods can substantially improve performance by up 2.7-6.7 BLEU points, outperforming existing baselines...

10.18653/v1/d17-1155 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2017-01-01

In document-level relation extraction (DocRE), graph structure is generally used to encode information in the input document classify category between each entity pair, and has greatly advanced DocRE task over past several years. However, learned representation universally models all pairs regardless of whether there are relationships these pairs. Thus, those without disperse attention encoder-classifier for ones with relationships, which may further hind improvement DocRE. To alleviate this...

10.1609/aaai.v35i16.17667 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2021-05-18

Attention mechanism, including global attention and local attention, plays a key role in neural machine translation (NMT). Global attends to all source words for word prediction. In comparison, selectively looks at fixed-window words. However, alignment weights the current target often decrease left right by linear distance centering on aligned position neglect syntax constraints. this paper, we extend with syntax-distance constraint, which focuses syntactically related predicted learning...

10.1609/aaai.v32i1.11910 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2018-04-26

Attention mechanism, including global attention and local attention, plays a key role in neural machine translation (NMT). Global attends to all source words for word prediction. In comparison, selectively looks at fixed-window words. However, alignment weights the current target often decrease left right by linear distance centering on aligned position neglect syntax-directed constraints. this paper, we extend with syntax-distance constraint, focus syntactically related predicted word, thus...

10.48550/arxiv.1711.04231 preprint EN other-oa arXiv (Cornell University) 2017-01-01

Text encoding is one of the most important steps in Natural Language Processing (NLP). It has been done well by self-attention mechanism current state-of-the-art Transformer encoder, which brought about significant improvements performance many NLP tasks. Though encoder may effectively capture general information its resulting representations, backbone information, meaning gist input text, not specifically focused on. In this paper, we propose explicit and implicit text compression...

10.1109/tpami.2021.3058341 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2021-01-01

Representation learning is the foundation of natural language processing (NLP). This work presents new methods to employ visual information as assistant signals general NLP tasks. For each sentence, we first retrieve a flexible number images either from light topic-image lookup table extracted over existing sentence-image pairs or shared cross-modal embedding space that pre-trained on out-of-shelf text-image pairs. Then, text and are encoded by Transformer encoder convolutional neural...

10.1109/tpami.2023.3234170 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2023-01-01

This survey explores the synergistic potential of Large Language Models (LLMs) and Vector Databases (VecDBs), a burgeoning but rapidly evolving research area. With proliferation LLMs comes host challenges, including hallucinations, outdated knowledge, prohibitive commercial application costs, memory issues. VecDBs emerge as compelling solution to these issues by offering an efficient means store, retrieve, manage high-dimensional vector representations intrinsic LLM operations. Through this...

10.48550/arxiv.2402.01763 preprint EN arXiv (Cornell University) 2024-01-30

Source dependency information has been successfully introduced into statistical machine translation. However, there are only a few preliminary attempts for Neural Machine Translation (NMT), such as concatenating representations of source word and its label together. In this paper, we propose novel NMT with representation to improve translation performance NMT, especially long sentences. Empirical results on NIST Chinese-to-English task show that our method achieves 1.6 BLEU improvements...

10.18653/v1/d17-1304 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2017-01-01

In statistical machine translation, translation prediction considers not only the aligned source word itself but also its contextual information. Learning context representation is a promising method for improving results, particularly through neural networks. Most of existing methods process words sequentially and neglect long-distance dependencies. this paper, we propose novel approach to dependence-based prediction. The proposed model capable encoding dependencies capturing functional...

10.1109/taslp.2017.2772846 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2017-11-13

Neural machine translation (NMT) has been prominent in many tasks. However, some domain-specific tasks, only the corpora from similar domains can improve performance. If out-of-domain are directly added into in-domain corpus, performance may even degrade. Therefore, domain adaptation techniques essential to solve NMT problem. Most existing methods for designed conventional phrase-based translation. For adaptation, there have a few studies on topics such as fine tuning, tags, and features. In...

10.1109/taslp.2018.2837223 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2018-05-16

Neural machine translation (NMT) takes deterministic sequences for source representations. However, either word-level or subword-level segmentations have multiple choices to split a sequence with different word segmentors subword vocabulary sizes. We hypothesize that the diversity in may affect NMT performance. To integrate state-of-the-art model, Transformer, we propose lattice-based encoders explore effective representation an automatic way during training. two methods: 1) lattice...

10.18653/v1/p19-1298 preprint EN cc-by 2019-01-01

Unsupervised bilingual word embedding (UBWE), together with other technologies such as back-translation and denoising, has helped unsupervised neural machine translation (UNMT) achieve remarkable results in several language pairs. In previous methods, UBWE is first trained using non-parallel monolingual corpora then this pre-trained used to initialize the encoder decoder of UNMT. That is, training UNMT are separate. paper, we empirically investigate relationship between The empirical...

10.18653/v1/p19-1119 article EN cc-by 2019-01-01

Traditional neural machine translation (NMT) methods use the word-level context to predict target language while neglecting sentence-level context, which has been shown be beneficial for prediction in statistical translation. This paper represents as latent topic representations by using a convolution network, and designs attention integrate source information into both attention-based Transformer-based NMT. In particular, our method can improve performance of NMT modeling topics...

10.1109/taslp.2019.2937190 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2019-08-23

Rare words are usually replaced with a single <;unk> token in the current encoder-decoder style of neural machine translation, challenging translation modeling by an obscured context. In this article, we propose to build fuzzy semantic representation (FSR) method for rare through hierarchical clustering group together, and integrate it into framework. This structure can compensate information both source target sides, providing context capture words. The introduced FSR also alleviate data...

10.1109/tfuzz.2020.2969399 article EN IEEE Transactions on Fuzzy Systems 2020-02-02

Document-level relation extraction (DocRE)models generally use graph networks to implicitly model the reasoning skill (i.e., pattern recognition, logical reasoning, coreference etc.) related between one entity pair in a document.In this paper, we propose novel discriminative framework explicitly paths of these skills each document.Thus, network is designed estimate probability distribution different based on constructed and vectorized document contexts for pair, thereby recognizing their...

10.18653/v1/2021.findings-acl.144 article EN cc-by 2021-01-01

The o1-Like LLMs are transforming AI by simulating human cognitive processes, but their performance in multilingual machine translation (MMT) remains underexplored. This study examines: (1) how perform MMT tasks and (2) what factors influence quality. We evaluate multiple compare them with traditional models like ChatGPT GPT-4o. Results show that establish new benchmarks, DeepSeek-R1 surpassing GPT-4o contextless tasks. They demonstrate strengths historical cultural exhibit a tendency for...

10.48550/arxiv.2502.11544 preprint EN arXiv (Cornell University) 2025-02-17

Large language models (LLMs) have succeeded remarkably in multilingual translation tasks. However, the inherent mechanisms of LLMs remain poorly understood, largely due to sophisticated architectures and vast parameter scales. In response this issue, study explores mechanism LLM from perspective computational components (e.g., attention heads MLPs). Path patching is utilized explore causal relationships between components, detecting those crucial for tasks subsequently analyzing their...

10.48550/arxiv.2502.11806 preprint EN arXiv (Cornell University) 2025-02-17

Despite being empowered with alignment mechanisms, large language models (LLMs) are increasingly vulnerable to emerging jailbreak attacks that can compromise their mechanisms. This vulnerability poses significant risks real-world applications. Existing work faces challenges in both training efficiency and generalization capabilities (i.e., Reinforcement Learning from Human Feedback Red-Teaming). Developing effective strategies enable LLMs resist continuously evolving attempts represents a...

10.1609/aaai.v39i24.34784 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2025-04-11

Unsupervised neural machine translation (UNMT) has recently achieved remarkable results for several language pairs. However, it can only translate between a single pair and cannot produce multiple pairs at the same time. That is, research on multilingual UNMT been limited. In this paper, we empirically introduce simple method to thirteen languages using encoder decoder, making use of data improve all On basis empirical findings, propose two knowledge distillation methods further enhance...

10.18653/v1/2020.acl-main.324 preprint EN cc-by 2020-01-01

Unsupervised cross-lingual language representation initialization methods such as unsupervised bilingual word embedding (UBWE) pre-training and masked model (CMLM) pre-training, together with mechanisms denoising back-translation, have advanced neural machine translation (UNMT), which has achieved impressive results on several pairs, particularly French-English German-English. Typically, UBWE focuses initializing the layer in encoder decoder of UNMT, whereas CMLM entire UNMT. However,...

10.1109/taslp.2020.2982282 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2020-01-01

Source input information plays a very important role in the Transformer-based translation system. In practice, word embedding and positional of each are added as representation. Then self-attention networks used to encode global dependencies representation generate source However, this processing on only adopts single feature excludes richer more diverse features such recurrence features, local syntactic which results tedious thereby hinders further performance improvement. paper, we...

10.1109/taslp.2020.2996077 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2020-01-01
Coming Soon ...