- Natural Language Processing Techniques
- Topic Modeling
- Multimodal Machine Learning Applications
- Text Readability and Simplification
- Semantic Web and Ontologies
- Biomedical Text Mining and Ontologies
- Speech Recognition and Synthesis
- Speech and dialogue systems
- Advanced Text Analysis Techniques
- Adversarial Robustness in Machine Learning
- Advanced Graph Neural Networks
- Handwritten Text Recognition Techniques
- Text and Document Classification Technologies
- Human Pose and Action Recognition
- Cognitive Computing and Networks
- Algorithms and Data Compression
- Data Quality and Management
- Hand Gesture Recognition Systems
- Advanced Manufacturing and Logistics Optimization
- Agronomic Practices and Intercropping Systems
- Hearing Impairment and Communication
- Robotic Path Planning Algorithms
- Selenium in Biological Systems
- Intelligent Tutoring Systems and Adaptive Learning
- Web Data Mining and Analysis
Harbin Institute of Technology
2016-2024
Anyang Academy of Agricultural Sciences
2023
Shanghai Municipal Education Commission
2023
Shanghai Jiao Tong University
2023
National Institute of Information and Communications Technology
2017-2021
Tencent (China)
2020
University of Chinese Academy of Sciences
2013
Instance weighting has been widely applied to phrase-based machine translation domain adaptation. However, it is challenging be Neural Machine Translation (NMT) directly, because NMT not a linear model. In this paper, two instance technologies, i.e., sentence and with dynamic weight learning strategy, are proposed for Empirical results on the IWSLT English-German/French tasks show that methods can substantially improve performance by up 2.7-6.7 BLEU points, outperforming existing baselines...
In document-level relation extraction (DocRE), graph structure is generally used to encode information in the input document classify category between each entity pair, and has greatly advanced DocRE task over past several years. However, learned representation universally models all pairs regardless of whether there are relationships these pairs. Thus, those without disperse attention encoder-classifier for ones with relationships, which may further hind improvement DocRE. To alleviate this...
Attention mechanism, including global attention and local attention, plays a key role in neural machine translation (NMT). Global attends to all source words for word prediction. In comparison, selectively looks at fixed-window words. However, alignment weights the current target often decrease left right by linear distance centering on aligned position neglect syntax constraints. this paper, we extend with syntax-distance constraint, which focuses syntactically related predicted learning...
Attention mechanism, including global attention and local attention, plays a key role in neural machine translation (NMT). Global attends to all source words for word prediction. In comparison, selectively looks at fixed-window words. However, alignment weights the current target often decrease left right by linear distance centering on aligned position neglect syntax-directed constraints. this paper, we extend with syntax-distance constraint, focus syntactically related predicted word, thus...
Text encoding is one of the most important steps in Natural Language Processing (NLP). It has been done well by self-attention mechanism current state-of-the-art Transformer encoder, which brought about significant improvements performance many NLP tasks. Though encoder may effectively capture general information its resulting representations, backbone information, meaning gist input text, not specifically focused on. In this paper, we propose explicit and implicit text compression...
Representation learning is the foundation of natural language processing (NLP). This work presents new methods to employ visual information as assistant signals general NLP tasks. For each sentence, we first retrieve a flexible number images either from light topic-image lookup table extracted over existing sentence-image pairs or shared cross-modal embedding space that pre-trained on out-of-shelf text-image pairs. Then, text and are encoded by Transformer encoder convolutional neural...
This survey explores the synergistic potential of Large Language Models (LLMs) and Vector Databases (VecDBs), a burgeoning but rapidly evolving research area. With proliferation LLMs comes host challenges, including hallucinations, outdated knowledge, prohibitive commercial application costs, memory issues. VecDBs emerge as compelling solution to these issues by offering an efficient means store, retrieve, manage high-dimensional vector representations intrinsic LLM operations. Through this...
Source dependency information has been successfully introduced into statistical machine translation. However, there are only a few preliminary attempts for Neural Machine Translation (NMT), such as concatenating representations of source word and its label together. In this paper, we propose novel NMT with representation to improve translation performance NMT, especially long sentences. Empirical results on NIST Chinese-to-English task show that our method achieves 1.6 BLEU improvements...
In statistical machine translation, translation prediction considers not only the aligned source word itself but also its contextual information. Learning context representation is a promising method for improving results, particularly through neural networks. Most of existing methods process words sequentially and neglect long-distance dependencies. this paper, we propose novel approach to dependence-based prediction. The proposed model capable encoding dependencies capturing functional...
Neural machine translation (NMT) has been prominent in many tasks. However, some domain-specific tasks, only the corpora from similar domains can improve performance. If out-of-domain are directly added into in-domain corpus, performance may even degrade. Therefore, domain adaptation techniques essential to solve NMT problem. Most existing methods for designed conventional phrase-based translation. For adaptation, there have a few studies on topics such as fine tuning, tags, and features. In...
Neural machine translation (NMT) takes deterministic sequences for source representations. However, either word-level or subword-level segmentations have multiple choices to split a sequence with different word segmentors subword vocabulary sizes. We hypothesize that the diversity in may affect NMT performance. To integrate state-of-the-art model, Transformer, we propose lattice-based encoders explore effective representation an automatic way during training. two methods: 1) lattice...
Unsupervised bilingual word embedding (UBWE), together with other technologies such as back-translation and denoising, has helped unsupervised neural machine translation (UNMT) achieve remarkable results in several language pairs. In previous methods, UBWE is first trained using non-parallel monolingual corpora then this pre-trained used to initialize the encoder decoder of UNMT. That is, training UNMT are separate. paper, we empirically investigate relationship between The empirical...
Traditional neural machine translation (NMT) methods use the word-level context to predict target language while neglecting sentence-level context, which has been shown be beneficial for prediction in statistical translation. This paper represents as latent topic representations by using a convolution network, and designs attention integrate source information into both attention-based Transformer-based NMT. In particular, our method can improve performance of NMT modeling topics...
Rare words are usually replaced with a single <;unk> token in the current encoder-decoder style of neural machine translation, challenging translation modeling by an obscured context. In this article, we propose to build fuzzy semantic representation (FSR) method for rare through hierarchical clustering group together, and integrate it into framework. This structure can compensate information both source target sides, providing context capture words. The introduced FSR also alleviate data...
Document-level relation extraction (DocRE)models generally use graph networks to implicitly model the reasoning skill (i.e., pattern recognition, logical reasoning, coreference etc.) related between one entity pair in a document.In this paper, we propose novel discriminative framework explicitly paths of these skills each document.Thus, network is designed estimate probability distribution different based on constructed and vectorized document contexts for pair, thereby recognizing their...
The o1-Like LLMs are transforming AI by simulating human cognitive processes, but their performance in multilingual machine translation (MMT) remains underexplored. This study examines: (1) how perform MMT tasks and (2) what factors influence quality. We evaluate multiple compare them with traditional models like ChatGPT GPT-4o. Results show that establish new benchmarks, DeepSeek-R1 surpassing GPT-4o contextless tasks. They demonstrate strengths historical cultural exhibit a tendency for...
Large language models (LLMs) have succeeded remarkably in multilingual translation tasks. However, the inherent mechanisms of LLMs remain poorly understood, largely due to sophisticated architectures and vast parameter scales. In response this issue, study explores mechanism LLM from perspective computational components (e.g., attention heads MLPs). Path patching is utilized explore causal relationships between components, detecting those crucial for tasks subsequently analyzing their...
Despite being empowered with alignment mechanisms, large language models (LLMs) are increasingly vulnerable to emerging jailbreak attacks that can compromise their mechanisms. This vulnerability poses significant risks real-world applications. Existing work faces challenges in both training efficiency and generalization capabilities (i.e., Reinforcement Learning from Human Feedback Red-Teaming). Developing effective strategies enable LLMs resist continuously evolving attempts represents a...
Unsupervised neural machine translation (UNMT) has recently achieved remarkable results for several language pairs. However, it can only translate between a single pair and cannot produce multiple pairs at the same time. That is, research on multilingual UNMT been limited. In this paper, we empirically introduce simple method to thirteen languages using encoder decoder, making use of data improve all On basis empirical findings, propose two knowledge distillation methods further enhance...
Unsupervised cross-lingual language representation initialization methods such as unsupervised bilingual word embedding (UBWE) pre-training and masked model (CMLM) pre-training, together with mechanisms denoising back-translation, have advanced neural machine translation (UNMT), which has achieved impressive results on several pairs, particularly French-English German-English. Typically, UBWE focuses initializing the layer in encoder decoder of UNMT, whereas CMLM entire UNMT. However,...
Source input information plays a very important role in the Transformer-based translation system. In practice, word embedding and positional of each are added as representation. Then self-attention networks used to encode global dependencies representation generate source However, this processing on only adopts single feature excludes richer more diverse features such recurrence features, local syntactic which results tedious thereby hinders further performance improvement. paper, we...