- Topic Modeling
- Natural Language Processing Techniques
- Multimodal Machine Learning Applications
- Text Readability and Simplification
- Teaching and Learning Programming
- Speech and dialogue systems
- Intelligent Tutoring Systems and Adaptive Learning
- Distributed and Parallel Computing Systems
- Genetics, Bioinformatics, and Biomedical Research
- Advanced Text Analysis Techniques
- Advanced Computational Techniques and Applications
- Online Learning and Analytics
- Service-Oriented Architecture and Web Services
University of Macau
2023
Southern University of Science and Technology
2022-2023
South China Normal University
2023
Chinese University of Hong Kong
2021
University of Hong Kong
2021
University of Miami
2017
Peking University
2005
Previous work mainly focuses on improving cross-lingual transfer for NLU tasks with a multilingual pretrained encoder (MPE), or the performance supervised machine translation BERT. However, it is under-explored that whether MPE can help to facilitate transferability of NMT model. In this paper, we focus zero-shot task in NMT. task, model trained parallel dataset only one language pair and an off-the-shelf MPE, then directly tested pairs. We propose SixT, simple yet effective task. SixT...
Guanhua Chen, Lu Hou, Yun Wenliang Dai, Lifeng Shang, Xin Jiang, Qun Liu, Jia Pan, Wenping Wang. Proceedings of the 61st Annual Meeting Association for Computational Linguistics (Volume 1: Long Papers). 2023.
Multilingual pretrained language models (mPLMs) have shown their effectiveness in multilingual word alignment induction. However, these methods usually start from mBERT or XLM-R. In this paper, we investigate whether sentence Transformer LaBSE is a strong aligner. This idea non-trivial as trained to learn language-agnostic sentence-level embeddings, while the extraction task requires more fine-grained word-level embeddings be language-agnostic. We demonstrate that vanilla outperforms other...
Since deep learning is the dominant paradigm in multi-turn dialogue generation task, large-scale training data key factor affecting model performance. To make full use of data, existing work directly applied curriculum to a “easy-to-hard” way. But design current methodology does not consider dialogue-specific features. close this gap, we propose Multi-Level Curriculum Learning (MLCL) method for by considering word-level linguistic feature and utterance-level semantic relation dialogue. The...
With the evolution of GIS from stand-alone systems with geo-data tightly coupled to an increasingly distributed model based on independently-provided, interoperable Web service, much more research has been focused service composition. However, little works concern control mechanisms improving availability and reliability in Considering that services are essence loosely-coupled hosted by different providers. As a result, any update might affect critically overall composition consistency...
Instruction tuning has been demonstrated that could significantly improve the zero-shot generalization capability to unseen tasks by an apparent margin. By incorporating additional context (e.g., task definition, examples) during fine-tuning process, Large Language Models (LLMs) achieved much higher performance than before. However, recent work reported delusive examples can achieve almost same as correct examples, indicating input-label correspondence is less important previously thought....
Stylistic headline generation is the task to generate a that not only summarizes content of an article, but also reflects desired style attracts users. As style-specific article-headline pairs are scarce, previous researches focus on unsupervised approaches with standard dataset and mono-style corpora. In this work, we follow line propose StyleBART, approach for stylistic generation. Our method decorates pretrained BART model adapters responsible different styles allows headlines diverse by...
This paper demonstrates that multilingual pretraining and fine-tuning are both critical for facilitating cross-lingual transfer in zero-shot translation, where the neural machine translation (NMT) model is tested on source languages unseen during supervised training. Following this idea, we present SixT+, a strong many-to-English NMT supports 100 but trained with parallel dataset only six languages. SixT+ initializes decoder embedding full encoder XLM-R large then trains layers simple...
Previous work mainly focuses on improving cross-lingual transfer for NLU tasks with a multilingual pretrained encoder (MPE), or the performance supervised machine translation BERT. However, it is under-explored that whether MPE can help to facilitate transferability of NMT model. In this paper, we focus zero-shot task in NMT. task, model trained parallel dataset only one language pair and an off-the-shelf MPE, then directly tested pairs. We propose SixT, simple yet effective task. SixT...
Multilingual pretrained language models (mPLMs) have shown their effectiveness in multilingual word alignment induction. However, these methods usually start from mBERT or XLM-R. In this paper, we investigate whether sentence Transformer LaBSE is a strong aligner. This idea non-trivial as trained to learn language-agnostic sentence-level embeddings, while the extraction task requires more fine-grained word-level embeddings be language-agnostic. We demonstrate that vanilla outperforms other...
Stylistic headline generation is the task to generate a that not only summarizes content of an article, but also reflects desired style attracts users. As style-specific article-headline pairs are scarce, previous researches focus on unsupervised approaches with standard dataset and mono-style corpora. In this work, we follow line propose StyleBART, approach for stylistic generation. Our method decorates pretrained BART model adapters responsible different styles allows headlines diverse by...