NFDI4DS | UHH-SEMS - Publication Details

Lemao Liu

ORCID: 0000-0003-3804-5768

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5005411806

Research Areas

Natural Language Processing Techniques
Topic Modeling
Multimodal Machine Learning Applications
Handwritten Text Recognition Techniques
Speech and dialogue systems
Speech Recognition and Synthesis
Software Engineering Research
Text Readability and Simplification
Explainable Artificial Intelligence (XAI)
Machine Learning and Data Classification
Semantic Web and Ontologies
Data Quality and Management
Advanced Graph Neural Networks
Biomedical Text Mining and Ontologies
Neural Networks and Applications
Advanced Text Analysis Techniques
Sentiment Analysis and Opinion Mining
Software Testing and Debugging Techniques
Machine Learning in Bioinformatics
Text and Document Classification Technologies
EEG and Brain-Computer Interfaces
Adversarial Robustness in Machine Learning
Big Data and Digital Economy
Second Language Acquisition and Learning
Computational and Text Analysis Methods

Tencent (China)
2017-2024

Beijing Jiaotong University
2023

Bellevue Hospital Center
2023

Harbin Institute of Technology
2011-2019

National Institute of Information and Communications Technology
2016-2019

Shanghai Jiao Tong University
2018

Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models

OPENALEX - Publications

Yue Zhang Yafu Li Leyang Cui Deng Cai Lemao Liu and 10 more

While large language models (LLMs) have demonstrated remarkable capabilities across a range of downstream tasks, significant concern revolves around their propensity to exhibit hallucinations: LLMs occasionally generate content that diverges from the user input, contradicts previously generated context, or misaligns with established world knowledge. This phenomenon poses substantial challenge reliability in real-world scenarios. In this paper, we survey recent efforts on detection,...

10.48550/arxiv.2309.01219 preprint EN cc-by-nc-sa arXiv (Cornell University) 2023-01-01

Instance Weighting for Neural Machine Translation Domain Adaptation

OPENALEX - Publications

Rui Wang Masao Utiyama Lemao Liu Kehai Chen Eiichiro Sumita

Instance weighting has been widely applied to phrase-based machine translation domain adaptation. However, it is challenging be Neural Machine Translation (NMT) directly, because NMT not a linear model. In this paper, two instance technologies, i.e., sentence and with dynamic weight learning strategy, are proposed for Empirical results on the IWSLT English-German/French tasks show that methods can substantially improve performance by up 2.7-6.7 BLEU points, outperforming existing baselines...

10.18653/v1/d17-1155 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2017-01-01

Neural Machine Translation with Supervised Attention

OPENALEX - Publications

Lemao Liu Masao Utiyama Andrew Finch Eiichiro Sumita

The attention mechanisim is appealing for neural machine translation, since it able to dynam- ically encode a source sentence by generating alignment between target word and words. Unfortunately, has been proved be worse than conventional models in aligment accuracy. In this paper, we analyze explain issue from the point view of re- ordering, propose supervised which learned with guidance models. Experiments on two Chinese-to-English translation tasks show that super- vised mechanism yields...

10.48550/arxiv.1609.04186 preprint EN cc-by arXiv (Cornell University) 2016-01-01

Agreement on Target-bidirectional Neural Machine Translation

OPENALEX - Publications

Lemao Liu Masao Utiyama Andrew Finch Eiichiro Sumita

Lemao Liu, Masao Utiyama, Andrew Finch, Eiichiro Sumita. Proceedings of the 2016 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies. 2016.

10.18653/v1/n16-1046 article EN cc-by Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2016-01-01

On the Word Alignment from Neural Machine Translation

OPENALEX - Publications

Xintong Li Guanlin Li Lemao Liu Max Q.‐H. Meng Shuming Shi

Prior researches suggest that neural machine translation (NMT) captures word alignment through its attention mechanism, however, this paper finds may almost fail to capture for some NMT models. This thereby proposes two methods induce which are general and agnostic specific Experiments show both much better than attention. further visualizes the induced by NMT. In particular, it analyzes effect of errors on at level quantitative analysis over many testing examples consistently demonstrate...

10.18653/v1/p19-1124 article EN cc-by 2019-01-01

A Survey on Retrieval-Augmented Text Generation

OPENALEX - Publications

Huayang Li Yixuan Su Deng Cai Yan Wang Lemao Liu

Recently, retrieval-augmented text generation attracted increasing attention of the computational linguistics community. Compared with conventional models, has remarkable advantages and particularly achieved state-of-the-art performance in many NLP tasks. This paper aims to conduct a survey about generation. It firstly highlights generic paradigm generation, then it reviews notable approaches according different tasks including dialogue response machine translation, other Finally, points out...

10.48550/arxiv.2202.01110 preprint EN cc-by arXiv (Cornell University) 2022-01-01

Recent Advances in Retrieval-Augmented Text Generation

OPENALEX - Publications

Deng Cai Yan Wang Lemao Liu Shuming Shi

Recently retrieval-augmented text generation has achieved state-of-the-art performance in many NLP tasks and attracted increasing attention of the IR community, this tutorial thereby aims to present recent advances comprehensively comparatively. It firstly highlights generic paradigm generation, then reviews notable works for different including dialogue machine translation, other tasks, finally points out some limitations shortcomings facilitate future research.

10.1145/3477495.3532682 article EN Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2022-07-06

Neural Machine Translation with Source Dependency Representation

OPENALEX - Publications

Kehai Chen Rui Wang Masao Utiyama Lemao Liu Akihiro Tamura and 2 more

Source dependency information has been successfully introduced into statistical machine translation. However, there are only a few preliminary attempts for Neural Machine Translation (NMT), such as concatenating representations of source word and its label together. In this paper, we propose novel NMT with representation to improve translation performance NMT, especially long sentences. Empirical results on NIST Chinese-to-English task show that our method achieves 1.6 BLEU improvements...

10.18653/v1/d17-1304 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2017-01-01

A Neural Approach to Source Dependence Based Context Model for Statistical Machine Translation

OPENALEX - Publications

Kehai Chen Tiejun Zhao Muyun Yang Lemao Liu Akihiro Tamura and 3 more

In statistical machine translation, translation prediction considers not only the aligned source word itself but also its contextual information. Learning context representation is a promising method for improving results, particularly through neural networks. Most of existing methods process words sequentially and neglect long-distance dependencies. this paper, we propose novel approach to dependence-based prediction. The proposed model capable encoding dependencies capturing functional...

10.1109/taslp.2017.2772846 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2017-11-13

Sentence Selection and Weighting for Neural Machine Translation Domain Adaptation

OPENALEX - Publications

Rui Wang Masao Utiyama Andrew Finch Lemao Liu Kehai Chen and 1 more

Neural machine translation (NMT) has been prominent in many tasks. However, some domain-specific tasks, only the corpora from similar domains can improve performance. If out-of-domain are directly added into in-domain corpus, performance may even degrade. Therefore, domain adaptation techniques essential to solve NMT problem. Most existing methods for designed conventional phrase-based translation. For adaptation, there have a few studies on topics such as fine tuning, tags, and features. In...

10.1109/taslp.2018.2837223 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2018-05-16

Fast and Accurate Neural Machine Translation with Translation Memory

OPENALEX - Publications

Qiuxiang He Guoping Huang Qu Cui Li Li Lemao Liu

Qiuxiang He, Guoping Huang, Qu Cui, Li Li, Lemao Liu. Proceedings of the 59th Annual Meeting Association for Computational Linguistics and 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2021.

10.18653/v1/2021.acl-long.246 article EN cc-by 2021-01-01

Understanding LLMs' Fluid Intelligence Deficiency: An Analysis of the ARC Task

OPENALEX - Publications

Junjie Wu Mo Yu Lemao Liu Dit‐Yan Yeung Jie Zhou

While LLMs have exhibited strong performance on various NLP tasks, it is noteworthy that most of these tasks rely utilizing the vast amount knowledge encoded in LLMs' parameters, rather than solving new problems without prior knowledge. In cognitive research, latter ability referred to as fluid intelligence, which considered be critical for assessing human intelligence. Recent research intelligence assessments has highlighted significant deficiencies abilities. this paper, we analyze...

10.48550/arxiv.2502.07190 preprint EN arXiv (Cornell University) 2025-02-10

Agreement on Target-Bidirectional LSTMs for Sequence-to-Sequence Learning

OPENALEX - Publications

Lemao Liu Andrew Finch Masao Utiyama Eiichiro Sumita

Recurrent neural networks, particularly the long short- term memory are extremely appealing for sequence-to-sequence learning tasks. Despite their great success, they typically suffer from a fundamental coming: prone to generate unbalanced targets with good prefixes but bad suffixes, and thus perfor- mance suffers when dealing sequences. We propose simple yet effective approach overcome this shortcoming. Our relies on agreement between pair of target-directional LSTMs, which generates more...

10.1609/aaai.v30i1.10327 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2016-03-05

Graph Based Translation Memory for Neural Machine Translation

OPENALEX - Publications

Mengzhou Xia Guoping Huang Lemao Liu Shuming Shi

A translation memory (TM) is proved to be helpful improve neural machine (NMT). Existing approaches either pursue the decoding efficiency by merely accessing local information in a TM or encode global yet sacrificing due redundancy. We propose an efficient approach making use of TM. The key idea pack redundant into compact graph and perform additional attention mechanisms over packed for integrating representation network. implement model extending state-of-the-art NMT, Transformer....

10.1609/aaai.v33i01.33017297 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2019-07-17

Target-Bidirectional Neural Models for Machine Transliteration

OPENALEX - Publications

Andrew Finch Lemao Liu Xiaolin Wang Eiichiro Sumita

Our purely neural network-based system represents a paradigm shift away from the techniques based on phrase-based statistical machine translation we have used in past.The approach exploits agreement between pair of target-bidirectional LSTMs, order to generate balanced targets with both good suffixes and prefixes.The evaluation results show that method is able match even surpass current state-of-the-art most language pairs, but also exposes weaknesses some tasks motivating further study.The...

10.18653/v1/w16-2711 article EN cc-by 2016-01-01

BiTIIMT: A Bilingual Text-infilling Method for Interactive Machine Translation

OPENALEX - Publications

Yanling Xiao Lemao Liu Guoping Huang Qu Cui Shujian Huang and 2 more

Yanling Xiao, Lemao Liu, Guoping Huang, Qu Cui, Shujian Shuming Shi, Jiajun Chen. Proceedings of the 60th Annual Meeting Association for Computational Linguistics (Volume 1: Long Papers). 2022.

10.18653/v1/2022.acl-long.138 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2022-01-01

Understanding Data Augmentation in Neural Machine Translation: Two Perspectives towards Generalization

OPENALEX - Publications

Guanlin Li Lemao Liu Guoping Huang Conghui Zhu Tiejun Zhao

Guanlin Li, Lemao Liu, Guoping Huang, Conghui Zhu, Tiejun Zhao. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint (EMNLP-IJCNLP). 2019.

10.18653/v1/d19-1570 article EN cc-by 2019-01-01

Evaluating Explanation Methods for Neural Machine Translation

OPENALEX - Publications

Jierui Li Lemao Liu Huayang Li Guanlin Li Guoping Huang and 1 more

Recently many efforts have been devoted to interpreting the black-box NMT models, but little progress has made on metrics evaluate explanation methods. Word Alignment Error Rate can be used as such a metric that matches human understanding, however, it not measure methods those target words are aligned any source word. This paper thereby makes an initial attempt from alternative viewpoint. To this end, proposes principled based fidelity in regard predictive behavior of model. As exact...

10.18653/v1/2020.acl-main.35 article EN cc-by 2020-01-01

TranSmart: A Practical Interactive Machine Translation System

OPENALEX - Publications

Guoping Huang Lemao Liu Xing Wang Longyue Wang Huayang Li and 3 more

Automatic machine translation is super efficient to produce translations yet their quality not guaranteed. This technique report introduces TranSmart, a practical human-machine interactive system that able trade off and efficiency. Compared existing publicly available systems, TranSmart supports three key features, word-level autocompletion, sentence-level autocompletion memory. By allows users interactively translate words in own manners rather than the strict manner from left right. In...

10.48550/arxiv.2105.13072 preprint EN cc-by arXiv (Cornell University) 2021-01-01

Neural Network Transduction Models in Transliteration Generation

OPENALEX - Publications

Andrew Finch Lemao Liu Xiaolin Wang Eiichiro Sumita

In this paper we examine the effectiveness of neural network sequence-to-sequence transduction in task transliteration generation.In year's shared evaluation submitted two systems into all tasks.The primary system was based on used for NEWS 2012 workshop, but augmented with an additional feature which generation probability from a network.The secondary model its own together simple beam search algorithm.Our results show that adding score as phrase-based statistical machine able to increase...

10.18653/v1/w15-3909 article EN cc-by 2015-01-01

Target Foresight Based Attention for Neural Machine Translation

OPENALEX - Publications

Xintong Li Lemao Liu Zhaopeng Tu Shuming Shi Max Q.‐H. Meng

Xintong Li, Lemao Liu, Zhaopeng Tu, Shuming Shi, Max Meng. Proceedings of the 2018 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). 2018.

10.18653/v1/n18-1125 article EN cc-by 2018-01-01

Automatic Article Commenting: the Task and Dataset

OPENALEX - Publications

Lianhui Qin Lemao Liu Wei Bi Yan Wang Xiaojiang Liu and 3 more

Lianhui Qin, Lemao Liu, Wei Bi, Yan Wang, Xiaojiang Zhiting Hu, Hai Zhao, Shuming Shi. Proceedings of the 56th Annual Meeting Association for Computational Linguistics (Volume 2: Short Papers). 2018.

10.18653/v1/p18-2025 article EN cc-by 2018-01-01

TexSmart: A System for Enhanced Natural Language Understanding

OPENALEX - Publications

Lemao Liu Haisong Zhang Haiyun Jiang Yangming Li Enbo Zhao and 12 more

Lemao Liu, Haisong Zhang, Haiyun Jiang, Yangming Li, Enbo Zhao, Kun Xu, Linfeng Song, Suncong Zheng, Botong Zhou, Dick Zhu, Xiao Feng, Tao Chen, Yang, Dong Yu, Feng ZhanHui Kang, Shuming Shi. Proceedings of the 59th Annual Meeting Association for Computational Linguistics and 11th International Joint Conference on Natural Language Processing: System Demonstrations. 2021.

10.18653/v1/2021.acl-demo.1 article EN cc-by 2021-01-01

Coming Soon ...