Lemao Liu

ORCID: 0000-0003-3804-5768
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Natural Language Processing Techniques
  • Topic Modeling
  • Multimodal Machine Learning Applications
  • Handwritten Text Recognition Techniques
  • Speech and dialogue systems
  • Speech Recognition and Synthesis
  • Software Engineering Research
  • Text Readability and Simplification
  • Explainable Artificial Intelligence (XAI)
  • Machine Learning and Data Classification
  • Semantic Web and Ontologies
  • Data Quality and Management
  • Advanced Graph Neural Networks
  • Biomedical Text Mining and Ontologies
  • Neural Networks and Applications
  • Advanced Text Analysis Techniques
  • Sentiment Analysis and Opinion Mining
  • Software Testing and Debugging Techniques
  • Machine Learning in Bioinformatics
  • Text and Document Classification Technologies
  • EEG and Brain-Computer Interfaces
  • Adversarial Robustness in Machine Learning
  • Big Data and Digital Economy
  • Second Language Acquisition and Learning
  • Computational and Text Analysis Methods

Tencent (China)
2017-2024

Beijing Jiaotong University
2023

Bellevue Hospital Center
2023

Harbin Institute of Technology
2011-2019

National Institute of Information and Communications Technology
2016-2019

Shanghai Jiao Tong University
2018

While large language models (LLMs) have demonstrated remarkable capabilities across a range of downstream tasks, significant concern revolves around their propensity to exhibit hallucinations: LLMs occasionally generate content that diverges from the user input, contradicts previously generated context, or misaligns with established world knowledge. This phenomenon poses substantial challenge reliability in real-world scenarios. In this paper, we survey recent efforts on detection,...

10.48550/arxiv.2309.01219 preprint EN cc-by-nc-sa arXiv (Cornell University) 2023-01-01

Instance weighting has been widely applied to phrase-based machine translation domain adaptation. However, it is challenging be Neural Machine Translation (NMT) directly, because NMT not a linear model. In this paper, two instance technologies, i.e., sentence and with dynamic weight learning strategy, are proposed for Empirical results on the IWSLT English-German/French tasks show that methods can substantially improve performance by up 2.7-6.7 BLEU points, outperforming existing baselines...

10.18653/v1/d17-1155 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2017-01-01

The attention mechanisim is appealing for neural machine translation, since it able to dynam- ically encode a source sentence by generating alignment between target word and words. Unfortunately, has been proved be worse than conventional models in aligment accuracy. In this paper, we analyze explain issue from the point view of re- ordering, propose supervised which learned with guidance models. Experiments on two Chinese-to-English translation tasks show that super- vised mechanism yields...

10.48550/arxiv.1609.04186 preprint EN cc-by arXiv (Cornell University) 2016-01-01

Lemao Liu, Masao Utiyama, Andrew Finch, Eiichiro Sumita. Proceedings of the 2016 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies. 2016.

10.18653/v1/n16-1046 article EN cc-by Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2016-01-01

Prior researches suggest that neural machine translation (NMT) captures word alignment through its attention mechanism, however, this paper finds may almost fail to capture for some NMT models. This thereby proposes two methods induce which are general and agnostic specific Experiments show both much better than attention. further visualizes the induced by NMT. In particular, it analyzes effect of errors on at level quantitative analysis over many testing examples consistently demonstrate...

10.18653/v1/p19-1124 article EN cc-by 2019-01-01

Recently, retrieval-augmented text generation attracted increasing attention of the computational linguistics community. Compared with conventional models, has remarkable advantages and particularly achieved state-of-the-art performance in many NLP tasks. This paper aims to conduct a survey about generation. It firstly highlights generic paradigm generation, then it reviews notable approaches according different tasks including dialogue response machine translation, other Finally, points out...

10.48550/arxiv.2202.01110 preprint EN cc-by arXiv (Cornell University) 2022-01-01

Recently retrieval-augmented text generation has achieved state-of-the-art performance in many NLP tasks and attracted increasing attention of the IR community, this tutorial thereby aims to present recent advances comprehensively comparatively. It firstly highlights generic paradigm generation, then reviews notable works for different including dialogue machine translation, other tasks, finally points out some limitations shortcomings facilitate future research.

10.1145/3477495.3532682 article EN Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2022-07-06

Source dependency information has been successfully introduced into statistical machine translation. However, there are only a few preliminary attempts for Neural Machine Translation (NMT), such as concatenating representations of source word and its label together. In this paper, we propose novel NMT with representation to improve translation performance NMT, especially long sentences. Empirical results on NIST Chinese-to-English task show that our method achieves 1.6 BLEU improvements...

10.18653/v1/d17-1304 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2017-01-01

In statistical machine translation, translation prediction considers not only the aligned source word itself but also its contextual information. Learning context representation is a promising method for improving results, particularly through neural networks. Most of existing methods process words sequentially and neglect long-distance dependencies. this paper, we propose novel approach to dependence-based prediction. The proposed model capable encoding dependencies capturing functional...

10.1109/taslp.2017.2772846 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2017-11-13

Neural machine translation (NMT) has been prominent in many tasks. However, some domain-specific tasks, only the corpora from similar domains can improve performance. If out-of-domain are directly added into in-domain corpus, performance may even degrade. Therefore, domain adaptation techniques essential to solve NMT problem. Most existing methods for designed conventional phrase-based translation. For adaptation, there have a few studies on topics such as fine tuning, tags, and features. In...

10.1109/taslp.2018.2837223 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2018-05-16

Qiuxiang He, Guoping Huang, Qu Cui, Li Li, Lemao Liu. Proceedings of the 59th Annual Meeting Association for Computational Linguistics and 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2021.

10.18653/v1/2021.acl-long.246 article EN cc-by 2021-01-01

While LLMs have exhibited strong performance on various NLP tasks, it is noteworthy that most of these tasks rely utilizing the vast amount knowledge encoded in LLMs' parameters, rather than solving new problems without prior knowledge. In cognitive research, latter ability referred to as fluid intelligence, which considered be critical for assessing human intelligence. Recent research intelligence assessments has highlighted significant deficiencies abilities. this paper, we analyze...

10.48550/arxiv.2502.07190 preprint EN arXiv (Cornell University) 2025-02-10

Recurrent neural networks, particularly the long short- term memory are extremely appealing for sequence-to-sequence learning tasks. Despite their great success, they typically suffer from a fundamental coming: prone to generate unbalanced targets with good prefixes but bad suffixes, and thus perfor- mance suffers when dealing sequences. We propose simple yet effective approach overcome this shortcoming. Our relies on agreement between pair of target-directional LSTMs, which generates more...

10.1609/aaai.v30i1.10327 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2016-03-05

A translation memory (TM) is proved to be helpful improve neural machine (NMT). Existing approaches either pursue the decoding efficiency by merely accessing local information in a TM or encode global yet sacrificing due redundancy. We propose an efficient approach making use of TM. The key idea pack redundant into compact graph and perform additional attention mechanisms over packed for integrating representation network. implement model extending state-of-the-art NMT, Transformer....

10.1609/aaai.v33i01.33017297 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2019-07-17

Our purely neural network-based system represents a paradigm shift away from the techniques based on phrase-based statistical machine translation we have used in past.The approach exploits agreement between pair of target-bidirectional LSTMs, order to generate balanced targets with both good suffixes and prefixes.The evaluation results show that method is able match even surpass current state-of-the-art most language pairs, but also exposes weaknesses some tasks motivating further study.The...

10.18653/v1/w16-2711 article EN cc-by 2016-01-01

Yanling Xiao, Lemao Liu, Guoping Huang, Qu Cui, Shujian Shuming Shi, Jiajun Chen. Proceedings of the 60th Annual Meeting Association for Computational Linguistics (Volume 1: Long Papers). 2022.

10.18653/v1/2022.acl-long.138 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2022-01-01

Guanlin Li, Lemao Liu, Guoping Huang, Conghui Zhu, Tiejun Zhao. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint (EMNLP-IJCNLP). 2019.

10.18653/v1/d19-1570 article EN cc-by 2019-01-01

Recently many efforts have been devoted to interpreting the black-box NMT models, but little progress has made on metrics evaluate explanation methods. Word Alignment Error Rate can be used as such a metric that matches human understanding, however, it not measure methods those target words are aligned any source word. This paper thereby makes an initial attempt from alternative viewpoint. To this end, proposes principled based fidelity in regard predictive behavior of model. As exact...

10.18653/v1/2020.acl-main.35 article EN cc-by 2020-01-01

Automatic machine translation is super efficient to produce translations yet their quality not guaranteed. This technique report introduces TranSmart, a practical human-machine interactive system that able trade off and efficiency. Compared existing publicly available systems, TranSmart supports three key features, word-level autocompletion, sentence-level autocompletion memory. By allows users interactively translate words in own manners rather than the strict manner from left right. In...

10.48550/arxiv.2105.13072 preprint EN cc-by arXiv (Cornell University) 2021-01-01

In this paper we examine the effectiveness of neural network sequence-to-sequence transduction in task transliteration generation.In year's shared evaluation submitted two systems into all tasks.The primary system was based on used for NEWS 2012 workshop, but augmented with an additional feature which generation probability from a network.The secondary model its own together simple beam search algorithm.Our results show that adding score as phrase-based statistical machine able to increase...

10.18653/v1/w15-3909 article EN cc-by 2015-01-01

Xintong Li, Lemao Liu, Zhaopeng Tu, Shuming Shi, Max Meng. Proceedings of the 2018 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). 2018.

10.18653/v1/n18-1125 article EN cc-by 2018-01-01

Lianhui Qin, Lemao Liu, Wei Bi, Yan Wang, Xiaojiang Zhiting Hu, Hai Zhao, Shuming Shi. Proceedings of the 56th Annual Meeting Association for Computational Linguistics (Volume 2: Short Papers). 2018.

10.18653/v1/p18-2025 article EN cc-by 2018-01-01

Lemao Liu, Haisong Zhang, Haiyun Jiang, Yangming Li, Enbo Zhao, Kun Xu, Linfeng Song, Suncong Zheng, Botong Zhou, Dick Zhu, Xiao Feng, Tao Chen, Yang, Dong Yu, Feng ZhanHui Kang, Shuming Shi. Proceedings of the 59th Annual Meeting Association for Computational Linguistics and 11th International Joint Conference on Natural Language Processing: System Demonstrations. 2021.

10.18653/v1/2021.acl-demo.1 article EN cc-by 2021-01-01
Coming Soon ...