Longyue Wang

ORCID: 0000-0002-9062-6183
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Natural Language Processing Techniques
  • Multimodal Machine Learning Applications
  • Text Readability and Simplification
  • Speech and dialogue systems
  • Computational Drug Discovery Methods
  • Speech Recognition and Synthesis
  • Video Analysis and Summarization
  • Software Engineering Research
  • Web Data Mining and Analysis
  • Text and Document Classification Technologies
  • Biomedical Text Mining and Ontologies
  • Human Motion and Animation
  • Machine Learning in Materials Science
  • Bioinformatics and Genomic Networks
  • Domain Adaptation and Few-Shot Learning
  • Multi-Agent Systems and Negotiation
  • Semantic Web and Ontologies
  • Image Retrieval and Classification Techniques
  • Handwritten Text Recognition Techniques
  • Advanced Graph Neural Networks
  • Human Pose and Action Recognition
  • Artificial Intelligence in Games
  • Science Education and Pedagogy
  • Gastric Cancer Management and Outcomes

Tencent (China)
2018-2025

Alibaba Group (China)
2025

Zhejiang University
2024

Dublin City University
2015-2024

Beijing Institute of Technology
2024

Xiangtan University
2024

University of Illinois Chicago
2024

Hunan University
2024

Macao Polytechnic University
2024

University of Hong Kong
2020-2023

In translation, considering the document as a whole can help to resolve ambiguities and inconsistencies. this paper, we propose cross-sentence context-aware approach investigate influence of historical contextual information on performance neural machine translation (NMT). First, history is summarized in hierarchical way. We then integrate representation into NMT two strategies: 1) warm-start encoder decoder states, 2) an auxiliary context source for updating states. Experimental results...

10.18653/v1/d17-1301 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2017-01-01

While large language models (LLMs) have demonstrated remarkable capabilities across a range of downstream tasks, significant concern revolves around their propensity to exhibit hallucinations: LLMs occasionally generate content that diverges from the user input, contradicts previously generated context, or misaligns with established world knowledge. This phenomenon poses substantial challenge reliability in real-world scenarios. In this paper, we survey recent efforts on detection,...

10.48550/arxiv.2309.01219 preprint EN cc-by-nc-sa arXiv (Cornell University) 2023-01-01

Large language models (LLMs) such as ChatGPT can produce coherent, cohesive, relevant, and fluent answers for various natural processing (NLP) tasks. Taking document-level machine translation (MT) a testbed, this paper provides an in-depth evaluation of LLMs’ ability on discourse modeling. The study focuses three aspects: 1) Effects Context-Aware Prompts, where we investigate the impact different prompts quality phenomena; 2) Comparison Translation Models, compare performance with commercial...

10.18653/v1/2023.emnlp-main.1036 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2023-01-01

Baosong Yang, Longyue Wang, Derek F. Wong, Lidia S. Chao, Zhaopeng Tu. Proceedings of the 2019 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019.

10.18653/v1/n19-1407 preprint EN 2019-01-01

Jie Hao, Xing Wang, Baosong Yang, Longyue Jinfeng Zhang, Zhaopeng Tu. Proceedings of the 2019 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019.

10.18653/v1/n19-1122 preprint EN 2019-01-01

This work evaluates GPT-4V's multimodal capability for medical image analysis, focusing on three representative tasks radiology report generation, visual question answering, and grounding. For the evaluation, a set of prompts is designed each task to induce corresponding GPT-4V produce sufficiently good outputs. Three evaluation ways including quantitative human case study are employed achieve an in-depth extensive evaluation. Our shows that excels in understanding images can generate...

10.1016/j.metrad.2024.100099 article EN cc-by-nc-nd Meta-Radiology 2024-07-01

Xing Wang, Zhaopeng Tu, Longyue Shuming Shi. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint (EMNLP-IJCNLP). 2019.

10.18653/v1/d19-1145 article EN cc-by 2019-01-01

Machine Translation (MT) has greatly advanced over the years due to developments in deep neural networks. However, emergence of Large Language Models (LLMs) like GPT-4 and ChatGPT is introducing a new phase MT domain. In this context, we believe that future intricately tied capabilities LLMs. These models not only offer vast linguistic understandings but also bring innovative methodologies, such as prompt-based techniques, have potential further elevate MT. paper, provide an overview...

10.48550/arxiv.2305.01181 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Although instruction-tuned large language models (LLMs) have exhibited remarkable capabilities across various NLP tasks, their effectiveness on other data modalities beyond text has not been fully studied. In this work, we propose Macaw-LLM, a novel multi-modal LLM that seamlessly integrates visual, audio, and textual information. Macaw-LLM consists of three main components: modality module for encoding data, cognitive harnessing pretrained LLMs, an alignment harmonizing diverse...

10.48550/arxiv.2306.09093 preprint EN cc-by arXiv (Cornell University) 2023-01-01

A bstract This paper presents a comprehensive evaluation of GPT-4V’s capabilities across diverse medical imaging tasks, including Radiology Report Generation, Medical Visual Question Answering (VQA), and Grounding. While prior efforts have explored performance in imaging, to the best our knowledge, study represents first quantitative on publicly available benchmarks. Our findings highlight potential generating descriptive reports for chest X-ray images, particularly when guided by...

10.1101/2023.11.03.23298067 preprint EN cc-by-nc-nd medRxiv (Cold Spring Harbor Laboratory) 2023-11-04

Shilin He, Zhaopeng Tu, Xing Wang, Longyue Michael Lyu, Shuming Shi. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint (EMNLP-IJCNLP). 2019.

10.18653/v1/d19-1088 article EN cc-by 2019-01-01

Knowledge distillation (KD) is essential for training non-autoregressive translation (NAT) models by reducing the complexity of raw data with an autoregressive teacher model. In this study, we empirically show that as a side effect training, lexical choice errors on low-frequency words are propagated to NAT model from To alleviate problem, propose expose restore useful information words, which missed in distilled data. end, introduce extra Kullback-Leibler divergence term derived comparing...

10.48550/arxiv.2012.14583 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Position encoding (PE), an essential part of self-attention networks (SANs), is used to preserve the word order information for natural language processing tasks, generating fixed position indices input sequences. However, in cross-lingual scenarios, machine translation, PEs source and target sentences are modeled independently. Due divergences different languages, modeling positional relationships might help SANs tackle this problem. In paper, we augment with representations model...

10.18653/v1/2020.acl-main.153 preprint EN cc-by 2020-01-01

Liang Ding, Longyue Wang, Xuebo Liu, Derek F. Wong, Dacheng Tao, Zhaopeng Tu. Proceedings of the 59th Annual Meeting Association for Computational Linguistics and 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2021.

10.18653/v1/2021.acl-long.266 article EN cc-by 2021-01-01

Virtual film production requires intricate decision-making processes, including scriptwriting, virtual cinematography, and precise actor positioning actions. Motivated by recent advances in automated with language agent-based societies, this paper introduces FilmAgent, a novel LLM-based multi-agent collaborative framework for end-to-end automation our constructed 3D spaces. FilmAgent simulates various crew roles, directors, screenwriters, actors, cinematographers, covers key stages of...

10.48550/arxiv.2501.12909 preprint EN arXiv (Cornell University) 2025-01-22

Recent advancements in Multimodal Large Language Models (MLLMs) underscore the significance of scalable models and data to boost performance, yet this often incurs substantial computational costs. Although Mixture Experts (MoE) architecture has been employed scale large language or visual-language efficiently, these efforts typically involve fewer experts limited modalities. To address this, our work presents pioneering attempt develop a unified MLLM with MoE architecture, named Uni-MoE that...

10.1109/tpami.2025.3532688 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2025-01-01

Despite being pretrained on multilingual corpora, large language models (LLMs) exhibit suboptimal performance low-resource languages. Recent approaches have leveraged encoders alongside LLMs by introducing trainable parameters connecting the two models. However, these methods typically focus encoder's output, overlooking valuable information from other layers. We propose \aname (\mname), a framework that integrates representations all encoder layers, coupled with \attaname mechanism to...

10.48550/arxiv.2502.11405 preprint EN arXiv (Cornell University) 2025-02-16

Multi-aspect controllable text generation aims to control in attributes from multiple aspects, making it a complex but powerful task natural language processing. Supervised fine-tuning methods are often employed for this due their simplicity and effectiveness. However, they still have some limitations: low rank adaptation (LoRA) only fine-tunes few parameters has suboptimal effects, while full (FFT) requires significant computational resources is susceptible overfitting, particularly when...

10.48550/arxiv.2502.13474 preprint EN arXiv (Cornell University) 2025-02-19
Coming Soon ...