Daixuan Cheng

ORCID: 0000-0003-0405-9707
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Natural Language Processing Techniques
  • Topic Modeling
  • Speech and dialogue systems
  • Domain Adaptation and Few-Shot Learning
  • Speech Recognition and Synthesis
  • Multimodal Machine Learning Applications
  • Human Pose and Action Recognition
  • Advanced Image and Video Retrieval Techniques
  • Robot Manipulation and Learning
  • 3D Shape Modeling and Analysis
  • Machine Learning and Data Classification
  • Computational and Text Analysis Methods

Switch
2023

Microsoft (Finland)
2022-2023

Beijing University of Posts and Telecommunications
2021-2023

Daixuan Cheng, Shaohan Huang, Junyu Bi, Yuefeng Zhan, Jianfeng Liu, Yujing Wang, Hao Sun, Furu Wei, Weiwei Deng, Qi Zhang. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. 2023.

10.18653/v1/2023.emnlp-main.758 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2023-01-01

Unsupervised multitask pre-training has been the critical method behind recent success of language models (LMs). However, supervised learning still holds significant promise, as scaling it in post-training stage trends towards better generalization. In this paper, we explore by proposing Instruction Pre-Training, a framework that scalably augments massive raw corpora with instruction-response pairs to pre-train LMs. The are generated an efficient instruction synthesizer built on open-source...

10.48550/arxiv.2406.14491 preprint EN arXiv (Cornell University) 2024-06-20

10.18653/v1/2024.emnlp-main.148 article EN Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2024-01-01

10.1016/j.ipm.2021.102605 article EN Information Processing & Management 2021-04-23

We explore how continued pre-training on domain-specific corpora influences large language models, revealing that training the raw endows model with domain knowledge, but drastically hurts its prompting ability for question answering. Taken inspiration from human learning via reading comprehension--practice after improves to answer questions based learned knowledge--we propose a simple method transforming into comprehension texts. Each text is enriched series of tasks related content. Our...

10.48550/arxiv.2309.09530 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Vision-Language Pretraining (VLP) has significantly improved the performance of various vision-language tasks with matching images and texts. In this paper, we propose VL-Match, a framework Enhanced Token-level Instance-level Matching. At token level, Replaced Token Detection task is designed to boost substantial interaction between text tokens images, where encoder VLP works as generator generate corrupted text, multimodal discriminator predict whether each in matches image. instance...

10.1109/iccv51070.2023.00244 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

Recent years have witnessed the rapid development of general multimodal large language models (MLLMs). However, adapting MLLMs to specific domains, such as scientific fields and industrial applications, remains less explored. This paper systematically investigates domain adaptation through post-training, focusing on data synthesis, training pipelines, task evaluation. (1) Data Synthesis: Using open-source models, we develop a visual instruction synthesizer that effectively generates diverse...

10.48550/arxiv.2411.19930 preprint EN arXiv (Cornell University) 2024-11-29

Large Language Models (LLMs) are popular for their impressive abilities, but the need model-specific fine-tuning or task-specific prompt engineering can hinder generalization. We propose UPRISE (Universal Prompt Retrieval Improving zero-Shot Evaluation), which tunes a lightweight and versatile retriever that automatically retrieves prompts given zero-shot task input. Specifically, we demonstrate universality in cross-task cross-model scenario: is tuned on diverse set of tasks, tested unseen...

10.48550/arxiv.2303.08518 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Response generation is a fundamental function in conversational systems, where controllability of the response key problem. In this paper, we consider how to control by lexical constraints, namely lexically constrained generation. The stochastic search-based methods have achieved promising performance satisfying constraints. idea these modifying sentence through actions insertion, deletion and replacement guided an optimization algorithm. core our method incorporating constraints preserving...

10.1109/bigdata52589.2021.9671855 article EN 2021 IEEE International Conference on Big Data (Big Data) 2021-12-15

Transformer-based pretrained language models (PLMs) have achieved great success in modern NLP. An important advantage of PLMs is good out-of-distribution (OOD) robustness. Recently, diffusion attracted a lot work to apply PLMs. It remains under-explored how influences on OOD data. The core forward process which gradually applies Gaussian noise inputs, and reverse denoising removes noise. noised input reconstruction fundamental ability models. We directly analyze robustness by measuring the...

10.48550/arxiv.2307.13949 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Discriminative pre-trained language models, such as ELECTRA, have achieved promising performances in a variety of general tasks. However, these generic models struggle to capture domain-specific knowledge domain-related In this work, we propose novel domain-adaptation method for which can dynamically select tokens and guide the discriminator emphasize them, without introducing new training parameters. We show that by re-weighting losses tokens, ELECTRA be effectively adapted different...

10.18653/v1/2022.findings-emnlp.163 article EN cc-by 2022-01-01
Coming Soon ...