- Natural Language Processing Techniques
- Topic Modeling
- Speech and dialogue systems
- Domain Adaptation and Few-Shot Learning
- Speech Recognition and Synthesis
- Multimodal Machine Learning Applications
- Human Pose and Action Recognition
- Advanced Image and Video Retrieval Techniques
- Robot Manipulation and Learning
- 3D Shape Modeling and Analysis
- Machine Learning and Data Classification
- Computational and Text Analysis Methods
Switch
2023
Microsoft (Finland)
2022-2023
Beijing University of Posts and Telecommunications
2021-2023
Daixuan Cheng, Shaohan Huang, Junyu Bi, Yuefeng Zhan, Jianfeng Liu, Yujing Wang, Hao Sun, Furu Wei, Weiwei Deng, Qi Zhang. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. 2023.
Unsupervised multitask pre-training has been the critical method behind recent success of language models (LMs). However, supervised learning still holds significant promise, as scaling it in post-training stage trends towards better generalization. In this paper, we explore by proposing Instruction Pre-Training, a framework that scalably augments massive raw corpora with instruction-response pairs to pre-train LMs. The are generated an efficient instruction synthesizer built on open-source...
We explore how continued pre-training on domain-specific corpora influences large language models, revealing that training the raw endows model with domain knowledge, but drastically hurts its prompting ability for question answering. Taken inspiration from human learning via reading comprehension--practice after improves to answer questions based learned knowledge--we propose a simple method transforming into comprehension texts. Each text is enriched series of tasks related content. Our...
Vision-Language Pretraining (VLP) has significantly improved the performance of various vision-language tasks with matching images and texts. In this paper, we propose VL-Match, a framework Enhanced Token-level Instance-level Matching. At token level, Replaced Token Detection task is designed to boost substantial interaction between text tokens images, where encoder VLP works as generator generate corrupted text, multimodal discriminator predict whether each in matches image. instance...
Recent years have witnessed the rapid development of general multimodal large language models (MLLMs). However, adapting MLLMs to specific domains, such as scientific fields and industrial applications, remains less explored. This paper systematically investigates domain adaptation through post-training, focusing on data synthesis, training pipelines, task evaluation. (1) Data Synthesis: Using open-source models, we develop a visual instruction synthesizer that effectively generates diverse...
Large Language Models (LLMs) are popular for their impressive abilities, but the need model-specific fine-tuning or task-specific prompt engineering can hinder generalization. We propose UPRISE (Universal Prompt Retrieval Improving zero-Shot Evaluation), which tunes a lightweight and versatile retriever that automatically retrieves prompts given zero-shot task input. Specifically, we demonstrate universality in cross-task cross-model scenario: is tuned on diverse set of tasks, tested unseen...
Response generation is a fundamental function in conversational systems, where controllability of the response key problem. In this paper, we consider how to control by lexical constraints, namely lexically constrained generation. The stochastic search-based methods have achieved promising performance satisfying constraints. idea these modifying sentence through actions insertion, deletion and replacement guided an optimization algorithm. core our method incorporating constraints preserving...
Transformer-based pretrained language models (PLMs) have achieved great success in modern NLP. An important advantage of PLMs is good out-of-distribution (OOD) robustness. Recently, diffusion attracted a lot work to apply PLMs. It remains under-explored how influences on OOD data. The core forward process which gradually applies Gaussian noise inputs, and reverse denoising removes noise. noised input reconstruction fundamental ability models. We directly analyze robustness by measuring the...
Discriminative pre-trained language models, such as ELECTRA, have achieved promising performances in a variety of general tasks. However, these generic models struggle to capture domain-specific knowledge domain-related In this work, we propose novel domain-adaptation method for which can dynamically select tokens and guide the discriminator emphasize them, without introducing new training parameters. We show that by re-weighting losses tokens, ELECTRA be effectively adapted different...