Xiaozhi Wang

ORCID: 0000-0002-5727-143X
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Natural Language Processing Techniques
  • Multimodal Machine Learning Applications
  • Semantic Web and Ontologies
  • Advanced Text Analysis Techniques
  • Text Readability and Simplification
  • Software Engineering Research
  • Advanced Graph Neural Networks
  • Data Quality and Management
  • Recommender Systems and Techniques
  • Scientific Computing and Data Management
  • Domain Adaptation and Few-Shot Learning
  • Computational and Text Analysis Methods
  • Distributed and Parallel Computing Systems
  • Cloud Computing and Resource Management
  • Speech Recognition and Synthesis
  • Biomedical Text Mining and Ontologies
  • Text and Document Classification Technologies
  • Speech and dialogue systems
  • Advanced Neural Network Applications
  • Advanced Data Storage Technologies
  • Video Analysis and Summarization
  • Reinforcement Learning in Robotics
  • Neuroscience of respiration and sleep
  • Emotion and Mood Recognition

Binzhou University
2024

Binzhou Medical University
2024

Tsinghua University
2018-2024

Zhejiang University
2024

Beijing Academy of Artificial Intelligence
2020-2022

Nanchang Institute of Science & Technology
2016

Hebei GEO University
2009

Southeast University
2004-2005

Abstract Pre-trained language representation models (PLMs) cannot well capture factual knowledge from text. In contrast, embedding (KE) methods can effectively represent the relational facts in graphs (KGs) with informative entity embeddings, but conventional KE take full advantage of abundant textual information. this paper, we propose a unified model for Knowledge Embedding and LanguagERepresentation (KEPLER), which not only better integrate into PLMs also produce effective text-enhanced...

10.1162/tacl_a_00360 article EN cc-by Transactions of the Association for Computational Linguistics 2021-03-01

Abstract With the prevalence of pre-trained language models (PLMs) and pre-training–fine-tuning paradigm, it has been continuously shown that larger tend to yield better performance. However, as PLMs scale up, fine-tuning storing all parameters is prohibitively costly eventually becomes practically infeasible. This necessitates a new branch research focusing on parameter-efficient adaptation PLMs, which optimizes small portion model while keeping rest fixed, drastically cutting down...

10.1038/s42256-023-00626-4 article EN cc-by Nature Machine Intelligence 2023-03-02

Abstract As pre-trained language models (PLMs) have become the fundamental infrastructure for various NLP tasks and researchers readily enjoyed themselves in pretraining-finetuning paradigm, evidence from emerging research has continuously proven that larger tend to yield better performance. However, despite welcome outcome, process of fine-tuning large-scale PLMs brings prohibitive adaptation costs. In fact, fine- tuning all parameters a colossal model retaining separate instances different...

10.21203/rs.3.rs-1553541/v1 preprint EN cc-by Research Square (Research Square) 2022-06-22

Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li. Proceedings of the 2019 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019.

10.18653/v1/n19-1105 article EN 2019-01-01

Xiaozhi Wang, Ziqi Xu Han, Wangyi Jiang, Rong Zhiyuan Liu, Juanzi Li, Peng Yankai Lin, Jie Zhou. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020.

10.18653/v1/2020.emnlp-main.129 article EN cc-by 2020-01-01

Xiaozhi Wang, Ziqi Xu Han, Zhiyuan Liu, Juanzi Li, Peng Maosong Sun, Jie Zhou, Xiang Ren. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint (EMNLP-IJCNLP). 2019.

10.18653/v1/d19-1584 article EN 2019-01-01

Ziqi Wang, Xiaozhi Xu Han, Yankai Lin, Lei Hou, Zhiyuan Liu, Peng Li, Juanzi Jie Zhou. Proceedings of the 59th Annual Meeting Association for Computational Linguistics and 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2021.

10.18653/v1/2021.acl-long.491 article EN cc-by 2021-01-01

Pre-trained Language Models (PLMs) have proven to be beneficial for various downstream NLP tasks. Recently, GPT-3, with 175 billion parameters and 570 GB training data, drew a lot of attention due the capacity few-shot (even zero-shot) learning. However, applying GPT-3 address Chinese tasks is still challenging, as corpus primarily English, are not publicly available. In this technical report, we release Model (CPM) generative pre-training on large-scale data. To best our knowledge, CPM, 2.6...

10.1016/j.aiopen.2021.07.001 article EN cc-by-nc-nd AI Open 2021-01-01

Prompt tuning (PT) is a promising parameter-efficient method to utilize extremely large pre-trained language models (PLMs), which can achieve comparable performance full-parameter fine-tuning by only few soft prompts. However, PT requires much more training time than fine-tuning. Intuitively, knowledge transfer help improve the efficiency. To explore whether we via prompt transfer, empirically investigate transferability of prompts across different downstream tasks and PLMs in this work. We...

10.18653/v1/2022.naacl-main.290 article EN cc-by Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2022-01-01

Pre-trained language representation models (PLMs) cannot well capture factual knowledge from text. In contrast, embedding (KE) methods can effectively represent the relational facts in graphs (KGs) with informative entity embeddings, but conventional KE take full advantage of abundant textual information. this paper, we propose a unified model for Knowledge Embedding and LanguagE Representation (KEPLER), which not only better integrate into PLMs also produce effective text-enhanced strong...

10.48550/arxiv.1911.06136 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Recently, pre-trained language models mostly follow the pre-train-then-fine-tuning paradigm and have achieved great performance on various downstream tasks. However, since pre-training stage is typically task-agnostic fine-tuning usually suffers from insufficient supervised data, cannot always well capture domain-specific task-specific patterns. In this paper, we propose a three-stage framework by adding task-guided with selective masking between general fine-tuning. stage, model trained...

10.18653/v1/2020.emnlp-main.566 article EN cc-by 2020-01-01

Despite the success, process of fine-tuning large-scale PLMs brings prohibitive adaptation costs. In fact, all parameters a colossal model and retaining separate instances for different tasks are practically infeasible. This necessitates new branch research focusing on parameter-efficient PLMs, dubbed as delta tuning in this paper. contrast with standard fine-tuning, only fine-tunes small portion while keeping rest untouched, largely reducing both computation storage Recent studies have...

10.48550/arxiv.2203.06904 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Numerous benchmarks have been established to assess the performance of foundation models on open-ended question answering, which serves as a comprehensive test model's ability understand and generate language in manner similar humans. Most these works focus proposing new datasets, however, we see two main issues within previous benchmarking pipelines, namely testing leakage evaluation automation. In this paper, propose novel framework, Language-Model-as-an-Examiner, where LM knowledgeable...

10.48550/arxiv.2306.04181 preprint EN other-oa arXiv (Cornell University) 2023-01-01

The unprecedented performance of large language models (LLMs) necessitates improvements in evaluations. Rather than merely exploring the breadth LLM abilities, we believe meticulous and thoughtful designs are essential to thorough, unbiased, applicable Given importance world knowledge LLMs, construct a Knowledge-oriented Assessment benchmark (KoLA), which carefully design three crucial factors: (1) For ability modeling, mimic human cognition form four-level taxonomy knowledge-related...

10.48550/arxiv.2306.09296 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Conceptual knowledge is fundamental to human cognition and bases. However, existing probing works only focus on evaluating factual of pre-trained language models (PLMs) ignore conceptual knowledge. Since often appears as implicit commonsense behind texts, designing probes for hard. Inspired by representation schemata, we comprehensively evaluate PLMs three tasks probe whether organize entities similarities, learn properties, conceptualize in contexts, respectively. For the tasks, collect...

10.18653/v1/2022.emnlp-main.335 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2022-01-01

Abstract Tokenization is fundamental to pretrained language models (PLMs). Existing tokenization methods for Chinese PLMs typically treat each character as an indivisible token. However, they ignore the unique feature of writing system where additional linguistic information exists below level, i.e., at sub-character level. To utilize such information, we propose (SubChar short) tokenization. Specifically, first encode input text by converting into a short sequence based on its glyph or...

10.1162/tacl_a_00560 article EN cc-by Transactions of the Association for Computational Linguistics 2023-05-18

Despite the recent emergence of video captioning models, how to generate vivid, fine-grained descriptions based on background knowledge (i.e., long and informative commentary about domain-specific scenes with appropriate reasoning) is still far from being solved, which however has great applications such as automatic sports narrative. Based soccer game videos synchronized data, we present GOAL, a benchmark over 8.9k clips, 22k sentences, 42k triples for proposing challenging new task setting...

10.1145/3583780.3615120 article EN cc-by-nc 2023-10-21

Why can pre-trained language models (PLMs) learn universal representations and effectively adapt to broad NLP tasks differing a lot superficially? In this work, we empirically find evidence indicating that the adaptations of PLMs various few-shot be reparameterized as optimizing only few free parameters in unified low-dimensional <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">intrinsic task subspace</i> , which may help us understand why...

10.1109/taslp.2024.3430545 article EN cc-by-nc-nd IEEE/ACM Transactions on Audio Speech and Language Processing 2024-01-01

While there are abundant researches about evaluating ChatGPT on natural language understanding and generation tasks, few studies have investigated how ChatGPT's behavior changes over time. In this paper, we collect a coarse-to-fine temporal dataset called ChatLog, consisting of two parts that update monthly daily: ChatLog-Monthly is 38,730 question-answer pairs collected every month including questions from both the reasoning classification tasks. ChatLog-Daily, other hand, consists...

10.48550/arxiv.2304.14106 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Transformer-based pre-trained language models have demonstrated superior performance on various natural processing tasks. However, it remains unclear how the skills required to handle these tasks distribute among model parameters. In this paper, we find that after prompt tuning for specific tasks, activations of some neurons within Transformers are highly predictive task labels. We dub skill and confirm they encode task-specific by finding that: (1) Skill crucial handling Performances a...

10.18653/v1/2022.emnlp-main.765 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2022-01-01

Pre-trained Language Models (PLMs) have proven to be beneficial for various downstream NLP tasks. Recently, GPT-3, with 175 billion parameters and 570GB training data, drew a lot of attention due the capacity few-shot (even zero-shot) learning. However, applying GPT-3 address Chinese tasks is still challenging, as corpus primarily English, are not publicly available. In this technical report, we release Model (CPM) generative pre-training on large-scale data. To best our knowledge, CPM, 2.6...

10.48550/arxiv.2012.00413 preprint EN other-oa arXiv (Cornell University) 2020-01-01
Coming Soon ...