NFDI4DS | UHH-SEMS - Publication Details

MixText: Linguistically-Informed Interpolation of Hidden Space for Semi-Supervised Text Classification

OPENALEX - Publications

Jiaao Chen Zichao Yang Diyi Yang

This paper presents MixText, a semi-supervised learning method for text classification, which uses our newly designed data augmentation called TMix. TMix creates large amount of augmented training samples by interpolating in hidden space. Moreover, we leverage recent advances to guess low-entropy labels unlabeled data, hence making them as easy use labeled data. By mixing labeled, and MixText significantly outperformed current pre-trained fined-tuned models other state-of-the-art methods on...

10.18653/v1/2020.acl-main.194 preprint EN cc-by 2020-01-01

An Empirical Survey of Data Augmentation for Limited Data Learning in NLP

OPENALEX - Publications

Jiaao Chen Derek Tam Colin Raffel Mohit Bansal Diyi Yang

Abstract NLP has achieved great progress in the past decade through use of neural models and large labeled datasets. The dependence on abundant data prevents from being applied to low-resource settings or novel tasks where significant time, money, expertise is required label massive amounts textual data. Recently, augmentation methods have been explored as a means improving efficiency NLP. To date, there no systematic empirical overview for limited setting, making it difficult understand...

10.1162/tacl_a_00542 article EN cc-by Transactions of the Association for Computational Linguistics 2023-01-01

Local Additivity Based Data Augmentation for Semi-supervised NER

OPENALEX - Publications

Jiaao Chen Zhenghui Wang Ran Tian Zichao Yang Diyi Yang

Named Entity Recognition (NER) is one of the first stages in deep language understanding yet current NER models heavily rely on human-annotated data. In this work, to alleviate dependence labeled data, we propose a Local Additivity based Data Augmentation (LADA) method for semi-supervised NER, which create virtual samples by interpolating sequences close each other. Our approach has two variations: Intra-LADA and Inter-LADA, where performs interpolations among tokens within sentence,...

10.18653/v1/2020.emnlp-main.95 article EN cc-by 2020-01-01

Agentic Workflows for Conversational Human-AI Interaction Design

OPENALEX - Publications

Arthur Caetano Krishna K. Verma Atieh Taheri Radha Kumaran Zichen Chen and 3 more

Conversational human-AI interaction (CHAI) have recently driven mainstream adoption of AI. However, CHAI poses two key challenges for designers and researchers: users frequently ambiguous goals an incomplete understanding AI functionalities, the interactions are brief transient, limiting opportunities sustained engagement with users. agents can help address these by suggesting contextually relevant prompts, standing in during early design testing, helping better articulate their goals....

10.48550/arxiv.2501.18002 preprint EN arXiv (Cornell University) 2025-01-29

Semi-Supervised Models via Data Augmentationfor Classifying Interactive Affective Responses

OPENALEX - Publications

Jiaao Chen Yuwei Wu Diyi Yang

We present semi-supervised models with data augmentation (SMDA), a text classification system to classify interactive affective responses. SMDA utilizes recent transformer-based encode each sentence and employs back translation techniques paraphrase given sentences as augmented data. For labeled sentences, we performed augmentations uniform the label distributions computed supervised loss during training process. unlabeled explored self-training by regarding low-entropy predictions over...

10.48550/arxiv.2004.10972 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Weakly-Supervised Hierarchical Models for Predicting Persuasive Strategies in Good-faith Textual Requests

OPENALEX - Publications

Jiaao Chen Diyi Yang

Modeling persuasive language has the potential to better facilitate our decision-making processes. Despite its importance, computational modeling of persuasion is still in infancy, largely due lack benchmark datasets that can provide quantitative labels strategies expedite this line research. To end, we introduce a large-scale multi-domain text corpus for good-faith requests. Moreover, design hierarchical weakly-supervised latent variable model leverage partially labeled data predict such...

10.1609/aaai.v35i14.17498 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2021-05-18

Are We There Yet? Revealing the Risks of Utilizing Large Language Models in Scholarly Peer Review

OPENALEX - Publications

Rui Ye Xianghe Pang Jingyi Chai Jiaao Chen Zhenfei Yin and 4 more

Scholarly peer review is a cornerstone of scientific advancement, but the system under strain due to increasing manuscript submissions and labor-intensive nature process. Recent advancements in large language models (LLMs) have led their integration into review, with promising results such as substantial overlaps between LLM- human-generated reviews. However, unchecked adoption LLMs poses significant risks integrity system. In this study, we comprehensively analyze vulnerabilities...

10.48550/arxiv.2412.01708 preprint EN arXiv (Cornell University) 2024-12-02

Examining the Ordering of Rhetorical Strategies in Persuasive Requests

OPENALEX - Publications

Omar Shaikh Jiaao Chen Jon Saad-Falcon Polo Chau Diyi Yang

Interpreting how persuasive language influences audiences has implications across many domains like advertising, argumentation, and propaganda. Persuasion relies on more than a message's content. Arranging the order of message itself (i.e., ordering specific rhetorical strategies) also plays an important role. To examine strategy orderings contribute to persuasiveness, we first utilize Variational Autoencoder model disentangle content strategies in textual requests from large-scale loan...

10.18653/v1/2020.findings-emnlp.116 article EN cc-by 2020-01-01

MixText: Linguistically-Informed Interpolation of Hidden Space for Semi-Supervised Text Classification

OPENALEX - Publications

Jiaao Chen Zichao Yang Diyi Yang

This paper presents MixText, a semi-supervised learning method for text classification, which uses our newly designed data augmentation called TMix. TMix creates large amount of augmented training samples by interpolating in hidden space. Moreover, we leverage recent advances to guess low-entropy labels unlabeled data, hence making them as easy use labeled data.By mixing labeled, and MixText significantly outperformed current pre-trained fined-tuned models other state-of-the-art methods on...

10.48550/arxiv.2004.12239 preprint EN other-oa arXiv (Cornell University) 2020-01-01

HiddenCut: Simple Data Augmentation for Natural Language Understanding with Better Generalization

OPENALEX - Publications

Jiaao Chen Dinghan Shen Weizhu Chen Diyi Yang

Fine-tuning large pre-trained models with task-specific data has achieved great success in NLP. However, it been demonstrated that the majority of information within self-attention networks is redundant and not utilized effectively during fine-tuning stage. This leads to inferior results when generalizing obtained out-of-domain distributions. To this end, we propose a simple yet effective augmentation technique, HiddenCut, better regularize model encourage learn more generalizable features....

10.48550/arxiv.2106.00149 preprint EN cc-by arXiv (Cornell University) 2021-01-01

Leveraging Expert Guided Adversarial Augmentation For Improving Generalization in Named Entity Recognition

OPENALEX - Publications

Aaron Reich Jiaao Chen Aastha Agrawal Yanzhe Zhang Diyi Yang

Named Entity Recognition (NER) systems often demonstrate great performance on in-distribution data, but perform poorly examples drawn from a shifted distribution. One way to evaluate the generalization ability of NER models is use adversarial examples, which specific variations associated with named entities are rarely considered. To this end, we propose leveraging expert-guided heuristics change entity tokens and their surrounding contexts thereby altering types as attacks. Using...

10.18653/v1/2022.findings-acl.154 article EN cc-by Findings of the Association for Computational Linguistics: ACL 2022 2022-01-01

Multi-View Sequence-to-Sequence Models with Conversational Structure for Abstractive Dialogue Summarization

OPENALEX - Publications

Jiaao Chen Diyi Yang

Text summarization is one of the most challenging and interesting problems in NLP. Although much attention has been paid to summarizing structured text like news reports or encyclopedia articles, conversations---an essential part human-human/machine interaction where important pieces information are scattered across various utterances different speakers---remains relatively under-investigated. This work proposes a multi-view sequence-to-sequence model by first extracting conversational...

10.48550/arxiv.2010.01672 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Where Does Your News Come From? Predicting Information Pathways in Social Media

OPENALEX - Publications

Alexander K. Taylor Nuan Wen P.C. Kung Jiaao Chen Violet Peng and 1 more

As social networks become further entrenched in modern society, it becomes increasingly important to understand and predict how information (e.g., news coverage of a given event) is propagated across media (i.e., pathway), which helps the understandings impact real-world information. Thus, this paper, we propose novel task, Information Pathway Prediction (IPP), depicts propagation paths passage as community tree (rooted at source) on constructed interaction graphs where first aggregate...

10.1145/3539618.3592087 article EN cc-by-nc Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2023-07-18

From Scroll to Misbelief: Modeling the Unobservable Susceptibility to Misinformation on Social Media

OPENALEX - Publications

Yanchen Liu Mingyu Derek Ma Wenna Qin Azure Zhou Jiaao Chen and 3 more

Susceptibility to misinformation describes the degree of belief in unverifiable claims, a latent aspect individuals' mental processes that is not observable. Existing susceptibility studies heavily rely on self-reported beliefs, which can be subject bias, expensive collect, and challenging scale for downstream applications. To address these limitations, this work, we propose computational approach model users' levels. As shown previous research, influenced by various factors (e.g.,...

10.48550/arxiv.2311.09630 preprint EN other-oa arXiv (Cornell University) 2023-01-01

An weak surface defect inspection approach using efficient multi-scale attention and space-to-depth convolution network

OPENALEX - Publications

Guizhong Fu Jiaao Chen Si-Jin Qian Jing Miao Jinbin Li and 3 more

10.1016/j.measurement.2024.116220 article EN Measurement 2024-11-01

Skills-in-Context: Unlocking Compositionality in Large Language Models

OPENALEX - Publications

Jiaao Chen Xiaoman Pan Dian Yu Kaiqiang Song Xiaoyang Wang and 2 more

10.18653/v1/2024.findings-emnlp.812 article EN 2024-01-01

Decoding Susceptibility: Modeling Misbelief to Misinformation Through a Computational Approach

OPENALEX - Publications

Yanchen Liu Mingyu Derek Wenna Qin Azure Zhou Jiaao Chen and 3 more

10.18653/v1/2024.emnlp-main.846 article EN Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2024-01-01

Dynamic Skill Adaptation for Large Language Models

OPENALEX - Publications

Jiaao Chen Diyi Yang

We present Dynamic Skill Adaptation (DSA), an adaptive and dynamic framework to adapt novel complex skills Large Language Models (LLMs). Compared with previous work which learns from human-curated static data in random orders, we propose first automatically generate organize the training by mimicking learning pathways of human then dynamically tailor based on dynamics. Specifically, inspired structures teaching strategies education system, construct a skill graph decomposing into sub-skills...

10.48550/arxiv.2412.19361 preprint EN arXiv (Cornell University) 2024-12-26

Structure-Aware Abstractive Conversation Summarization via Discourse and Action Graphs

OPENALEX - Publications

Jiaao Chen Diyi Yang

Abstractive conversation summarization has received much attention recently. However, these generated summaries often suffer from insufficient, redundant, or incorrect content, largely due to the unstructured and complex characteristics of human-human interactions. To this end, we propose explicitly model rich structures in conversations for more precise accurate summarization, by first incorporating discourse relations between utterances action triples ("who-doing-what") through structured...

10.48550/arxiv.2104.08400 preprint EN cc-by arXiv (Cornell University) 2021-01-01

Local Additivity Based Data Augmentation for Semi-supervised NER

OPENALEX - Publications

Jiaao Chen Zhenghui Wang Ran Tian Zichao Yang Diyi Yang

Named Entity Recognition (NER) is one of the first stages in deep language understanding yet current NER models heavily rely on human-annotated data. In this work, to alleviate dependence labeled data, we propose a Local Additivity based Data Augmentation (LADA) method for semi-supervised NER, which create virtual samples by interpolating sequences close each other. Our approach has two variations: Intra-LADA and Inter-LADA, where performs interpolations among tokens within sentence,...

10.48550/arxiv.2010.01677 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Examining the Ordering of Rhetorical Strategies in Persuasive Requests

OPENALEX - Publications

Omar Shaikh Jiaao Chen Jon Saad-Falcon Duen Horng Chau Diyi Yang

Interpreting how persuasive language influences audiences has implications across many domains like advertising, argumentation, and propaganda. Persuasion relies on more than a message's content. Arranging the order of message itself (i.e., ordering specific rhetorical strategies) also plays an important role. To examine strategy orderings contribute to persuasiveness, we first utilize Variational Autoencoder model disentangle content strategies in textual requests from large-scale loan...

10.48550/arxiv.2010.04625 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Continual Learning for Text Classification with Information Disentanglement Based Regularization

OPENALEX - Publications

Yufan Huang Yanzhe Zhang Jiaao Chen Xuezhi Wang Diyi Yang

Continual learning has become increasingly important as it enables NLP models to constantly learn and gain knowledge over time. Previous continual methods are mainly designed preserve from previous tasks, without much emphasis on how well generalize new tasks. In this work, we propose an information disentanglement based regularization method for text classification. Our proposed first disentangles hidden spaces into representations that generic all tasks specific each individual task,...

10.48550/arxiv.2104.05489 preprint EN other-oa arXiv (Cornell University) 2021-01-01