Jiaao Chen

ORCID: 0009-0004-8425-2893
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Natural Language Processing Techniques
  • Sentiment Analysis and Opinion Mining
  • Misinformation and Its Impacts
  • Multimodal Machine Learning Applications
  • Opinion Dynamics and Social Influence
  • Domain Adaptation and Few-Shot Learning
  • Text and Document Classification Technologies
  • Speech and dialogue systems
  • Advanced Text Analysis Techniques
  • Artificial Intelligence in Healthcare and Education
  • Surface Roughness and Optical Measurements
  • Intelligent Tutoring Systems and Adaptive Learning
  • Advanced Graph Neural Networks
  • Bioinformatics and Genomic Networks
  • Industrial Vision Systems and Defect Detection
  • Semantic Web and Ontologies
  • Text Readability and Simplification
  • Complex Network Analysis Techniques
  • Multi-Agent Systems and Negotiation
  • scientometrics and bibliometrics research
  • Seismic Imaging and Inversion Techniques
  • Optical measurement and interference techniques
  • Digital Marketing and Social Media
  • Geophysical Methods and Applications

Georgia Institute of Technology
2020-2023

This paper presents MixText, a semi-supervised learning method for text classification, which uses our newly designed data augmentation called TMix. TMix creates large amount of augmented training samples by interpolating in hidden space. Moreover, we leverage recent advances to guess low-entropy labels unlabeled data, hence making them as easy use labeled data. By mixing labeled, and MixText significantly outperformed current pre-trained fined-tuned models other state-of-the-art methods on...

10.18653/v1/2020.acl-main.194 preprint EN cc-by 2020-01-01

Abstract NLP has achieved great progress in the past decade through use of neural models and large labeled datasets. The dependence on abundant data prevents from being applied to low-resource settings or novel tasks where significant time, money, expertise is required label massive amounts textual data. Recently, augmentation methods have been explored as a means improving efficiency NLP. To date, there no systematic empirical overview for limited setting, making it difficult understand...

10.1162/tacl_a_00542 article EN cc-by Transactions of the Association for Computational Linguistics 2023-01-01

Named Entity Recognition (NER) is one of the first stages in deep language understanding yet current NER models heavily rely on human-annotated data. In this work, to alleviate dependence labeled data, we propose a Local Additivity based Data Augmentation (LADA) method for semi-supervised NER, which create virtual samples by interpolating sequences close each other. Our approach has two variations: Intra-LADA and Inter-LADA, where performs interpolations among tokens within sentence,...

10.18653/v1/2020.emnlp-main.95 article EN cc-by 2020-01-01

Conversational human-AI interaction (CHAI) have recently driven mainstream adoption of AI. However, CHAI poses two key challenges for designers and researchers: users frequently ambiguous goals an incomplete understanding AI functionalities, the interactions are brief transient, limiting opportunities sustained engagement with users. agents can help address these by suggesting contextually relevant prompts, standing in during early design testing, helping better articulate their goals....

10.48550/arxiv.2501.18002 preprint EN arXiv (Cornell University) 2025-01-29

We present semi-supervised models with data augmentation (SMDA), a text classification system to classify interactive affective responses. SMDA utilizes recent transformer-based encode each sentence and employs back translation techniques paraphrase given sentences as augmented data. For labeled sentences, we performed augmentations uniform the label distributions computed supervised loss during training process. unlabeled explored self-training by regarding low-entropy predictions over...

10.48550/arxiv.2004.10972 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Modeling persuasive language has the potential to better facilitate our decision-making processes. Despite its importance, computational modeling of persuasion is still in infancy, largely due lack benchmark datasets that can provide quantitative labels strategies expedite this line research. To end, we introduce a large-scale multi-domain text corpus for good-faith requests. Moreover, design hierarchical weakly-supervised latent variable model leverage partially labeled data predict such...

10.1609/aaai.v35i14.17498 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2021-05-18

Scholarly peer review is a cornerstone of scientific advancement, but the system under strain due to increasing manuscript submissions and labor-intensive nature process. Recent advancements in large language models (LLMs) have led their integration into review, with promising results such as substantial overlaps between LLM- human-generated reviews. However, unchecked adoption LLMs poses significant risks integrity system. In this study, we comprehensively analyze vulnerabilities...

10.48550/arxiv.2412.01708 preprint EN arXiv (Cornell University) 2024-12-02

Interpreting how persuasive language influences audiences has implications across many domains like advertising, argumentation, and propaganda. Persuasion relies on more than a message's content. Arranging the order of message itself (i.e., ordering specific rhetorical strategies) also plays an important role. To examine strategy orderings contribute to persuasiveness, we first utilize Variational Autoencoder model disentangle content strategies in textual requests from large-scale loan...

10.18653/v1/2020.findings-emnlp.116 article EN cc-by 2020-01-01

This paper presents MixText, a semi-supervised learning method for text classification, which uses our newly designed data augmentation called TMix. TMix creates large amount of augmented training samples by interpolating in hidden space. Moreover, we leverage recent advances to guess low-entropy labels unlabeled data, hence making them as easy use labeled data.By mixing labeled, and MixText significantly outperformed current pre-trained fined-tuned models other state-of-the-art methods on...

10.48550/arxiv.2004.12239 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Fine-tuning large pre-trained models with task-specific data has achieved great success in NLP. However, it been demonstrated that the majority of information within self-attention networks is redundant and not utilized effectively during fine-tuning stage. This leads to inferior results when generalizing obtained out-of-domain distributions. To this end, we propose a simple yet effective augmentation technique, HiddenCut, better regularize model encourage learn more generalizable features....

10.48550/arxiv.2106.00149 preprint EN cc-by arXiv (Cornell University) 2021-01-01

Named Entity Recognition (NER) systems often demonstrate great performance on in-distribution data, but perform poorly examples drawn from a shifted distribution. One way to evaluate the generalization ability of NER models is use adversarial examples, which specific variations associated with named entities are rarely considered. To this end, we propose leveraging expert-guided heuristics change entity tokens and their surrounding contexts thereby altering types as attacks. Using...

10.18653/v1/2022.findings-acl.154 article EN cc-by Findings of the Association for Computational Linguistics: ACL 2022 2022-01-01

Text summarization is one of the most challenging and interesting problems in NLP. Although much attention has been paid to summarizing structured text like news reports or encyclopedia articles, conversations---an essential part human-human/machine interaction where important pieces information are scattered across various utterances different speakers---remains relatively under-investigated. This work proposes a multi-view sequence-to-sequence model by first extracting conversational...

10.48550/arxiv.2010.01672 preprint EN other-oa arXiv (Cornell University) 2020-01-01

As social networks become further entrenched in modern society, it becomes increasingly important to understand and predict how information (e.g., news coverage of a given event) is propagated across media (i.e., pathway), which helps the understandings impact real-world information. Thus, this paper, we propose novel task, Information Pathway Prediction (IPP), depicts propagation paths passage as community tree (rooted at source) on constructed interaction graphs where first aggregate...

10.1145/3539618.3592087 article EN cc-by-nc Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2023-07-18

Susceptibility to misinformation describes the degree of belief in unverifiable claims, a latent aspect individuals' mental processes that is not observable. Existing susceptibility studies heavily rely on self-reported beliefs, which can be subject bias, expensive collect, and challenging scale for downstream applications. To address these limitations, this work, we propose computational approach model users' levels. As shown previous research, influenced by various factors (e.g.,...

10.48550/arxiv.2311.09630 preprint EN other-oa arXiv (Cornell University) 2023-01-01

10.18653/v1/2024.emnlp-main.846 article EN Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2024-01-01

We present Dynamic Skill Adaptation (DSA), an adaptive and dynamic framework to adapt novel complex skills Large Language Models (LLMs). Compared with previous work which learns from human-curated static data in random orders, we propose first automatically generate organize the training by mimicking learning pathways of human then dynamically tailor based on dynamics. Specifically, inspired structures teaching strategies education system, construct a skill graph decomposing into sub-skills...

10.48550/arxiv.2412.19361 preprint EN arXiv (Cornell University) 2024-12-26

Abstractive conversation summarization has received much attention recently. However, these generated summaries often suffer from insufficient, redundant, or incorrect content, largely due to the unstructured and complex characteristics of human-human interactions. To this end, we propose explicitly model rich structures in conversations for more precise accurate summarization, by first incorporating discourse relations between utterances action triples ("who-doing-what") through structured...

10.48550/arxiv.2104.08400 preprint EN cc-by arXiv (Cornell University) 2021-01-01

Named Entity Recognition (NER) is one of the first stages in deep language understanding yet current NER models heavily rely on human-annotated data. In this work, to alleviate dependence labeled data, we propose a Local Additivity based Data Augmentation (LADA) method for semi-supervised NER, which create virtual samples by interpolating sequences close each other. Our approach has two variations: Intra-LADA and Inter-LADA, where performs interpolations among tokens within sentence,...

10.48550/arxiv.2010.01677 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Interpreting how persuasive language influences audiences has implications across many domains like advertising, argumentation, and propaganda. Persuasion relies on more than a message's content. Arranging the order of message itself (i.e., ordering specific rhetorical strategies) also plays an important role. To examine strategy orderings contribute to persuasiveness, we first utilize Variational Autoencoder model disentangle content strategies in textual requests from large-scale loan...

10.48550/arxiv.2010.04625 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Continual learning has become increasingly important as it enables NLP models to constantly learn and gain knowledge over time. Previous continual methods are mainly designed preserve from previous tasks, without much emphasis on how well generalize new tasks. In this work, we propose an information disentanglement based regularization method for text classification. Our proposed first disentangles hidden spaces into representations that generic all tasks specific each individual task,...

10.48550/arxiv.2104.05489 preprint EN other-oa arXiv (Cornell University) 2021-01-01
Coming Soon ...