AugGPT: Leveraging ChatGPT for Text Data Augmentation
Sample (material)
Training set
DOI:
10.48550/arxiv.2302.13007
Publication Date:
2023-01-01
AUTHORS (15)
ABSTRACT
Text data augmentation is an effective strategy for overcoming the challenge of limited sample sizes in many natural language processing (NLP) tasks. This especially prominent few-shot learning scenario, where target domain generally much scarcer and lowered quality. A widely-used to mitigate such challenges perform better capture invariance increase size. However, current text methods either can't ensure correct labeling generated (lacking faithfulness) or sufficient diversity compactness), both. Inspired by recent success large models, development ChatGPT, which demonstrated improved comprehension abilities, this work, we propose a approach based on ChatGPT (named AugGPT). AugGPT rephrases each sentence training samples into multiple conceptually similar but semantically different samples. The augmented can then be used downstream model training. Experiment results classification tasks show superior performance proposed over state-of-the-art terms testing accuracy distribution
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....