How to Tame Your Data: Data Augmentation for Dialog State Tracking

Representation
DOI: 10.18653/v1/2020.nlp4convai-1.4 Publication Date: 2020-07-29T14:14:43Z
ABSTRACT
Dialog State Tracking (DST) is a problem space in which the effective vocabulary practically limitless. For example, domain of possible movie titles or restaurant names bound only by limits language. As such, DST systems often encounter out-of-vocabulary words at inference time that were never encountered during training. To combat this issue, we present targeted data augmentation process, practitioner observes types errors made on held-out evaluation data, and then modifies training with additional corpora to increase size time. Using RoBERTa-based Transformer architecture, achieve state-of-the-art results comparison mask trouble slots special tokens. Additionally, data-representation scheme for seamlessly retargeting architectures new domains.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (0)
CITATIONS (0)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....