Reducing Model Churn: Stable Re-training of Conversational Agents

Retraining Training set
DOI: 10.18653/v1/2022.sigdial-1.2 Publication Date: 2023-11-26T13:36:02Z
ABSTRACT
Retraining modern deep learning systems can lead to variations in model performance even when trained using the same data and hyper-parameters by simply different random seeds. This phenomenon is known as churn or jitter. issue often exacerbated real world settings, where noise may be introduced collection process. In this work we tackle problem of stable retraining with a novel focus on structured prediction for conversational semantic parsing. We first quantify introducing metrics agreement between predictions across multiple retrainings. Next, devise realistic scenarios injection demonstrate effectiveness various reduction techniques such ensembling distillation. Lastly, discuss practical trade-offs show that co-distillation provides sweet spot terms only modest increase resource usage.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (0)
CITATIONS (2)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....