Personalizing a Dialogue System With Transfer Reinforcement Learning
Overfitting
Transfer of learning
DOI:
10.1609/aaai.v32i1.11938
Publication Date:
2022-11-03T06:53:38Z
AUTHORS (5)
ABSTRACT
It is difficult to train a personalized task-oriented dialogue system because the data collected from each individual often insufficient. Personalized systems trained on small dataset likely overfit and make it adapt different user needs. One way solve this problem consider collection of multiple users as source domain an target domain, perform transfer learning domain. By following idea, we propose PErsonalized Task-oriented diALogue (PETAL) system, reinforcement framework based POMDP, construct system. The PETAL first learns common knowledge then adapts proposed can avoid negative by considering differences between in Q-function. Experimental results real-world coffee-shopping simulation show that learn optimal policies for users, thus effectively improve quality under setting.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (0)
CITATIONS (49)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....