Comparing Wizard of Oz & Observational Studies for Conversational IR Evaluation
Wizard of oz
Dialog system
Wizard
DOI:
10.1007/s13222-020-00333-z
Publication Date:
2020-02-10T11:02:51Z
AUTHORS (3)
ABSTRACT
Abstract Systematic and repeatable measurement of information systems via test collections, the Cranfield model, has been mainstay Information Retrieval since 1960s. However, this may not be appropriate for newer, more interactive systems, such as Conversational Search agents. Such rely on Machine Learning technologies, which are yet sufficiently advanced to permit true human-like dialogues, so research can enabled by simulation human In work we compare dialogues obtained from two studies with same context, assistance in kitchen, but different experimental setups, allowing us learn about evaluate conversational IR systems. We discover that users adapt their behaviour when they think interacting a system conversations one were unpredictable an extent did expect. Our results have implications development new area and, ultimately, design future
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (8)
CITATIONS (3)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....