Dialog Datasets: Navigating Task-Oriented, Open-Domain, and Knowledge-Grounded Conversations

Authors: (1) Dominic Petrak, UKP Lab, Department of Computer Science, Technical University of Darmstadt, Germany; (2) Nafise Sadat Moosavi, Department of Computer Science, The University of Sheffield, United Kingdom; (3) Ye Tian, Wluper, London, United Kingdom; (4) Nikolai Rozanov, Wluper, London, United Kingdom; (5) Iryna Gurevych, UKP Lab, Department of Computer Science, Technical University of Darmstadt, Germany. Table of Links Abstract & Introduction Related Work Datasets Examined Manual Error Type Analysis and Taxonomies Automatic Filtering for Potentially Relevant Dialogs Statistical Analysis Evaluation and Experiments Discussion Conclusion, Limitation, Acknowledgments, and References A Integrated Error Taxonomy – Details B Error-Indicating Sentences And Phrases C Automatic Filtering – Implementation D Automatic Filtering – Sentence-Level Analysis E Task-Oriented Dialogs – Examples F Effectiveness Of Automatic Filtering – A Detailed Analysis G Inter-Annotator Agreement – Detailed Analysis H Annotation Guidelines I Hyperparameters and Baseline Experiments J Human-Human Dialogs – Examples 3 Datasets Examined Table 1 gives an overview of the datasets examined in this work. Overall, we consider six datasets with dialogs of various types, including task-oriented, open-domain, and knowledge-grounded dialogs, as well as human-human and human-bot dialogs. For task-oriented dialog datasets, we consider MultiWoZ (Budzianowski et al., 2018) (MWoZ), SGD (Rastogi et al., 2020), and BABI (Bordes et al., 2017). They mainly differ in the number of domains included in the dialogs. MWoZ includes seven different domains, SGD 16, and BABI only one (but with dialogs of increasing difficulty). In contrast to MWoZ and SGD, BABI consists of human-bot dialogs. For open-domain dialogs, we consider PersonaChat (Zhang et al., 2018) (PC) and the human-bot split of the SelfFeeding Chatbot (Hancock et al., 2019) (SFC). While PC consists of dialogs between two people who are trying to get to know each other, SFC consists of human-bot open-domain dialogs2 . For knowledge-grounded dialogs, we focus on Wizardsof-Wikipedia (Dinan et al., 2019) (WoW), which consists of human-human dialogs. For simplicity, we do not distinguish between human or bot in the following. We always refer to the utterance of the partner as a system utterance. This paper is under CC BY-NC-SA 4.0 DEED license. available on arxiv