Authors:
(1) Jianzhu Yao, The CoAI group, Tsinghua University, Beijing, China Department of Computer Science and Technology, Tsinghua University, Beijing, China Beijing National Research Center for Information Science and Technology;
(2) Ziqi Liu, The CoAI group, Tsinghua University, Beijing, China Department of Computer Science and Technology, Tsinghua University, Beijing, China Beijing National Research Center for Information Science and Technology;
(3) Jian Guan, The CoAI group, Tsinghua University, Beijing, China Department of Computer Science and Technology, Tsinghua University, Beijing, China Beijing National Research Center for Information Science and Technology;
(4) Minlie Huang, The CoAI group, Tsinghua University, Beijing, China Department of Computer Science and Technology, Tsinghua University, Beijing, China Beijing National Research Center for Information Science and Technology.
Masked Dialogue Generation In this task, the masked positions can be anywhere in the story. To generate and complete one specific masked dialogue turn, this task requires the machine to first understand the main line of the whole story, the roles and features of different characters, and then infer and generate the most appropriate dialogue turn to move the plot forward according to the specific environment. Context information also plays an important role in this task, which puts forth a high demand for dialogue and plot coherence. And this task can easily leverage other research such as persona chat bots, emotional chat bots, or even some other story generation tasks.
Dialogue Speaker Recognition This task specifically focuses on the dialogue understanding in a story. In a story, there are always lots of characters speaking to each other in different environments. And the relationship between characters and dialogue is complicated. Sometimes a piece of dialogue is self-talk, sometimes it is a conversation between two parties, or even multiple parties in a complex environment, and sometimes there are many details in the background information that cover up the actual speaker. The speaker of the dialogue will not only appear in the surrounding, but may also hide in the previous or behind the context of the dialogue. To recognize all the speakers in one story, our model also needs to capture different characteristics of people. Higher requirements for the model’s understanding of dialogue and character relationships are also put forth by the necessity to predict speakers simultaneously in several positions. This task can also leverage other research on recognition in multi-person dialogue story scenes.
This paper is available on arxiv under CC 4.0 DEED license.