How Dramatron Empowers Co-Creative Scriptwriting with AI Assistance

Authors: (1) PIOTR MIROWSKI and KORY W. MATHEWSON, DeepMind, United Kingdom and Both authors contributed equally to this research; (2) JAYLEN PITTMAN, Stanford University, USA and Work done while at DeepMind; (3) RICHARD EVANS, DeepMind, United Kingdom. Table of Links Abstract and Intro Storytelling, The Shape of Stories, and Log Lines The Use of Large Language Models for Creative Text Generation Evaluating Text Generated by Large Language Models Participant Interviews Participant Surveys Discussion and Future Work Conclusions, Acknowledgements, and References A. RELATED WORK ON AUTOMATED STORY GENERATION AND CONTROLLABLE STORY GENERATION B. ADDITIONAL DISCUSSION FROM PLAYS BY BOTS CREATIVE TEAM C. DETAILS OF QUANTITATIVE OBSERVATIONS D. SUPPLEMENTARY FIGURES E. FULL PROMPT PREFIXES FOR DRAMATRON F. RAW OUTPUT GENERATED BY DRAMATRON G. CO-WRITTEN SCRIPTS 8 CONCLUSIONS We present Dramatron: an interactive co-writing tool which allows writers to generate scripts from a provided log line. Hierarchical story generation with explicit narrative structures and characters helps to generate more coherent text, especially when generating text as long as theatre scripts and screenplays. We conducted a user study with 15 theatre and film industry professionals and distilled their reflections collected through open-ended qualitative interviews and a short survey. We also present feedback from a creative team that produced scripts co-written with Dramatron in public performances at a theatre festival, alongside two reviews from professional reviewers. In summary, Dramatron can be used as a co-creative writing tool allowing human authors to write screenplays and theatre scripts alongside LLMs. This work invites further questions on the nature of co-creativity and on the ethics surrounding LLMs. ACKNOWLEDGEMENTS We would also like to thank anonymous reviewers for their time, energy, and insightful feedback, as well as our colleagues at DeepMind for creative inspiration and critical input on the scientific, ethical and legal aspects of this work, in particular: Tara Thomas, Kevin McKee, Boxi Wu, Antonia Paterson, Murray Shanahan, Robert Dickens, Aliya Ahmad, Danielle Breen, Sanah Choudhry, Joel Moss, Yan Lai, Jon Small, Will Hawkins, Laura Weidinger, Lisa Anne Hendricks, Mia Glaese, Geoffrey Irving, Jack Rae, Natalie Lambert, Raia Hadsell, Shakir Mohamed and Doina Precup. We are immensely grateful to the anonymous participants who took part in this study and who made it possible. Finally, we are indebted to the talented performers and production companies Rapid Fire Theatre in Edmonton, Canada and Transitional Forms in Toronto, Canada without whom we would never have been able to fully realise the generated scripts. Thank you for providing your artistic voices in this human-machine co-creative dialogue. REFERENCES [1] C. AI. Sollicitors, 2020. URL https://www.youtube.com/watch?v=AmX3GDJ47wo. [2] N. Akoury, S. Wang, J. Whiting, S. Hood, N. Peng, and M. Iyyer. Storium: A dataset and evaluation platform for machine-in-the-loop story generation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6470–6484, 2020. [3] A. Alabdulkarim, S. Li, and X. Peng. Automatic story generation: Challenges and attempts. NAACL HLT 2021, page 72, 2021. [4] P. Ammanabrolu, E. Tien, W. Cheung, Z. Luo, W. Ma, L. J. Martin, and M. O. Riedl. Story realization: Expanding plot events into sentences. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 7375–7382, 2020. [5] Aristotle. Poetics. 350 BC. [6] A. Becker. Text building, epistemology, and aesthetic in javanese shadow theatre “dalam the imagination of reality. edited by al becker and aram a. yengoyan, 1979. [7] E. M. Bender, T. Gebru, A. McMillan-Major, and S. Shmitchell. On the dangers of stochastic parrots: Can language models be too big? pages 610–623, 2021. [8] L. S. Bishop. Sell Your Story in A Single Sentence: Advice from the Front Lines of Hollywood. The Countryman Press, 2016. [9] R. Bommasani, D. A. Hudson, E. Adeli, R. Altman, S. Arora, S. von Arx, M. S. Bernstein, J. Bohg, A. Bosselut, E. Brunskill, et al. On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258, 2021. [10] B. Branch, P. Mirowski, and K. W. Mathewson. Collaborative storytelling with human actors and ai narrators. Proceedings of the 12th International Conference on Computational Creativity, 2021. URL https://arxiv.org/abs/2109.14728. [11] G. Branwen. Gpt-3 creative fiction. 2020. [12] T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, and D. Amodei. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020. URL https://arxiv.org/abs/2005.14165. [13] A. Calderwood, V. Qiu, K. I. Gero, and L. B. Chilton. How novelists use generative language models: An exploratory user study. In Proceedings of HAI-GEN+ user2agent@ IUI, 2020. [14] A. Calderwood, N. Wardrip-Fruin, and M. Mateas. Spinning coherent interactive fiction through foundation model prompts. 2022. [15] J. Campbell. The hero with a thousand faces. Princeton, NJ: Princeton University Press, 1949. [16] A. Celikyilmaz, E. Clark, and J. Gao. Evaluation of text generation: A survey. CoRR, abs/2006.14799, 2020. URL https://arxiv.org/abs/2006.14799. [17] W. L. Chafe. The pear stories: Cognitive, cultural, and linguistic aspects of narrative production. 1980. [18] R. Cheng, A. Smith-Renner, K. Zhang, J. R. Tetreault, and A. Jaimes. Mapping the design space of human-ai interaction in text summarization. arXiv preprint arXiv:2206.14863, 2022. [19] J. Cho, M. Jeong, J. Bak, and Y.-G. Cheong. Genre-controllable story generation via supervised contrastive learning. In Proceedings of the ACM Web Conference 2022, pages 2839–2849, 2022. [20] J. J. Y. Chung, S. He, and E. Adar. Artist support networks: Implications for future creativity support tools. Proceedings of Designing Interactive Systems: DIS’22, 2022. [21] J. J. Y. Chung, W. Kim, K. M. Yoo, H. Lee, E. Adar, and M. Chang. Talebrush: Sketching stories with generative pretrained language models. In CHI Conference on Human Factors in Computing Systems, pages 1–19, 2022. [22] E. Clark, A. S. Ross, C. Tan, Y. Ji, and N. A. Smith. Creative writing with a machine in the loop: Case studies on slogans and stories. In 23rd International Conference on Intelligent User Interfaces, pages 329–340, 2018. [23] W. W. Cook. Plotto: The Master Book of All Plots. Ellis, first edition, 1928. [24] A. Creswell and M. Shanahan. Faithful reasoning using large language models. arXiv preprint arXiv:2208.14271, 2022. [25] S. Dathathri, A. Madotto, J. Lan, J. Hung, E. Frank, P. Molino, J. Yosinski, and R. Liu. Plug and play language models: A simple approach to controlled text generation. CoRR, abs/1912.02164, 2019. URL http://arxiv.org/abs/1912.02164. [26] A. De Fina and B. Johnstone. Discourse analysis and narrative. The handbook of discourse analysis, 1:152–167, 2015. [27] P. Debreczeny. Chekhov’s art: A stylistic analysis. Slavic Review, 43(2):347–348, 1984. [28] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018. [29] A. Dirik, H. Donmez, and P. Yanardag. Controlled cue generation for play scripts. arXiv preprint arXiv:2112.06953, 2021. [30] T. Draws, D. La Barbera, M. Soprano, K. Roitero, D. Ceolin, A. Checco, and S. Mizzaro. The effects of crowd worker biases in fact-checking tasks. In 2022 ACM Conference on Fairness, Accountability, and Transparency, pages 2114–2124, 2022. [31] W. Du, Z. M. Kim, V. Raheja, D. Kumar, and D. Kang. Read, revise, repeat: A system demonstration for human-in-the-loop iterative text revision. In Proceedings of the First Workshop on Intelligent and Interactive Writing Assistants (In2Writing 2022), pages 96–108, 2022. [32] N. Dziri, E. Kamalloo, K. W. Mathewson, and O. Zaiane. Evaluating coherence in dialogue systems using entailment. arXiv preprint arXiv:1904.03371, 2019. [33] M. Eger and K. W. Mathewson. dairector: Automatic story beat generation through knowledge synthesis. arXiv preprint arXiv:1811.03423, 2018. [34] M. Eger, C. M. Potts, C. Barot, and R. M. Young. Plotter: operationalizing the master book of all plots. Proceedings of the Intelligent Narrative Technologies and Social Believability in Games, pages 30–33, 2015. [35] C. Eickhoff. Cognitive biases in crowdsourcing. In Proceedings of the eleventh ACM international conference on web search and data mining, pages 162–170, 2018. [36] R. Evans and E. Short. Versu—a simulationist storytelling system. IEEE Transactions on Computational Intelligence and AI in Games, 6(2):113–130, 2013. [37] A. Fan, M. Lewis, and Y. N. Dauphin. Hierarchical neural story generation. CoRR, abs/1805.04833, 2018. URL http://arxiv.org/abs/1805.04833. [38] A. Fan, M. Lewis, and Y. N. Dauphin. Strategies for structuring story generation. CoRR, abs/1902.01109, 2019. URL http://arxiv.org/abs/1902.01109. [39] C. Fellbaum. Wordnet. In Theory and applications of ontology: computer applications, pages 231–243. Springer, 2010. [40] G. Freytag. Die technik des dramas. S. Hirzel, 1894. [41] K. I. Gero, V. Liu, and L. Chilton. Sparks: Inspiration for science writing using language models. In Designing Interactive Systems Conference, pages 1002–1019, 2022. [42] S. Ghazarian, Z. Liu, S. Akash, R. Weischedel, A. Galstyan, and N. Peng. Plot-guided adversarial example construction for evaluating open-domain story generation. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4334–4344, 2021. [43] A. Ghosh. An encyclopaedia of indian archaeology. 1990. [44] S. Goldfarb-Tarrant, T. Chakrabarty, R. Weischedel, and N. Peng. Content planning for neural story generation with aristotelian rescoring. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 4319–4338, 2020. [45] J. Guan, Y. Wang, and M. Huang. Story ending generation with incremental encoding and commonsense knowledge. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pages 6473–6480, 2019. [46] J. Guan, F. Huang, Z. Zhao, X. Zhu, and M. Huang. A knowledge-enhanced pretraining model for commonsense story generation. Transactions of the Association for Computational Linguistics, 8:93–108, 2020. [47] B. Hayes-Roth and R. Van Gent. Improvisational puppets, actors, and avatars. In Proc Computer Game Developers’ Conf, 1996. [48] J. Hoffmann, S. Borgeaud, A. Mensch, E. Buchatskaya, T. Cai, E. Rutherford, D. d. L. Casas, L. A. Hendricks, J. Welbl, A. Clark, et al. Training compute-optimal large language models. arXiv preprint arXiv:2203.15556, 2022. [49] A. Holtzman, J. Buys, L. Du, M. Forbes, and Y. Choi. The curious case of neural text degeneration. In International Conference on Learning Representations, 2019. [50] Z. Hu, H. P. Chan, J. Liu, X. Xiao, H. Wu, and L. Huang. Planet: Dynamic content planning in autoregressive transformers for long-form text generation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2288–2305, 2022. [51] C. Hube, B. Fetahu, and U. Gadiraju. Understanding and mitigating worker biases in the crowdsourced collection of subjective judgments. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, pages 1–12, 2019. [52] Y. Jin, V. Kadam, and D. Wanvarie. Plot writing from pre-trained language models. arXiv preprint arXiv:2206.03021, 2022. [53] B. Johnstone. Discourse analysis and narrative. The handbook of discourse analysis, pages 635–649, 2005. [54] M. Karpinska, N. Akoury, and M. Iyyer. The perils of using mechanical turk to evaluate open-ended text generation. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 1265–1285, 2021. [55] N. S. Keskar, B. McCann, L. R. Varshney, C. Xiong, and R. Socher. Ctrl: A conditional transformer language model for controllable generation. arXiv preprint arXiv:1909.05858, 2019. URL http://arxiv.org/abs/1909.05858. [56] R. Kozierok, J. Aberdeen, C. Clark, C. Garay, B. Goodman, T. Korves, L. Hirschman, P. L. McDermott, and M. W. Peterson. Assessing open-ended human-computer collaboration systems: Applying a hallmarks approach. Frontiers in artificial intelligence, 4, 2021. [57] M. Kusner, Y. Sun, N. Kolkin, and K. Weinberger. From word embeddings to document distances. In International conference on machine learning, pages 957–966. PMLR, 2015. [58] W. Labov and J. Waletzky. Narrative analysis. In Essays on the Verbal and Visual Arts, ed. J. Helm, pages 12–44. Seattle: U. of Washington Press, 1967. [59] J. Lee, T. Le, J. Chen, and D. Lee. Do language models plagiarize? arXiv e-prints, pages arXiv–2203, 2022. [60] M. Lee, P. Liang, and Q. Yang. Coauthor: Designing a human-ai collaborative writing dataset for exploring language model capabilities. In CHI Conference on Human Factors in Computing Systems, pages 1–19, 2022. [61] B. Li, S. Lee-Urban, G. Johnston, and M. O. Riedl. Story generation with crowdsourced plot graphs. In Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence, AAAI’13, pages 598–604. AAAI Press, 2013. URL http://dl.acm.org/citation.cfm?id=2891460.2891543. [62] C.-Y. Lin. Rouge: A package for automatic evaluation of summaries. In Proc ACL Wkshp. Vol 8., 2004. [63] I. Mani. Computational modeling of narrative. Synthesis Lectures on Human Language Technologies, 5(3):1–142, 2012. [64] L. J. Martin, P. Ammanabrolu, X. Wang, W. Hancock, S. Singh, B. Harrison, and M. O. Riedl. Event representations for automated story generation with deep neural nets. arXiv preprint arXiv:1706.01331, 2017. [65] K. Mathewson and P. Mirowski. Improbotics: Exploring the imitation game using machine intelligence in improvised theatre. In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, volume 14, 2018. [66] K. W. Mathewson and P. Mirowski. Improvised theatre alongside artificial intelligences. In AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, 2017. [67] K. W. Mathewson and P. Mirowski. Improvised comedy as a turing test. arXiv preprint arXiv:1711.08819, 2017. [68] K. W. Mathewson, P. S. Castro, C. Cherry, G. F. Foster, and M. G. Bellemare. Shaping the narrative arc: An information-theoretic approach to collaborative dialogue. CoRR, abs/1901.11528, 2019. URL http://arxiv.org/abs/1901.11528. [69] R. McKee. Story: Substance, structure, style and the principles of screenwriting. 1997. Kent, Great Britain: Methuen, 1997. [70] J. R. Meehan. The metanovel: writing stories by computer. Technical report, Yale Univ, New Haven Conn, Dept of Comp Sci, 1976. [71] J. R. Meehan. Tale-spin, an interactive program that writes stories. In IJCAI, volume 77, pages 91–98, 1977. [72] P. Mirowski and K. W. Mathewson. Human improvised theatre augmented with artificial intelligence. In Proceedings of the 2019 on Creativity and Cognition, pages 527–530. 2019. [73] P. Mirowski, S. Chopra, S. Balakrishnan, and S. Bangalore. Feature-rich continuous language models for speech recognition. In Spoken Language Technology Wkshp, 2010 IEEE, pages 241–246. IEEE, 2010. [74] A. Newitz. Movie written by algorithm turns out to be hilarious and intense, May 2016. URL https://arstechnica.com/gaming/2021/05/an-ai-wrotethis-movie-and-its-strangely-moving/. [75] E. Nichols, L. Gao, and R. Gomez. Collaborative storytelling with large-scale neural language models. In Motion, Interaction and Games, pages 1–10. 2020. [76] OpenAI. Pricing, Nov 2021. URL https://openai.com/api/pricing/. [77] V. Padmakumar and H. He. Machine-in-the-loop rewriting for creative image captioning. In Proceedings of the 20th Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2022. [78] P. Papalampidi, K. Cao, and T. Kocisky. Towards coherent and consistent use of entities in narrative generation. International Conference on Macine Learning, 2022. [79] K. Patti. The first horror movie written entirely by bots, 2021. URL https://www.youtube.com/watch?v=WZzbxNoMjGM. [80] K. Perlin and A. Goldberg. Improv: A system for scripting interactive actors in virtual worlds. In Proc. Conf. on Computer Graphics and Interactive Techniques, pages 205–216. ACM, 1996. [81] M. Polceanu, J. Porteous, A. Lindsay, and M. Cavazza. Narrative plan generation with self-supervised learning. In AAAI, 2021. [82] G. Polti. The thirty-six dramatic situations. Editor Company, 1917. [83] V. I. Propp. Morphology of the Folktale, volume 9. University of Texas Press, 1968. [84] J. W. Rae, S. Borgeaud, T. Cai, K. Millican, J. Hoffmann, H. F. Song, J. Aslanides, S. Henderson, R. Ring, S. Young, E. Rutherford, T. Hennigan, J. Menick, A. Cassirer, R. Powell, G. van den Driessche, L. A. Hendricks, M. Rauh, P. Huang, A. Glaese, J. Welbl, S. Dathathri, S. Huang, J. Uesato, J. Mellor, I. Higgins, A. Creswell, N. McAleese, A. Wu, E. Elsen, S. M. Jayakumar, E. Buchatskaya, D. Budden, E. Sutherland, K. Simonyan, M. Paganini, L. Sifre, L. Martens, X. L. Li, A. Kuncoro, A. Nematzadeh, E. Gribovskaya, D. Donato, A. Lazaridou, A. Mensch, J. Lespiau, M. Tsimpoukelli, N. Grigorev, D. Fritz, T. Sottiaux, M. Pajarskas, T. Pohlen, Z. Gong, D. Toyama, C. de Masson d’Autume, Y. Li, T. Terzi, V. Mikulik, I. Babuschkin, A. Clark, D. de Las Casas, A. Guy, C. Jones, J. Bradbury, M. Johnson, B. A. Hechtman, L. Weidinger, I. Gabriel, W. S. Isaac, E. Lockhart, S. Osindero, L. Rimell, C. Dyer, O. Vinyals, K. Ayoub, J. Stanway, L. Bennett, D. Hassabis, K. Kavukcuoglu, and G. Irving. Scaling language models: Methods, analysis & insights from training gopher. CoRR, abs/2112.11446, 2021. URL https://arxiv.org/abs/2112.11446. [85] H. Rashkin, A. Celikyilmaz, Y. Choi, and J. Gao. Plotmachines: Outline-conditioned generation with dynamic plot state tracking. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 4274–4295, 2020. [86] E. Reif, D. Ippolito, A. Yuan, A. Coenen, C. Callison-Burch, and J. Wei. A recipe for arbitrary text style transfer with large language models. arXiv preprint arXiv:2109.03910, 2021. [87] M. O. Riedl and R. M. Young. Narrative planning: Balancing plot and character. Journal of Artificial Intelligence Research, 39:217–268, 2010. [88] M. Roemmele and A. Gordon. Linguistic features of helpfulness in automated support for creative writing. In Proceedings of the First Workshop on Storytelling, pages 14–19, 2018. [89] M. Roemmele, A. S. Gordon, and R. Swanson. Evaluating story generation systems using automated linguistic analyses. In SIGKDD 2017 Workshop on Machine Learning for Creativity, pages 13–17, 2017. [90] R. Rosa, O. Dušek, T. Kocmi, D. Mareček, T. Musil, P. Schmidtová, D. Jurko, O. Bojar, D. Hrbek, D. Košt’ák, et al. Theaitre: Artificial intelligence to write a theatre play. arXiv preprint arXiv:2006.14668, 2020. [91] R. Rosa, T. Musil, O. Dušek, D. Jurko, P. Schmidtová, D. Mareček, O. Bojar, T. Kocmi, D. Hrbek, D. Košt’ák, et al. Theaitre 1.0: Interactive generation of theatre play scripts. arXiv preprint arXiv:2102.08892, 2021. [92] R. Rosa, P. Schmidtová, O. Dušek, T. Musil, D. Mareček, S. Obaid, M. Nováková, K. Vosecká, and J. Doležal. Gpt-2-based human-in-the-loop theatre play script generation. In Proceedings of the 4th Workshop of Narrative Understanding (WNU2022), pages 29–37, 2022. [93] D. E. Rumelhart. Notes on a schema for stories. In Representation and understanding, pages 211–236. Elsevier, 1975. [94] D. E. Rumelhart. On evaluating story grammars. 1980. [95] K. Sakaguchi, C. Bhagavatula, R. Le Bras, N. Tandon, P. Clark, and Y. Choi. proscript: Partially ordered scripts generation. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 2138–2149, 2021. [96] P. Schmidtová, D. Javorsky, C. Mikláš, T. Musil, R. Rosa, and O. Dušek. Dialoguescript: Using dialogue agents to produce a script. ` arXiv preprint arXiv:2206.08425, 2022. [97] O. Schmitt and D. Buschek. Characterchat: Supporting the creation of fictional characters through conversation and progressive manifestation with a chatbot. In Creativity and Cognition, pages 1–10, 2021. [98] I. Scripts. How to write outstanding tv & movie loglines: The ultimate guide, Jun 2019. URL https://industrialscripts.com/loglines-guide/. [99] A. See, A. Pappu, R. Saxena, A. Yerukola, and C. D. Manning. Do massively pretrained language models make better storytellers? arXiv preprint arXiv:1909.10705, 2019. [100] T. Sellam, D. Das, and A. P. Parikh. Bleurt: Learning robust metrics for text generation. arXiv preprint arXiv:2004.04696, 2020. [101] G. Shimmin. Logline formula: How to use the killogator formula to write a killer logline, Dec 2021. URL https://graemeshimmin.com/writing-alogline-for-a-novel/. [102] W. M. Si, P. Ammanabrolu, and M. Riedl. Telling stories through multi-user dialogue by modeling character relations. In SIGDIAL, 2021. [103] J. Steiff. The complete idiot’s guide to independent filmmaking. Penguin, 2005. [104] C. Stevenson, I. Smal, M. Baas, R. Grasman, and H. van der Maas. Putting gpt-3’s creativity to the (alternative uses) test. 2022. [105] B. Swanson, K. Mathewson, B. Pietrzak, S. Chen, and M. Dinalescu. Story centaur: Large language model few shot learning as a creative writing tool. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations, pages 244–256, Online, Apr. 2021. Association for Computational Linguistics. doi: 10.18653/v1/2021.eacl-demos.29. URL https://aclanthology.org/2021.eacldemos.29. [106] J. Tang, N. Segal, and C. Odimba. Ai at the young vic, 2021. URL https://www.youngvic.org/whats-on/ai. [107] M. Theune, S. Faas, A. Nijholt, and D. Heylen. The virtual storyteller: Story creation by intelligent agents. In Proceedings of the Technologies for Interactive Digital Storytelling and Entertainment (TIDSE) Conference, volume 204215, 2003. [108] P. W. Thorndyke. Cognitive structures in comprehension and memory of narrative discourse. Cognitive psychology, 9(1):77–110, 1977. [109] I. Van Heerden and A. Bas. Ai as author–bridging the gap between machine learning and literary theory. Journal of Artificial Intelligence Research, 71:175–189, 2021. [110] C. Vogler. The writer’s journey. Michael Wiese Productions Studio City, CA, 2007. [111] A. Wang, A. Singh, J. Michael, F. Hill, O. Levy, and S. R. Bowman. Glue: A multi-task benchmark and analysis platform for natural language understanding. ArXiv, abs/1804.07461, 2018. [112] T. Wang and X. Wan. T-cvae: Transformer-based conditioned variational autoencoder for story completion. In IJCAI, pages 5233–5239, 2019. [113] N. Wardrip-Fruin. Expressive processing. Cambridge: MIT Press. Weiberg, B.(2002). Beyond Interactive Cinema. Retrieved April, 9:2009, 2009. [114] S. G. Ware and R. M. Young. Cpocl: A narrative planner supporting conflict. In AIIDE, 2011. [115] J. Wei, X. Wang, D. Schuurmans, M. Bosma, E. Chi, Q. Le, and D. Zhou. Chain of thought prompting elicits reasoning in large language models. arXiv preprint arXiv:2201.11903, 2022. [116] L. Weidinger, J. Mellor, M. Rauh, C. Griffin, J. Uesato, P.-S. Huang, M. Cheng, M. Glaese, B. Balle, A. Kasirzadeh, et al. Ethical and social risks of harm from language models. arXiv preprint arXiv:2112.04359, 2021. [117] J. Wu, L. Ouyang, D. M. Ziegler, N. Stiennon, R. Lowe, J. Leike, and P. Christiano. Recursively summarizing books with human feedback, 2021. [118] T. Wu, M. Terry, and C. J. Cai. Ai chains: Transparent and controllable human-ai interaction by chaining large language model prompts. In CHI Conference on Human Factors in Computing Systems, pages 1–22, 2022. [119] P. Xu, M. Patwary, M. Shoeybi, R. Puri, P. Fung, A. Anandkumar, and B. Catanzaro. Megatron-cntrl: Controllable story generation with external knowledge using large-scale language models. arXiv preprint arXiv:2010.00840, 2020. URL https://arxiv.org/abs/2010.00840. [120] D. Yang, Y. Zhou, Z. Zhang, T. J.-J. Li, and R. LC. Ai as an active writer: Interaction strategies with generated text in human-ai collaborative fiction writing. In Joint Proceedings of the ACM IUI Workshops 2022, volume 10, 2022. [121] L. Yao, N. Peng, R. M. Weischedel, K. Knight, D. Zhao, and R. Yan. Plan-and-write: Towards better automatic storytelling. CoRR, abs/1811.05701, 2018. URL http://arxiv.org/abs/1811.05701. [122] A. Yuan, A. Coenen, E. Reif, and D. Ippolito. Wordcraft: Story writing with large language models. In 27th International Conference on Intelligent User Interfaces, pages 841–852, 2022. [123] J. Zhang, Y. Zhao, M. Saleh, and P. J. Liu. PEGASUS: pre-training with extracted gap-sentences for abstractive summarization. CoRR, abs/1912.08777, 2019. URL http://arxiv.org/abs/1912.08777. [124] Y. Zhu, S. Lu, L. Zheng, J. Guo, W. Zhang, J. Wang, and Y. Yu. Texygen: A benchmarking platform for text generation models. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, pages 1097–1100, 2018. This paper is available on arxiv under CC 4.0 license. Authors: (1) PIOTR MIROWSKI and KORY W. MATHEWSON, DeepMind, United Kingdom and Both authors contributed equally to this research; (2) JAYLEN PITTMAN, Stanford University, USA and Work done while at DeepMind; (3) RICHARD EVANS, DeepMind, United Kingdom. Authors: Authors: (1) PIOTR MIROWSKI and KORY W. MATHEWSON, DeepMind, United Kingdom and Both authors contributed equally to this research; (2) JAYLEN PITTMAN, Stanford University, USA and Work done while at DeepMind; (3) RICHARD EVANS, DeepMind, United Kingdom. Table of Links Abstract and Intro Abstract and Intro Storytelling, The Shape of Stories, and Log Lines Storytelling, The Shape of Stories, and Log Lines The Use of Large Language Models for Creative Text Generation The Use of Large Language Models for Creative Text Generation Evaluating Text Generated by Large Language Models Evaluating Text Generated by Large Language Models Participant Interviews Participant Interviews Participant Surveys Participant Surveys Discussion and Future Work Discussion and Future Work Conclusions, Acknowledgements, and References Conclusions, Acknowledgements, and References A. RELATED WORK ON AUTOMATED STORY GENERATION AND CONTROLLABLE STORY GENERATION A. RELATED WORK ON AUTOMATED STORY GENERATION AND CONTROLLABLE STORY GENERATION B. ADDITIONAL DISCUSSION FROM PLAYS BY BOTS CREATIVE TEAM B. ADDITIONAL DISCUSSION FROM PLAYS BY BOTS CREATIVE TEAM C. DETAILS OF QUANTITATIVE OBSERVATIONS C. DETAILS OF QUANTITATIVE OBSERVATIONS D. SUPPLEMENTARY FIGURES D. SUPPLEMENTARY FIGURES E. FULL PROMPT PREFIXES FOR DRAMATRON E. FULL PROMPT PREFIXES FOR DRAMATRON F. RAW OUTPUT GENERATED BY DRAMATRON F. RAW OUTPUT GENERATED BY DRAMATRON G. CO-WRITTEN SCRIPTS G. CO-WRITTEN SCRIPTS 8 CONCLUSIONS We present Dramatron: an interactive co-writing tool which allows writers to generate scripts from a provided log line. Hierarchical story generation with explicit narrative structures and characters helps to generate more coherent text, especially when generating text as long as theatre scripts and screenplays. We conducted a user study with 15 theatre and film industry professionals and distilled their reflections collected through open-ended qualitative interviews and a short survey. We also present feedback from a creative team that produced scripts co-written with Dramatron in public performances at a theatre festival, alongside two reviews from professional reviewers. In summary, Dramatron can be used as a co-creative writing tool allowing human authors to write screenplays and theatre scripts alongside LLMs. This work invites further questions on the nature of co-creativity and on the ethics surrounding LLMs. ACKNOWLEDGEMENTS We would also like to thank anonymous reviewers for their time, energy, and insightful feedback, as well as our colleagues at DeepMind for creative inspiration and critical input on the scientific, ethical and legal aspects of this work, in particular: Tara Thomas, Kevin McKee, Boxi Wu, Antonia Paterson, Murray Shanahan, Robert Dickens, Aliya Ahmad, Danielle Breen, Sanah Choudhry, Joel Moss, Yan Lai, Jon Small, Will Hawkins, Laura Weidinger, Lisa Anne Hendricks, Mia Glaese, Geoffrey Irving, Jack Rae, Natalie Lambert, Raia Hadsell, Shakir Mohamed and Doina Precup. We are immensely grateful to the anonymous participants who took part in this study and who made it possible. Finally, we are indebted to the talented performers and production companies Rapid Fire Theatre in Edmonton, Canada and Transitional Forms in Toronto, Canada without whom we would never have been able to fully realise the generated scripts. Thank you for providing your artistic voices in this human-machine co-creative dialogue. REFERENCES [1] C. AI. Sollicitors, 2020. URL https://www.youtube.com/watch?v=AmX3GDJ47wo. [2] N. Akoury, S. Wang, J. Whiting, S. Hood, N. Peng, and M. Iyyer. Storium: A dataset and evaluation platform for machine-in-the-loop story generation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6470–6484, 2020. [3] A. Alabdulkarim, S. Li, and X. Peng. Automatic story generation: Challenges and attempts. NAACL HLT 2021, page 72, 2021. [4] P. Ammanabrolu, E. Tien, W. Cheung, Z. Luo, W. Ma, L. J. Martin, and M. O. Riedl. Story realization: Expanding plot events into sentences. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 7375–7382, 2020. [5] Aristotle. Poetics. 350 BC. [6] A. Becker. Text building, epistemology, and aesthetic in javanese shadow theatre “dalam the imagination of reality. edited by al becker and aram a. yengoyan, 1979. [7] E. M. Bender, T. Gebru, A. McMillan-Major, and S. Shmitchell. On the dangers of stochastic parrots: Can language models be too big? pages 610–623, 2021. [8] L. S. Bishop. Sell Your Story in A Single Sentence: Advice from the Front Lines of Hollywood. The Countryman Press, 2016. [9] R. Bommasani, D. A. Hudson, E. Adeli, R. Altman, S. Arora, S. von Arx, M. S. Bernstein, J. Bohg, A. Bosselut, E. Brunskill, et al. On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258, 2021. [10] B. Branch, P. Mirowski, and K. W. Mathewson. Collaborative storytelling with human actors and ai narrators. Proceedings of the 12th International Conference on Computational Creativity, 2021. URL https://arxiv.org/abs/2109.14728. [11] G. Branwen. Gpt-3 creative fiction. 2020. [12] T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, and D. Amodei. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020. URL https://arxiv.org/abs/2005.14165. [13] A. Calderwood, V. Qiu, K. I. Gero, and L. B. Chilton. How novelists use generative language models: An exploratory user study. In Proceedings of HAI-GEN+ user2agent@ IUI, 2020. [14] A. Calderwood, N. Wardrip-Fruin, and M. Mateas. Spinning coherent interactive fiction through foundation model prompts. 2022. [15] J. Campbell. The hero with a thousand faces. Princeton, NJ: Princeton University Press, 1949. [16] A. Celikyilmaz, E. Clark, and J. Gao. Evaluation of text generation: A survey. CoRR, abs/2006.14799, 2020. URL https://arxiv.org/abs/2006.14799. [17] W. L. Chafe. The pear stories: Cognitive, cultural, and linguistic aspects of narrative production. 1980. [18] R. Cheng, A. Smith-Renner, K. Zhang, J. R. Tetreault, and A. Jaimes. Mapping the design space of human-ai interaction in text summarization. arXiv preprint arXiv:2206.14863, 2022. [19] J. Cho, M. Jeong, J. Bak, and Y.-G. Cheong. Genre-controllable story generation via supervised contrastive learning. In Proceedings of the ACM Web Conference 2022, pages 2839–2849, 2022. [20] J. J. Y. Chung, S. He, and E. Adar. Artist support networks: Implications for future creativity support tools. Proceedings of Designing Interactive Systems: DIS’22, 2022. [21] J. J. Y. Chung, W. Kim, K. M. Yoo, H. Lee, E. Adar, and M. Chang. Talebrush: Sketching stories with generative pretrained language models. In CHI Conference on Human Factors in Computing Systems, pages 1–19, 2022. [22] E. Clark, A. S. Ross, C. Tan, Y. Ji, and N. A. Smith. Creative writing with a machine in the loop: Case studies on slogans and stories. In 23rd International Conference on Intelligent User Interfaces, pages 329–340, 2018. [23] W. W. Cook. Plotto: The Master Book of All Plots. Ellis, first edition, 1928. [24] A. Creswell and M. Shanahan. Faithful reasoning using large language models. arXiv preprint arXiv:2208.14271, 2022. [25] S. Dathathri, A. Madotto, J. Lan, J. Hung, E. Frank, P. Molino, J. Yosinski, and R. Liu. Plug and play language models: A simple approach to controlled text generation. CoRR, abs/1912.02164, 2019. URL http://arxiv.org/abs/1912.02164. [26] A. De Fina and B. Johnstone. Discourse analysis and narrative. The handbook of discourse analysis, 1:152–167, 2015. [27] P. Debreczeny. Chekhov’s art: A stylistic analysis. Slavic Review, 43(2):347–348, 1984. [28] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018. [29] A. Dirik, H. Donmez, and P. Yanardag. Controlled cue generation for play scripts. arXiv preprint arXiv:2112.06953, 2021. [30] T. Draws, D. La Barbera, M. Soprano, K. Roitero, D. Ceolin, A. Checco, and S. Mizzaro. The effects of crowd worker biases in fact-checking tasks. In 2022 ACM Conference on Fairness, Accountability, and Transparency, pages 2114–2124, 2022. [31] W. Du, Z. M. Kim, V. Raheja, D. Kumar, and D. Kang. Read, revise, repeat: A system demonstration for human-in-the-loop iterative text revision. In Proceedings of the First Workshop on Intelligent and Interactive Writing Assistants (In2Writing 2022), pages 96–108, 2022. [32] N. Dziri, E. Kamalloo, K. W. Mathewson, and O. Zaiane. Evaluating coherence in dialogue systems using entailment. arXiv preprint arXiv:1904.03371, 2019. [33] M. Eger and K. W. Mathewson. dairector: Automatic story beat generation through knowledge synthesis. arXiv preprint arXiv:1811.03423, 2018. [34] M. Eger, C. M. Potts, C. Barot, and R. M. Young. Plotter: operationalizing the master book of all plots. Proceedings of the Intelligent Narrative Technologies and Social Believability in Games, pages 30–33, 2015. [35] C. Eickhoff. Cognitive biases in crowdsourcing. In Proceedings of the eleventh ACM international conference on web search and data mining, pages 162–170, 2018. [36] R. Evans and E. Short. Versu—a simulationist storytelling system. IEEE Transactions on Computational Intelligence and AI in Games, 6(2):113–130, 2013. [37] A. Fan, M. Lewis, and Y. N. Dauphin. Hierarchical neural story generation. CoRR, abs/1805.04833, 2018. URL http://arxiv.org/abs/1805.04833. [38] A. Fan, M. Lewis, and Y. N. Dauphin. Strategies for structuring story generation. CoRR, abs/1902.01109, 2019. URL http://arxiv.org/abs/1902.01109. [39] C. Fellbaum. Wordnet. In Theory and applications of ontology: computer applications, pages 231–243. Springer, 2010. [40] G. Freytag. Die technik des dramas. S. Hirzel, 1894. [41] K. I. Gero, V. Liu, and L. Chilton. Sparks: Inspiration for science writing using language models. In Designing Interactive Systems Conference, pages 1002–1019, 2022. [42] S. Ghazarian, Z. Liu, S. Akash, R. Weischedel, A. Galstyan, and N. Peng. Plot-guided adversarial example construction for evaluating open-domain story generation. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4334–4344, 2021. [43] A. Ghosh. An encyclopaedia of indian archaeology. 1990. [44] S. Goldfarb-Tarrant, T. Chakrabarty, R. Weischedel, and N. Peng. Content planning for neural story generation with aristotelian rescoring. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 4319–4338, 2020. [45] J. Guan, Y. Wang, and M. Huang. Story ending generation with incremental encoding and commonsense knowledge. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pages 6473–6480, 2019. [46] J. Guan, F. Huang, Z. Zhao, X. Zhu, and M. Huang. A knowledge-enhanced pretraining model for commonsense story generation. Transactions of the Association for Computational Linguistics, 8:93–108, 2020. [47] B. Hayes-Roth and R. Van Gent. Improvisational puppets, actors, and avatars. In Proc Computer Game Developers’ Conf, 1996. [48] J. Hoffmann, S. Borgeaud, A. Mensch, E. Buchatskaya, T. Cai, E. Rutherford, D. d. L. Casas, L. A. Hendricks, J. Welbl, A. Clark, et al. Training compute-optimal large language models. arXiv preprint arXiv:2203.15556, 2022. [49] A. Holtzman, J. Buys, L. Du, M. Forbes, and Y. Choi. The curious case of neural text degeneration. In International Conference on Learning Representations, 2019. [50] Z. Hu, H. P. Chan, J. Liu, X. Xiao, H. Wu, and L. Huang. Planet: Dynamic content planning in autoregressive transformers for long-form text generation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2288–2305, 2022. [51] C. Hube, B. Fetahu, and U. Gadiraju. Understanding and mitigating worker biases in the crowdsourced collection of subjective judgments. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, pages 1–12, 2019. [52] Y. Jin, V. Kadam, and D. Wanvarie. Plot writing from pre-trained language models. arXiv preprint arXiv:2206.03021, 2022. [53] B. Johnstone. Discourse analysis and narrative. The handbook of discourse analysis, pages 635–649, 2005. [54] M. Karpinska, N. Akoury, and M. Iyyer. The perils of using mechanical turk to evaluate open-ended text generation. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 1265–1285, 2021. [55] N. S. Keskar, B. McCann, L. R. Varshney, C. Xiong, and R. Socher. Ctrl: A conditional transformer language model for controllable generation. arXiv preprint arXiv:1909.05858, 2019. URL http://arxiv.org/abs/1909.05858. [56] R. Kozierok, J. Aberdeen, C. Clark, C. Garay, B. Goodman, T. Korves, L. Hirschman, P. L. McDermott, and M. W. Peterson. Assessing open-ended human-computer collaboration systems: Applying a hallmarks approach. Frontiers in artificial intelligence, 4, 2021. [57] M. Kusner, Y. Sun, N. Kolkin, and K. Weinberger. From word embeddings to document distances. In International conference on machine learning, pages 957–966. PMLR, 2015. [58] W. Labov and J. Waletzky. Narrative analysis. In Essays on the Verbal and Visual Arts, ed. J. Helm, pages 12–44. Seattle: U. of Washington Press, 1967. [59] J. Lee, T. Le, J. Chen, and D. Lee. Do language models plagiarize? arXiv e-prints, pages arXiv–2203, 2022. [60] M. Lee, P. Liang, and Q. Yang. Coauthor: Designing a human-ai collaborative writing dataset for exploring language model capabilities. In CHI Conference on Human Factors in Computing Systems, pages 1–19, 2022. [61] B. Li, S. Lee-Urban, G. Johnston, and M. O. Riedl. Story generation with crowdsourced plot graphs. In Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence, AAAI’13, pages 598–604. AAAI Press, 2013. URL http://dl.acm.org/citation.cfm?id=2891460.2891543. [62] C.-Y. Lin. Rouge: A package for automatic evaluation of summaries. In Proc ACL Wkshp. Vol 8., 2004. [63] I. Mani. Computational modeling of narrative. Synthesis Lectures on Human Language Technologies, 5(3):1–142, 2012. [64] L. J. Martin, P. Ammanabrolu, X. Wang, W. Hancock, S. Singh, B. Harrison, and M. O. Riedl. Event representations for automated story generation with deep neural nets. arXiv preprint arXiv:1706.01331, 2017. [65] K. Mathewson and P. Mirowski. Improbotics: Exploring the imitation game using machine intelligence in improvised theatre. In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, volume 14, 2018. [66] K. W. Mathewson and P. Mirowski. Improvised theatre alongside artificial intelligences. In AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, 2017. [67] K. W. Mathewson and P. Mirowski. Improvised comedy as a turing test. arXiv preprint arXiv:1711.08819, 2017. [68] K. W. Mathewson, P. S. Castro, C. Cherry, G. F. Foster, and M. G. Bellemare. Shaping the narrative arc: An information-theoretic approach to collaborative dialogue. CoRR, abs/1901.11528, 2019. URL http://arxiv.org/abs/1901.11528. [69] R. McKee. Story: Substance, structure, style and the principles of screenwriting. 1997. Kent, Great Britain: Methuen, 1997. [70] J. R. Meehan. The metanovel: writing stories by computer. Technical report, Yale Univ, New Haven Conn, Dept of Comp Sci, 1976. [71] J. R. Meehan. Tale-spin, an interactive program that writes stories. In IJCAI, volume 77, pages 91–98, 1977. [72] P. Mirowski and K. W. Mathewson. Human improvised theatre augmented with artificial intelligence. In Proceedings of the 2019 on Creativity and Cognition, pages 527–530. 2019. [73] P. Mirowski, S. Chopra, S. Balakrishnan, and S. Bangalore. Feature-rich continuous language models for speech recognition. In Spoken Language Technology Wkshp, 2010 IEEE, pages 241–246. IEEE, 2010. [74] A. Newitz. Movie written by algorithm turns out to be hilarious and intense, May 2016. URL https://arstechnica.com/gaming/2021/05/an-ai-wrotethis-movie-and-its-strangely-moving/. [75] E. Nichols, L. Gao, and R. Gomez. Collaborative storytelling with large-scale neural language models. In Motion, Interaction and Games, pages 1–10. 2020. [76] OpenAI. Pricing, Nov 2021. URL https://openai.com/api/pricing/. [77] V. Padmakumar and H. He. Machine-in-the-loop rewriting for creative image captioning. In Proceedings of the 20th Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2022. [78] P. Papalampidi, K. Cao, and T. Kocisky. Towards coherent and consistent use of entities in narrative generation. International Conference on Macine Learning, 2022. [79] K. Patti. The first horror movie written entirely by bots, 2021. URL https://www.youtube.com/watch?v=WZzbxNoMjGM. [80] K. Perlin and A. Goldberg. Improv: A system for scripting interactive actors in virtual worlds. In Proc. Conf. on Computer Graphics and Interactive Techniques, pages 205–216. ACM, 1996. [81] M. Polceanu, J. Porteous, A. Lindsay, and M. Cavazza. Narrative plan generation with self-supervised learning. In AAAI, 2021. [82] G. Polti. The thirty-six dramatic situations. Editor Company, 1917. [83] V. I. Propp. Morphology of the Folktale, volume 9. University of Texas Press, 1968. [84] J. W. Rae, S. Borgeaud, T. Cai, K. Millican, J. Hoffmann, H. F. Song, J. Aslanides, S. Henderson, R. Ring, S. Young, E. Rutherford, T. Hennigan, J. Menick, A. Cassirer, R. Powell, G. van den Driessche, L. A. Hendricks, M. Rauh, P. Huang, A. Glaese, J. Welbl, S. Dathathri, S. Huang, J. Uesato, J. Mellor, I. Higgins, A. Creswell, N. McAleese, A. Wu, E. Elsen, S. M. Jayakumar, E. Buchatskaya, D. Budden, E. Sutherland, K. Simonyan, M. Paganini, L. Sifre, L. Martens, X. L. Li, A. Kuncoro, A. Nematzadeh, E. Gribovskaya, D. Donato, A. Lazaridou, A. Mensch, J. Lespiau, M. Tsimpoukelli, N. Grigorev, D. Fritz, T. Sottiaux, M. Pajarskas, T. Pohlen, Z. Gong, D. Toyama, C. de Masson d’Autume, Y. Li, T. Terzi, V. Mikulik, I. Babuschkin, A. Clark, D. de Las Casas, A. Guy, C. Jones, J. Bradbury, M. Johnson, B. A. Hechtman, L. Weidinger, I. Gabriel, W. S. Isaac, E. Lockhart, S. Osindero, L. Rimell, C. Dyer, O. Vinyals, K. Ayoub, J. Stanway, L. Bennett, D. Hassabis, K. Kavukcuoglu, and G. Irving. Scaling language models: Methods, analysis & insights from training gopher. CoRR, abs/2112.11446, 2021. URL https://arxiv.org/abs/2112.11446. [85] H. Rashkin, A. Celikyilmaz, Y. Choi, and J. Gao. Plotmachines: Outline-conditioned generation with dynamic plot state tracking. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 4274–4295, 2020. [86] E. Reif, D. Ippolito, A. Yuan, A. Coenen, C. Callison-Burch, and J. Wei. A recipe for arbitrary text style transfer with large language models. arXiv preprint arXiv:2109.03910, 2021. [87] M. O. Riedl and R. M. Young. Narrative planning: Balancing plot and character. Journal of Artificial Intelligence Research, 39:217–268, 2010. [88] M. Roemmele and A. Gordon. Linguistic features of helpfulness in automated support for creative writing. In Proceedings of the First Workshop on Storytelling, pages 14–19, 2018. [89] M. Roemmele, A. S. Gordon, and R. Swanson. Evaluating story generation systems using automated linguistic analyses. In SIGKDD 2017 Workshop on Machine Learning for Creativity, pages 13–17, 2017. [90] R. Rosa, O. Dušek, T. Kocmi, D. Mareček, T. Musil, P. Schmidtová, D. Jurko, O. Bojar, D. Hrbek, D. Košt’ák, et al. Theaitre: Artificial intelligence to write a theatre play. arXiv preprint arXiv:2006.14668, 2020. [91] R. Rosa, T. Musil, O. Dušek, D. Jurko, P. Schmidtová, D. Mareček, O. Bojar, T. Kocmi, D. Hrbek, D. Košt’ák, et al. Theaitre 1.0: Interactive generation of theatre play scripts. arXiv preprint arXiv:2102.08892, 2021. [92] R. Rosa, P. Schmidtová, O. Dušek, T. Musil, D. Mareček, S. Obaid, M. Nováková, K. Vosecká, and J. Doležal. Gpt-2-based human-in-the-loop theatre play script generation. In Proceedings of the 4th Workshop of Narrative Understanding (WNU2022), pages 29–37, 2022. [93] D. E. Rumelhart. Notes on a schema for stories. In Representation and understanding, pages 211–236. Elsevier, 1975. [94] D. E. Rumelhart. On evaluating story grammars. 1980. [95] K. Sakaguchi, C. Bhagavatula, R. Le Bras, N. Tandon, P. Clark, and Y. Choi. proscript: Partially ordered scripts generation. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 2138–2149, 2021. [96] P. Schmidtová, D. Javorsky, C. Mikláš, T. Musil, R. Rosa, and O. Dušek. Dialoguescript: Using dialogue agents to produce a script. ` arXiv preprint arXiv:2206.08425, 2022. [97] O. Schmitt and D. Buschek. Characterchat: Supporting the creation of fictional characters through conversation and progressive manifestation with a chatbot. In Creativity and Cognition, pages 1–10, 2021. [98] I. Scripts. How to write outstanding tv & movie loglines: The ultimate guide, Jun 2019. URL https://industrialscripts.com/loglines-guide/. [99] A. See, A. Pappu, R. Saxena, A. Yerukola, and C. D. Manning. Do massively pretrained language models make better storytellers? arXiv preprint arXiv:1909.10705, 2019. [100] T. Sellam, D. Das, and A. P. Parikh. Bleurt: Learning robust metrics for text generation. arXiv preprint arXiv:2004.04696, 2020. [101] G. Shimmin. Logline formula: How to use the killogator formula to write a killer logline, Dec 2021. URL https://graemeshimmin.com/writing-alogline-for-a-novel/. [102] W. M. Si, P. Ammanabrolu, and M. Riedl. Telling stories through multi-user dialogue by modeling character relations. In SIGDIAL, 2021. [103] J. Steiff. The complete idiot’s guide to independent filmmaking. Penguin, 2005. [104] C. Stevenson, I. Smal, M. Baas, R. Grasman, and H. van der Maas. Putting gpt-3’s creativity to the (alternative uses) test. 2022. [105] B. Swanson, K. Mathewson, B. Pietrzak, S. Chen, and M. Dinalescu. Story centaur: Large language model few shot learning as a creative writing tool. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations, pages 244–256, Online, Apr. 2021. Association for Computational Linguistics. doi: 10.18653/v1/2021.eacl-demos.29. URL https://aclanthology.org/2021.eacldemos.29. [106] J. Tang, N. Segal, and C. Odimba. Ai at the young vic, 2021. URL https://www.youngvic.org/whats-on/ai. [107] M. Theune, S. Faas, A. Nijholt, and D. Heylen. The virtual storyteller: Story creation by intelligent agents. In Proceedings of the Technologies for Interactive Digital Storytelling and Entertainment (TIDSE) Conference, volume 204215, 2003. [108] P. W. Thorndyke. Cognitive structures in comprehension and memory of narrative discourse. Cognitive psychology, 9(1):77–110, 1977. [109] I. Van Heerden and A. Bas. Ai as author–bridging the gap between machine learning and literary theory. Journal of Artificial Intelligence Research, 71:175–189, 2021. [110] C. Vogler. The writer’s journey. Michael Wiese Productions Studio City, CA, 2007. [111] A. Wang, A. Singh, J. Michael, F. Hill, O. Levy, and S. R. Bowman. Glue: A multi-task benchmark and analysis platform for natural language understanding. ArXiv, abs/1804.07461, 2018. [112] T. Wang and X. Wan. T-cvae: Transformer-based conditioned variational autoencoder for story completion. In IJCAI, pages 5233–5239, 2019. [113] N. Wardrip-Fruin. Expressive processing. Cambridge: MIT Press. Weiberg, B.(2002). Beyond Interactive Cinema. Retrieved April, 9:2009, 2009. [114] S. G. Ware and R. M. Young. Cpocl: A narrative planner supporting conflict. In AIIDE, 2011. [115] J. Wei, X. Wang, D. Schuurmans, M. Bosma, E. Chi, Q. Le, and D. Zhou. Chain of thought prompting elicits reasoning in large language models. arXiv preprint arXiv:2201.11903, 2022. [116] L. Weidinger, J. Mellor, M. Rauh, C. Griffin, J. Uesato, P.-S. Huang, M. Cheng, M. Glaese, B. Balle, A. Kasirzadeh, et al. Ethical and social risks of harm from language models. arXiv preprint arXiv:2112.04359, 2021. [117] J. Wu, L. Ouyang, D. M. Ziegler, N. Stiennon, R. Lowe, J. Leike, and P. Christiano. Recursively summarizing books with human feedback, 2021. [118] T. Wu, M. Terry, and C. J. Cai. Ai chains: Transparent and controllable human-ai interaction by chaining large language model prompts. In CHI Conference on Human Factors in Computing Systems, pages 1–22, 2022. [119] P. Xu, M. Patwary, M. Shoeybi, R. Puri, P. Fung, A. Anandkumar, and B. Catanzaro. Megatron-cntrl: Controllable story generation with external knowledge using large-scale language models. arXiv preprint arXiv:2010.00840, 2020. URL https://arxiv.org/abs/2010.00840. [120] D. Yang, Y. Zhou, Z. Zhang, T. J.-J. Li, and R. LC. Ai as an active writer: Interaction strategies with generated text in human-ai collaborative fiction writing. In Joint Proceedings of the ACM IUI Workshops 2022, volume 10, 2022. [121] L. Yao, N. Peng, R. M. Weischedel, K. Knight, D. Zhao, and R. Yan. Plan-and-write: Towards better automatic storytelling. CoRR, abs/1811.05701, 2018. URL http://arxiv.org/abs/1811.05701. [122] A. Yuan, A. Coenen, E. Reif, and D. Ippolito. Wordcraft: Story writing with large language models. In 27th International Conference on Intelligent User Interfaces, pages 841–852, 2022. [123] J. Zhang, Y. Zhao, M. Saleh, and P. J. Liu. PEGASUS: pre-training with extracted gap-sentences for abstractive summarization. CoRR, abs/1912.08777, 2019. URL http://arxiv.org/abs/1912.08777. [124] Y. Zhu, S. Lu, L. Zheng, J. Guo, W. Zhang, J. Wang, and Y. Yu. Texygen: A benchmarking platform for text generation models. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, pages 1097–1100, 2018. This paper is available on arxiv under CC 4.0 license. This paper is available on arxiv under CC 4.0 license. available on arxiv