Quantitative Evaluation of AI Writing Tools: Insights from Likert Scale Responses

Authors: (1) PIOTR MIROWSKI and KORY W. MATHEWSON, DeepMind, United Kingdom and Both authors contributed equally to this research; (2) JAYLEN PITTMAN, Stanford University, USA and Work done while at DeepMind; (3) RICHARD EVANS, DeepMind, United Kingdom. Table of Links Abstract and Intro Storytelling, The Shape of Stories, and Log Lines The Use of Large Language Models for Creative Text Generation Evaluating Text Generated by Large Language Models Participant Interviews Participant Surveys Discussion and Future Work Conclusions, Acknowledgements, and References A. RELATED WORK ON AUTOMATED STORY GENERATION AND CONTROLLABLE STORY GENERATION B. ADDITIONAL DISCUSSION FROM PLAYS BY BOTS CREATIVE TEAM C. DETAILS OF QUANTITATIVE OBSERVATIONS D. SUPPLEMENTARY FIGURES E. FULL PROMPT PREFIXES FOR DRAMATRON F. RAW OUTPUT GENERATED BY DRAMATRON G. CO-WRITTEN SCRIPTS D SUPPLEMENTARY FIGURES Figure 7 shows the participants’ responses to the quantitative evaluation, on a Likert-type scale ranging from 1 (strongly disagree) to 5 (strongly agree), and broken down by groups of participants. For the first group, we defined a binary indicator variable (Has experience of AI writing tools). For the second group, we defined a three-class category for their primary domain of expertise (Improvisation, Scripted Theatre and Film or TV). This paper is available on arxiv under CC 4.0 license. Authors: (1) PIOTR MIROWSKI and KORY W. MATHEWSON, DeepMind, United Kingdom and Both authors contributed equally to this research; (2) JAYLEN PITTMAN, Stanford University, USA and Work done while at DeepMind; (3) RICHARD EVANS, DeepMind, United Kingdom. Authors: Authors: (1) PIOTR MIROWSKI and KORY W. MATHEWSON, DeepMind, United Kingdom and Both authors contributed equally to this research; (2) JAYLEN PITTMAN, Stanford University, USA and Work done while at DeepMind; (3) RICHARD EVANS, DeepMind, United Kingdom. Table of Links Abstract and Intro Abstract and Intro Storytelling, The Shape of Stories, and Log Lines Storytelling, The Shape of Stories, and Log Lines The Use of Large Language Models for Creative Text Generation The Use of Large Language Models for Creative Text Generation Evaluating Text Generated by Large Language Models Evaluating Text Generated by Large Language Models Participant Interviews Participant Interviews Participant Surveys Participant Surveys Discussion and Future Work Discussion and Future Work Conclusions, Acknowledgements, and References Conclusions, Acknowledgements, and References A. RELATED WORK ON AUTOMATED STORY GENERATION AND CONTROLLABLE STORY GENERATION A. RELATED WORK ON AUTOMATED STORY GENERATION AND CONTROLLABLE STORY GENERATION B. ADDITIONAL DISCUSSION FROM PLAYS BY BOTS CREATIVE TEAM B. ADDITIONAL DISCUSSION FROM PLAYS BY BOTS CREATIVE TEAM C. DETAILS OF QUANTITATIVE OBSERVATIONS C. DETAILS OF QUANTITATIVE OBSERVATIONS D. SUPPLEMENTARY FIGURES D. SUPPLEMENTARY FIGURES E. FULL PROMPT PREFIXES FOR DRAMATRON E. FULL PROMPT PREFIXES FOR DRAMATRON F. RAW OUTPUT GENERATED BY DRAMATRON F. RAW OUTPUT GENERATED BY DRAMATRON G. CO-WRITTEN SCRIPTS G. CO-WRITTEN SCRIPTS D SUPPLEMENTARY FIGURES Figure 7 shows the participants’ responses to the quantitative evaluation, on a Likert-type scale ranging from 1 (strongly disagree) to 5 (strongly agree), and broken down by groups of participants. For the first group, we defined a binary indicator variable (Has experience of AI writing tools). For the second group, we defined a three-class category for their primary domain of expertise (Improvisation, Scripted Theatre and Film or TV). This paper is available on arxiv under CC 4.0 license. This paper is available on arxiv under CC 4.0 license. available on arxiv