Comedy & AI: The Ethics of Community-Based Value Alignment

Table of Links

5 DISCUSSION

After analysing our study participants’ feedback and inspired by recent discussions on the ethics of generative models, we discuss how comedy and humour can be seen as special cases of value alignment (Sect. 5.1) where context is key (Sect. 5.2) and how data ownership impacts artists (Sect. 5.3).

5.1 Towards community-based cultural value alignment for humour and comedy

The participants’ critical stance towards popular conversational LLMs (ChatGPT and Bard) as a tool for comedy writing suggests that those tools may be currently misaligned with the particular creative goals of the artists.

5.1.1 Complexity of global cultural value alignment of LLMs for creative uses. The participants’ observations might be a special case of a more general problem. We first borrow Masoud et al. [66]’s definition of cultural value alignment as “ the process of aligning an AI system with the set of shared beliefs, values, and norms of the group of users that interact with the system”, where such values are “fundamental beliefs an individual or a group holds towards socio-cultural topics” [6]. Gabriel [35] discusses the complexity of attempting to align generalist conversational AI systems—made to be used by diverse users for diverse tasks—with values shared by global communities, particularly given significant variations in norms and conceptions of justice across societies. In “Whose Opinions Do Language Models Reflect?”, Santurkar et al. [90] propose to probe the cultural representation encoded in LLMs. For instance, Johnson et al. [57] identified, in some LLMs, lack of pluralistic opinions in LLM outputs and values that were culturally more aligned with the US, on issues ranging from secularism to gender and sexuality. Gabriel [35] and Kirk et al. [60] warn about the pitfalls of value imposition by one community over another—a problem surfaced by participants in our study.

In addition to the challenge of value alignment for diverse groups of users, Kasirzadeh and Gabriel [59] discuss alignment for diverse tasks, namely factual information retrieval vs. creative storytelling: “creative work aspires to achieve creative freedom and originality”, the latter often “obtained by stretching, even outright violating, the various rules of the game” [59, 95].

5.1.2 The problem with global fine-tuning of Harmless, Helpful and Honest conversational agents. Askell et al. [7] explicitly list honest, helpful, and harmless as stated objectives for general purpose assistants (the so-called HHH criteria) because those “seem to capture the majority of what users want from an aligned AI”. However, while they discuss inter-agent and intra-agent conflicts between the three HHH criteria, they do not address the question of how the human values that underlie each of those criteria might conflict between different societies, nor do they discuss creative use cases where users may not want “honest” conversational agents, or may have a different definition of “harmless” [44].

Askell et al. [7] further propose to train such HHH assistants directly from user interactions and using preference models, for instance through Reinforcement Learning from Human Feedback [27, 116]. During such fine-tuning, LLMs such as OpenAI’s ChatGPT [77, 79], Google Bard [26, 98], Anthropic’s Claude [8], and Meta’s Llama [100] are all finetuned on annotators’ feedback supposed to represent values of the global community. However, the wider social or relational context of the producer and audience of the LLM’s interactions, do not factor into the model’s training objective. The crowdworkers or users who fine-tune LLMs may not sufficiently represent the diversity of opinions [60], and provide insufficiently defined feedback. And yet the default versions of the LLMs are released under assumptions of indiscriminate, global cultural value alignment.

The broad HHH criteria embody a Western philosophical approach to alignment [101] and bias [29]. As the participants noted, such global cultural value alignment underlying LLMs might also be directly in conflict with the specificity and local tastes that make comedy funny: “the broader appeal something has, the less great it could be. If you make something that fits everybody, it probably will end up being nobody’s favorite thing” (p10).

5.1.3 Community-based value alignment of LLMs. Thus, to make LLMs effectively understand or generate humour, their value-based alignment should be redirected from global alignment to communitybased alignment with specific audiences and comedians. Communities could agree on a set of values for their specific culture and acceptable language norms, before training, fine-tuning or adapting the LLM. More simply, LLMs could be trained only on feedback and data generated by members of each distinct community, data that will reflect that community’s actual norms and values. As Gabriel [35] suggests, this kind of value alignment could happen through a democratic process (see more recent work on Collective Constitution AI [4, 92]). The LLM could be designed as a mixture model that accommodates a plurality of viewpoints [101] or could be finetuned to allow consensus-based agreement among humans with diverse preferences [9]. The technical infrastructure for communitybased value alignment of LLMs is readily available: for example, proprietary LLMs like ChatGPT [77, 79] or Palm 2 [26], as well as open-source models like Llama [100] or Mixtral [55] (via the HuggingFace platform[9]), all allow fine-tuning on user-supplied text or conversations. The key problem for communities is setting up data governance and infrastructure to responsibly collect and curate data [111].

One benefit of global value alignment is the prevention of harm; delegating such a responsibility to a smaller community of users is not without risks. To prevent harm, one could envision mechanisms for community accountability, and assume that comedians share some common values, such as do not harm (and do not lose) the audience[10]: “if we’re using it to create material as artists, the responsibility for what we do with that material is on me” (p4).

5.1.4 Avoiding unnecessary paternalism in LLMs for creative uses. Two alternative formulations of the interaction between a creative (comedy) writer and AI could be adapted from [62] and [34]. In the first, the AI writing tools should “enable the [human] to more effectively carry out tasks that are instrumental to their goals” while being “mediated by the norms and infrastructure of the society in which they live” [62]. In the second, “meaningful human control is achieved if human creators can creatively express themselves through the generative system, leading to an outcome that aligns with their intentions and carries their personal, expressive signature” [34]. Provided that societal norms can be formulated in the subtle and edgy domain of humour, an LLM designed according to principles of beneficent intelligence [62] and with self-expression as a goal [34] could ensure the comedian’s creative freedom when interacting with the tool, while minimising unnecessary paternalism due to LLM censorship.

5.2 Humour and comedy uses of LLMs require incorporating the context

As we hypothesised in Section 1.2, and as observed by study participants, LLM tools did not take into account the relational context when moderating offensive language, and missed the broader situational context of comedy writing.

5.2.1 Recent AI ethics research on relational context for LLMs. There are safety reasons why widely-released public-facing LLMs generally cannot handle offensive language and dark humour. As Kasirzadeh and Gabriel [59] observe in the case of creative storytelling, the potential harms of LLMs are due to their deployment in “domains that are not context-bound”. Weidinger et al. [104] recognise that “context determines whether a given capability may cause harm” and propose to add human interaction (as well as systemic impacts) in safety evaluation, in order to “account for relevant context”: “who uses the AI system, to what end and under which circumstances”. Similarly, Amironesei and Díaz [3] propose to “incorporate social context into the ways [NLP] tasks are conceptualised and operationalised”, in order to allow the distinction between offensive language such as reclaimed slurs from “language which is [intentionally] abusive, toxic or hateful”. Such relational context includes the speaker, the receiver, their social relation, social and cultural norms, and communicative goals. Comedy is often shared with audiences in physical spaces (e.g., a comedy festival) or on age-controlled distribution channels, with implied context and explicit trigger warnings. Context allows comedians to share their lived experience of trauma, which may involve depicting violence or harassment [36]; more generally, comedy can be used as a mechanism for processing trauma [33]. It is beyond the scope of this paper to propose technical solutions to evaluate or to incorporate the broader relational context into an LLM, but we suggest that the creative community needs to be actively involved in specifying how the LLM should process the context, and in safety design without indiscriminate censorship.

As a counterpoint to that suggestion, one can argue that comedians can afford to use offensive language because they can take responsibility for it (and are accountable to their audiences) as they can claim to be in-group members—something that AI cannot be. Even if one successfully builds an LLM that can handle context properly and speak the (offensive) language of its users as if it were part of that in-group, “it is not clear if users would find its use of reclaimed terms acceptable, as the model cannot actually be in-group” [86]. To quote a participant: “If I was to try to do that with an AI, I can only imagine how yikes, and offensive, and totally not fit for humanity a pitch like that would be” (p17).

5.2.2 Humour and the human context. We can hypothesize that the missing relational context, as discussed above, affects the quality of comedy material co-written with LLMs. In our study, participants noted that the LLMs were missing the complex human context needed for real humour understanding: “I have an intuitive sense of what’s gonna work [...] for me based on so much lived experience and studying of comedy, but it is very individualized and I don’t know that AI is ever gonna be able to approach that” (p14).

Their observations seem to confirm recent work that examined the use of LLMs to automatically evaluate [40, 41, 45] and detect humour [10]. Despite formidable recent advances, empirical studies found that understanding and generating humour was still challenging for LLMs, which could generate only a limited number of stereotypical jokes [54]. These issues get worse when one ventures beyond the English language: researchers observed that “one mostly unaddressed issue in the field of computational humour (both for generation and detection) is how it is mostly centered on English jokes” [106], with limited work on LLM-based humour in other languages [52, 61].

To cite Winters [106], “Humor’s frame-shifting prerequisite reveals its difficulty for a machine to acquire. [...] This substantial dependency on insight into human thought (e.g., memory recall, linguistic abilities for semantic integration, and world knowledge inferences) often made researchers conclude that humor is an AIcomplete problem” [50]. Winters [106] continues: “genuine humor appreciation requires machines to have human-level intelligence, since it needs functionally equivalent cognitive abilities, sensory inputs, anticipation-generators and world views” [50]. Humans know how to write surreal prose “by design, not by accident or failure of expression” [93].

5.3 Data ownership and the impact of AI on artists

As discussed in Sections 4.5 and 4.5.1, many of the participants in our study mentioned concerns about copyright and data ownership in the discussion. Their responses reflected keen awareness of recent litigation against technology companies training LLMs on data potentially under copyright (including the one led by comedian Sarah Silverman [39]), and the AI-related concerns of Writers’ Guild of America (WGA) 2023 strike [75]. Several participants believed that “if you copy and paste something directly into a show, that is plagiarism” (p9) and that, just like for visual artists [56, 78], a comedian’s style and voice are personal to them and developed through their lived experience. Among the underlying concerns is economic loss via labor displacement [34, 56] with LLM that “canibalise the market for human-authored works” [103], and devaluing artists’ work via “digital forgery” [56]. In the US, the WGA 2023 strike resulted in outcomes that some considered favourable for writers[11], with rules for producers that “AI can’t write or rewrite literary material, and AI-generated material will not be considered source material under the [Minimum Basic Agreement], meaning that AI-generated material can’t be used to undermine a writer’s credit or separated rights” [75].

The need for disclosing the AI origin of the text or images has been discussed in [34, 56, 103]; one participant said that it was important for ethical reasons that “people understand that they’re working with AI live” (p15).

Authors:

(1) Piotr W. Mirowski∗, Google DeepMind London, UK (piotrmirowski@deepmind.com);

(2) Juliette Love∗, Google DeepMind London, UK ( juliettelove@deepmind.com);

(3) Kory Mathewson, Google DeepMind Montréal, QC, Canada (korymath@deepmind.com);

(4) Shakir Mohamed, Google DeepMind London, UK (shakir@deepmind.com).

This paper is available on arxiv under CC BY 4.0 license.

[9] https://huggingface.co/models

[10] While comedy has also been used as a weapon to target and alienate specific social groups and spread hateful stereotypes, we exclude such misuse from our argument, as that usage falls under the definition of hate speech [32, 86].

[11] https://www.vox.com/culture/2023/9/24/23888673/wga-strike-end-sag-aftracontract