The Center for Investigative Reporting Inc. v. OpenAI Court Filing, retrieved on June 27, 2024, is part of HackerNoon’s Legal PDF Series. You can jump to any part in this filing here. This part is 7 of 18.
83. In response to prompts by users, ChatGPT and Copilot provide highly detailed abridgements of copyright-protected news articles, including articles published by Plaintiff.
84. When earlier versions of ChatGPT (up to and including ChatGPT 3.5-turbo) abridge a copyright-protected news article in response to a user prompt, they draw from their training on the article. During training, the patterns of all content, including copyright-protected news articles, are imprinted onto the model. That imprint allows the model to abridge the article.
85. When Copilot and later versions of ChatGPT abridge a copyright-protected news article in response to a user prompt, they find the previously downloaded article inside a database called a search index using a method called synthetic searching. Upon information and belief, they make another copy of the article in the memory of their computing system and use their LLM or other programming to generate an abridgement by applying the model or other programming to the text of the article.
86. Plaintiff’s articles are not merely collections of facts. Rather, they reflect the originality of their authors in selecting, arranging, and presenting facts to tell compelling stories. They also reflect the authors’ analysis and interpretation of events, structuring of materials, marshaling of facts, and the emphasis given to certain aspects.
7. An ordinary observer of a ChatGPT or Copilot abridgement of one of Plaintiff’s articles would conclude that the abridgements were derived from the articles being abridged, at least because ChatGPT and Copilot expressly link to the article they abridge and explain that they are searching Plaintiffs’ website in the course of generating a response.
88. In response to prompts seeking an abridgement of an article, ChatGPT and Copilot will typically provide a general abridgement of such an article, on the order of a few paragraphs. In some instances, the initial response will summarize the article in substantial detail. Further, when prompted by the user to provide more information about one or more aspects of that abridgement, ChatGPT or Copilot will provide additional details, often in the format of a bulleted list of main points. If prompted by the user to provide more information on one of more of those points, Chat GPT or Copilot will provide additional details. In some instances, however, ChatGPT or Copilot will provide a bulleted list of main points in response to an initial prompt seeking an abridgement.
89. A ChatGPT or Copilot user is capable of obtaining a substantial abridgement of a copyright protected news article through such series of prompts, and in some instances, further prompts designed to elicit further summary are even suggested by Copilot or ChatGPT itself. As a representative sample, a series of abridgements by ChatGPT and Copilot is attached as Exhibit 8.
90. These abridgements lack copyright notice or terms of use information conveyed in connection with the work, and sometimes lack author information.
91. Thus, upon information and belief, abridgements from earlier versions of ChatGPT lack copyright notice, terms of use, and typically author information because Defendants intentionally removed that information from the ChatGPT training sets.
92. Further, the abridgements from Copilot and later versions of ChatGPT lack copyright notice, terms of use, and typically author information. Upon information and belief, this is because Defendants intentionally removed them either when initially storing them in computer memory or when generating the synthetic search results.
93. Defendants’ abridgements, rewritten from copyright-protected news articles, harm the market for those articles by reducing the incentives for users to go to the original source, thus reducing Plaintiff’s subscription, licensing, advertising, and affiliate revenue. This allows Defendants to monetize copyright owners’ content at the expense of copyright owners who created the works ChatGPT has abridged.
94. Defendants’ abridgements do not add anything new to, or further any purpose or character different from, that of Plaintiff’s articles. They simply take the text of the articles and rewrite them into abridgements, including, when prompted, into detailed abridgements of the entire articles. Those abridgements often serve as a substitute for the original articles even when they are not complete, as evidenced by a study showing that only 51% of consumers read the entire text of a typical news article.[14]
95. Defendants’ abridgements of Plaintiff’s articles violates Plaintiff’s copyrights.
Continue Reading Here.
About HackerNoon Legal PDF Series: We bring you the most important technical and insightful public domain court case filings.
This court case retrieved on June 27, 2024, motherjones.com is part of the public domain. The court-created documents are works of the federal government, and under copyright law, are automatically placed in the public domain and may be shared without legal restriction.
[14] See Sharing on Social Media Makes Us Overconfident in Our Knowledge, UT News (Aug. 30, 2022), https://news.utexas.edu/2022/08/30/sharing-on-social-media-makes-us-overconfident-inourknowledge/#:~:text=Recent%20data%20from%20the%20Reuters,headline%20or%20a%20few %20lines.