paint-brush
NYT Hired Hackers to Game ChatGPT Into Giving Evidence For Its Lawsuits Against OpenAIby@legalpdf
286 reads

NYT Hired Hackers to Game ChatGPT Into Giving Evidence For Its Lawsuits Against OpenAI

tldt arrow

Too Long; Didn't Read

OpenAI wants portions of the NYT's lawsuit against the company be dismissed, arguing the paper presented misleading evidence to the court.
featured image - NYT Hired Hackers to Game ChatGPT Into Giving Evidence For Its Lawsuits Against OpenAI
Legal PDF: Tech Court Cases HackerNoon profile picture

The New York Times Company v. OpenAI Update Court Filing, retrieved on February 26, 2024 is part of HackerNoon’s Legal PDF Series. You can jump to any part in this filing here. This part is 1 of 15.

I. INTRODUCTION

The artificial intelligence (AI) tool known as ChatGPT is many things: a revolutionary technology with the potential to augment human capabilities, fostering our own productivity and efficiency;[1] an accelerator for scientific and medical breakthroughs;[2] a mechanism for making existing technologies accessible to more people;[3] an aid to help the visually impaired navigate the world;[4] a creative tool that can write sonnets, limericks, and haikus;[5] and a computational engine that reasonable estimates posit may add trillions of dollars of growth across the global economy.[6]


Contrary to the allegations in the Complaint, however, ChatGPT is not in any way a substitute for a subscription to The New York Times. In the real world, people do not use ChatGPT or any other OpenAI product for that purpose. Nor could they. In the ordinary course, one cannot use ChatGPT to serve up Times articles at will.


The Times has sought to paint a different picture. Its lawsuit alleges that OpenAI has imperiled the very enterprise of journalism, illustrating the point with 100 examples in which some version of OpenAI’s GPT-4 model supposedly generated several paragraphs of Times content as outputs in response to user prompts. See Dkt. 1-68 (Exhibit J).


The allegations in the Times’s Complaint do not meet its famously rigorous journalistic standards.[7] The truth, which will come out in the course of this case, is that the Times paid someone to hack OpenAI’s products. It took them tens of thousands of attempts to generate the highly anomalous results that make up Exhibit J to the Complaint. They were able to do so only by targeting and exploiting a bug (which OpenAI has committed to addressing) by using deceptive prompts that blatantly violate OpenAI’s terms of use.[8] And even then, they had to feed the tool portions of the very articles they sought to elicit verbatim passages of, virtually all of which already appear on multiple public websites. Normal people do not use OpenAI’s products in this way.[9]


Journalism has undergone many changes in the digital age, and may well undergo more with the advent of AI. OpenAI has established important partnerships with many leaders in the news industry—from large enterprises like the Associated Press and Axel Springer to the dozens of smaller and local outlets associated with the American Journalism Project—to creatively explore and implement AI solutions that assist investigative reporting, create enhanced reader experiences, and improve business operations. The Times’s suggestion that the contrived attacks of its hired gun show that the Fourth Estate is somehow imperiled by this technology is pure fiction. So too is its implication that the public en masse might mimic its agent’s aberrant activity.


There is a genuinely important issue at the heart of this lawsuit—critical not just to OpenAI, but also to countless start-ups and other companies innovating in this space—that is being litigated both here and in over a dozen other cases around the country (including in this Court): whether it is fair use under copyright law to use publicly accessible content to train generative AI models to learn about language, grammar, and syntax, and to understand the facts that constitute humans’ collective knowledge. OpenAI and the other defendants in these lawsuits will ultimately prevail because no one—not even the New York Times—gets to monopolize facts[10] or the rules of language.[11] For good reason, there is a long history of precedent holding that it is perfectly lawful to use copyrighted content as part of a technological process that (as here) results in the creation of new, different, and innovative products.[12] Established copyright doctrine will dictate that the Times cannot prevent AI models from acquiring knowledge about facts, any more than another news organization can prevent the Times itself from re-reporting stories it had no role in investigating.[13] As Justice Brandeis explained more than 100 years ago: “The general rule of law is, that the noblest of human productions—knowledge, truths ascertained, conceptions, and ideas—become, after voluntary communication to others, free as the air to common use.”[14]


All of that said, even assuming (counterfactually) the truth of what the lawsuit alleges, several of the theories in the Complaint are not viable, even as pleaded. This Motion asks the Court to trim those at the outset to focus the litigation on the core issues that really matter. In short: (1) The direct copyright infringement claim asserts liability in part from conduct that is timebarred because it occurred more than three years ago. (2) The contributory infringement claim would ascribe liability to OpenAI based on generalized knowledge of third-party infringement, rather than actual knowledge of specific infringements, which the law requires. (3) The claim for violations of 17 U.S.C. § 1202 (the “DMCA”) fails for the reasons embraced by every other court to consider indistinguishable claims against generative AI models: the DMCA simply does not address the conduct to which the Times seeks to ascribe liability. And (4) the claim for state common law misappropriation is preempted by the federal Copyright Act.


OpenAI respectfully seeks an order dismissing these legally infirm portions of the Complaint, so that the parties can properly and efficiently litigate the balance.


Continue Reading Here.


[1] Louis Hyman, It’s Not the End of Work. It’s the End of Boring Work, N.Y. Times (Apr. 22, 2023), https://www.nytimes.com/2023/04/22/opinion/jobs-ai-chatgpt.html.


[2] Microsoft, The Impact of Large Language Models on Scientific Discovery: a Preliminary Study Using GPT-4 (Dec. 8, 2023), https://arxiv.org/pdf/2311.07361.pdf (deployments in “biology and materials design” and “drug discovery”).


[3] Ran Ronen, How Generative AI Tools Like ChatGPT Can Revolutionize Web Accessibility, VentureBeat (July 8, 2023), https://venturebeat.com/ai/how-generative-ai-tools-like-chatgpt-can-revolutionize-web-accessibility/.


[4] Sheena Vasani, Be My Eyes AI Offers GPT-4-Powered Support for Blind Microsoft Customers, Verge (Nov. 15, 2023), https://www.theverge.com/2023/11/15/23962709/microsoft-blind-users-open-ai-chatgpt-4-be-my-eyes.


[5] Adam Gross, I Asked ChatGPT AI to Write a Sonnet, LinkedIn (Mar. 2023), https:// www.linkedin.com/posts/grossadam_i-asked-chatgpt-ai-to-write-a-sonnet-in-iambic-activity-040728019300229120- ZBNa/; Zetolgam, The Chatbot Limerick Writer, AllPoetry (Dec. 2022) https://allpoetry.com/poem/16903411-TheChatbot-Limerick-Writer-by-Zetolgam; Uday Dandavate, How I Used ChatGPT to Write Haiku, Medium (Sept. 2, 2023), https://uday-dandavate.medium.com/how-i-used-chatgpt-to-write-haiku-5904ee96360d.


[6] McKinsey & Company, The Economic Potential of Generative AI: The Next Productivity Frontier (June 14, 2023), https://www.mckinsey.com/capabilities/mckinsey-digital/our-insights/the-economic-potential-of-generative-ai-thenext-productivity-frontier (“generative AI could add the equivalent of $2.6 trillion to $4.4 trillion annually” to the global economy); see also Goldman Sachs, Generative AI Could Raise Global GDP by 7% (Apr. 5, 2023), https://www.goldmansachs.com/intelligence/pages/generative-ai-could-raise-global-gdp-by-7-percent.html.


[7] Cf. N.Y. Times, Ethical Journalism: A Handbook, https://www.nytimes.com/editorial-standards/ethicaljournalism.html#introductionAndPurpose (last accessed Feb. 25, 2024) (“we tell our readers the complete, unvarnished truth as best we can learn it”; “Staff members…may not commit illegal acts of any sort.”).


[8] OpenAI, Terms of Use § 2(c), https://openai.com/policies/mar-2023-terms (prohibiting use to “infringe[]” others’ “rights” or “extract data”); Hesse v. Godiva Chocolatier, Inc., 463 F. Supp. 3d 453, 463 (S.D.N.Y. 2020) (courts may take “judicial notice of information publicly announced on a party’s website” if “authenticity is not in dispute”).


[9] Francesca Paris & Larry Buchanan, 35 Ways Real People Are Using A.I. Right Now, N.Y. Times (Apr. 14, 2023), https://www.nytimes.com/interactive/2023/04/14/upshot/up-ai-uses.html (people use ChatGPT to “[w]rite a [] speech,” “[s]kim dozens of academic articles,” “[a]ppeal an insurance denial,” and “[c]reate new proteins in minutes”).


[10] See, e.g., Baker v. Selden, 101 U.S. 99, 103 (1879) (“The very object of publishing a book … is to communicate to the world the useful knowledge which it contains. But this object would be frustrated if the knowledge could not be used without incurring guilt of piracy of the book.”); Hoehling v. Universal City Studios, Inc., 618 F.2d 972, 979 (2d Cir. 1980) (“[F]actual information is in the public domain.”).


[11] See 17 U.S.C. 102(b); Clanton v. UMG Recordings, Inc., 556 F. Supp. 3d 322, 332 (S.D.N.Y. 2021) (“ordinary building blocks of the English language” not protectable); Med. Educ. Dev. Servs. v. Reed Elsevier Grp., PLC, No. 05-cv-8665, 2008 WL 4449412, at *6 (S.D.N.Y. Sep. 30, 2008) (no copyright for “concepts such as rules of punctuation, analogies, vocabulary or other fundamental elements of English composition”).


[12] See, e.g., Authors Guild, Inc. v. HathiTrust, 755 F.3d 87 (2d Cir. 2014) (fair use to create “digital copies of more than ten million [books]” for search tool); Authors Guild v. Google, Inc. (Google Books), 804 F.3d 202 (2d Cir. 2015) (fair use to scan millions of copyrighted books to create novel tool); Google LLC v. Oracle Am., Inc., 141 S. Ct. 1183 (2021) (fair use to replicate copyrighted software programming interfaces to create a new mobile platform).


[13] See Nicholas Lemann, The Panama Papers and the Monster Stories of the Future, New Yorker (Apr. 14, 2016) https://www.newyorker.com/news/news-desk/the-panama-papers-and-the-monster-stories-of-the-future (noting the Times’s refusal to participate in consortium that broke “Panama Papers” story); Michael S. Schmidt & Steven Lee Myers, Panama Law Firm’s Leaked Files Detail Offshore Accounts Tied to World Leaders, N.Y. Times (Apr. 3, 2016), https://www.nytimes.com/2016/04/04/us/politics/leaked-documents-offshore-accounts-putin.html.


[14] Int’l News Serv. v. Associated Press, 248 U.S. 215, 250 (1918) (Brandeis, J., dissenting).


About HackerNoon Legal PDF Series: We bring you the most important technical and insightful public domain court case filings.


This court case retrieved on February 26, 2024, from fingfx.thomsonreuters.com is part of the public domain. The court-created documents are works of the federal government, and under copyright law, are automatically placed in the public domain and may be shared without legal restriction.