paint-brush
The Times v. Microsoft/OpenAI: Unauthorized Retrieval and Dissemination of Current News (13)by@legalpdf
173 reads

The Times v. Microsoft/OpenAI: Unauthorized Retrieval and Dissemination of Current News (13)

by Legal PDF: Tech Court CasesJanuary 2nd, 2024
Read on Terminal Reader
Read this story w/o Javascript

Too Long; Didn't Read

Synthetic search applications built on the GPT LLMs, including Bing Chat and Browse with Bing for ChatGPT, display extensive excerpts or paraphrases

People Mentioned

Mention Thumbnail

Company Mentioned

Mention Thumbnail
featured image - The Times v. Microsoft/OpenAI: Unauthorized Retrieval and Dissemination of Current News (13)
Legal PDF: Tech Court Cases HackerNoon profile picture

The New York Times Company v. Microsoft Corporation Court Filing December 27, 2023 is part of HackerNoon’s Legal PDF Series. You can jump to any part in this filing here. This is part 13 of 27.

IV. FACTUAL ALLEGATIONS

C. Defendants’ Unauthorized Use and Copying of Times Content

4. Unauthorized Retrieval and Dissemination of Current News


108. Synthetic search applications built on the GPT LLMs, including Bing Chat and Browse with Bing for ChatGPT, display extensive excerpts or paraphrases of the contents of search results, including Times content, that may not have been included in the model’s training set. The “grounding” technique employed by these products includes receiving a prompt from a user, copying Times content relating to the prompt from the internet, providing the prompt together with the copied Times content as additional context for the LLM, and having the LLM stitch together paraphrases or quotes from the copied Times content to create natural-language substitutes that serve the same informative purpose as the original. In some cases, Defendants’ models simply spit out several paragraphs of The Times’s articles.


109. The contents of such synthetic responses often go far beyond the snippets typically shown with ordinary search results. Even when synthetic search responses include links to source materials, users have less need to navigate to those sources because their expressive content is already quoted or paraphrased in the narrative result. Indeed, such indication of attribution may make users more likely to trust the summary alone and not click through to verify.


110. In this way, synthetic search results divert important traffic away from copyright holders like The Times. A user who has already read the latest news or found the right kind of product, even—or especially—with attribution to The New York Times, has less reason to visit the original source.


111. Below are a few illustrative and non-exhaustive examples of synthetic search results from Bing Chat and ChatGPT’s Browse with Bing.


a) Examples of Synthetic Search Results from Bing Chat


112. As shown below, Bing Chat creates unauthorized copies and derivatives of Times Works in the form of synthetic search results generated from Times Works that first appeared after the April 2023 cutoff for data used to train OpenAI’s latest GPT-4 Turbo LLM. 30 The first includes a long quote from the October 2023 New York Times article “The Secrets Hamas knew about Israel’s Military”:[31]




113. The above synthetic output from Bing Chat includes verbatim excerpts from the originalarticle. The copied article text is highlighted in red below.



114. The synthetic output displays significantly more expressive content from the original article than what would traditionally be displayed in a Bing search result for the same article, as shown below. Unlike a traditional search result, the synthetic output also does not include a prominent hyperlink that sends users to The Times’s website.



115. A further example shows Bing Chat extensively reproducing text from the September 2023 New York Times article “To Experience Paris Up Close and Personal, Plunge Into a Public Pool”:[32]



116. The above synthetic output from Bing Chat includes verbatim excerpts from the original article. The copied article text is highlighted in red below.



117. The synthetic output displays significantly more expressive content from the original article than what would traditionally be displayed in a Bing search result for the same article, as shown below. Unlike a traditional search result, the synthetic output also does not include a prominent hyperlink that sends users to The Times’s website.



b) Synthetic Search Results from ChatGPT Browse with Bing


118. The below examples show that ChatGPT’s Browse with Bing plug-in also outputs unauthorized copies and derivatives of copyrighted works from The Times in the form of synthetic search results generated from Times Works that first appeared after the April 2023 cutoff for data used to train OpenAI’s latest GPT-4 Turbo LLM. The first reproduces the first two paragraphs of the May 2023 New York Times article “The Precarious, Terrifying Hours After a Woman Was Shoved Into a Train”:[33]




119. The above synthetic output from ChatGPT with the Browse with Bing plugin includes verbatim excerpts from the original article. The copied article text is highlighted in red below



120. The synthetic output displays significantly more expressive content from the original article than what would traditionally be displayed in a Bing search result for the same article as shown below. Unlike a traditional search result, the synthetic output also does not include a prominent hyperlink that sends users to The Times’s website.



121. This example likewise shows Browse with Bing for ChatGPT reproducing the first two paragraphs of The New York Times article “Are the Hamptons Still Hip?” from May 2023.[34]



122. The above synthetic output from ChatGPT with the Browse with Bing plugin includes verbatim excerpts from the original article. The copied article text is highlighted in red below.



123. Again, the synthetic output displays significantly more expressive content from the original article than what would traditionally be displayed in a Bing search result for the same article, as shown below. Unlike a traditional search result, the synthetic output also does not include a prominent hyperlink that sends users to The Times’s website.





Continue Reading Here.


[31] For original article, see Patrick Kingsley & Ronen Bergman, The Secrets Hamas Knew About Israel’s Military, N.Y. TIMES (Oct. 13, 2023), https://www.nytimes.com/2023/10/13/world/middleeast/hamas-israel-attackgaza.html.


[32] For original article, see Catherine Porter, To Experience Paris Up Close and Personal, Plunge Into a Public Pool, N.Y. TIMES (Sept. 3, 2023), https://www.nytimes.com/2023/09/03/world/europe/paris-franceswimming-pools.html.


[33] For original content, see Hurubie Meko, The Precarious, Terrifying Hours After a Woman Was Shoved Into a Train, N.Y. TIMES (May 25, 2023), https://www.nytimes.com/2023/05/25/nyregion/subway-attack-womanshoved-manhattan.html.


[34] For original article, see Anna Kodé, Are the Hamptons Still Hip?, N.Y. TIMES (May 26, 2023),

https://www.nytimes.com/2023/05/26/realestate/hamptons-summer-housing-costs.html.




About HackerNoon Legal PDF Series: We bring you the most important technical and insightful public domain court case filings.


This court case 1:23-cv-11195 retrieved on December 29, 2023, from nycto-assets.nytimes.com is part of the public domain. The court-created documents are works of the federal government, and under copyright law, are automatically placed in the public domain and may be shared without legal restriction.