DOE v. GITHUB Court Filing, retrieved on January 26, 2023 is part of HackerNoon’s Legal PDF Series. You can jump to any part in this filing here. This part is 11 of 21.
IV. ARGUMENT
C. Plaintiffs’ Claims Fail for Reasons Specific to Each Claim.
1. Plaintiffs’ DMCA Claim Should be Dismissed.
Although the complaint is replete with allegations about alleged similarities between Copilot’s output and the code it was trained on, Plaintiffs do not assert a copyright infringement claim.
Instead, they allege that Defendants violated the DMCA by (1) removing or altering CMI from Licensed Materials, (2) distributing copies of Licensed Materials knowing CMI had been removed or altered without authority, and (3) knowingly providing CMI that is false by “asserting and/or implying that Copilot is the author of the Licensed Materials.” (Compl. ¶¶ 158-159.) Plaintiffs’ allegations do not meet DMCA requirements and fail properly to plead a DMCA claim.
a. Plaintiffs Have Not Properly Pled a Claim for Removal of CMI.
To properly plead a claim for removal of CMI, a plaintiff must plausibly allege: (1) the existence of CMI on a work, (2) removal or alteration of that information, (3) that the removal or alteration was done intentionally; and (4) the removal or alteration was done knowing or having reasonable grounds to know that it would induce, enable, facilitate, or conceal copyright infringement.
17 U.S.C. § 1202(b); Stevens v. CoreLogic, Inc., 899 F.3d 666, 673 (9th Cir. 2018) (discussing the mental state elements); O’Neal v. Sideshow, Inc., 583 F. Supp. 3d 1282, 1286-87 (C.D. Cal. 2022) (discussing other elements).
(i) Failure to Allege Removal from Identical Copies. Plaintiffs’ claim under § 1202(b) arises out of the allegation that CMI was removed from Plaintiffs’ code. But in order to prevent § 1202 from subsuming every copyright dispute, courts have interpreted “removal” in the § 1202 context to require that there was some identical copy of the plaintiff’s work made without the plaintiff’s CMI.
See, e.g., Kelly v. Arriba Soft Corp., 77 F. Supp. 2d 1116, 1122 (C.D. Cal. 1999) (requiring that CMI was removed from “a plaintiff’s product or original work”), aff’d and rev’d in part on other grounds, 336 F.3d 811 (9th Cir. 2003). Where a defendant makes a copy of a defendant’s work that is substantially similar, but not identical, to the plaintiff’s work, and omits CMI from that copy, there may be a claim for copyright infringement, but there cannot be a claim under § 1202.
See Frost-Tsuji Architects v. Highway Inn, Inc., No. CIV. 13-00496 SOM, 2015 WL 263556, at *3 (D. Haw. Jan. 21, 2015), aff’d, 700 F. App’x 674 (9th Cir. 2017) (“But the drawing by [the defendant] is not identical to the drawing by [the plaintiff], such that this court can say that [the defendant] removed or altered [the plaintiff’s] copyright management information from [the drawing].”); id. (“basing a drawing on [the plaintiff’s] work is not sufficient to support a claim” under § 1202); Kirk Kara Corp. v. W. Stone & Metal Corp., No.CV 20-1931-DMG (EX), 2020 WL 5991503, at *6 (C.D. Cal. Aug. 14, 2020) (dismissing DMCA claim because “while the works may be substantially similar, Defendant did not make identical copies of Plaintiff’s works and then remove engraved CMI”).
Here, Plaintiffs concede that Copilot does not generate identical copies:
• “[T]he Output is often a near-identical reproduction of code from the training data.” (Compl. ¶ 46 (emphasis added));
• “Codex has reproduced Haverbeke’s Licensed Material almost verbatim, with the only difference being drawn from a different portion of those same Licensed Materials.” (Id. ¶ 60 (emphasis added));
• “Like the other examples above—and most of Copilot’s Output—this output is nearly a verbatim copy of copyrighted code. In this case, it is substantially similar to the “isPrime” function in the book Think JavaScript by Max X. Curinga et al., . . . .” (Id. ¶ 74 (emphasis added).)
Because Plaintiffs affirmatively allege that the output at issue is not identical to the allegedly copied material, they have pled themselves out of court on the § 1202 claim, and it should be dismissed with prejudice.
(ii) Failure to Identify the Works. The § 1202 claim is also subject to dismissal because Plaintiffs have not sufficiently identified any works from which CMI was allegedly removed. See Free Speech Sys., LLC v. Menzel, 390 F. Supp. 3d 1162, 1175 (N.D. Cal. 2019) (dismissing DMCA claim because Menzel “merely alleged that his photographs ‘were altered to remove certain of [his] copyright management information’ without providing any facts to identify which photographs had CMI removed or to describe what the removed or altered CMI was”).
Plaintiffs merely allege generally that Defendants removed CMI from “Licensed Materials,” which they define broadly as “materials made available publicly on GitHub that are subject to various licenses containing conditions for use of those works.” (Compl. ¶¶ 1, 148.) The complaint highlights Plaintiffs’ imprecision, reciting in the DMCA cause of action that Copilot was “trained on millions— possibly billions—of lines of code.” (Id. ¶ 143.)
But the few specific instances the complaint points to are not examples of Plaintiffs’ own code, but snippets from third-party programming textbooks. (Id. ¶¶ 56-61, 71-75.) That code is not the subject of Plaintiffs’ DMCA allegations, as it does not fall within the complaint’s definition of “Licensed Materials.”
Without identifying specific works from which CMI was removed, Plaintiffs fail to state a claim for CMI removal.
(iii) Failure to Adequately Plead Scienter. Plaintiffs also have not pled facts sufficient to meet the “double-scienter” requirement of Section 1202(b)(3), which requires “the defendant who distributed improperly attributed copyrighted material must have actual knowledge that CMI ‘has been removed or altered without authority of the copyright owner or the law,’ as well as actual or constructive knowledge that such distribution ‘will induce, enable, facilitate, or conceal an infringement.’” Mango v. BuzzFeed, Inc., 970 F.3d 167, 171 (2d Cir. 2020).
“[T]he plaintiff must provide evidence from which one can infer that future infringement is likely, albeit not certain, to occur as a result of the removal or alteration of CMI.” CoreLogic, 899 F.3d at 676 (finding CoreLogic not liable for violating § 1202(b) because photographers had “not put forward any evidence that CoreLogic knew its software carried even a substantial risk of inducing, enabling, facilitating, or concealing infringement, let alone a pattern or probability of such connection to infringement”).
Here, Plaintiffs have not alleged facts sufficient to establish a substantial risk that any copyright infringement has occurred or that any future infringement is likely because of the removal of CMI, nor that any of the OpenAI Entities had reason to know of any such likelihood.
They have not alleged, for example, copying of protectible expression: that is, that the allegedly copied code was original, that there was no merger of idea and expression, and that the allegedly copied code did not represent “scènes à faire.”
See, e.g., Oracle Am., Inc. v. Google Inc., 872 F. Supp. 2d 974, 984-997 (N.D. Cal. 2012) (summarizing the various doctrines limiting copyright in computer programs). And they would need to allege that any copying was not fair use—a heavy burden in light of the Supreme Court’s holding in the source-code context that “taking only what was needed to allow users to put their accrued talents to work in a new and transformative program . . . was a fair use of that material as a matter of law.” Google LLC v. Oracle Am., Inc., 141 S. Ct. 1183, 1209 (2021); see also, e.g., Authors Guild v. Google, Inc., 804 F.3d 202, 225 (2d Cir. 2015) (copying of millions of books for the purpose of searching them and providing relevant snippets to users was fair use).
Finally, they would have to identify with specificity which work or works were copied and specify which defendant is alleged to have infringed which particular copyright. Lynwood Invs. CY Ltd. v. Konovalov, No. 20-CV-03778- MMC, 2022 WL 3370795, at *19 (N.D. Cal. Aug. 16, 2022) (dismissing claim). All of these are substantial hurdles to showing that Defendants had reason to know that they would cause or further copyright infringement. The complaint meets none of them.
b. Plaintiffs Have Failed to Plead a Claim for Distributing Copies of Works from Which CMI Has Been Removed.
Plaintiffs’ claim that Defendants have distributed copies of code from which CMI has been removed fails for the same reasons as its claim for removal of CMI. 17 U.S.C. §§ 1202(b)(2), 1202(b)(3). See Kirk Kara, 2020 WL 5991503, at *6 (applying same 1202(b)(1) analysis to distribution claims); Dolls Kill, Inc. v. Zoetop Bus. Co., No. 2:22-cv-01463-RGKMAA, 2022 WL 16961477, at *3-4 (C.D. Cal. Aug. 25, 2022) (concluding no DMCA violation for complaint that defendants “are distributing knockoff products” where the works were not identical and only had “certain[] similarities”); Mango v. BuzzFeed, Inc., 356 F. Supp. 3d 368, 376 (S.D.N.Y 2019), aff’d, 970 F.3d 167 (2d Cir. 2020) (in view of few decisions involving DMCA’s distribution prohibitions, looking to CMI removal caselaw for guidance).
Plaintiffs have not specifically identified any such copies; the supposed copies are not identical; and Plaintiffs have not shown the requisite scienter.
c. Plaintiffs Have Failed to Show that OpenAI Has Conveyed Any False CMI in Connection with Copilot Outputs.
Plaintiffs’ claim that Defendants have conveyed false CMI (Compl. ¶¶ 158-159) is also fundamentally flawed. 17 U.S.C. § 1202(a). The DMCA defines CMI as any information identifying the work, its author or copyright owner, and the terms and condition of use, or “links to such information,” “conveyed in connection with copies . . . of [the] work.” 17 U.S.C. § 1202(c) (emphasis added).
Courts require that the allegedly false CMI’s location suggest an association with plaintiff’s work. See SellPoolSuppliesOnline.com, LLC v. Ugly Pools Ariz., Inc., 804 F. App’x 668, 670-71 (9th Cir. 2020) (affirming grant of summary judgment against plaintiff’s false CMI claim, finding defendant’s copyright notice at the bottom of the webpage was not “conveyed in connection with” plaintiff’s photos); Logan v. Meta Platforms, Inc., No. 22- cv-1847-CRB, 2022 WL 14813836, at *8 (N.D. Cal. Oct. 25, 2022) (finding copyright notice on the bottom of each Facebook user page separated from the rest of the content insufficient to plead that Meta conveyed CMI in connection with plaintiff’s photos).
Once again, Plaintiffs have not pointed to any specific code that Defendants conveyed containing false CMI, as pleading standards require. See § IV.C.1.a, supra.
Moreover, Plaintiffs have not properly alleged that the OpenAI Entities have conveyed any CMI at all. Plaintiffs merely allege that “Defendants have a business practice of asserting and/or implying that Copilot is the author of the Licensed Materials.” (Compl. ¶ 158.) But that is not the same as conveying CMI “in connection with copies” of the work. And in the complaint’s examples of Codex-generated code, there is no CMI presented whatsoever. (See id. ¶¶ 49, 69.)
Continue Reading here.
About HackerNoon Legal PDF Series: We bring you the most important technical and insightful public domain court case filings.
This court case 4:22-cv-06823-JST retrieved on September 8, 2023, from DocumentCloud.org is part of the public domain. The court-created documents are works of the federal government, and under copyright law, are automatically placed in the public domain and may be shared without legal restriction.