paint-brush
Does Copilot Reproduce the Code of Others Without Attribution?by@legalpdf

Does Copilot Reproduce the Code of Others Without Attribution?

by Legal PDF: Tech Court CasesSeptember 3rd, 2023
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

DOE vs. Github (amended complaint) Court Filing (Redacted), June 8, 2023, is part of HackerNoon’s Legal PDF Series.
featured image - Does Copilot Reproduce the Code of Others Without Attribution?
Legal PDF: Tech Court Cases HackerNoon profile picture

DOE vs. Github (amended complaint) Court Filing (Redacted), June 8, 2023, is part of HackerNoon’s Legal PDF Series. You can jump to any part in this filing here. This is part 20 of 38.

VII. FACTUAL ALLEGATIONS

F. Copilot Reproduces the Code of the Named Plaintiffs Without Attribution

2. Example: Copilot Outputs the Code of Doe 1 in Modified Format


105. The second example demonstrates Copilot suggesting a modified copy of code written by Doe 1. To protect Doe 1’s identity, the paragraphs describing the code will be redacted.


106. (Redacted) subject to the MIT License. (Redacted)


107. When Copilot is prompted with (Redacted)


The first suggestion from Copilot is a modification of Doe 1’s code:

(Redacted)


108. (Redacted) do not appear in any other source file on GitHub. The only way Copilot knows how to make this suggestion is because it ingested Doe 1’s source file as training data. Though the Copilot suggestion is not an exact match for Doe 1’s code, it is necessarily a modification based on a copy of Doe 1’s code.


109. Furthermore, many distinctive expressive features of Doe 1’s code have been preserved in Copilot’s suggestion. For instance, Doe 1’s comments in the code (in green) are reproduced almost verbatim. (Redacted) means the same thing as this Copilot-suggested code: (Redacted)


110. As is apparent from a cursory glance of this example, the variations between Copilot’s emitted output and Doe 1’s source code are cosmetic and the code is functionally equivalent; it follows that Copilot’s output is a copy of Doe 1’s code.


111. That said, Copilot also introduces mistakes into the code. For instance, (Redacted)


112. Still, because Copilot is reproducing Doe 1’s algorithm in modified format, and the obligations in Doe 1’s license (the MIT License) carry with the code even if the underlying code is modified, the Copilot suggestion needs to follow the requirements of Doe 1’s license for that code, including providing attribution. It does not. Copilot also did not reproduce Doe 1’s license.


Continue Reading Here.


About HackerNoon Legal PDF Series: We bring you the most important technical and insightful public domain court case filings.


This court case 4:22-cv-06823-JST retrieved on August 26, 2023, from Storage Courtlistener is part of the public domain. The court-created documents are works of the federal government, and under copyright law, are automatically placed in the public domain and may be shared without legal restriction.