What are the scales of justice, for the artist, and the AI ?
If we don't sort out the legal issues surrounding AI-generated content properly, it could have serious implications for the future of thousands of jobs.
As such, I believe, It's better to address these issues now rather than later, as the potential damage grows with the increased use of AI across all industries.
Throughout this article, you will see the term "works" frequently used to refer to any content to which copyright can be assigned, including images, artworks, and written text. To avoid confusion, the user of the AI model will be referred to as the "human operator", this is separate from the “creator” of the AI model itself.
For the purposes of this article, we will primarily be discussing the legal issues surrounding AI art generators, as they are currently at the forefront of this debate. However, the proposed solution discussed here is applicable to all AI models, including text-based ones.
To oversimplify an extremely complicated topic that could fill multiple legal essays on its own, the current legal status of AI-generated content is undecided, with lawyers and members of the public advocating for either side.
One high-profile case (which is not yet finalized) is the "US Copyright Office" going back and forth on the copyright status of an AI-generated comic (link here)
At its core, the AI copyright debate revolves around three major questions:
In general, these are the major arguments for AI.
With the following 3 major opposition
There are many additional arguments for and against AI that have been made by both sides, but they are not included here because they are either not as significant as the key points or they are based on ethical considerations rather than existing legal codes.
For example, one notable argument that was left out: is the "for progress" proposition. On how AI should be allowed to learn from copyright materials to encourage progress in society, and that resisting such progress is harmful to society. This is an argument based on ethical and societal considerations and not an existing legal code.
As I am skimming the details of the debate, if you want a more in-depth read, you can consider theverge article - The scary truth about AI copyright is nobody knows what will happen next
Finally, to state the obvious, I am not your lawyer, and the following is not legal advice. Also, copyright laws differ drastically from country to country (Japan, for example, does not fully recognize fair use).
Now with all of that, out of the way ...
*with various limitations depending on the process used in each step
More specifically, the following sequence is proposed:
This proposal allows us to divide the copyright problem into distinct steps and rely on existing legal frameworks and case laws, particularly the legal framework revolving around "Collaborative authorship" of open-source code. It also helps to answer the three questions distinctively.
In this framework, it would also allow answering the 3 questions distinctively.
This proposal avoids being stuck in the debate of whether an AI performs the creation of work "like a living being."
It also helps to create a clear path for all parties involved in a fair way:
One major factor that is often overlooked in the debate is how copyright is not binary, and how permissive licenses lie between full restrictive copyright and no copyright.
This proposal intends to take into account how permissive licenses are treated in the process and intends to use such licenses as a means to navigate the legal debate.
Because permissive licenses have been a hallmark in the open-source copyright movement for software coding (e.g. MIT license), this will help lay the groundwork for the proposed approach to existing case laws. (The list of open-source licenses can be found here.)
Outside of the software industry, the Creative Commons is its most prominent variant, with over 2 billion works tracked in circulation.
Broadly speaking, permissive licenses are used within three major categories:
BY-ND
and BY-NC-ND
BY-SA
and BY-NC-SA
, for software code, an example would be the GPL/AGPL/LGPL
licenses.It is worth reminding the tech bros that no consent (to make derivatives) means no consent.
BY
, for software, an example would be the MIT
licenseBY-NC
Placing an early prediction, there will be a derivative of the Creative Commons license which specifically prohibits the usage with a dataset or AI training.
Maybe BY-NADD
(No AI nor Dataset Derivatives)
Permissive licenses are typically written in a way that enforces derivative works, and any collections including their work, or its distribution, to have the same limitation (attribution, non-commercial requirements).
Measures must be taken to ensure the same limitations apply to the dataset, AI model, and their generated content. (Similar to opensource licensing)
Failure to do so would break the current proposal, due to potential loopholes that may allow commercial companies to fund the creation of public datasets under "fair use" only to eventually use them commercially.
For example:
The end result is, despite starting out in a way that "does not create a competing product," due to its open-source nature, it ends up as a tool that can be used in commercial settings to replace specific living artists identified by their name, who had their images scraped within the dataset without consent.
By classifying, downstream datasets, and AI models, as derivative works of their upstream content, copyright can be respected across a chain.
This is similar to how open-source projects work in the software industry, allowing existing laws to work on each layer.
In all cases, attribution needs to be maintained up the chain.
For example, all AI models must attribute their datasets, which must in turn provide attribution for their sources.
Once this is established, there will be three major types of AI models:
BY
or something similarBY-NC
or something similar
For right holders (ie. artists), who do not want any of their works to be used to train an AI model, or to be used by others as derivative work. They decide and stay in control, via the licensing terms.
For rights holders, who do allow derivative works, they get to decide if it's restricted to non-commercial uses (non-competing).
Finally, right holders, especially those with very unique styles, can sub-license their data, or AI model, and set their own terms of use. Typically as a way to commercialize their unique style.
Using the AI model is more nuanced. Depending on if the AI is adding IP value, or not, there are two major categories that need to be outlined.
The general rule of thumb is that the output would be the derivative of all three participants: the AI model, the human operator (via a prompt typically), and the input material (if any, for example, editing or redrawing an existing image).
This is because the final output would literally be a derivative of all three inputs.
If any of the three enforces a copyright clause (ie. attribution, no commercial use), so will the derivative output. If the AI model follows the Creative Commons model, and the "Human Operator" has full rights to the input, the "Human Operator" would have full rights to the output, with the attribution clause limitation (and the no-commercial clause for no-commercial use AI).
If it's just the "Human Operator" and the AI, the above structure still applies.
This is currently limited to the AI text model, where the AI is used in a way that does not add IP value. For example in "Short Summarization, Labeling, or Factual Extraction".
In this case, if the output has no additional content added and is simply the reduction of the input, per the user instructions, and was done so in a way that is clearly mapped to the instruction - the AI has arguably not "added any IP or context".
As such, in this case, the AI model should be treated as if it was a tool/extension of the human operator. And the output copyright should be as if the "Human Operator" performed the said instructions manually. (Need legal feedback on this segment)
Quite frankly, No. The bar has been raised for being an artist.
If the artist's primary job was commissioned to render artwork in a commonly known style, that is out of copyright (eg. Vincent van Gogh style) or is very common in the public domain, then this won't stop AI from taking artist jobs.
However, if the artist were to make or remix a newer style, that is unique and distinctive to them, then this prevents their new style or likeness from being copied. This allows them to commercialize and be rewarded for their style on their terms, with time entering the public domain.
Ultimately, allowing them to make a name and career for themselves, for their unique style.
Into this, strange new AI world.
As I am neither a professional artist, lawyer, nor politician, this may just end up being a programmer screaming into the crowd. However, if this approach makes sense to you, especially if you are an artist, do feedback, forward it to as many legal friends, and start organizing a movement towards this.
For any lawyers who wish to refine this to a legal paper, please do so, reach out to me, and provide credit.
The only paper I can find exploring this angle is "AI Derivatives: the Application to the Derivative Work Right to Literary and Artistic Productions of AI Machines"
At the end of the day, I wish that we all steer in a direction where both AI and artists can work together hand in hand. Either with what's proposed, or any other alternative proposal that respects both sides.
~ Until next time 🖖 live long and prosper
Eugene Cheah @ tech-talk-cto.com
For more on how AI works by dreaming, consider my other article …
Originally posted on the author's substack: https://substack.tech-talk-cto.com/p/how-opensource-style-licensing-can
All Images used, with their proper attribution
Also published here