1,195 reads

The Copyright Battle Against AI: Closed vs. Open-Source AI

by Futuristic LawyerJune 12th, 2023

Too Long; Didn't Read

UK-based lawyer Chris Mammen explains in a recent interview with Vice about AI-generated music, that the law moves slowly and evolves by analogy. “Something new comes up, and we figure out what it’s analogous to, and then that gradually becomes settled law”. The problem we are facing now with generative AI – AI models that can generate creative output such as text, images, music, or videos – is the difficulty in coming up with analogies. In other words, relating generative AI to something we already know and understand. The underlying technology is so complex that understanding how it works on a conceptual level and how it should be regulated, requires some serious mind expansion.

People Mentioned

featured image - The Copyright Battle Against AI: Closed vs. Open-Source AI

UK-based lawyer Chris Mammen explains in a recent interview with Vice about AI-generated music, that the law moves slowly and evolves by analogy. “Something new comes up, and we figure out what it’s analogous to, and then that gradually becomes settled law”.

The problem we are facing now with generative AI – AI models that can generate creative output such as text, images, music, or videos – is the difficulty in coming up with analogies. In other words, relating generative AI to something we already know and understand. The underlying technology is so complex that understanding how it works on a conceptual level and how it should be regulated, requires some serious mind expansion.

As with social media and the internet, AI models such as OpenAI’s ChatGPT or their text-to-image model DALL-E 2, are deceivably simple to use. Yet, there is obviously a lot of stuff going on under the hood that we don’t understand in the slightest. The gap between the user experience and all the complicated, technical stuff underneath it, is where criminal and unethical stuff can go on unnoticed.

The Black Box Effect in Crypto

We have seen this “black box effect” clearly in the financial world, recently in the crypto sector. Few crypto supporters, including myself, had a deep technical understanding of how crypto worked, and we didn’t know how the centralized exchanges were operated. In traditional finance, this is where we would typically rely on governmental vouching and oversight. But in an industry as new and complex as crypto, there was almost none. The relatively wide adoption, technical complexity, lack of oversight, and knowledge gap between developers and users, laid out the perfect conditions for crime and exploitation en masse. Last year, crypto exchanges collapsed in a cascade, over $3 billion was stolen from DeFi platforms in 2022, and hundreds of thousands of people were left in financial ruin.

The AI industry is of course very different from the crypto industry, but the same conditions for crime and exploitation are present. AI models are widely adopted, easier to use than crypto, more technically complex, there is not much oversight, and the knowledge gap between users and developers is arguably even wider than with crypto. Luckily, there are many awareness campaigns on the dangers and risks of AI, where similar campaigns in crypto drowned in the noise.

The Copyright Issue

The use of copyrighted material in generative AI models is one area where existing laws and frameworks are challenged. In my post from last week, I wrote about EU’s interpretation of foundational models. This week I will focus on the difference between closed-source vs. open-source AI models and introduce Stable Diffusion, a popular open-source AI image model which was hit with copyright lawsuits earlier this year from two different angles. I plan to publish another post about the lawsuits and the implications on copyright law within the next couple of weeks.

Open-Source vs. Closed-Source

Training foundation models is a costly affair in terms of time, money, and computational resources. In general, only BigTech companies with deep pockets can afford to lay out the initial investment. By the same token, the companies behind foundation models generally have an interest in closed-sourcing AI. The multi-million-dollar costs of development and training are hard to recoup if competitors can access all the ingredients and use their secret sauce.

One important exception is Meta’s LLaMA which Mark Zuckerberg and Meta’s AI research team controversially decided to make public. LLaMA is a large language model (LLM) released at different sizes from 7B to 65B parameters. Even the small-to-medium sized version, LLaMA-13B, can outperform OpenAI’s GPT-3 -despite being 10 x smaller. GPT-3 was groundbreaking and market-leading just three ago.

Meta’s Chief AI scientist Yann LeCun says that “the platform that will win will be open”. He argues that progress in AI is faster this way and that consumers and governments will refuse to embrace A.I. unless it is outside the control of companies like Google and Meta.

The counterargument to open-sourcing AI (which means to make the source code available) is that bad actors can use the code to build nefarious applications, spread misinformation, commit fraud, cybercrime, and lots of other bad stuff. Mark Zuckerberg recently received a letter from two US senators who criticized the decision to make LLaMA available to the public. The Senators concluded in the letter, that Meta’s “lack of thorough, public consideration of the ramifications of its foreseeable widespread dissemination” was ultimately a “disservice to the public.”

The Pull Towards Open-Source

Today, less than three months after its release, a bunch of open-source models stand on the shoulders of LLaMa. Vicuna-13B for example is an open-source chatbot that was trained by fine-tuning LLaMA on user-shared conversations collected from ShareGPT (a Chrome extension that allows users to share their conversations with ChatGPT). According to evaluations by GPT-4, Vicuna-13B achieves more than 90% of the quality of OpenAI's ChatGPT and Google's Bard with a training cost of around $300!

Regardless of competition and safety concerns, there is a strong pull toward open-sourcing AI. New and improved models are released ever so often. On the HuggingFace Open LLM Leaderboard, the best-performing model right now is Falcon 40B, which recently dethroned Meta’s LLaMA. Falcon 40B was developed by Technology Innovation Institute of Abu Dhabi with help from Amazon.

The jury is still out on whether open-source development could potentially dominate the use of generative AI in the future. In a leaked internal Google document published by SemiAnalysis a senior Google engineer argued that Google and OpenAI, “have no moat” and will eventually be outcompeted by open-source AI. He writes that “Open-source models are faster, more customizable, more private, and pound-for-pound more capable”.

Stability AI and Stable Diffusion

One of the companies on the frontline of open-source AI is Stability AI. The company was founded by former hedge fund manager Emad Mostaque. According to its website, Stability AI has since its launch in 2021 amassed an army of more than 140.000 developers and seven research hubs throughout the world. The research community develops AI models for different purposes, such as imaging, language, code, audio, video, 3D content, design, biotech, and other scientific research.

The product Stability AI is most known for to date is the image model Stable Diffusion. Stable Diffusion is an AI image model that can generate or tweak images from text prompts. It was released in August 2022, not long after OpenAI’s viral internet sensation DALL-E 2 was released privately to 1 million users on the waitlist. Many in the AI community considered Stable Diffusion a revolutionary milestone. Not only did it match, or even exceed, the capabilities of contemporary, large, and closed text-to-image models such as DALL-E 2, or Google’s Imagen but it was open-source.

According to Stable Diffusions license, anyone can use the model to create commercial applications, study its architecture, build on it, and modify its design within the scope of law, ethics, and common sense. Different from closed-sourced image models, Stable Diffusion can be downloaded and run locally on an average gaming PC. For casual users without coding skills, Stable Diffusion can also be accessed via the web app DreamStudio or the new open-source web app StableStudio.

As a side story, Stable Diffusion was in fact developed by a team of researchers at Ludwig-Maximilians-Universität in Münich, while Stability AI funded the computing resources to train the model. Stability has been criticized for taking too much undue credit as the University in Münich did all of the heavy liftings that resulted in Stable Diffusion. In an article by Forbes published last Sunday, Stability founder Emad Mosque was portrayed as a pathological exaggerator with a tendency to lie. Prof. Dr. Björn Ommer, head of the research team behind Stable Diffusion, told Forbes that he hoped to publicize his lab’s work, but his university’s entire press department was on vacation at the time (such things can only happen at public universities).

Stable Diffusion & the Copyright Storm

Stable Diffusion’s openness is a gift for researchers, as well as for governments, competitors, regulators, and bloodthirsty copyright advocates. Within the last category, we find Matthew Butterick and his legal team who represents three independent artists in a class-action lawsuit against Stability AI, MidJourney, and DeviantArt.

According to the lawyer Matthew Butterick: “[Stable Diffusion] is a parasite that, if allowed to proliferate, will cause irreparable harm to artists, now and in the future.”

I would argue that Butterick is in some sense correct about his characterization of Stable Diffusion and modern AI image models. They do kind of suck the creativity out of original work, mash it all together on a massive scale, and threaten the livelihood of artists who have unwillingly and unknowingly helped to train the model with micro contributions.

However, the class action lawsuit is riddled with so many legal and technical inaccuracies, misunderstandings, and shortcomings that I can only wonder if the legal team was high out of their minds when they wrote the first draft of the subpoena. Another theory is that Butterick and co are intentionally trying to misrepresent how the technology works to confuse the public or the judges. Hard to say.

In my next post, we will look further into the frivolous lawsuit and explain why it does not scratch the copyright itch on the right spot.

Also published here.