Looking to code like a pro? Join my web bootcamp today at programarivm.com
This is the original thought, which brought me to post the following on Twitter:
Am I entitled to a refund from Microsoft if GPT-3 learned from me with data provided by Common Crawl?
To be honest, I would have expected an empathetic reply from the President of the European Commission, Ursula von der Leyen, or a friendly reaction from the CEO of Microsoft, Satya Nadella. But I fully understand they are lovely, busy human beings, so I am happily honoured to share my concerns about AI on Hacker Noon's community today.
The question still stands.
Please let me explain. I first heard about Common Crawl when catching up on the latest amazing news from OpenAI. For those of you who, like me, haven't heard of it before, Common Crawl is a nonprofit organisation based in Los Angeles California that crawls the web every month, and freely provides its archives and datasets to the public.
Access to free, open data is definitely a good thing in my opinion. It is great in a sense that any person can download Common Crawl data for further analysis, research or hobbyist projects.
On the other hand, if this is the first time you're learning about OpenAI, this is Microsoft's partner for the development of artificial general intelligence (AGI). In its own words, they aim to build safe and beneficial artificial intelligence products and services, which is to say highly autonomous systems that outperform humans at most economically valuable work.
Your job is at risk.
Or, simply put differently, the OpenAI API is now able to perform almost any kind of English language task; ranging from semantic search and summarization, to sentiment analysis, content generation, translation, and more.
The key thing is: machines can learn English precisely because they are fed huge amounts of data written by people, like Common Crawl's dataset.
Let's open a debate!
What if OpenAI learned from your articles, contributions to forums, comments, and writing style after all, across the entire Web since 2011? Should companies then charge for their services and products if built on Common Crawl data? Would you be entitled to a refund?
Just to give an example, OpenAI used the Reddit TL;DR dataset to train a language model which is now able to summarize texts. OpenAI might well have learned from me since 2011, and possibly it's learning from you too right now in 2020 if you decide to leave a comment below. So you'd possibly have a right to a refund from Microsoft as well. It seemed obvious to me at the beginning: I'm not alone, and you are not alone. We are not alone.
Who owns the learning?
It took me a significant amount of time to learn how to write in English properly. After years of effort and use as a second language, I suppose today I am a little decent at it, only a little bit decent, yet depending on the task to be done. Of course my skills are not as good as those of a native speaker which means I need some extra time to put things across, but I can live with that. And I'm happy.
Would you imagine my surprise? I learned that intelligent language models are now springing up like mushrooms probably inspired by OpenAI's GPT models. Let's face it. Microsoft has always been a driving force for software development, as Silicon Valley has too, and AI writers are becoming trendy.
More and more companies are selling these products as if they'll make you more productive. Look at this apparently real review of an automatic content creator, which name is not disclosed:
People pay me $10/hour to write unique content. I just spin it with this amazing product and do nothing for almost one hour. All of my clients are thrilled with the quality. None of them can tell the difference!
Hang on a minute, and think about what this means to you.
Well, it's clear to me you just can't write a quality article in one hour without any previous preparation -- at least, I can't do that. Also I can't live on $10 an hour before taxes.
There seems to be a lie being spread at some point.
Automatic writers in general might contribute to flood the Web with non-relevant content, for instance, rubbish, garbage, because we need to work, sometimes quickly, to make ends meet. As a result of writing faster for the sake of cents, writers' fees might go down at the expense of a worse search engine experience for everyone. This could be a sign the digital dark age is approaching faster than you think.
Here's a dystopian prediction. The future Web will boil down to humans writing faster and more because of writing more for less money. Just take Upwork as an example. This freelancing platform charges contractors a 20% for their bills below $500.
No, this is not true nor is it normal in my opinion: $10 mean $8 in reality on Upwork with the help of AI. Possibly it is high time to speak up. Poor lives matter!
Professor Hawking was especially worried about the advancement of science and technology. He held it would accelerate the extinction of human beings. At the end of the day it is us, the people, and no one else, who must ask our politicians and big corporations, Elon Musk, to please slow down and degrowth.
Corporations, politicians, please...
Ursula von der Leyen, Satya Nadella, Upwork too, whoever might be reading this, please slow down and interact with my tweet above right now. Make me earn a few cents. Remember what the Queen of England said in her Christmas speech: you are not alone.
Rather than making things to make humans work faster and harder for less and less money, please invest on higher values such as mental health prevention, climate change mitigation and well-being. Develop new regulations for the digital global economy.
You are not alone, and really really deserve a better new normal.
Create your free account to unlock your custom reading experience.