It’s the evening of May 10th, 2019. Ctrl Shift Face, a popular channel, uploads the following video to YouTube:
Over the next few months, the video goes viral, amassing nearly 10 million views by September on YouTube alone. That’s not counting reposts on other platforms, nor the mass of media exposure received.
The discourse surrounding this video, and the many others like it, raises a general sentiment that what is being seen is creepy, disturbing, and outright dangerous - the process of ‘deep faking’ in which a face is taken, then mapped onto another. But, most criticism stops there - even if every YouTube commenter solely voiced these sentiments, they’d make up less than 0.2% of the overall view count; the vast majority are simply spectators.
So what makes this passive consumption dangerous? More than you might think. Let’s break down some of the key reasons.
It’s Open Source, and Anyone Can Access It
Ctrl Shift Face’s video has a note right in the About section, linking to the exact code they used. Here you go, I’ll save you the painstaking detective work needed in tracking it down.
Now, that code’s useless by itself. It comes neatly packaged, but not alongside the personal expertise you need to make it work - just like any other open source product.
The similarities to other successful open source products don’t end there, however. Take a closer look at the edit logs, or open issues. People are working together around the clock, continuously eradicating bugs, adding guides, and publishing manuals. Followers and contributors are growing in number, while the code itself gets consistently more effective. To a tech developer like me, it’s a beautiful thing; a testament to how dedicated coders can combine their expertise and allow for trending technology to evolve.
Yet, it terrifies me. There’s no doubt that this community of coders have mostly good intentions, but that itself is what’s frightening - they’ve been able to achieve this with very little. Imagine what an active department can do within a large organization, with proper funding, recruitment, and resource allocation. Individuals with more malicious intents and junction can also benefit from this hard open work, not just the involved enthusiasts.
The tenacity and creativity of developers can be awe-inspiring, but it’s also a presentation of opportunity for those who wish to abuse open source projects.
The majority of popular deep fake videos are obviously branded as such, using well-known celebrities and scenarios as the basis for their content. Their purpose is entertainment, which reduces inherent defense and mistrust into seeing deep fake as a comedic tool, or internet joke. Videos where celebrities gain the faces of other celebrities are pretty harmless, and they’re the most common.
It’s easy for this format to become harmful. Replace the people and scenarios with ones you don’t know, and if done properly, it’s hard to detect what is real; most cases where deep fakes are used are openly trying to entertain, and not deceive, via using obvious likenesses for entertainment value. Arnold Schwarzenegger’s face on Bill Hader’s body is Arnold Schwarzenegger’s face on Bill Hader’s body, but John Doe’s face on Joe Public’s body is simply a new, artificial person.
Uncanny valley discussions have been around for a long time, but they often rely on something being off via attempted replication. Deep faking isn’t replication, it’s seamlessly combining sources, so it can circumvent this seemingly innate ability to register the ‘not quite right’.
This penetration of our senses only strengthens as you’re exposed to it.
Mass Media is Diluting Dangers
The easiest way to doom a trend in modern America, is for it to be noticed by Jimmy Kimmel. Take a look at this clip from earlier this week:
The exposition of deep fake technology via channels like this accentuates my previous point. Here, we see the technology used in a completely ersatz way, adding to the joke. It may seem like this is evidence for a harmless, comedic potential of deep faking, but it merely creates a common perception that this is the norm.
And guess what? It is. The most prevalent use for deep fake technology will always be entertainment, whether it’s remastering old films efficiently, improvising a skit on a talk show, or adapting whatever emerges from internet culture next. This primary body has ample room for more sinister usage, hidden beneath normality. If someone is exposed to entertaining deep fakes daily, they’re more vulnerable to that annual, targeted smear campaign or fake news video.
It’s Becoming Easier to Utilize
So far, I’ve talked about deep fakes as a general idea or open source technology. This freedom of potential is lost when it’s packaged as a specific product, and completely subject to the team behind it (and their dodgy terms & conditions that was hastily agreed to).
Often, this is in the form of a mobile app. Even the best face-tuning apps seem jovial compared to viral deep fake videos, but that gap will only get smaller. A major reason is that those videos are often done on a powerful computer, whereas apps are restricted to the processing power of a mobile device. 5G could change this, however, as with it comes massively reduced latency - allowing the extensive rendering and algorithms required to be fulfilled server-side, then transmitted to the phone.
This ease of access also improves companies’ ability to perfect the technology. For every face tuned, their machine learning algorithms get marginally smarter. The question is, who does this help, and who has access to this growing pool of information?
So What’s Exciting About All of This?
Fear and excitement are two degrees of the same reaction. When dwelling upon the concepts and realities of deep faking, your heart rate or cortisol levels may not surge, but there’s still an inherent dark side to any positive. I’ve lived in Silicon Valley for a long time, and have seen it work both ways - trends that seemed scary at first, like machine learning, drones and driverless cars, all garnered benefits as they developed.
I think we’re already seeing some of these positives emerge. A common trend throughout this article, entertainment, has forever been shaped by it, with new opportunities emerging in adapting old footage and expanding new formats. But, in an ironic twist, this benefit seems fake; a piece of potential that is far outweighed by risk, to the point of the U.S. Military taking measures to combat it.
The question of whether these boons in entertainment are worth it, is useless. The technology is here, it’s rapidly expanding, and can’t be stopped; so instead, let’s think about a key benefit that I don’t see being discussed enough: education. Hell, if Bill Hader didn’t even know his impression received the deep fake treatment, how would someone become savvy at knowing how their data is at risk?
Deep fakes could be the driving force behind widespread, beneficial education - something that the modern tech industry needed. Many ML and AI concepts are quite abstract for non-technical audiences, but now we have a very visual way of showcasing their potentials. The penetration into the entertainment sphere could be the final link we need for people to take their data seriously, being bolstered by elements such as competition.
The misguided notion that advanced, technical concepts are beyond the understanding of the common user has been around since the inception of the internet. Deep fakes are the perfect vehicle for this new zeitgeist, as they adhere and warp people’s very personas, in a very dynamic way. We can’t change the level of impact deep fakes will have on technology, or society as a whole. We can, however, shape this impact for the better through educating people.