Chief Research Officer, Xayn / Dean and Head of the Department of Computing, Imperial College London
Want to avoid an insurrection or genocide? Disconnect AI from centralized databases now!
As the power of AI grows and the internet plays an ever greater role in our physical realities, we must act decisively to put the protection of user data at the forefront of any new developments in online products and services. The consequences of failing to protect personal privacy online could be another insurrection or genocide.
Artificial Intelligence has a key role to play in securing online privacy, despite all the recent news stories about AI in conjunction with the misuse of private data. An important distinction needs to be made. The problem in these stories was not AI per se but rather the design choice of powerful machine learning models being turned loose on mountainous, centralized databases of highly personal information. The mountain of data makes AI work very well in instances like facial recognition, but control can be fleeting and the risks -- as we have seen -- are extremely high.
Such collections of private data are not necessary for most of the products and services that internet users rely on today. It’s now up to the next generation of programmers and designers to prove that AI can deliver a richer experience without mining private data. It is our duty to put the blinders on AI through exploring decentralized data processing and solutions that are private and transparent by design. There are promising developments in both directions that deserve our full attention and dedication – without further delay.
Online privacy is getting more attention today than any other time in the short history of the web. The debate correctly focuses on the fact that most options for protecting privacy online involve a tradeoff inconvenience or cost. “I’ve got nothing to hide, so what do I care,” has been a popular refrain, and it’s fine as long as we stay on the level of the individual. What’s far more important than this possible spying on any single person’s data is the danger of mass aggregations of private data. If a stranger knew everything about you, it would be creepy. But if a stranger knew everything about everyone, it would be a cause for outcry and legal investigation. In the words of Oxford professor Carissa Véliz, "It doesn’t matter if you think you don’t need privacy. Your society needs you to have privacy."
Worse yet, experience has shown us that strangers knowing everything about everyone is not only scandalous; it can be deadly. When the AI developed by Big Tech trains on mountains of personal data, the decisions it arrives at are shockingly impersonal. It learns everything about a user in order to maximize engagement, paying little attention to the valence of that engagement and to the destructive power of its emotional triggers. At this point “I have nothing to hide” stops being a protective mantra and starts hurting people.
One of AI’s biggest powers is identifying correlations and causal chains that human intelligence cannot register. When all our data is heaped together and turned inside out, seemingly harmless characteristics of ours can expose us to profiling and triggering operations we never thought possible. A detailed study of the 193 arrested US Capitol insurrectionists revealed that most of them did not fit the usual far-right extremist profile. However, we can presume they were fed a steady diet of algorithm-curated, cross-platform disinformation, and they all got the right triggers to push them towards violent action.
To prevent more bloodshed, the use of AI in conjunction with these huge collections of profile information needs to not only be strictly regulated -- new AI technology should also make the need for such mass aggregations obsolete. This will be much easier said than done. Most systems of machine learning, including Deep Learning, require massive data inputs to function effectively. This is one of the structuring principles of modern AI. It’s not enough to simply design the program; one must also train the program on data collections, making continuous refinements of these programs and their structures until performance is optimized. Quantity and quality both matter when building artificial intelligence, so who is to say where the limits are for what a researcher can feed into a computer to build effective AI?
Luckily, there are ways to train AI effectively without having to ponder over ethical and moral boundaries. We owe that to the great leaps in personal computing that we take for granted today. All it takes is a comparably great leap in our approach to input data for our algorithms, one that focuses on the users as individuals with specific needs, not on how they fit into predefined target profiles geared towards mass engagement and thought control.
When artificial intelligence first emerged as a concept in the mid-20th century, the average computer could fill a spacious room. It was not particularly powerful or quick, and it had to be fed data in physical form: A human operator would literally place stacks of punched cards in the feeder for the computer to read. Computers gradually gained processing power and connectivity, but the idea of collecting data in one giant repository and feeding it to the machine persisted.
Now fast forward to today when many of us are carrying a mini-supercomputer in our pocket. The wide availability of personal computing devices like smartphones, tablets, or laptops has enabled a new kind of machine learning called ‘edge AI.’ For the first time, users’ own devices can run advanced algorithms, without any pressing need for centralized data collection and processing.
Edge computing and edge AI, as the name suggests, describe the computational and data processing that takes place on the end nodes of a network. In practical terms, edge AI means running and training algorithms on users’ devices and letting them operate locally instead of collecting local user data for centralized storage and processing. To borrow a phrase from Francis Bacon:
If the data will not come to the AI, then the AI will go to the data.
Not only is edge AI just as powerful as its older centralized brother, it brings a number of advantages with it that can improve the overall user experience. Since all data is processed locally, privacy becomes much less of a concern. Additionally, communications on such distributed networks are typically encrypted end-to-end and often with multiple layers, which makes them very secure. Moving AI data processing to the edge also resolves many latency issues since information and commands do not need to travel to the central nodes and back.
Crucially, edge AI offers a fruitful foundation for developing innovative and large-scale distributed systems without investing a ton of own resources thanks to load balancing. If one follows the classic model of Big Tech, the massive pool of data requires considerable amounts of centralized computing power to process it meaningfully. Working ‘on the edges,’ conversely, distributes the processing among individual nodes, creating a much more resilient and pliable structure. Taken together, the major characteristics of edge AI make it an ideal candidate for the future of personal computing and app development.
AI gets a bad name right now, and there are legitimate reasons for that: Big Tech is using AI’s enormous potential in their own interests, and it pours enormous resources into convincing you it’s the only way. Algorithms bring “relevant” content to users, but this relevance is measured only in terms of business-oriented metrics such as site engagement, number of shares, and “stickiness.” These metrics generate content that maximizes the platform’s bottom line. As a toxic byproduct, they can lead to the alarming outcomes described above. Out of personal data comes impersonal engagement predicated on negative triggers and antisocial profiling.
Once we decouple AI technology from aggregated cross-personal data mining, we can begin to design AI that actually has the users’ interests in mind.
All this takes is a shift of focus from the center to the edges of the platform, network, service, or whatever other data- and AI-powered product we are considering. Edge AI has the power to achieve the convenience currently provided by Big Tech. Crucially, it will also allow users to control their own data and apply their own notions of relevance to their web usage. Putting users in the driver’s seat ensures that personal data is put to personal use, for personal benefit, and on personal devices.
Thanks to edge AI, we have the technology to give everyone in the world their own assistant. All you need is a device and an internet connection, and you could develop a personal assistant that helps you navigate the world, see things that you couldn’t see, be more productive, and learn about yourself in new ways. If this was truly your private assistant, then obviously, the more you revealed about yourself, the richer an experience you would have. But if all the assistants were talking to each other, or if they were all just copies of the same assistant, that would be a problem. Who would trust their assistant?
Setting strict regulations on the use of private data in AI is a hard ask because centralized data processing is deeply ingrained in our society, both by force of habit and by Big Tech’s selfish design. Mining and monetizing private data is a major component of the business models of many of the world’s biggest companies. The toxic byproducts have been social division, mass delusion, violence, and death. We no longer have the luxury of naiveté about the dangers that mass collections of private data involve. The time has come for web users to demand that personal data be owned and controlled only by the person generating it. The time has also come for developers to provide users with that choice. With internet access and smartphone ownership peaking virtually everywhere, the conditions for this future are perfect.
Professor Michael Huth (Ph.D.) is Co-Founder and Chief Research Officer of Xayn and teaches at Imperial College London. His research focuses on Cybersecurity, Cryptography, and Mathematical Modeling, as well as security and privacy in Machine Learning. He served as the technical lead of the Harnessing Economic Value theme at PETRAS IoT Cybersecurity Research Hub in the UK. In 2017, he founded Xayn together with Leif-Nissen Lundbæk and Felix Hahmann. Xayn offers a privacy-protecting search engine that enables users to gain back control over the algorithms and provides them with a smooth user-experience. Winner of the first Porsche Innovation Contest, the AI company has already worked successfully with Porsche, Daimler, Deutsche Bahn, and Siemens.
Professor Huth studied Mathematics at TU Darmstadt and obtained his Ph.D. at Tulane University, New Orleans. He worked at TU Darmstadt, Kansas State University and spent a research sabbatical at The University of Oxford. Huth has authored numerous scientific publications and is an experienced speaker on international stages.
Create your free account to unlock your custom reading experience.