: Adversarial examples, future of deep learning, security and attacks Featured In the “ ” series, we will see how to use deep learning to solve complex problems as we do in the series** ** We will rather look at different or related to Deep Learning. Deep Learning bits not A.I. Odyssey . techniques concepts Introduction In this article, we are going to talk about and discuss their implications for deep learning and security_._ They must not be confused with which is a framework for training neural networks, as used in adversarial examples adversarial training, Generative Adversarial Networks . What are Adversarial Examples? Adversarial examples are that cause a neural network to predict a with . handcrafted inputs wrong class high confidence Usually, neural network errors occur when the image***** is either of (bad cropping, ambiguous, etc.) or contains (car in the background etc.). This is the case for adversarial examples, which look like ordinary images. poor quality multiple classes not * In this post, we will focus on images as they provide interesting visual support, but keep in mind that this can be applied to other inputs such as sound. While the first two mistakes are understandable, the third image looks like a temple, and we can think that any properly trained neural network . definitely should be able to make the correct prediction What’s going on here then? The specificity of adversarial examples is that they do not occur in natural data, . An against a neural network is a process in which someone slightly modifies an image so that it . The goal is to to the original image, while obtaining a for the target class. they are crafted adversarial attack fools the network minimize the perturbations high confidence Creation of an adversarial example to target the class Ostrich How is this done? The of adversarial examples is a vast topic, and new techniques are being discovered to create , perturbations with image distortion. generation faster more robust minimal We will not dwell on these are generated to rather focus on their But a general principle and simple method is to take the original image, run it through the neural network, and use the to find out to reach the class. how implications. backpropagation algorithm how the of the image should be modified pixels target Is this really a big deal? The first thing that we usually of when we see adversarial examples is that they are . As humans would classify them correctly without breaking a sweat, we intuitively expect any good model to do so. This reveals our intrinsic for a neural network: we want or performance. think unacceptable expectations human super-human “If a model fails to classify this as a Temple, then it’s ” — Or is it? necessarily bad Let’s step back for a minute and think about what this means. On a given task — e.g. identify road signs in a self driving vehicle — we wouldn’t the human by a computer unless it is replace at least as good as the human. Something we often forget is that having a model that is than a human does imply any requirement on the failure cases. In the end, if the human has an accuracy of , and the neural network of , does it really matter that the exemples the machine missed are considered ? better not 96% 98% easy The answer to this question is … aaaand . yes no Even though it’s to see models fail on what looks like trivial examples, this doesn’t represent a What we care about is how powerful and reliable the model is. We have to accept that our brain and deep learning work in the same way and therefore don’t yield the same results. frustrating and counter-intuitive state-of-the-art fundamental issue. do not “Do we care whether the exemples the machine missed are never missed by a human?” What matter, though, is that adversarial attacks represent a to AI-based systems. does security threat How can we maliciously exploit adversarial examples? Many kinds of Deep Learning powered systems could severely suffer from adversarial attacks if someone got their hands on the underlying model. Here are some examples. Upload images that bypass safety filters Create bots that don’t get flagged by google’s “I’m not a robot” system That’s for the virtual world. However, implementing such attacks on is significantly because of all the transformations involved when taking a picture of an object, but it is . real life objects harder still possible Robust Adversarial Example in the wild by — The red bar indicates the most probable class for the image. Here the cat is classified as a desktop computer OpenAI With this in mind you could imagine: Stealing the identity of someone by wearing special glasses Misleading a self-driving car by altering traffic signs Disguise a weapon to avoid video detection Bypass audio or fingerprint identification Impersonating Mila Jovovich with custom glasses What can we do against that? There are we can do to this issue. First we can think of keeping the model private. However, this is has two big flaws. a few things mitigate solution , secure systems should ideally be built following : This means we on the privacy of the model because one day or another, it will be . First Kerckhoffs-Shannon principle “one ought to design systems under the assumption that the enemy will immediately gain full familiarity with them”. shouldn’t rely leaked* , some papers have been published on / adversarial attacks, that could potentially work for a specific task no matter which model is used. Then universal model-independant * Note: That is the reason why all databases are encrypted. When you think about it, there is no “need” to encrypt your database if you think it will never be hacked. On the , techniques like or are being developped to make neural networks more resilient to adversarial attacks. We can also the model using images and to help the network disregard the perturbations. good side Parseval networks Defensive distillation train both normal adversarial examples Extrapolation of OpenAI’s remark on transformations and perturbations magnitude Also, a team from OpenAI noticed that it becomes to find when you want to make the adversarial attack robust to many transformations ( , etc.). that some model could reach a point where that is both resilient to all transformations undetectable. We then flag adversarial examples as and therefore be from adversarial attacks for this task, but this might be tricky to implement (e.g. requiring etc.). increasingly harder small perturbations rotation perspective We can imagine there is no perturbation and could anomalies safe multiple cameras Conclusion To conclude, adversarial examples are an incredibly interesting area of deep learning research, and there is being made every day to reach secure deep learning systems. It’s paramount that research teams play both and in trying to neural network classifiers. progress cop robber make/break As of today, adversarial attacks begin to represent a threat to deep learning-based systems. However, few systems rely blindly on neural networks for important verifications, and adversarial attacks to be applied at a large scale other than by research teams. are not versatile/robust enough , more and more attacks against neural networks will become possible, with for the attackers Hopefully, we will have developped strong defense techniques to be from such attacks by then. In the years to come increasingly interesting rewards safe for reading this post! Feel free to it and me if you like AI related stuff! Thank you share follow
Share Your Thoughts