The rapid advancement of AI has offered powerful tools for malware detection, but it has also introduced new avenues for adversarial attacks. As an example, recently OpenAI reported threat actors abusing ChatGPT to execute reconnaissance, help fixing code, write partial code, or look at vulnerabilities. These are to me examples of AI aiding “basic” steps, but would threat actors invest and use more advanced applications?
Universal Adversarial Perturbations (UAPs) have gained attention due to their potential to bypass machine learning models in various domains, including malware detection. UAPs can manipulate malware in ways that evade AI-based detection systems without altering the malware's core functionality. However, despite this capability, cybercriminals have not widely adopted AI-driven techniques like UAPs. This blog delves into the complexity and effort required to generate UAPs for malware and explains why it might not be worth the trouble for attackers.
Just to be clear on definitions:
Artificial Intelligence (AI) is a broad field that aims to create machines or software capable of performing tasks that typically require human intelligence, such as understanding language, recognizing images, problem-solving, and decision-making. AI encompasses various techniques and approaches, from rule-based systems to learning algorithms.
Machine Learning (ML) is a subset of AI that focuses on building systems that learn from data. Instead of being explicitly programmed for each task, ML models identify patterns in data to make predictions or decisions, improving over time with more experience.
Universal Adversarial Perturbations (UAPs) are subtle modifications applied to input data (such as malware samples) to mislead AI models. What makes UAPs particularly interesting is that a single perturbation (one ring rules them all) can be applied to many inputs, causing the AI model to misclassify them. Think of it as changing just a few pixels in a picture to make a powerful facial recognition system mistake someone for someone else. In the below example, a single bit of random code is added to multiple different images, resulting in the classifying model going completely wrong on the identification.
When we look at the example of the platypus, the model identifies the animal partially right based on the training on the beak with other images, but due to the interference with the added “noise” in the pixels, it classifies it wrong. That is exactly the interesting space when it comes to malware detection and evasion. You want malicious files to be classified wrong.
In the context of malware detection, UAPs allow attackers to evade detection without having to create entirely new malware variants. While this seems like a low-effort, high-reward strategy, generating effective UAPs is far more challenging than it appears, particularly in the malware domain.
In their paper,
The process requires attackers to perform a series of careful transformations to avoid breaking the executable while still evading detection. This is a far cry from simply adding noise to an image or text dataset. As a result, the time and expertise required to create UAPs that both fool AI/ML malware detection models and preserve malware functionality is significant.
Given the complexity of generating UAPs, cybercriminals face a dilemma: Should they invest time and resources into crafting these perturbations, or is it easier to create entirely new strains of malware?
Developing a new malware strain might involve reusing code from previous versions, applying known obfuscation techniques, or modifying payloads. This process is often faster, less risky, and more predictable compared to the complex sequence of transformations required to generate UAPs. As a result, many attackers prefer to invest in creating new strains of malware, which are more likely to achieve the desired outcome without the same level of effort and risk.
One of the major hurdles in applying UAPs to malware is the real-world execution environment. Malware operates in dynamic, unpredictable conditions, and UAPs crafted in controlled environments may not perform as expected once deployed. Small changes in the operating system, file structure, or antivirus defenses can render the UAP ineffective. This fragility is a key reason why UAPs remain largely theoretical for malware attacks rather than a widely adopted technique in practice.
Additionally, defenders are not standing still.
Conclusion
The idea of using AI to defeat AI, particularly through Universal Adversarial Perturbations, may seem like a natural progression in the ongoing battle between attackers and defenders. However, the reality is that the complexity and risk associated with developing UAPs for malware make this approach unattractive for most cybercriminals. Instead, attackers tend to rely on more straightforward methods like creating new malware variants, which offer a better return on investment with less risk of failure. If you examine some of the latest ransomware campaigns, none of them highlight the use of AI-based techniques. Instead, as shown in recent coverage of ransomware tactics
As long as the development of UAPs remains fraught with difficulties—such as maintaining functionality and overcoming problem-space constraints—it’s unlikely that we will see widespread adoption of these techniques in the cybercriminal world. Instead, traditional malware development and deployment methods will continue to dominate the landscape, while defenders must remain vigilant and adaptive to the evolving AI threat landscape.