There is no question that the rise of artificial intelligence (AI) in cybersecurity has been meteoric in nature, and – in the same manner as other emerging technologies – the potential rewards and risks from adoption in commercial activity will materialize over time. For businesses, navigating the AI landscape is increasingly challenging due to the overwhelming deluge of AI-backed products, promises, and dreams at a time when certainty about the technology is paramount.
Let's explore the rewards and risks of using AI in offensive cybersecurity practices, including vulnerability assessments, penetration testing, red teaming, and everything in between. While opportunities certainly exist, a number of potential pitfalls should be considered by offensive practitioners and customers alike.
Not every company maintains offensive security experts as part of its workforce, and it is commonplace to outsource testing to third-party consultancies to perform testing. This model has a number of advantages, such as obtaining the services of well-practiced and qualified professionals. However, the rise of AI-assisted tools on the horizon – such as
During the course of an assessment, a tester could – with ease and at speed – send sensitive data to a third-party system outside of customer control. While testing, it is not uncommon to discover and parse information pertaining to customers and users – such as PII and credentials – and use of such technologies could inadvertently result in disclosing vulnerabilities. Leaks of sensitive data to AI services have already been highlighted publicly, much like what occurred to
Leaks like this one force us to ask: What happens when someone performing what should be a trusted service leaks sensitive information? And are organizations able to understand that risk? Can consultancies demonstrate that they understand the risks and are able to provide suitable assurances to their customers?
This isn’t to say that ignoring AI completely is the correct course of action, either. If the primary concern around AI use comes from a lack of control over data sent to a third-party AI, why not run it locally?
With a selection of base models run on consumer hardware that can be used for commercial use, or cloud services offering professional hardware like the Nvidia A100 GPU, a number of options exist to develop tooling that could aid pentesters while reducing the risk from loss of sensitive information.
AI and large language models (LLMs) are not a panacea. As recently discovered by a
As an example, I set up an LLM running on consumer hardware with Meta’s Llama2 13B model as the base model and asked it a relatively simple question:
The response isn't bad and even makes some recommendations as to how the code could be better. However, let's go for something more unique and specialized. Let’s have a closer look at some code and an unusual property of `scanf` described
(As an aside, at time of writing, you can ask Chat GPT-4 the same question and it is also unable to provide a useful answer, you can view the example chat history
In another example, I tried to use BurpGPT along with the OpenAI API to see if it could spot vulnerabilities in the portswigger XSS challenge site and found the results completely nonsensical.
Not to mention, quite expensive with a single page load in the browser translating to 59 API requests:
… and a cost of $2.86, at which point I concluded that the test could get quite expensive, rather quickly.
Today, the technology isn’t ready to provide useful, real-world results to assist pentesters in all situations and environments. That isn’t to say it won’t be useful in other ways – pair-assisted programming being a primary example – and it isn’t to say that there will come a point where usable information won’t be forthcoming, but wider questions around how to train an LLM and expose it to pentesters in a secure way will need to be answered. It also doesn’t remove the requirement for the tester to understand the advice given and know whether it is safe to follow or not.
As a thought experiment, imagine that a pentester is conducting an infrastructure assessment – again in a Windows Active Directory (AD) environment – and either identifies that they have control over a user who can modify the membership of a high-value built-in group, such as Account Operators, or the AI they feed their tool output into recognizes this state.
Next, they ask the LLM how they can take advantage of this situation and the LLM describes the next steps in the following image:
If the objective of the task was to compromise the ‘Account Operators’ group, it has been achieved. The user account, by nature of being in the ‘Domain Users’ group, is now also in the ‘Account Operators’ group, as is every other domain user in the customer environment.
While this example is fictitious and extreme, conditions and weaknesses – such as logical flaws in AD – are introduced by customers into their own environments all the time and found by pentesters every day, and those flaws can be quite complex in nature. The last thing that a customer needs is a pentester making the situation orders of magnitude worse by further weakening the security of that environment.
As a caveat, the issue here isn’t even AI itself. It's perfectly possible to get bad or incorrect advice from other sources of information. The problem is that we stand at the cusp of widespread use of a technology that has the potential to give inherently bad advice, and if caution is not taken, lessons will be learned the hard way.
Let's take a step back from viewing the issue through the nightmarish prism of AI not being quite as reliable as users would like. Once the technology is fully understood, there are a number of ways it can be incorporated into the workflow of any tech professional, including pentesters. I’ve already touched on pair-assisted programming as a great example of how AI can improve productivity and speed while programming. Additionally, based on the information I was able to elicit from a local LLM, there is evidently viable use as a 1:1 teacher/trainer at certain skill levels, providing the student understands the teacher is not always completely correct.
Clearly, there are ways to leverage AI that are safe and beneficial to practitioners and clients. As it stands, the key to these benefits lies in understanding the technology's shortcomings and when to lean on your own expertise to best use AI products.