2,455 reads

Exploring Cross-Site Scripting (XSS): Risks, Vulnerabilities, and Prevention Measures

by Vladyslav ArzamastsevMay 9th, 2023

Too Long; Didn't Read

Cross-site scripting (XSS) is the most severe consequence of poorly implemented frontends. Let’s take an in-depth look at that vulnerability, explore the root cause, various ways of exploiting it, learn how to properly use the React framework, and understand how to protect any frontend app from the XSS threat.

featured image - Exploring Cross-Site Scripting (XSS): Risks, Vulnerabilities, and Prevention Measures

Cross-Site Scripting (XSS) Explained

When creating an application with a non-CLI interface, it is crucial to prioritize frontend security to ensure the integrity of your system. One of the most critical risks that poorly implemented frontends can lead to is cross-site scripting (XSS). While many articles and courses touch on the topic of security in web application development, they often fall short of providing comprehensive explanations.

Therefore, it is essential to delve into the vulnerability of XSS, understand its root causes, explore potential exploitation methods, learn the proper utilization of the React framework, and ultimately grasp effective measures to safeguard any frontend application against the XSS threat.

What is cross-site scripting?

Cross-Site Scripting (XSS) is a code-injection vulnerability that occurs in applications that process HTML when developers do not sanitize user input well enough before inserting it into an HTML template. It allows an attacker to insert arbitrary JavaScript code into a template and execute it in the user’s context:

In the image above, the developer failed to sanitize the content of the "last-name" div, which resulted in users being able to include malicious scripts by manipulating their last name.

Is XSS common?

Despite the fact that numerous frameworks and libraries provide users with all the necessary tools to get rid of XSS, this is still one of the most common vulnerabilities found in web applications. It consistently appears in the OWASP list of the Top Web Application Security Risks and was used in 40% of online cyberattacks against large enterprises in Europe and North America in 2019. According to HackerOne, XSS vulnerabilities are the most common vulnerability type discovered in bug bounty programs.

Where can I spot XSS vulnerability?

There are 2 types of Cross-Site Scripting:

Client-Side XSS	Server-Side XSS
The most common kind of XSS. It happens on the client side (in browsers or desktop apps) and is a consequence of not properly sanitizing user-supplied data before inserting it into the DOM.This is quite rare, however, possible.	It happens when the server transforms HTML files into other documents (most likely, PDFs) and the library does not whitelist what kind of code it executes. You can read about Dynamic PDF vulnerability on HackTricks.

Is XSS really that dangerous?

Simply put, it’s a disaster.

Here’s a list of what an attacker can do if they’re able to exploit an XSS vulnerability on the client side:

Remote Code Execution (Browser exploits, CMS exploitation)
Session Hijacking
Bypass CSRF protection
Keylogging
Forced Downloads
Man-in-the-Middle Attacks
All sorts of phishing: Credential Harvesting, Ad-jacking (Ad injection), Clickjacking, Redirecting users to a malicious website, etc.
Stealing data from local- session- web- storages, cookies, IndexedDB, page source code, taking screenshots
DoS and DDoS
Content Spoofing
Pivoting into hidden, internal networks, protected by firewalls: JS can be used for host & port discovery, service identification, and interaction (Is extremely slow)
Stealing geo-location, capturing audio, web camera, or gyroscope data (requires explicit permission)
Crypto mining (Is hard, the browser will try to protect a user)

Server-Side XSS is even worse because it allows:

Remote Code Execution
Local File Inclusion
Server-Side Request Forgery
Internal Path Disclosure
Stealing information from the result document
DoS
Crypto mining

The attacker is also able to get information and control the tab in real-time mode:

Misconceptions about XSS

“XSS is not a threat if the website uses HTTP-only cookies for authentication”

I hear that a lot, and that’s just ridiculous. Yes, an attacker can’t steal HTTP-only cookies with JavaScript; however, they don’t even need to. The true danger of XSS comes from the ability to execute arbitrary JS in the current user’s context. If an attacker can’t steal the cookie and attach it to malicious requests made from their machine, they’ll just move the malware into the victim’s browser, and the browser will attach the cookies for them. I admit, the exploitation becomes more complicated; however, it’s A LOT stealthier and safer for the attacker.

“XSS is a non-persistent type of attack #1: I leave the vulnerable page and it’s fine”

Wrong again! Once an attacker can inject arbitrary JS into your browser, they can change the application’s behavior so that you NEVER leave the vulnerable page. By manipulating requests/responses and the HTML DOM, they can make it seem like you left the page by re-rendering new content on the vulnerable page. However, this kind of trickery can be defeated if the user manually types the URL they’re interested in into the address bar.

“XSS is a non-persistent type of attack #2: I just leave the website and it’s fine”

Not true: malicious JavaScript files CAN be persisted in your browser. That’s rare, but still possible via Server Workers. To register a malicious service worker, one of the following conditions must be met:

An attacker has write access on the frontend server
There’s an unfiltered JSONP endpoint exposed

If an attacker manages to register a malicious Service Worker, they can maintain persistence in your browser indefinitely. Service Workers can be used to sniff and modify traffic (for instance, to supply new malicious scripts with each response). Check out the ShadowWorkers project for more details.

How many types of XSS are there?

There are several different ways of XSS manifestation. Security specialists usually single out 3:

Dom-Based XSS. DOM XSS stands for Document Object Model-based Cross-site Scripting. A DOM-based XSS attack is possible if the web application writes data to the Document Object Model without proper sanitizing. The attacker can manipulate this data to include malware on the web page. Key points:

Generally, DOM-Based cross-site scripting attacks are client-side attacks. Malicious code might never reach the server.
Malware is executed AFTER the HTML template is rendered; it happens at some point during runtime.

Reflected XSS. Reflected XSS occurs when the server takes a part of a request and inserts it into the response without proper sanitizing. Key points:

Reflected XSS payload ALWAYS reaches the server; it is part of both request and response.
Unlike DOM-Based XSS, Reflected XSS payload is executed WHILE a browser renders an HTML template since the payload is part of the response and usually is embedded into the template.

Stored XSS. Stored XSS happens when developers blindly trust data that’s being stored in their databases, web-caches, files, etc. Key points:

Stored XSS is saved somewhere (not necessarily the database) for a while
The payload might be executed on multiple pages and usually does not require any user interaction to fire (unlike DOM-Based and Reflected XSS, which are usually spread via malicious links and require user interaction).

If I use React, am I safe then?

Surprise reveal: React is not fully safe from XSS, although it really tries to protect users from it.

There are several ways to inject malicious JS into a React app:

React does not filter what you’re passing to props:
- href (Exploitation via “javascript:” or “data:text/html” URI)
- src (Exploitation via “javascript:” or “data:text/html” URI)
- srcDoc (Exploitation via inserting malicios HTML)
- formAction (Exploitation via “eval(...arbitrary js)”)
- data (Exploitation via “javascript:” or “data:text/html” URI)
React also allows you to directly manipulate the DOM, bypassing its restrictions and protections. You can achieve this by using the dangerouslySetInnerHTML prop. As a “Security precaution”, React ignores <script> tags whenever they’re inserted into dangerouslySetInnerHTML. This protection can be easily bypassed by modifying the XSS payload, like:
- Using <iframe src=”javascript:eval(...)”/> or
- <img id=’_malware_’ src=’x’ onerror=’eval(this.id)’ />
Such mutations allow you to inject <script> tags into the DOM
The last known way to inject malicious code into a React app is by abusing the user-controlled props object. If an attacker has control over the props object’s keys, they might be able to embed an exploit by either abusing href, src, srcDoc, data, formAction attributes, or by poisoning props with dangerouslySetInnerHTML. This is especially dangerous if users have control over the JSX tag that’s being inserted into the React tree.

How can I prevent XSS?

The first step would be to encode data on output.

According to Portswigger Web Security Academy, encoding should be applied directly before user-controllable data is written to a page because the context you're writing into determines what kind of encoding you need to use. For example, values inside a JavaScript string require a different type of escaping to those in an HTML context.

In an HTML context, you should convert non-whitelisted values into HTML entities:

< converts to: <
> converts to: >

In a JavaScript string context, non-alphanumeric values should be Unicode-escaped:

< converts to: \u003c
> converts to: \u003e

Validate input on arrival

You should validate any user input as strictly as possible. For instance:

If a user submits a URL, manually cast it to a URL class and verify that it starts with a safe protocol (HTTP / HTTPS)
If a user supplies a value that is expected to be numeric, explicitly cast it to a number
Validate that input contains only an expected set of characters

Whitelisting and blacklisting

Input validation should generally employ whitelists rather than blacklists. For example, instead of trying to make a list of all harmful protocols (javascript, data, etc.), simply make a list of safe protocols (HTTP, HTTPS) and disallow anything not on the list.

Allowing "safe" HTML

The best option is to use a JavaScript library that performs filtering and encoding in the user's browser, such as DOMPurify. Other libraries allow users to provide content in markdown format and convert the markdown into HTML. Unfortunately, all these libraries have XSS vulnerabilities from time to time, so this is not a perfect solution. If you do use one, you should monitor closely for security updates.

If you’re using a frontend framework, your other option is to parse it into that framework’s elements, like in the JSX tree in React. There are libraries that do that, however, parsing it manually is not that hard, so if you know exactly what kind of HTML you’re supposed to render, it might be safer to do it yourself. That way, you’ll definitely avoid dangerous props, event handlers, and harmful CSS.

Mitigating XSS using Content Security Policy (CSP)

CSP is the last line of defense against cross-site scripting. If your XSS prevention fails, you can use CSP to mitigate XSS by restricting what an attacker can do. CSP lets you control various things, such as whether external scripts can be loaded and whether inline scripts will be executed. To deploy CSP, you need to include an HTTP response header called Content-Security-Policy with a value containing your policy.

An example of CSP is as follows:

default-src 'self'; script-src 'self'; object-src 'none'; frame-src 'none'; base-uri 'none';

This policy specifies that resources such as images and scripts can only be loaded from the same origin as the main page. So even if an attacker can successfully inject an XSS payload, they can only load resources from the current origin.

If you require the loading of external resources, ensure you only allow scripts that do not aid an attacker in exploiting your site. For example, if you whitelist certain domains then an attacker can load any script from those domains. Where possible, try to host resources on your own domain.

If that is not possible then you can use a hash- or nonce-based policy to allow scripts on different domains. A nonce is a random string that is added as an attribute of a script or resource, which will only be executed if the random string matches the server-generated one. An attacker is unable to guess the randomized string and therefore cannot invoke a script or resource with a valid nonce and so the resource will not be executed.

Server-side protection

Using server-side protection, such as Web-Application Firewalls and Intrusion Prevention Systems, can help you reject XSS payloads sent to your server. For instance, AWS WAF and Snort IPS have sets of rules that detect the most common XSS payloads, such as ‘<script>alert(1)</script>’. Additionally, IPS systems ship with known exploit traffic signatures, for example, Snort is able to detect and disrupt exploitation of CVE-2011-1897 – XSS vulnerability in Microsoft Forefront Unified Access Gateway.

Beware of inserting user input into the "script" tag

If an attacker is able to insert JS where it’s being evaluated, there's not much you can do. There are numerous ways to encode and mutate malicious scripts by using dedicated obfuscators or esoteric JS dialects, like Katakana, JJEncode, or JSFuck.

Consider the following JJEncode snippet:

```js
$=~[];$={___:++$,$$$$:(![]+"")[$],__$:++$,$_$_:(![]+"")[$],_$_:++$,$_$$:({}+"")
[$],$$_$:($[$]+"")[$],_$$:++$,$$$_:(!""+"")[$],$__:++$,$_$:++$,$$__:({}+"")[$],$
$_:++$,$$$:++$,$___:++$,$__$:++$};$.$_=($.$_=$+"")[$.$_$]+($._$=$.
$_[$.__$])+($.$$=($.$+"")[$.__$])+((!$)+"")[$._$$]+($.__=$.$_[$.$$_])+($.
$=(!""+"")[$.__$])+($._=(!""+"")[$._$_])+$.$_[$.$_$]+$.__+$._$+$.$;$.$$=$.$+
(!""+"")[$._$$]+$.__+$._+$.$+$.$$;$.$=($.___)[$.$_][$.$_];$.$($.$($.$$+"\""+$.
$_$_+(![]+"")[$._$_]+$.$$$_+"\\"+$.__$+$.$$_+$._$_+$.__+"("+$.__$+")"+"\"")
())();

When inserted in a <script> tag, it evaluates to alert(1). A home-baked regular expression can’t help because there’s not a single suspicious word or a control character in here.