Writing cryptographic software or adding encryption to an app is an undertaking with numerous pitfalls for a first-timer. And for those already experienced in dealing with crypto matters, simple carelessness or self-assurance can lead to catastrophic results.
In this article we’ve compiled a list of the most common or especially dangerous mistakes developers make while implementing cryptography in their software, the things to look out for, and the things to do to avoid them (best cryptographic practices). Some of them fit even broader risk surface every app has, not only ones that manipulate cryptographic material and sensitive data.
The most common mistakes (in the opinion we hold at Cossack Labs) listed here are not directly linked to cryptographic processes and encryption, but making those mistakes renders cryptography useless at best, leads to vulnerabilities, opens up the system to exploits, and may lead to malfunctioning or DoS. A little attentiveness goes a long way, and the following practices should serve as a check-list of what not to be doing in your software and where it way lead if you do.
When you try to cram in more data than a container can hold, you’re going to create a mess. Copying an untrusted input without checking its size would be a classic example of a mistake leading to a buffer overflow.
Buffer overflows often can be used to execute arbitrary code, which is usually outside of the scope of a program’s implicit security policy. This can often be used to subvert any other security service. And buffer overflows lead to crashes and open up the program for malicious external actions such as DoS or putting the program into an infinite loop.
Prevention/Mitigation:
Another example of a simple mistake that can lead to buffer overflow and its consequences. This happens when the software uses a sequential operation to read or write a buffer, but uses an incorrect length value that causes it to access memory outside of the bounds of the buffer. When the length value exceeds the size of the destination, a buffer overflow can occur.
Prevention/Mitigation:
Use prevention/mitigation rules from classic buffer overflow vulnerability.
A programmer’s toolbox is chock-full of such digital ‘power tools’ that should be handled with care, including with libraries or API functions that make assumptions about how they will be used, with no guarantees of safety if they are abused. When potentially dangerous functions are not used properly, things can get very messy really quick. For instance, the following functions can become dangerous when used improperly:
usage of non-random IV with CBC mode of a block cipher like AES,
usage of insufficient entropy / small / same / predictable seed for PRNG,
usage of cryptographically weak PRNG.
Prevention/Mitigation:
Identify the list of prohibited API functions and forbid the developers (or yourself) to use these functions (sometimes you’ll have to come up with safer alternatives). In some cases, automatic code analysis tools or the compiler can be instructed to spot the use of prohibited functions, such as the “banned.h” include file from Microsoft’s SDL.
It may be tempting to develop your own encryption scheme in the hopes of making it difficult for the attackers to crack. However, such homegrown cryptography is a “welcome” sign for potential attackers.
Prevention/Mitigation:
Select a well-vetted algorithm approved and recommended by cryptography experts, and select well-tested implementations (the source code should be available for analysis). We might be biased, but recommend using Themis for encryption as it is a well-tested modular Apache 2 licensed open-source crypto library that currently uses OpenSSL as a source of its crypto primitives.
In languages where memory management is the programmer’s responsibility (such as C), there are many opportunities for making a mistake. If the buffer size is calculated incorrectly, the buffer may be too small to contain the data that the programmer intends to write, even if the input was properly validated. Any number of problems could lead to an incorrect calculation, but in the end you’re going to run head-first into a buffer overflow.
Prevention/Mitigation:
Another common mistake is allowing the product to use untrusted input when calculating or using an array index. When that product doesn’t validate (or validates incorrectly) the index to verify if the index references a valid position within the array, it leads to unpleasant consequences.
Prevention/Mitigation:
All successful relationships depend on clear communication — this is also true for software. Format strings are often used for sending/receiving well-formed data. By controlling a format string, the attacker can control the input or output in unexpected ways and sometimes even execute code.
Prevention/Mitigation:
Integers are not Chuck Norris, so they have their limits. And machines can’t count to infinity even if it sometimes feels like they take that long to complete an important task. When programmers forget that computers don’t do Math like people, bad things happen — and that includes anything ranging from faulty price calculations, infinite loops to crashes, etc.
Prevention/Mitigation:
Popular compilers are in development for decades (i.e. the first release of gcc dates back to March 1987), with the help of hundreds of contributors. Which means that in most cases security problems can be caught with compiler warnings. But they often are ignored.
Prevention/Mitigation:
The previous chapter of this article covered the mistakes unrelated to cryptography that could render any further encryption in your code useless by making your code vulnerable to various attacks. This chapter covers the best general practices for correct and secure implementation of cryptographic tools and approaches in your code.
Using passwords as encryption keys makes them highly vulnerable to keysearch attacks. Most users choose passwords that lack sufficient entropy to resist such attacks.
Solution: Use a truly random encryption/decryption key, not one deterministically generated from a password/passphrase. We recommend using PBKDF2 (which we included in Themis) that uses iterative hashing (along the lines of H(H(H(….H(password)…)))) to slow down a dictionary search.
Use a sufficient number of iterations to make this process take, say, 100ms to generate the key on the user’s machine.
Concatenation leaves the space indication between the two strings ambiguous. For example:
builtin||securely = built||insecurely
car||skill = cars||kill
Put differently, the hash H(S||T) does not uniquely identify the strings S and T. Therefore, the attacker may be able to change the division between the two strings without changing the hash.
For instance, if Alice wanted to send the two strings “builtin” and “securely”, the attacker could change them to strings “built” and “insecurely” without invalidating the hash. Similar problems arise when applying a digital signature or message authentication code to a concatenation of strings.
Rather than using plain concatenation, use encoding that is unambiguously decodable. For instance, instead of computing H(S||T), compute H(length(S)||S||T), where length(S) is a 32-bit value that denotes the length of S in bytes. Another solution would be using H(H(S)||H(T)), or even H(H(S)||T).
Using a single key for multiple purposes may open it up for various subtle attacks. Pick a single purpose key and use it for just that one purpose. If you need to be performing both functions, generate two keypairs, one for signing and one for encryption/decryption. Similarly, with symmetric cryptography, you should use one key for encryption and a separate independent key for message authentication. Don’t re-use the same key for both purposes.
Use the key management principles and guidelines described in our Themis GitHub Wiki.
Extensive audit logging in every component of the distributed architecture is an important part of key management. Every access to the sensitive data must be logged with details about the function, the user (individual or application), the utilised encryption resources, the data accessed, and when the access took place.
This point is very important even though it leaves the strictly cryptographic plane. We will repeat that using strong cryptography DOES guarantee security against all known theoretical attacks, but it WILL NOT guarantee a high level of complex computer system security against all possible threats of the real world.
For additional educational and fun read on non-cryptographic data security rules, also see our Medium article on non-cryptographic security practices.
This is our take on the subject. If you have something to add to expand the list of mistakes or best practices, please reach out to us via @CossackLabs or email.