Managing user data on a platform is always a challenge. It is impossible to be 100% secure against data leaks, and the impacts of these incidents can be very severe. Among sensitive information, we especially highlight the email and password.
Passwords have received our greatest attention. There are numerous restrictions on length, character mixes (upper and lower case, numbers, and special characters). There are mnemonic techniques to help users remember and password vaults to ensure you can use a different password on each account. Finally, we log passwords using salted hashes, and even that isn't enough to keep accounts safe after a leak.
That said, when it comes to emails, why do we continue saving values in plain text, just as we did decades ago? The reason is simple: platforms need to stay in touch with customers, informing them of updates, new products, and even security incidents. Hashes are irreversible and, therefore would make it impossible to retrieve an email address once saved. If your platform, for some reason, doesn't need to contact the user via email, keeping the address in a salted hash is a good option. But what about when contact is needed? Is it possible to register an email address securely? Fortunately, yes.
When we talk about cryptography these days, most people will think of cryptocurrencies, blockchain, and derivative terms. But cryptography is much more than that. From Caesar's time, messages were encrypted to prevent enemies from acquiring important information.
We currently have many more resources than the distinguished officers of the Roman empire. There are methods we can use to encrypt an email address, sending the result reversible and without affecting the performance of servers. Again, don't expect 100% security. But hey! We're talking about adding one layer of security where none exists!
One of my favorite methods is to use an XOR algorithm. In the links below, you can understand more about how it works:
Now that you know how it works, we can start a Python implementation quickly. Below, we create two of the necessary functions:
import binascii
def encrypt(content: str, key: str) -> bytes:
key_id = 0
xored = ""
for key_id, c in enumerate(content):
xored += chr(ord(key[key_id % len(key)]) ^ ord(c))
key_id += 1
return binascii.hexlify(xored.encode())
def decrypt(content: bytes, key: str) -> str:
key_id = 0
xored = ""
for key_id, c in enumerate(binascii.unhexlify(content).decode()):
xored += chr(ord(key[key_id % len(key)]) ^ ord(c))
key_id += 1
return xored
As you can see, these are very simple functions. And what would be the best practices for encoding email addresses? As I said, XOR is not particularly the most secure encryption. But there are some tricks we can do:
Got it?
For example, we will encrypt the email [email protected]. We can split it into "email", "@", "example.com". The first element consists of five characters. The question now is: how to define a secure key? Well, we have some values available. Once the password is hashed, we can reuse a portion of this hash as a key.
In our example, we will use an MD5 hash. We will choose a set of five characters from that hash. This value does not need to be saved as we already have the hash registered. This is also a guarantee that if the database is leaked, the specific part of the hash will not be clearly demonstrated.
Assuming that the user uses the password "badpass", and the salt "365ropmNUjtq08xSZOiMrgRjG9OMMe82Hh8LU1M" is added at the end, we will have the following result: 249ec31cf8b1946371bdeed6603b8341 We only need five characters, so let's use characters on positions 2, 7, 16, 21, 6: "9", "c", "7", "e", "1". 9c7e1 is our first key. Applying the XOR to the first element, we have the following:
encrypt("email", "9c7e1") -> 5c0e560c5d
To test, let's do the reverse: decrypt(b"5c0e560c5d", "9c7e1") -> "email"
Nice! Repeat the process with the second element, after the "@". Use a different key. You have several values at your disposal: an internal administrative key, other parts of the password hash. You can even use the result of the first element! However, remember to prioritize a key with the same element size. Finally, your database will look like this:
{
email: 5c0e560c5d@5014041d452b0744522c59,
password: 249ec31cf8b1946371bdeed6603b8341,
salt: 365ropmNUjtq08xSZOiMrgRjG9OMMe82Hh8LU1M
}
Looks a little bit safer now, doesn't it?
In order to test the security of our method, here's a challenge: try to break the encryption used in the second element, "example.com".
Also published on https://www.buymeacoffee.com/corvo/safely-saving-user-accounts.