I think it is very important to know why a messaging application is using end-to-end encryption, and why is it imperative that it is open-source in the context of privacy and security. In this article, I will try to explain plainly why these two concepts are important, and how do they interact. I aim to go into technical details about Signal protocol as little as possible, and explain the benefits of using an open-source messaging platform with end-to-end encryption.
First, let’s define what open source means for us. Let’s take the Signal messaging application as an example. We can see and download the full source code of the Signal application from here. This means that if you have the technical knowledge to get these parts together, you can download the source code for the mobile apps and server from the link I provided, modify them to your needs (change the logo, menus, how it communicates, etc.), turn the source code into apps and executable programs, set up your own server, distribute it, and message with their friends using your own server and apps.
You don’t need permission from any authority to do that, and as you have the full source code, both the server and the apps are under your control.
What is the advantage of this for people that don’t want to do this, or don’t have the technical knowledge?
The advantage, in this case, is; even if we don’t set up our own server, if we can be sure that the app we download from our app store is an unmodified version of the original source code, we know exactly what is running on our device. So if any person or organization wants to add a backdoor or some code that transfers data to their server, they will have to do so publicly, in a platform like GitHub that anyone can see, examine and object to any code changes.
This is a great advantage in terms of privacy and security. Many incidents, including the recent SolarWinds security breach, have shown that if the software is developed in a closed ecosystem, reactions to security-related issues may be slower or less complete than their open-source counterparts.
End-to-end encryption means that the data is encrypted before it leaves your device, and it is only decrypted when it reaches the device you are messaging. So while in transmit, your messages are encrypted and cannot be read.
The advantage of E2E encryption is that the server that sits between two devices and transfers messages between them, can’t understand what’s going on. It has no choice but to blindly exchange messages that look like utter gibberish because it can’t decrypt them without having the passwords (secret keys).
As a result, even if the company or authority that owns the server can choose to store this data, it can’t display, use or sell the messages, because decrypting this encrypted data within a reasonable time is not possible even using supercomputers because of the mathematical algorithms used.
So the real benefit of E2E encryption is instead of trusting statements like “we are not using / storing your data”, we can trust powerful mathematical algorithms instead. If we are sure that we are using E2E encryption correctly, it doesn’t matter if there is such a statement, the server cannot read our messages in any case.
But how can this be possible? At some point, we have to exchange passwords, right? Can’t the server intercept our messages to learn the passwords we send at each other, and just decrypt the messages?
No, it can’t.
To prevent the server from doing that, we are using a very smart mathematical method. I will try to explain it in a simple way.
If the two devices just agreed on a password at the beginning of a conversation and just use that password for all communication, the server would just be able to save that password at the beginning and decrypt the conversation easily using it. So what is preventing it from doing that?
Public-key cryptography is precisely for preventing this from happening.
Public-key cryptography uses two passwords (“keys”) instead of one. They are called “public key” and “private key”. They are paired with each other. During communication, your public key is only used for encrypting things. It can’t decrypt encrypted messages. So you can distribute this key everywhere freely. You can literally go ahead and post it on Facebook, or print it and tape it on your window. This won’t be a security issue. This is what it is for. If two people use it to encrypt two messages with the same public key and later be able to see what each other sent, they won’t be able to decrypt each other’s messages. Only you can decrypt both messages using your private key.
This is a very important mathematical algorithm that when it was first discovered, created quite a bit of turmoil. It was first invented by a British Cryptograph in 1970 but classified by the British government due to its military significance. After 8 years, it was published by three researchers from the USA who discovered it independently, letting the world know about it for the first time.
It took the British government 27 years to declassify the early discovery. Later, in 1991, Phil Zimmerman wrote a computer program called PGP (Pretty Good Privacy) using the algorithm and published it on the newly popularized internet.
He would later face heavy charges for “unlicensed export of dangerous munitions” for this software. As a clever response, Phil Zimmermann published the whole source code of his PGP program at MIT Press. Distributing a computer program that can do encryption stronger than 40 bits was considered a crime, but distributing information was protected by the fifth amendment.
In a simplified way, public-key cryptography works like this:
Public and private key generation on your device.
At first, two passwords (“keys”) are generated by both your device and the other device you are communicating with.
After that, both devices send each other their public keys only. The communication process happens like this:
In real-world applications, usually, a lot of other steps happen after that. This process can also differ according to the algorithm used.
For example, to improve the performance, the devices can use the new secure communication line they established to exchange new keys and switch to another encryption algorithm that is more efficient in terms of speed. Or the algorithm used for the first steps might differ. These differences are out of the scope of this article. If you want to know more, you can also check this article for the details of the signal protocol.
Even if there can be differences between methods used, this key exchange step we discussed is the first important step that we establish the privacy and security of our communication independent of what the server wants to do, and the main principle is common across many apps and methods.
What if the server tries another trick to be able to see our messages?
Let’s consider the following scenario:
The way to prevent this is to prove that server did transfer the right keys to the devices. In the Signal application, there is a feature for this called the safety number.
The safety number is a number that is generated from the public keys on your device. You can confirm that it is the same on the other device you are communicating with using QR code, speech, or by just putting two phones next to each other and checking. After you did the security number check with a person you talk to over the app, you can be sure that the server didn’t cheat and transferred your keys to each other correctly.
Safety number verification screen of the Signal app.
In conclusion, it is entirely possible to make a conversation without the intrusion of an organization or a third party, regardless of their policy or statement, given that there is no backdoor or vulnerability on your device or the operating system itself, thanks to public-key cryptography and more recent algorithms that followed its steps.
We have talked about a messaging platform being open-source, and we also talked about it using end-to-end encryption. But what happens when those two come together? How do these two things interact?
I think that interaction is very important. If an application is closed source, it is significantly harder for us to be sure that it is performing end-to-end encryption securely. We can only make sure by applying reverse engineering, by trying to figure out what an app (for example, WhatsApp) is actually doing, because we don’t have the source code and only the app.
As such an undertaking will be against the company’s interests, it will have to be conducted by volunteers, spending a large amount of time and effort, and it will need to be repeated for each version of the app.
But the conditions are completely different for an open-source app. For example, specifics of how Signal communication protocol works is well-documented for everyone to see. This means for everyone who has the technical knowledge, it is not only possible to set up their own version of the app and server, it is also possible to understand how it works, check for backdoors and security vulnerabilities, and contribute.
Due to these reasons, there is a significant difference between a company issuing a statement that they are using end-to-end encryption, and an open-source messaging platform telling you that it is using it, both in terms of privacy and security.
In this article, I tried to explain that it is not our only option to choose according to statements declared by companies or organizations since we have much stronger tools at our disposal like strong mathematical algorithms and open-source software.
I believe that in the future, as we become more aware of these tools and methods, we will prefer to use them more, and take control of our data and privacy. This way, there can be a chance for us to slow down the efforts of wide-scale profiling and analysis of our behavior. This is only possible through us understanding the methods and solutions involved.
Previously published here.
Create your free account to unlock your custom reading experience.