Bitcoin is a decentralized form of digital currency based on a system of trust underpinned by cryptographic puzzles such as various properties of Elliptic curves. Without delving too much into the mathematical aspect of it, let’s see how the Bitcoin trust system is built. To properly grasp the key ideas in this post, profound knowledge of cryptography is not needed. But, it is expected that you have the basic understanding of a private and public key.
Each time you want to authorize a payment, you’re asked for some sort of verification. The purpose of the verification is that once you’ve authorized a deal under your name, it is irreversible. Of course, there are cases where you might want to add a clause that the payment has to go in the other direction if certain conditions are not met (escrow). But once a deal has been authorized with a signature, it is set in stone. And the only question is: can you verify with complete certainty that it was indeed authorized by the person whom it is supposed to signify? Mechanism for such verification keep evolving as security loopholes get discovered¹. The problem of identifying someone using their signature isn’t just limited to payments. It’s an age old problem. There was a time when expensive emblems were used for sealing letters as a proof that the letter was indeed from a certain authority.
Sealed letters and application
The practice of sealing letters have been replaced by PGP keys. PGP keys are interesting because in some ways they are very similar to Bitcoin keys. Let’s look at PGP for a minute and later see how Bitcoin builds on that.
Sample PGP signed message
There are two ways you might want to use a PGP key:
1. For encrypting and decrypting emails. If anyone intercepts, they are not able to decipher what is written in the email.2. For signing messaging. If anyone intercepts, they are able to verify the authenticity of the message.
PGP signing flow diagram
We are more interested in the latter.
PGP effectively replaces the sealing device. The process works as follows. You, first, have to generate a public key-private key pair. Now you write your email. After you are done, you take your private key and the email, run it through PGP signing program and out comes a signature. This signature can only be produced by someone who holds the private key. How does one know for sure? Well, you take the message and the signature, and run it through a PGP verification program providing the public key of the sender. This program will tell you if the email was indeed constructed by the person whom he is claiming to be. The only question that remains to be answered is: how do you get the public key? It’s an interesting problem because the encryption techniques don’t provide you with any mechanism for it. It’s entirely left in the hands of the users. Typically, in the case of PGP, users upload their public keys to some central server² so that anyone can fetch it.
Bitcoin uses a very similar process of public key cryptography. Instead of signing messages, however, the bitcoin program allows you sign to transactions.
So now that you have gotten an understanding of how PGP encryption works, you can start thinking of Bitcoin signatures. It’s a little tricky however. The naive thinking would be to expect that when you create a transaction offering to deduct X amount of bitcoins from your account and transferring them to Y, you would simply sign this proposal. But that has a problem. If you are just creating transactions stating that you want to transfer your assets, then who verifies that you do indeed have the money? It would require some central authority for verification. That’s not how Bitcoin is designed to work. The power ultimately rests in the hands of the users. Bitcoin proposes a nifty solution. Here’s the idea: it is only possible for you to have bitcoins if you received bitcoins at some point in the past. Hmm… aha! Make public key serve the role of identity. This way the only person who can claim the transaction, is the one who holds the private key. And so, Bitcoin signatures serve two-fold purpose:
The latter is fundamental. We will see in some other post how this forms the basis for wallets (or collection of accounts). By the way, Ethereum works slightly differently in a way that it does introduce a standalone concept of accounts³. But Bitcoin is simply a chain of transactions. And so, for ownership, you have to scan the chain of transactions and to spend them, you have to unlock them and use them in a transaction.
Side Note: You might be wondering: Ethereum has a better design because it allows for creation of accounts. It really depends on the use-case. Simplicity in design is what allows the system to remain streamlined and Bitcoin does achieve its purpose of being a store of value. Ethereum, on the other hand, enables more use-cases but at the cost of being less streamlined and more bulky.
Each transaction is composed of inputs and outputs. Each input comprises of some previous transaction. And each output comprises of the destination where the funds are being sent. When you sign a transaction, you sign both inputs and outputs. By the signing the inputs, you are effectively saying that you are the true owner of those coins. And by the signing the outputs, you are agreeing that you do indeed want to send those coins to the specified address. To keep the money flow going, each output of a transaction is used as input to another transaction.
outputs consumed as inputs
Enough theory. Let’s see a real transaction.
Heh, that’s the actual transaction that is transmitted over the wires. There is no way we can understand anything from it. How do we interpret it? One word: Protocol.
A protocol is a standard used to define a method of exchanging data over a computer network
If we look at the transaction protocol rules⁴, we can see how the transaction was constructed.
transaction protocol
transaction input protocol
transaction output protocol
Let’s break down the transaction according to the protocol.
Contains metadata information. For instance, it tells the program that this is a transaction message.
Allows for backward compatibility. If the protocol is updated with a new field, old transaction shouldn’t become unrecognizable.
This is a field that was introduced as part of segregated witness change⁵. We will look at it some other time.
Transactions contain only the references to a particular output of a previous transaction that is being spent. It uses a combination of a transaction hash and the output index number (starting from 0). The idea here is that it’s simpler to just reference a previous transaction. If needed, the program can fetch the full transaction using its hash. And using the index number, it knows which output is to used as an input.
To calculate the transaction hash, you concatenate the full HEX representation of the transaction and compute the SHA256 of that string twice⁶.
Outdated. See https://bitcoin.stackexchange.com/a/55113.
Amount (in satoshi) that is being sent.
The block number or timestamp at which this transaction is unlocked.
The destination of the transaction. Hidden in this script is the public key hash of the receiver represented as a hexadecimal. To get the corresponding Bitcoin address, simply convert this HEX to Base58.
Similar to the scriptPubKey, scriptSig contains the signature that authorizes the transaction. Every part except the signature itself is signed.
Now that we know the rules of encoding, let’s decode it:There is one transaction input, which is 0th index of the transaction of hash 2936ee6a0db4e4901988503bb6e966128dd5fa01bcf08451f78a1d5b08dbbd6 and there are two outputs. One of which is 0.05 BTC addressed to 3SwtkZDEtSFxYjqzgdwTvPquLv64RVkTBzSThUkvXxRE4TH6TsGxnrG and the other 33.54 BTC addressed to 3Q7MciDryx4D9PefDEQcKUQ2iUG4i4efb8Buho7GFLckepyTrnwkr4h.
You might have noticed that instead of calling it simply public key and signature, we can it public key script and signature script. This is a very powerful concept and we will explore it thoroughly some other time. But I will leave you with this idea: sometimes you want to create a programmable transaction so that it only executes when certain conditions are met. For instance, you might want to return an item you are just bought if found faulty. Or your manager wants to approve every transaction that you make. Script allows for interesting use-cases and there is an ongoing effort to improve the script even further.
Transactions are a key component in Bitcoin’s decentralized peer-to-peer electronic cash system. Public key cryptography ensures that no-one can just take a transaction created by you and change it so that the Bitcoins are now addressed to a different destination than the one you intended to send to. It also makes sure that nobody except the owner of the private key can access the funds addressed to him.
To recap: transactions are irreversible and indisputable. To consume the output of a previous transaction as an input to a new transaction, you need the private key. This allows you to sign the message. To send assets to some account, you need their public key. Transactions are one of the core features of the Bitcoin system.
But that is not all there is to the trust system. What if I create a transaction, broadcast it and quickly create another transaction and broadcast that one as well? Who is to invalidate one of them? Generally speaking, who ensures that all transactions in the system are done fairly. And how? Bitcoin wouldn’t be considered decentralized if it needed an entity that had to check whether everything is fair and square. In the second part of this post, you will see exactly how it is possible to not have a central body and still ensure that everything runs smoothly.
[1] Latest one being Transaction malleability[2] PGP FAQ: Public key servers[3] Ethereum account management[4] Bitcoin Transaction Protocol[5] BIP: Segregated Witness[6] How to compute transaction hash
Cryptocurrency still has a long way to go before it truly becomes Internet money. The foundation has been laid, but a lot of challenges lay ahead. Follow me on twitter to get latest updates.