Blockchain technology is built on the foundation of certain cryptographic primitives. By primitives, we mean some of the underlying principles and tech techniques employed in the real-world implementation of blockchain. There are various such cryptographic primitives that are employed in blockchain, let's start with hash functions.
Hash functions take an arbitrarily long text as input and then create an output with a defined size: the arbitrarily sized string as input is also known as themessage, and it is turned into what is known as the fixed-size output, which is also known as the message digest.
Hash functions and their certain ways of building different data structures are used for connecting the so-known blocks in a chain very importantly in a tamper-proof manner and this essentially defines cryptographic hash functions.
A hash function is any cryptographic hash function. However, not all cryptographic hash functions are hash functions.
There are several types of cryptographic hash functions, and they generate outputs of different sizes for the types of applications we have in this blockchain domain. Typically, the size of the output of these hash functions will be 256 bits in length. Because such functions, which take an input message and create its digest, are used frequently as we build a blockchain, it is crucial that the generation of the message digest from the message be computationally efficient.
Other important features of cryptographic hash functions include:
Puzzle friendliness: Given x and y, where x is a message and y is the output of a hash function, we can find a k such that the hash of x concatenated with k returns a y, and this y must have certain properties. It is incredibly popular for solving what is known as a mining puzzle in one of the blockchains, such as bitcoin, and it is built on one essential pillar known as proof of work.
One-way functions: Given x, finding h(x) is fairly simple. As previously mentioned, it should be efficiently computable, which means that h(x) may be derived extremely efficiently from x. However, we cannot determine x given any h(x), hence this is a critical characteristic. This is one of the reasons why these are known as cryptographic hash functions, and it is critical that we always have this condition met when utilizing them in the context of blockchain.
Collisions may be rather simple to discover. The birthday paradox problem, for example, states that if a group of n randomly picked people is placed in a room, the likelihood that some of them share the same birthday becomes one when the number of people exceeds 366 if it's not a leap year or 367 if it is a leap year since at least two of them now have the same birthday.
Of course, this is self-evident, but it can be demonstrated that even with only 70 people, the probability of two people having the same birthday reaches a probability of 0.999, and even with only 23 people, the probability reaches a similar probability. It may appear that the number of people required to reach that probability is quite large, but this is not so: even a smaller number of people could reach that probability very quickly. Of course, the birthday paradox places an upper bound on collision resistance.
We are not discussing the birthday paradox in the sense that there are only 365 or 366 possible values, but rather a very large number of possible values of message digests, which is why we generate digests with very long lengths, such as 256 bits, and for a hash function that produces n bits of output, where n is say 256, it can be mathematically confirmed that an attacker must compute 2n/2 hash operations on a random input to obtain matching outputs with a probability greater than 0.98.
To put things in context, for a 256-bit hash function, such as the one used in blockchains, the attacker typically needs to compute 2128 hash operations, which is extremely time-consuming. Even if each hash can be computed in 1 microsecond, it will take close to 1025 years to find two such messages with the same hash value. So we may infer that if we have sufficiently long message-digest lengths, that is, if we use a hash function with digest lengths of say 256 bits, it is nearly impossible to come up with two separate messages that result in the same digest value.
**
**Another important application of hashes is as a message digest, which means that if h(x) and h(y) are the same, we can safely assume that x equals y, or in simpler terms, we can safely assume that the original messages are also the same and the digests effectively are unique representations of the original messages so we can remember just the hash value rather than the entire message. For a lengthy document with certain unique attributes, we do not need to remember the entire text in order to utilize it later, instead, we may construct a digest that can be used to remember this specific message. Because the hash function's output has a predetermined length, we can simply compare the digests to check if two original lengthy documents are identical.
Blockchain technology is built on the foundations of cryptographic hash functions, which are used to connect the blocks in a chain in a secure way. There are several types of cryptographic hash functions, which we will cover in the subsequent articles. We'll also look at their applications and how they may be used to build blockchains such as bitcoin and ethereum.