Gayan Samarakoon

@samarakoon.gayan

Cryptographic essence of Bitcoin part # 1: What is a Hash function?

What is a Hash?

Cryptographic hash functions are mathematical operations run on digital data. In Bitcoin, all the operations use SHA256 as the underlying cryptographic hash function.

SHA (Secure Hash Algorithm) is a set of cryptographic hash functions designed by the United States National Security Agency (NSA).

To put it in simple term, a Hash function is like a black box, where you input any kind of digital information of any size, and the result (output) is an alphanumeric string (e.g.: 0xe3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855). In the case of SHA-256, the output is 32 bytes. This function has 2 characteristics:

1) Unequivocal: the hash (output) is like the fingerprint of the input data. From a human fingerprint, you can´t create the human. So from the hash of a digital input, you can´t create the original digital input.

2) Collision Resistant: nobody should be able to find two different input values that result in the same hash output. In other words, for any different input, there will be different outputs always. This allows to use this function to check Data Integrity by comparing the computed “hash” (the output from execution of the algorithm) to a known and expected hash value, a person can determine the data’s integrity. For example, computing the hash of a downloaded file and comparing the result to a previously published hash result can show whether the download has been modified or tampered with.

“SHA-2 — Wikipedia.” https://en.wikipedia.org/wiki/SHA-2.

https://www.youtube.com/watch?v=b4b8ktEV4Bg

But why is hash so important in blockchain?

That’s because is part of the mining process and miners responsibility: the miners pick some transactions up and using them as part of the input, they try to calculate a hash function to provide a new block to the chain.

Cryptographic hash algorithms

There are many cryptographic hash algorithms. Listed below are a few algorithms that are referenced relatively often. A more extensive list can be found on the page containing a comparison of cryptographic hash functions.

MD5

MD5 was designed by Ronald Rivest in 1991 to replace an earlier hash function MD4, and was specified in 1992 as RFC 1321. Collisions against MD5 can be calculated within seconds which makes the algorithm unsuitable for most use cases where a cryptographic hash is required. MD5 produces a digest of 128 bits (16 bytes).

SHA-1

SHA-1 was developed as part of the U.S. Government’s Capstone project. The original specification — now commonly called SHA-0 — of the algorithm was published in 1993 under the title Secure Hash Standard, FIPS PUB 180, by U.S. government standards agency NIST (National Institute of Standards and Technology). It was withdrawn by the NSA shortly after publication and was superseded by the revised version, published in 1995 in FIPS PUB 180–1 and commonly designated SHA-1. Collisions against the full SHA-1 algorithm can be produced using the shattered attack and the hash function should be considered broken. SHA-1 produces a hash digest of 160 bits (20 bytes).

Documents may refer to SHA-1 as just “SHA”, even though this may conflict with the other Standard Hash Algorithms such as SHA-0, SHA-2 and SHA-3.

RIPEMD-160

RIPEMD (RACE Integrity Primitives Evaluation Message Digest) is a family of cryptographic hash functions developed in Leuven, Belgium, by Hans Dobbertin, Antoon Bosselaers and Bart Preneel at the COSIC research group at the Katholieke Universiteit Leuven, and first published in 1996. RIPEMD was based upon the design principles used in MD4, and is similar in performance to the more popular SHA-1. RIPEMD-160 has however not been broken. As the name implies, RIPEMD-160 produces a hash digest of 160 bits (20 bytes).

Whirlpool

In computer science and cryptography, Whirlpool is a cryptographic hash function. It was designed by Vincent Rijmen and Paulo S. L. M. Barreto, who first described it in 2000. Whirlpool is based on a substantially modified version of the Advanced Encryption Standard (AES). Whirlpool produces a hash digest of 512 bits (64 bytes).

SHA-2

SHA-2 (Secure Hash Algorithm 2) is a set of cryptographic hash functions designed by the United States National Security Agency (NSA), first published in 2001. They are built using the Merkle–Damgård structure, from a one-way compression function itself, built using the Davies–Meyer structure from a (classified) specialized block cypher.

SHA-2 basically consists of two hash algorithms: SHA-256 and SHA-512. SHA-224 is a variant of SHA-256 with different starting values and truncated output. SHA-384 and the lesser known SHA-512/224 and SHA-512/256 are all variants of SHA-512. SHA-512 is more secure than SHA-256 and is commonly faster than SHA-256 on 64-bit machines such as AMD64.

The output size in bits is given by the extension to the “SHA” name, so SHA-224 has an output size of 224 bits (28 bytes), SHA-256 produces 32 bytes, SHA-384 produces 48 bytes and finally, SHA-512 produces 64 bytes.

SHA-3

SHA-3 (Secure Hash Algorithm 3) was released by NIST on August 5, 2015. SHA-3 is a subset of the broader cryptographic primitive family Keccak. The Keccak algorithm is the work of Guido Bertoni, Joan Daemen, Michael Peeters, and Gilles Van Assche. Keccak is based on a sponge construction which can also be used to build other cryptographic primitives such as a stream cypher. SHA-3 provides the same output sizes as SHA-2: 224, 256, 384 and 512 bits.

Configurable output sizes can also be obtained using the SHAKE-128 and SHAKE-256 functions. Here the -128 and -256 extensions to the name imply the security strength of the function rather than the output size in bits.

BLAKE2

An improved version of BLAKE called BLAKE2 was announced in December 21, 2012. It was created by Jean-Philippe Aumasson, Samuel Neves, Zooko Wilcox-O’Hearn, and Christian Winnerlein with the goal to replace widely used, but broken MD5 and SHA-1 algorithms. When run on 64-bit x64 and ARM architectures, BLAKE2b is faster than SHA-3, SHA-2, SHA-1, and MD5. Although BLAKE nor BLAKE2 have not been standardized as SHA-3 it has been used in many protocols including the Argon2 password hash for the high efficiency that it offers on modern CPUs. As BLAKE was a candidate for SHA-3, BLAKE and BLAKE2 both offer the same output sizes as SHA-3 — including a configurable output size.

More by Gayan Samarakoon

Topics of interest

More Related Stories