The graph shouldn’t look like this… or should it?
The nonce (number used once) in Bitcoin is a 32-bit number that miners use to generate a valid hash for a block. This random number is found in the block header and, along with other data, is used to create a hash that must be smaller than the target defined by the network.
The simple formula for calculating the block hash is as follows:
Block hash = SHA-256(SHA-256(block header))
In other words:
Block hash = SHA-256(SHA-256(version + previous block hash + Merkle Root + timestamp + difficulty or bits + nonce))
The nonce is the value that miners adjust in each attempt to find a hash that meets the network’s target. The possible range for the nonce is from 0 to 4,294,967,295. This process is repeated until the generated hash is valid, allowing the block to be added to the blockchain.
The formula for calculating the block hash is computationally simple: the data is concatenated and the SHA-256 algorithm is applied twice. However, for the resulting hash to be valid, it must meet a specific requirement: it must start with a certain number of zeros, as defined by the network's difficulty. If the components of the function remain constant, the resulting hash will always be the same. This is where the nonce comes into play—a one-time number that is continuously adjusted to modify the hash outcome.
This process creates a "cascading effect": even the slightest change in the data (such as altering the nonce) completely alters the hash. Miners test different nonce values until they find a hash that meets the difficulty level required by the network. This ensures that the proof of work is challenging and that the final block meets the security requirements established.
An important point is that, in addition to the nonce, the timestamp is another dynamic value that also affects the hash. The timestamp represents the exact moment when the block is being mined and changes automatically over time. If the proof takes too long, the timestamp updates, "forcing" the miner to restart the process from scratch, testing again with different nonce values to find a valid hash. The rest of the block's data, such as the list of transactions or the Merkle root, may also change, though it is less likely to change over short periods (e.g., every second), unless new transactions are added or network changes occur.
If you roll a die, the probability of landing on a specific number, like 4, is 1/6. Even if you roll the die many times and it always lands on 4, the probability of landing on 4 on the next roll remains 1/6... unless the die is rigged 😉.
Similarly, in Bitcoin mining, miners use brute force to find a nonce that causes the block hash to meet the difficulty target (e.g., the hash must start with 18 zeros). To do this, they test nonce values sequentially: first 0, then 1, then 2, and so on. The valid hash could be found with nonce number 3,245,231, or even with nonce number 3.
In large mining operations, such as farms or pools, the range of possible nonce values is divided among multiple miners, assigning specific segments to avoid overlap. For example, one miner might handle the range of nonces from 0 to 1,000,000, while another works on the range from 1,000,001 to 2,000,000. This helps optimize resources and increases the efficiency of finding the valid hash.
Whether it’s individual or collective mining, the search for this number remains random and unpredictable. But... has it always been like this? Will it always be? Or can the die be "rigged"? 😉
As we mentioned earlier with the dice example, the probability of getting any result when rolling them is 16.67% (1/6), regardless of previous rolls. As we accumulate more rolls, these frequencies should tend toward this theoretical value.
Does the same happen here with the nonce values? To analyze this, we divided the range of nonce values into 16 equal parts. Following the same reasoning as with the dice, the theoretical probability of any nonce falling within one of these ranges is 6.25% (i.e., 1/16).
However, when observing the data, we notice some deviations from this expected probability. To make the analysis easier, the cases where the percentage exceeds 6.25% are marked in green, and where it is equal to or less than this value, they are marked in red, from the first block to block 867,366:
In the provided tables, specific patterns can be observed at a glance through the use of color. The 6.25% probability should remain relatively constant if there are no biases in the generation of nonce values. However, the data shows significant variations, especially in the early ranges, such as Range 01, where the actual probability consistently exceeded the theoretical value, peaking at 49.62% in 2010. This indicates a clear preference for low nonce values in the early years.
This suggests that, in the beginning, the generation of nonce values was not as random as expected, possibly due to less sophisticated mining methods. As mining technology advanced (both in quantity and quality), this overconcentration in low ranges decreased, aligning more with the theoretical probability of 6.25% in the following years.
Next, these results and their potential implications will be examined.
Analyzing the annual evolution, the following points can be highlighted:
In this analysis, we are considering only two possible ranges for the nonces:
The theoretical probability for each range is 50%, as both cover half of the possible nonce value space.
It is evident that, in addition to the earlier periods (as analyzed previously), in the last 7 years there has been a persistent trend towards the appearance of the "winning" nonce in the first range, i.e., within the first 2,147,483,648 values. This preference is particularly notable, as the theoretical probability for both ranges should be 50%. However, the data shows that during this period, the lower range has had a higher incidence, consistently surpassing the theoretical mark.
Looking at the bigger picture, the following question arises: Are there specific strategies that contribute to this over-concentration in the lower range, despite the process theoretically being completely random and fair for both ranges?
The observed distribution of nonce values provides important strategic insights for miners. The fact that lower ranges have historically shown a higher frequency of use suggests several key points:
The recent data from 2023 and projections for 2024 show a slight stabilization, although significant fluctuations in the lower ranges are still observed. This could indicate:
The analysis of nonce values in Bitcoin reveals that, although the mining process should be random, there are significant patterns, particularly in the early years and in the lower ranges of values. This bias towards lower nonces can be attributed to the lack of sophistication in early mining algorithms, but it also suggests the possibility that miners are using strategies to optimize the search for valid blocks, which may be reducing the expected randomness.
The recent trend shows slight stabilization, although fluctuations persist, especially in the lower ranges. This could result from advances in mining hardware and software, allowing further optimization of the process, but it may also contribute to the centralization of mining power. If these patterns continue, there is a risk that mining could concentrate further in a few actors, potentially compromising the decentralization of the network.
Ultimately, the observed behavior could be interpreted as a natural adaptation by miners to a competitive environment, but it raises the question of whether it is possible to "mark the die" to gain an advantage.
Source: Data obtained from my own Bitcoin validator node.