The Role of Probability in The Safety of Your Cryptocurrencies

Written by elagai | Published 2021/04/11
Tech Story Tags: decentralized-internet | probability | cryptocurrency | bitcoin | safety | seed | freeton | hackernoon-top-story

TLDR The Role of Probability in The Safety of Your Cryptocurrency is based on probability. We will examine the role of probability in security using the example of bitcoin. Bitcoin is the main representative of cryptocurrencies and, one might say, at the moment, the only field for activities, which will be discussed below. One day this information may also become relevant for Free TON. Although there are only 23 people in 23 people are only in a group of people, the probability that at least two people have the same birthday is more than 50%.via the TL;DR App

This article is devoted to security from the point of view of probability. Losing keys from wallets, forgetting learned mnemonic phrases, or giving to a bad boy 12 words of your seed-phrase under a life threat are not the subject of this article.
Cryptocurrencies are based on probability. Does this mean that you install a bitcoin wallet and one day you will find there 100 BTC? Is that possible? Yes, it`s possible, but it is extremely unlikely to happen. Rather, all the stars will blow in the universe than such an event will happen. 
However, this fact does not stop enthusiasts – “treasure hunters” who are purposefully trying to guess the keys to old addresses (and not only to old), which contain the coins mined in the early years of Bitcoin’s existence. We will examine the role of probability in security using the example of bitcoin, as the main representative of cryptocurrencies and, one might say, at the moment, the only field for activities, which will be discussed below. One day this information may also become relevant for Free TON.
So what is this activity based on? Not only on the fact that there is a possibility to find exactly the right key to the address but also on the existence of collisions. Hash collisions are the situations where two different source datasets produce the same hash (hash-value).
The fact that collisions exist can be easily proved, using the example of bitcoin. The initial data for obtaining the public key (which was used as an address for the first time after the launch of bitcoin) is a private key.  The address is obtained from the public key by sequential hashing. In this case, the space of private keys has a size of 2^256 (256-bit), but the address space is already only 2^160 (160-bit). If we divide the first by the second, we get the potential number of collisions 2^96.
However, things are not so bad, because many collisions are not in the private space of the keys at all. But there is some chance of collisions (very negligible). By the way, there is also the possibility that some addresses do not have a private key at all for the same reason (the set of initial data that could have been a private key is not in the valid keyspace).
Therefore, an important property of the hash function (in particular for use in the field of cryptocurrencies) is resistance to collisions, which are divided into two types. (That the hash function must be irreversible for cryptographic purposes is taken for granted in this case and will not even be mentioned further.)
If it is practically impossible to select a data set with the same hash as for a given dataset, then this is called collision resistance of the first kind.
If it is almost impossible to find a couple of different datasets with the same hashes, then this is called collision resistance of the second kind.
It seems like they are the same, but there is a significant difference between collisions of the first and second kind. The complexity of picking a collision of the first kind with a probability of 50% (as a chance when tossing a coin to get, for example, heads) is:
2^(n-1) (n is the number of bits)  attempts when enumerating data, while the complexity of picking a collision of the second kind is much lower, and to achieve the same probability of 50% is 2^(n/2). Based on this, for cryptography, the hash function is considered to be brute-force resistant according to the criterion for collisions of the second kind, that is, with a complexity of 2^(n/2) attempts.
Why is there such a difference in the complexity of finding collisions of the first and second kind? This is a direct consequence of a “birthday attack” method based on the birthday paradox. Briefly about the birthday paradox: in a group of 23 people or more, the probability that at least two people have the same birthday is more than 50%. Although there are only 23 people, and there are 365 days a year (or 366 in a leap year), the number of possible pairs is 23!/2!*(23-2)! (according to the formula of combinations), which is 253. For a larger number of people in the group, the probability of such pairs is even higher.
Now please take a look  here. As you can see, there are a lot of pages with private keys to bitcoin addresses. Treasure hunting is ongoing, for example, right now when you’re reading this article, at a rate of 283 Mkey/s (million keys per second). This is the activity of the Large Bitcoin Collider (LBC). By the way, 283 million keys per second are not enough, it’s just the performance of a couple of top-end Nvidia cards.
For the first time something like these pages I happened to see when at the end of 2013 in the chat of the BTC-E exchange there were screams (if I can put it this way in relation to the chat) “Bitcoin got hacked!” As a result of the panic that arose for a short time, the rate fell. However, pretty soon the turmoil ended, since all these pages were no more than tiny parts  of the large amount of information and practically did not pose a threat to the security of wallets. Although I must admit that some balances were found at those addresses. However, they immediately came to the author of the pages. He used pages to have visitors check balances instead of a parser. Cleverly thought out, but extremely unproductive.
The LBC also boasts some of the trophies they have earned, a list of which can be found here. In addition to LBC, there are many enthusiasts who are engaged in treasure hunting on an individual basis. For this, they have special tools. Among the well-known programs for this, we can first of all mention the very old Vanitygen, which was actually created to generate beautiful addresses (for example, to put your name at the beginning of the address), and not to attack address lists. A more advanced program VanitySearch is now well known, but it is also not created for the purpose of brute-force attacks although can be used for this. For completeness, we can also mention the Plutus program – it’s about the same as VanitySearch.
In general, the scheme of their work looks like this: the list of addresses selected by the target of the attack is loaded into the program, and the program starts searching from a random point. Also, this process generates many keys to empty addresses (empty for that moment, but perhaps in the future, they can be used), which can be stored in databases (or lists), similar to what can be seen at the large bitcoin collider (LBC uses its own program, vanity does not create lists). Of the well-known programs, it is also worth noting BitCrack, which, unlike the above, has a more thorough approach to attacking address arrays, since it was created for this. Its features include setting a starting point, setting a range in the space of keys, brute-force options with arbitrary alternation (after 1, 2, 7, or a million, whatever you like), saving the result of the work for later continuing the brute-force.
By the way, it is recommended to store (for a long time) cryptocurrency not only in cold wallets but in the case of bitcoin also on addresses that do not have outgoing transactions (outputs). For the reason of not disclosing your public key. Some wallets even have warnings that disclosing a public key compromises security. Why? It is believed that a public key can provide a trail (or hint) by which you can roughly determine the range in the keyspace where the private can be located. And assuming the range can significantly narrow the search space for BitCrack in particular.
The above programs actually search for collisions of the first type. And there are also programs for finding collisions of the second kind. As we already know, the probability of finding such collisions is much higher. For example, an interesting program Kangaroo uses the “kangaroo” or lambda-algorithm for searching (the name comes from the fact that the graph of the approximation to the result is similar to the letter lambda). There is also a "baby step giant step" algorithm. A detailed immersion in these algorithms is not the subject of this article, since all of the above programs and algorithms are given here only so that the reader knows that the safety of his funds is always “tested for strength” by someone.
If someone is interested in this topic due to sporting interest or excitement and wants to “try”, some factors should be taken into account. Theoretically, there are two ways to achieve success in this field:
1. Create a quantum computer capable of sorting through huge amounts of data and finding a collision within a reasonable time.
2. Solve one of the seven problems of the millennium, namely the problem of equality of the classes P and NP in the theory of algorithms (while receiving a million dollars as a prize), and solve in the form P = NP, which will immediately entail the collapse of cryptocurrencies in the form they exist.
By the way, all cryptographic systems are built on this statement (P is not equal to NP). In a simple way, when applied to cryptocurrencies, one can interpret this problem like this: is there an algorithm that allows you to find the key to the address relatively quickly (not by brute force)?
Is it funny? Here’s a small historical example:
It is known that the number of primes not exceeding a given x is determined by two approximate formulas: x/ln x (ratio of x to the natural logarithm of x) or Li x (integral logarithm of x from 2 to x), while the first formula underestimates the real value, and the second (albeit more accurate) overestimates. That is, x/ ln x is always less than Li x. This is exactly what Ramanujan (the famous mathematician) believed. However, it has been proven that this is not the case. In the vicinity of some very large number (Skewes number), the value of the first formula exceeds the value of the second. And even if Ramanujan could be wrong, until the problem of equality of the classes P and NP is not solved, it cannot be unambiguously asserted that the classes are not equal (although it is worth noting that some assertions can neither be proved nor disproved at all).
And in addition to the two theoretical paths mentioned above, there remains the only practical method for the gambling key seeker. For this, the power of the equipment does not matter at all, be it at least one old Pentium, at least a bunch of farms on top-end video cards. It is not the video card that matters, but the astral birth chart, or the result of calculations according to numerology, showing whether there is luck in fate, before starting to search for treasures. Here you can laugh, but what else remains?
Having familiarized yourself with the types of “risks” for owners of cryptocurrencies, described in the article above, you can finally touch on a topic that is direct of practical importance for Free TON users, but not just them. And this topic is about seed security. And consider it precisely from the point of view of probability, and how some users, without realizing it, can reduce the security of their wallets, at the same time thinking that on the contrary, they increase it.
So, the seed phrase in our case includes 12 words of a special dictionary, which consists of 2048 words in total. Thus, each of the positions of the phrase can take 2048 values, which corresponds to 2048^12 combinations. If we reduce the base of the degree to powers of two, we get 2^11, then the number of options is 2^132, which means 132-bit space of seed-phrases. This is a very high degree of security, but it seems to some people that it is not reliable enough. And then, for example, such a trick is invented: divide the phrase into two parts and hide the pieces of paper in different places. The simplest inference leads to the fact that the chance of finding one piece of paper out of two is higher than the chance of finding one piece of paper out of one. Most likely, the phrase is divided in half and then the finder knows six words and their order. Moreover, it does not matter the first half of the phrase or the second, since he will check both options and this will delay his search (specifically 2 times), but it is not so critical regarding the entire amount of data for the search. For simplicity, suppose he knows for sure that he has found the first part of the phrase, then all he has to do is go through less than 2048^6 combinations (or 2^66) to know the phrase 100%. Security dropped from 132 bits to 66 bits. Yes, this is a very long time, but on powerful hardware, a threat to the security of the wallet takes place. It will take more time not to brute force itself, but check the balances of generated addresses.
Another common trick is to rearrange words in places in a phrase and assume that this trick will significantly prevent the wallet from emptying, even if someone finds a piece of paper. In fact, no, because when the values of all 12 positions are known, but the order is unknown, then the total number of options is only 12! (number of permutations) namely 479001600 – a ridiculous number for cryptography.
Of course, we will not consider here any fantastic tricks like the words from the seed phrase tattooed on the ass or on the heel that someone saw in the sauna. Since the imagination of people is huge, then nothing should be invented in addition to what is laid down by the creators. You should only follow the recommendations that say that the mnemonic phrase should be stored in a safe place, written on paper, in the sense that the storage should not be connected in any way with electronic media – not on a flash drive, not on a disk and, moreover, not on a computer with access to the network.
Do I keep multiple copies of the phrase? On the one hand, this increases the likelihood that someone will find one of the copies, on the other hand, it seems to protect the owner from loss and information by himself. It would be more correct to say that not to store, since the keys should not be scattered about, even for the purpose of your own safety net. 
Cryptocurrency is a personal responsibility. Even if the owner does not store the phrase anywhere, but has learned it by heart, this does not guarantee absolute safety, since there is a chance that he will one day get hit on the head and lose his memory. In cryptocurrencies, there are no opportunities like a bank account, where you can protest transfers, block a card, restore access, and so on. At the same time, no one can deprive you of your cryptocurrency, unlike a seized bank account. Therefore, along with high security and full power over funds, there is an element of risk of complete loss of access to the wallet, as the flip side of security.
That’s it. Keep your mnemonic phrases in a safe place. And don’t lose them.

Written by elagai | Cryptocurrency | DeFi | Yield farming
Published by HackerNoon on 2021/04/11