Patterns exist everywhere. From art to mathematics, from music to biology, patterns surround almost every aspect of our lives and culture. This is no coincidence, for the human mind excels in the creation and recognition of patterns. Recently, a particular family of patterns grabbed my attention: Cellular Automata. Popularized by Dr. Stephen Wolfram (the creator of Mathematica and Wolfram Alpha), Cellular Automata (CA) are patterns generated by a well-defined set of rules, yet are capable of exhibiting complex, and sometimes even random, behavior.
At first, I saw these patterns simply as interesting images to look at. However, as I read more about them and studied their behavior, I realized that these patterns are capable of more than producing pretty pictures. CAs provide a means to model many types of complex systems, and this fact alone may hold large implications for the future of AI and many other types of technologies including Blockchain.
This article is a potpourri of my musings and experiments regarding CAs and their applications, so after reading the following introduction to these patterns, and a bit about Rule 30, please feel free to skip around to the sections that appeal to you, although I hope you will find each as fascinating as I do.
Please enjoy.
A simple example of a Cellular Automaton is a grid of cells that evolves over time according to a set of rules for generating patterns (http://mathworld.wolfram.com/CellularAutomaton.html). Here is a simple example of a CA that was created by “Rule 1”:
Cellular Automaton Generated by Rule 1
In the top row we begin with a row of white cells with a single black cell in the middle. This is our initial condition. We could come up with a large variety of starting conditions, but let’s keep it simple for now. To generate subsequent rows, we apply a simple rule: For each cell in the row we are generating, look at the three cells just above it, (i.e. the top left, top middle, and top right). If all three are colored white, then the current cell should be colored black. Otherwise, the next cell should be colored white. This rule can perhaps be better understood graphically:
Rule 1. (All rule photos by wolframalpha.com)
The top three cells in each square window contain the possible patterns we might observe in a sub-row containing three cells. If we find that pattern, we draw the corresponding cell directly below the three it. Try it!
We can continue to apply this rule for as many subsequent rows as we wish. This CA contains an obvious pattern and isn’t all too interesting, but there are many other rules for us to explore! In fact, since a binary cell depends on the three binary cells above it, there are 2^(2³) = 2⁸ = 256 rules in this system (for a complete list of simple CA rules, look here: http://mathworld.wolfram.com/ElementaryCellularAutomaton.html).
A slightly more interesting CA is “Rule 50”:
Cellular Automaton Generated by Rule 50
Rule 50
While slightly more complex, Rule 50 still contains a boring, predictable pattern. Let’s take a look at a few more examples to see what else CAs are capable of:
Cellular Automaton Generated by Rule 54
Cellular Automaton Generated by Rule 62
Cellular Automaton Generated by Rule 90
Rule 90
Rule 90 is quite fascinating. Notice its recursive structure —the entire pattern resembles a triangle, but this structure can be broken up into smaller but similar triangles. Now, these smaller triangles can be broken up into still smaller ones and so on and so on until we reach a cell which cannot be divided up any further. It marvels me that a such a simple generative rule can create such a beautiful pattern. Yet Rule 90 is still predictable in that we have a good sense of what the pattern would look like if we continued to generate more and more rows according to the generative rules. But are all CAs predictable?
Astonishingly, the answer is no.
Cellular Automaton Generated by Rule 30
Rule 30
Notice the two different behaviors in the CA diagram- while the left side seems rather orderly, the right side appears quite chaotic! It is far from obvious what the next few rows would look like as there does not seem to be any clear pattern. In fact, mathematicians showed that this structure meets many rigorous tests of pseudo-randomness (mathematical randomness as opposed to true randomness observed in nature) (http://mathworld.wolfram.com/Rule30.html). This pseudo-randomness is so robust that Wolfram’s computing language, Mathematica, uses Rule 30 as its random number generator!
What is truly fascinating is how this chaotic arose out of a simple starting condition and a simple set of rules. This may remind one of the “Butterfly Effect,” a hypothetical phenomenon in which a butterfly flaps its wings, thereby causing a slight disturbance in the air, which causes a disturbance in the breeze, which causes a disturbance in the wind patterns, which causes a hurricane. This event, while rather absurd, is certainly imaginable in the natural world. What is interesting about Rule 30 is that it takes place in the computational world, that is, the world of programs.
Computer programs are in essence a set of clear, unambiguous instructions written for a machine. A programmer writes the instructions and supplies the machine with input data. The machine performs the instructions on the input by manipulating electrical circuits, and returns the result of the computation. Machines, while digital in today’s world, are ultimately mechanical devices that add numbers together and move these numbers around in memory. As such, we expect machines to perform exactly as we tell them to. If we tell a machine to compute 2+2, we expect the answer to be 4, and if we tell it to delete the number stored at a particular place in memory, then we expect exactly that. Machines would not be very useful if this were not the case! The point is, machines do what we tell them to do and nothing more.
CAs are really programs: they have input (the starting row of cells), a set of rules for manipulating the input, and they have output (the observed patterns). But in the case of CAs, do we really know what we told them? We’ve seen how simple CA programs can exhibit complex, unpredictable behavior. Even though the program performed exactly what we told it to do, we had no idea what the output would be. The program, in some sense, has escaped our understanding of it.
While most CAs exhibit predictable behavior, our intuition of Rule 30's behavior breaks down entirely. What might the next 100 rows of the above diagram pattern be? The only way to know the next part of the pattern is to compute it. This leads us into our next discussion, that of Computational Irreducibility.
Cliche Blockchain image because every post about Blockchain has to have one! This image has nothing to do with this section, I just think it looks cool, don’t read into it too much. (Photo: CoinSutra)
It is difficult to write about technology today without mentioning Blockchain. In my musings about Cellular Automata, I realized that these programs have interesting applications in the Blockchain field.
For those still curious about what this buzzword means, Blockchain is simply a system of trust between computers. This trust has applications ranging from financial transactions to secure voting systems. In order for trust to exist between any two parties, some sort of foundation must first be laid down. Gold backs up financial transactions and currency, emotional stability creates trust between people, and a knowledge of mathematics gives us trust in the correctness of a computer program. In the Blockchain, trust is created by chaining blocks of information together in a secure fashion, hence the name.
Powering this system is a mechanism known as a Proof of Work (POW) algorithm. The core idea is that a non-trivial amount of time and energy must be spent in order to add a block onto the chain of blocks, but a block’s validity (or proof of work) must be quickly verifiable. This ensures that maliciously editing a block’s information is difficult, but proving a block’s validity is easy. POW problems must entail a problem that is difficult to solve, but once a solution is found, the solution is easy to verify. This may remind computer scientists of NP problems (nondeterministic polynomial time). A common example of problem in this space is prime factorization. Consider the following problem: factor 1240 into prime numbers. Solving this problem involves testing various prime numbers to see if they divide 1240. After some work, the answer can be found (2x2x2x5x31). Verifying the answer is much easier, we simply multiply all the numbers together and see if the result equals 1240. Prime factorization is such a difficult problem compared to the simplicity of its verification that it is the core component powering many cryptography systems. To be sure, there are some techniques, or shortcuts, developed over the years that speed up the computation, but the problem is still fundamentally difficult.
A brief discussion of how Quantum Computers could break blockchains
While mostly a theoretical exercise up until the last few years, several companies, such as IBM, are now creating basic quantum computers (. This upcoming technology could potentially disrupt prime factorization altogether, thereby causing stress to the world’s cryptographic systems. The inner workings of quantum computers are outside the scope of this article (I want to write about them soon!), but the idea is that they can take advantage of quantum mechanics and various properties of prime numbers to perform certain types of computations, including prime factorization, exponentially faster than classical computers. When these types of machines become commonplace, many modern POW problems could lose their effectiveness. One should note that not every problem is more efficiently solved by quantum computers. So far, mathematicians only know of a handful of problems that can be sped up due to special relationships between quantum physics and number theory that create computational shortcuts. Many problems offer no speed up when run on a quantum machine.
Cellular automata present a potential proof of work problem that might not be as susceptible to the quantum revolution. We previously discussed how some CAs, namely Rule 30, have the quality that there is no shortcut to determine the pattern’s behavior — the only way to determine the next row in a pattern is to actually spend the time to compute it. This property of having no computational shortcuts is aptly called computational irreducibility (http://mathworld.wolfram.com/ComputationalIrreducibility.html). Each row in the CA pattern must be computed sequentially, with the computation of each row dependent on the completion of the computation for the previous row. Thus, CAs entail computations that take a non-trivial amount of time to perform, and this makes them a strong candidate for proof of work problems.
Here is a problem I propose could be used for such a mechanism: Given the nth row from a CA pattern, what rule/initial condition combination could create it?
A possible example of the nth row of some CA
Suppose the illustration above is the 100th row from some CA pattern. The problem then is to backtrack through the unknown computations that created this row. Solving this problem sounds quite difficult! Imagine all the possible rules and initial conditions that one must consider. However, once the problem is solved, and we know the rule and starting condition that generated this particular pattern on the 100th row, verifying the solution is quite simple. We just take the starting condition and apply the rules to generate subsequent rows, and then make sure that the final row is the same one we presented in the problem statement.
Hard to compute, easy to verify, and computationally irreducible, computations involving cellular automata may provide the necessary mechanics to ensure a Blockchain’s integrity throughout the quantum era.
We’ve previously discussed the difficulty of predicting patterns in Rule 30, at least on an abstract level. Human intuition fails in predicting unseen row sequences in Rule 30, but how might a computer perform in this regard? To answer this, I conducted a series of experiments involving neural networks. I attempted to create a simple feed-forward network to predict the next row in a CA pattern given the previous row. In other words, the neural network’s goal was to discover the latent rules that govern the creation of several CA patterns.
For those with experience in deep learning, I show the network architecture I used for all experiments below. I used 3 hidden layers with 1000 neurons with LeakyReLU activation , ADAM optimization, and a sigmoid output layer. To discourage over-fitting, I included several dropout layers.
The input is an n-by-m matrix X that contains n examples of CA rows with m cells each, and the output is a matrix y that contains n CA rows that correspond to the CA rows directly after those CA rows in X.
# data contains observed rows from the automata patternX = np.array(data[0:-2])y = np.array(data[1:-1]) # offset by 1X_train, X_test, y_train, y_test = train_test_split(X,y, train_size=0.8)
model = Sequential()model.add(Dense(1000, activation="linear", input_dim = X_train.shape[1]))model.add(LeakyReLU())
model.add(Dense(1000, activation="linear") )model.add(LeakyReLU())model.add(Dropout(0.5))
model.add(Dense(1000, activation="linear") )model.add(LeakyReLU())model.add(Dense(1000, activation="linear") )model.add(LeakyReLU())model.add(Dropout(0.5))
model.add(Dense(y.shape[1], activation="sigmoid") )optim = keras.optimizers.Adam(lr=0.0001)model.compile(loss='mse',optimizer=optim )
history = model.fit(X_train, y_train, validation_split=0.2, epochs=100)
For the first experiment, I generated rows using the simple Rule 1. Below is the ROC curve for the unseen test data.
So far so good! The model appears to have, in some sense, correctly identified Rule 1’s generation mechanism. That is, given a row of cells, the neural network is able to compute the next row just as Rule 1 does. Let’s try a most sophisticated example, say Rule 54
Again, the neural network does pretty well here. The blue line is slightly rounded in the top left, indicating imperfect performance, but still pretty good. Now let’s predicting rows for one of my favorite CAs, Rule 90 (recall this CA’s pattern contains an intriguing recursive structure).
Ouch. The neural net did a terrible job. Remember, Rule 90 is just a function for generating patterns, and a very simple one at that. From the ROC curves, it appears that the network is making random guesses. A human could look at two sequential rows and with a tad of work infer the pattern used. However, the neural network seems to have failed in this task. For good measure, let’s see what happens when we use Rule 30.
In this case, the network is highly biased toward making positive predictions most of the time. While this ROC curve indicates better predictive power on Rule 30, this is not to say that Rule 30 is easy to predict, but rather it means that the rules that govern this CA were more easily determined than in the experiment with Rule 90.
An interesting phenomenon I observed during these experiments was the testing loss curve. In all experiments except Rule 30, the loss converged. However, in the case of, Rule 30 the loss began to rise with more training. This generally indicates model over-fitting. I don’t know if there’s a connection between Rule 30’s chaos and this non-convergence, but it is interesting to think about. I’ll leave any further conclusions to the reader.
Here are the testing loss curves for the various experiments.
Since Cellular Automata are Turing complete, they are capable of computing anything that can be computed and modeling almost any kind of system (http://mathworld.wolfram.com/UniversalCellularAutomaton.html). Might they one day be used to model intelligence? Past attempts to model intelligence, and one might consider artificial neural networks as such an attempt, have encountered the same issue: we don’t really know what intelligence is. However, this might not be necessary. We previously discussed how programs can exhibit behavior that, in some sense, goes above the programmer’s intentions. Understanding the program’s behavior was not critical in its creation. Additionally, the programs behavior changes and adapts over time, creating chaos and randomness. Perhaps one day we will create a CA that evolves in such a way that it eventually models intelligence. Granted, this may involve much more advanced CAs that don’t operate on simple black and white grids, but it isn’t too far out there to suggest that these systems may contain the capacity to take AI to the next level.
You can find all the code to create your own Cellular Automata on my Github account here!:
https://github.com/Jdduryea/CellularAutomata/blob/master/Cellular%20Automata.ipynb
Thanks for reading!
J