This post describes a light compilation step to help stack manipulation in Michelson. It’s not critical for the launch, so we aren’t focusing on it in the near term, but I am sure this problem will nerd-snipe a few people, and I’d love to see someone pick it up.
Rubik’s cube — photo by keqs
When writing contracts in Michelson, Tezos’ smart contract language, stack manipulation can often get in the way. To get a sense, check out Milo’s contract examples. Splitting pairs, reforming them, re-ordering elements of the stack is time consuming for the programmer, and reduces code readability.
There are, fortunately, three workarounds:
I have been thinking about 3 for a while. I call this the “JUMBLE” instruction, and it would work like this. Suppose you have a stack with the following form:
(a,b):c:(d,(e,f))
and would like a stack of the form
(a,f):(c,d)
You could write
CAR; DIP { SWAP; DUP; CAR; SWAP; CDR; CDR;}; PAIR; DIP {SWAP; PAIR; PUSH;}
Which isn’t very explict… Or you could write
JUMBLE (a,b):c:(d,(e,f)) => (a,f):(c,d)
Which is a lot clearer! Of course, the JUMBLE operation needs to be compiled — we will get to that in a second — but note that it can also be decompiled easily.
Any code sequence of the form (DUP|DROP|SWAP|CDR|CAR|DIP|PAIR)* can be compressed into a JUMBLE instruction.
Writing down a correct implementation of JUMBLE isn’t trivial, but it’s still relatively straightforward. A simple way to do this is to start by flattening all the pairs in both the initial stack and the target stack. Flattening the pairs of the initial stack is done by repeatedly applying an UNPAIR macro of the form DUP; CDR; SWAP; CAR. The flattened initial stack is then sorted and variables are duplicated to match the flattened target stack. The pairs are then reformed to match the target stack.
This isn’t a very efficient way of doing things. Some pairs may be present in both the initial and target stack, and it may be inefficient to break them and reform them. In addition, sorting elements with only the ability to SWAP nearby entries is necessarily quadratic in the worst case.
It’s possible to start from this correct procedure, and tack on a series of optimization, by finding common patterns and avoiding unnecessary stack manipulation.
Another approach is to directly search for an optimal solution. I have implemented a prototype of that compilation, and recently pushed it on Github: https://github.com/tezos/optimal_stack
It’s an implementation of A* with a few heuristics to cut down on the search space. It hasn’t been tuned for performance by any means, so it won’t be helpful for large stack manipulations, but it will find non-trivial patterns.
I recommend trying out in utop, which can be done like this
$ ocamlc -c *.ml *.mliutop#require "heap";;#require "optimal_stack";;open Optimal_stack;;example ()
Example optimizes the following problem
JUMBLE (a, b):c => (c,a):b
which can be programmed like this
let a = Var "a" and b = Var "b" and c = Var "c" inlet start = Stack ([Pair (a, b); c**], []) and target = Stack ([Pair (c,a); b]**, [])**in**optimize start target
The given solution is
DUP; POP; CAR; SWAP; PAIR; PUSH; CDR; SWAP
You will notice that the stack type is a pair of lists. This lets us represent the DIP operation as PUSH and POP operations onto an auxiliary stack. Unless you’re inside of a DIP, the auxiliary stack should be empty.
My focus is on launching the network, and we don’t need JUMBLE for that. However, I think it’s cool and I’d love to see someone pick up the preliminary work to make it faster and able to tackle larger optimization problems.
My suggestions on how to take it further would be: