Neat optimization trick: reduce the number of jumps in a nested loop by@stag1e

March 6th 2017 7,628 reads

By reordering nested loops you can easily achieve more efficient code, that is, with less code jumps. Credit for the original idea goes to “Code Complete” which was written by Steve McConnell but I am going to expand it a bit further with math…

The example in that book compares these two versions of the same nested for loop:

int a, b;

for (a = 0; a < 10; a++)

for (b = 0; b < 500; b++)

printf("hi");

And

int a, b;

for (b = 0; b < 500; b++)

for (a = 0; a < 10; a++)

printf("hi");

You can easily see that the first example jumps 10 + 10*500 = 5010 times and the second one: 500 + 500*10 = 5500** **but both loop the same number of times. That means that even though they both achieve the same thing, the first nested loop is a tad bit faster. But just how much is it faster? Could we calculate that?

First of all, let’s assume that **X** and **Y** are two natural numbers. For example, we have two loops. The A loop:

int a, b;

for (a = 0; a < X; a++)

for (b = 0; b < Y; b++)

printf("hi");

The B loop:

int a, b;

for (b = 0; b < Y; b++)

for (a = 0; a < X; a++)

printf("hi");

The A loop loops **X** + **X*****Y** times whereas the second loop loops **Y** + **X*****Y** times. The difference between the B loop and the A loop is **Y**-**X**. If **X** > **Y** then the difference is negative, if **X** == **Y** then difference is zero and if **X** < **Y** then the difference is positive. What that means is that if **X** < **Y** then we execute **Y**-**X** less jump instructions in the A loop compared to the B loop.

You can see that this is indeed true by checking the original example. In the original example **X** = 10, **Y** = 500. **X** < **Y** is true and **Y**-**X** is 490. The first loop loops 5010 times whereas the second one — 5500 times. The difference between them is 490 which is **Y**-**X**!

In conclusion, you can say that the bigger the difference between **X** and **Y**, the more you have to gain in terms of efficiency by reordering the loops in such a way that the smaller number **X** is used in a for loop before **Y**.

We could go on with the same reasoning for deeper nested loops but instead let’s frame the problem like this: we need to prove that a sum like **a**+**a*****b**+**a*****b*****c** is minimal where **a**, **b** and **c** are some natural numbers if and only if **a** < **b** < **c**. My proof will be similar to mathematical induction. We already established before that **b** + **b*****c** is lower when **b** < **c**. So we can restructure that sum like this: **a** + **a *** (**b**+ **b*****c**). We already know about the parentheses: for that sum to be minimal, **b** has to be smaller than **c** or, in other words,** b **<** c.** It can obviously be seen that **a** has to be as minimal as possible to minimize the whole sum. And that means that **a** must be lower than **b** and **c**. Thus, we get that when **a** < **b** < **c **then the sum is minimal. It is possible to extend this same logic to **a** + **a*****b** + **a*****b*****c** + **a*****b*****c*****d** and so forth.

The rule is this: when writing nested loops make sure that the variables that change the most are in the most inner loop and those which change the least — in the most outer loop. This significantly reduces the number of jumps if the number of loops is big.