A convenient untruth — Multi Arrays in C/++ Inspired by “A convenient untruth” by Glennan Carnie . When declaring a multi array (multi-dimensional array) in C/++, we end up with is a contiguous chunk of memory. This memory block spans over the size of the product of all the dimensions. For a 2x3 array ,we get a memory block of 6 units. I say units instead of the actual size (in bytes) because the size will be different depending on the data type of the array and the machine architecture. arr[2][3] When we query the array (i.e. access elements), we are performing pointer arithmetic to extract the value from a single 6 unit memory block. int arr[2][3] = {11,22,33,44,55,66};cout << arr[1][1]; // 55 int *ptr = &arr[0][0];cout << *(ptr + 1*3 + 1*1); // 55 data 11 22 33 44 55 66address 100 104 108 112 116 120↑ ↑ptr ptr+4 This made me wonder — can I travel backwards? Negative Indexing Now here’s the fun part. In the 2x3 integer array given above, how can we query the number ? The obvious answer would be . And that’s correct. 33 arr[0][2] In pointers, it becomes which is just . So the next question is: in how many ways can we make equal to ? In other words, for what values of , will the equality hold? *(ptr + 0*3 + 2*1) *(ptr + 2) x * 3 + y * 1 2 x y 3x + y == 2 Here’s a sample of solutions: x 0 1 2 -1 -2 ...y 1 -1 -4 5 8 ... And we can use any of these pairs to query the number : (x, y) 33 int arr[2][3] = {11,22,33,44,55,66}; cout << arr[0][2]; // 33cout << arr[1][-1]; // 33cout << arr[2][-4]; // 33cout << arr[-1][5]; // 33cout << arr[-2][8]; // 33 data 11 22 33 44 55 66address 100 104 108 112 116 120↑ ↑ptr ptr+2 This behavior is true for all multi of dimensions. The only problem is that it becomes rather cumbersome to visualize this principle in 3D and above arrays. Fortunately, I made a tiny utility that does the job for you :) arrays n This tool helps create a multi array up to 5D array and each “level” can hold up to 4 “cells”. After generating an array, you can run a query on it and the tool will highlight the memory cells used to arrive at the value. output What I mean by the words “levels” and “cells” is explained later on. Note: Show and Tell To explain things better, I will use the following memory map of a 2x3x2 array. Have a look at it and convince yourself of the following equality arr[0][2][0] == arr[0][0][4] == arr[1][0][-2] _____________________________________________________________| | L0 | 0 | 1 || |----|-----------------------------------------------|| Memory | L1 | 0 | 1 | 2 | 0 | 1 | 2 || Levels |----|-----------------------------------------------|| | L2 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 ||= ===================================|| Data Level | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10| 11| 12|‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾ | Also, I am using functions and that give the product and sum of an array range. PROD() SUM() prod(i, [a..b], arr) = arr_a * arr_(a+1) * ... * arr_(b-1)prod(arr) = arr_0 * arr_1 * ... * arr_(n-1) sum(i, [a..b], arr) = arr_a + arr_(a+1) + ... + arr_(b-1)sum(arr) = arr_0 + arr_1 + ... + arr_(n-1) where n = size of array Cells and Levels Both these terms are made up. These are just notations I use to refer to a specific memory location. A is just a memory location that can be pointed to by a pointer or array index. When we access element , we end up with the data value . Memory Cell arr[0][2][1] 6 But what does represent? I answer this by saying that they are pointers to “memory cells”. One way to read it is: [0][2][1] * access 0th cell in the block* access 2nd cell in the 0th cell* access 1st cell in the 2nd cell* retrieve the value A is a collection of memory cells that reside in the same memory dimension. Memory Level Observations The first observations are the more obvious ones: Dimension Count: **N** = 3 Dimension Sizes: **D** = [2, 3, 2] If we access then Query will be: arr[1][2][1] **Q** = [1, 2, 1] Cell Stride (S) Standing at the memory level, if I move by 1 cell, the size of jump in the data level is the for that memory level. ith Cell Stride Cell strides at levels are: L0: 6 = 1*2*3 L1: 2 = 1*2 L2: 1 = 1 This gives us the relation that for the memory level, the stride length is product of all dimension sizes above that level. ith S_i = 1 * prod(j, [i+1..N], D)orS_i = S_(i+1) * D_(i+1) Query Match (M) When we run a query, we match a memory cell at each level. This forms a chain of jumps , ending in the data level and results in a data value. across the levels This gives us the relation that for the memory level, the matching cell is: ith M_i = (1 / S_i) * sum(j, [1..N], S_j * Q_j)or simplyM_i = (1 / S_i) * sum(S_j * Q_j) Unfortunately, my journey to this relation is pretty unwieldy to explain. So I’ll just skip it. It is this value — the query match — that I use to highlight the memory cells and the final data value that result from the query. Misc : The number of cells in the memory level Cell Count ith **C_i** = prod(j, [0..i], D) = C_(i-1) * D_i : The total number of data values stored Data Count **n** = prod(i, [0..N], D) = prod(D) By comparing relations , and , get the relation because: C_i S_i n n = C_i * S_i n = prod(i, [0..N], D) C_i = prod(j, [0..i], D) S_i = prod(j, [i+1..N], D) The End C/++ allows negative indexes in multi arrays since it is just a contiguous chunk of memory and index lookups is just pointer arithmetic on it.There. That’s a thing you now know. You can find the code or on . here codepen Also, if you’re wondering whether it would be possible in some other language, that answer is “it depends”. Languages that implement multi arrays as true multi arrays — arrays of arrays — cannot have negative indexes. Languages that implement multi arrays like C/++ does support it. might Bonus Say we have an integer array and we access the element , what do we get? Well, we get the 2nd value. That’s not interesting. arr[3] arr[1] What interesting is how the compiler works this out by re-writing the query as . This is simple pointer addition. And since addition is commutative, we can do … is *(arr + 1) arr[1] -> *(arr + 1)*(arr + 1) -> *(1 + arr)*(1 + arr) -> 1[arr] … and conclude that is the same as . This bit is also true for multi arrays. arr[i] i[arr] int main () {int arr[2][3] = {11,22,33,44,55,66}; cout<<arr[0][2]; // goodcout<<2[0[arr]]; // badcout<<2[arr[0]]; // also badcout<<0[arr][2]; // ???} There’s a whole lot more where that came from. Checkout the article . “A convenient untruth” by Glennan Carnie