Performance optimization tip – understand your memory layout

.NET take care of all memory allocation and de-allocation and after a while it seems that programmers no longer need to understand exactly how data is stored in the memory. The idea of not needing to know how data is stored could not be farther from the truth, in fact memory layout have great impact on code performance.

Test Case – For Loops

Every program I ever wrote has at least one loop and often than not I use a For loop for iterating my collections.

The following code accesses two dimensional array of integers:

for (int i = 0 ; i < width; i++) 
{
    for (int j = 0; j < height ; j++) 
    {
        testArray[i,j] = i * j;
    }
}

And it seems that there is no reason not to write the code this way:

for (int j = 0; j < height ; ++j)
{
    for (int i = 0 ; i < width; ++i) 
    {
        testArray[i,j] = i * j;
    }
}

Both code snippets do the same thing – store a multiplication of indexes in each cell of the array.

When run both snippets (using Stopwatch) I discovered an interesting thing – running the 1st code snippet took less then half the time it took running the 2nd snippet. More accurately when using array of 1000x1000 the 1st took 18 seconds while the 2nd took 44 seconds!

The reason for the time differences is in how the array is stored in the memory.

How arrays are stored in the memory

Allocated arrays are stored in the heap memory as a continues chunk of memory. Two dimensional arrays (same as in the example above) are stored similarly – each line of the array (index x,0 – x,n) stored together and so to reach the x,y cell we have to go to the j + (i * width) place.

[Taken from C++ Notes - 2-D Array Memory Layout]

What it means is that when we travel the array line by line we run on the array from start to finish and when running by columns (2nd example) we jump on each iteration a number of cells of the size of the array “width”.

When we read/write to the array it is copied to the cache memory (which is a limited size fast memory). When traveling sequentially we make it easier for the cache to help us by bringing additional cells from the array. Obviously the cache memory have no knowledge of our array or its width and so when we jump on the memory in width size steps we do not use the cache properly and the write time takes much longer.

Conclusion

Although using .NET free software developers from the need to micro manage the memory usage of our code knowing how the data is stored and accessed can make a noticable difference.

Labels: .NET, Performance