Which ordering of nested loops for iterating over a 2D array is more efficient [duplicate]

The first method is slightly better, as the cells being assigned to lays next to each other.

First method:

[ ][ ][ ][ ][ ] ....^1st assignment   ^2nd assignment[ ][ ][ ][ ][ ] ....^101st assignment

Second method:

[ ][ ][ ][ ][ ] ....^1st assignment   ^101st assignment[ ][ ][ ][ ][ ] ....^2nd assignment

c performance for-loop cpu-cache

For array[100][100] - they are both the same, if the L1 cache is larger then 100*100*sizeof(int) == 10000*sizeof(int) == [usually] 40000. Note in Sandy Bridge - 100*100 integers should be enough elements to see a difference, since the L1 cache is only 32k.
Compilers will probably optimize this code all the same
Assuming no compiler optimizations, and matrix does not fit in L1 cache - the first code is better due to cache performance [usually]. Every time an element is not found in cache - you get a cache miss - and need to go to the RAM or L2 cache [which are much slower]. Taking elements from RAM to cache [cache fill] is done in blocks [usually 8/16 bytes] - so in the first code, you get at most miss rate of 1/4 [assuming 16 bytes cache block, 4 bytes ints] while in the second code it is unbounded, and can be even 1. In the second code snap - elements that were already in cache [inserted in the cache fill for the adjacent elements] - were taken out, and you get a redundant cache miss.
- This is closely related to the principle of locality, which is the general assumption used when implementing the cache system. The first code follows this principle while the second doesn't - so cache performance of the first will be better of those of the second.

Conclusion:For all cache implementations I am aware of - the first will be not worse then the second. They might be the same - if there is no cache at all or all the array fits in cache completely - or due to compiler optimization.

c performance for-loop cpu-cache

This sort of micro-optimization is platform-dependent so you'll need to profile the code in order to be able to draw a reasonable conclusion.

CodeHunter

Which ordering of nested loops for iterating over a 2D array is more efficient [duplicate]

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last