Is it faster to count down than it is to count up?

Is it really true? and if so does anyone know why?

In ancient days, when computers were still chipped out of fused silica by hand, when 8-bit microcontrollers roamed the Earth, and when your teacher was young (or your teacher's teacher was young), there was a common machine instruction called decrement and skip if zero (DSZ). Hotshot assembly programmers used this instruction to implement loops. Later machines got fancier instructions, but there were still quite a few processors on which it was cheaper to compare something with zero than to compare with anything else. (It's true even on some modern RISC machines, like PPC or SPARC, which reserve a whole register to be always zero.)

So, if you rig your loops to compare with zero instead of N, what might happen?

You might save a register
You might get a compare instruction with a smaller binary encoding
If a previous instruction happens to set a flag (likely only on x86 family machines), you might not even need an explicit compare instruction

Are these differences likely to result in any measurable improvement on real programs on a modern out-of-order processor? Highly unlikely. In fact, I'd be impressed if you could show a measurable improvement even on a microbenchmark.

Summary: I smack your teacher upside the head! You shouldn't be learning obsolete pseudo-facts about how to organize loops. You should be learning that the most important thing about loops is to be sure that they terminate, produce correct answers, and are easy to read. I wish your teacher would focus on the important stuff and not mythology.

c performance loops

Here's what might happen on some hardware depending on what the compiler can deduce about the range of the numbers you're using: with the incrementing loop you have to test i<N each time round the loop. For the decrementing version, the carry flag (set as a side effect of the subtraction) may automatically tell you if i>=0. That saves a test per time round the loop.

In reality, on modern pipelined processor hardware, this stuff is almost certainly irrelevant as there isn't a simple 1-1 mapping from instructions to clock cycles. (Though I could imagine it coming up if you were doing things like generating precisely timed video signals from a microcontroller. But then you'd write in assembly language anyway.)

c performance loops

In the Intel x86 instruction set, building a loop to count down to zero can usually be done with fewer instructions than a loop that counts up to a non-zero exit condition. Specifically, the ECX register is traditionally used as a loop counter in x86 asm, and the Intel instruction set has a special jcxz jump instruction that tests the ECX register for zero and jumps based on the result of the test.

However, the performance difference will be negligible unless your loop is already very sensitive to clock cycle counts. Counting down to zero might shave 4 or 5 clock cycles off each iteration of the loop compared to counting up, so it's really more of a novelty than a useful technique.

Also, a good optimizing compiler these days should be able to convert your count up loop source code into count down to zero machine code (depending on how you use the loop index variable) so there really isn't any reason to write your loops in strange ways just to squeeze a cycle or two here and there.

CodeHunter

Is it faster to count down than it is to count up?

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last