Why does a 4 billion-iteration Java loop take only 2 ms?
There are one of two possibilities going on here:
The compiler realized that the loop is redundant and doing nothing so it optimized it away.
The JIT (just-in-time compiler) realized that the loop is redundant and doing nothing, so it optimized it away.
Modern compilers are very intelligent; they can see when code is useless. Try putting an empty loop into GodBolt and look at the output, then turn on -O2
optimizations, you will see that the output is something along the lines of
main(): xor eax, eax ret
I would like to clarify something, in Java most of the optimizations are done by the JIT. In some other languages (like C/C++) most of the optimizations are done by the first compiler.
I just will state the obvious - that this is a JVM optimization that happens, the loop will simply be remove at all. Here is a small test that shows what a huge difference JIT
has when enabled/enabled only for C1 Compiler
and disabled at all.
Disclaimer: don't write tests like this - this is just to prove that the actual loop "removal" happens in the C2 Compiler
:
@Benchmark@Fork(1)public void full() { long result = 0; for (int i = Integer.MIN_VALUE; i < Integer.MAX_VALUE; i++) { ++result; }}@Benchmark@Fork(1)public void minusOne() { long result = 0; for (int i = Integer.MIN_VALUE; i < Integer.MAX_VALUE - 1; i++) { ++result; }}@Benchmark@Fork(value = 1, jvmArgsAppend = { "-XX:TieredStopAtLevel=1" })public void withoutC2() { long result = 0; for (int i = Integer.MIN_VALUE; i < Integer.MAX_VALUE - 1; i++) { ++result; }}@Benchmark@Fork(value = 1, jvmArgsAppend = { "-Xint" })public void withoutAll() { long result = 0; for (int i = Integer.MIN_VALUE; i < Integer.MAX_VALUE - 1; i++) { ++result; }}
The results show that depending on which part of the JIT
is enabled, method gets faster (so much faster that it looks like it's doing "nothing" - loop removal, which seems to be happening in the C2 Compiler
- which is the maximum level):
Benchmark Mode Cnt Score Error Units Loop.full avgt 2 ≈ 10⁻⁷ ms/op Loop.minusOne avgt 2 ≈ 10⁻⁶ ms/op Loop.withoutAll avgt 2 51782.751 ms/op Loop.withoutC2 avgt 2 1699.137 ms/op