Is 'switch' faster than 'if'?

c performance switch-statement assembly jump-table

There are several optimizations a compiler can make on a switch. I don't think the oft-mentioned "jump-table" is a very useful one though, as it only works when the input can be bounded some way.

C Pseudocode for a "jump table" would be something like this -- note that the compiler in practice would need to insert some form of if test around the table to ensure that the input was valid in the table. Note also that it only works in the specific case that the input is a run of consecutive numbers.

If the number of branches in a switch is extremely large, a compiler can do things like using binary search on the values of the switch, which (in my mind) would be a much more useful optimization, as it does significantly increase performance in some scenarios, is as general as a switch is, and does not result in greater generated code size. But to see that, your test code would need a LOT more branches to see any difference.

To answer your specific questions:

Clang generates one that looks like this:

test_switch(char):                       # @test_switch(char)        movl    %edi, %eax        cmpl    $19, %edi        jbe     .LBB0_1        retq.LBB0_1:        jmpq    *.LJTI0_0(,%rax,8)        jmp     void call<0u>()         # TAILCALL        jmp     void call<1u>()         # TAILCALL        jmp     void call<2u>()         # TAILCALL        jmp     void call<3u>()         # TAILCALL        jmp     void call<4u>()         # TAILCALL        jmp     void call<5u>()         # TAILCALL        jmp     void call<6u>()         # TAILCALL        jmp     void call<7u>()         # TAILCALL        jmp     void call<8u>()         # TAILCALL        jmp     void call<9u>()         # TAILCALL        jmp     void call<10u>()        # TAILCALL        jmp     void call<11u>()        # TAILCALL        jmp     void call<12u>()        # TAILCALL        jmp     void call<13u>()        # TAILCALL        jmp     void call<14u>()        # TAILCALL        jmp     void call<15u>()        # TAILCALL        jmp     void call<16u>()        # TAILCALL        jmp     void call<17u>()        # TAILCALL        jmp     void call<18u>()        # TAILCALL        jmp     void call<19u>()        # TAILCALL.LJTI0_0:        .quad   .LBB0_2        .quad   .LBB0_3        .quad   .LBB0_4        .quad   .LBB0_5        .quad   .LBB0_6        .quad   .LBB0_7        .quad   .LBB0_8        .quad   .LBB0_9        .quad   .LBB0_10        .quad   .LBB0_11        .quad   .LBB0_12        .quad   .LBB0_13        .quad   .LBB0_14        .quad   .LBB0_15        .quad   .LBB0_16        .quad   .LBB0_17        .quad   .LBB0_18        .quad   .LBB0_19        .quad   .LBB0_20        .quad   .LBB0_21

I can say that it is not using a jump table -- 4 comparison instructions are clearly visible:

13FE81C51 cmp  qword ptr [rsp+30h],1 13FE81C57 je   testSwitch+73h (13FE81C73h) 13FE81C59 cmp  qword ptr [rsp+30h],2 13FE81C5F je   testSwitch+87h (13FE81C87h) 13FE81C61 cmp  qword ptr [rsp+30h],3 13FE81C67 je   testSwitch+9Bh (13FE81C9Bh) 13FE81C69 cmp  qword ptr [rsp+30h],4 13FE81C6F je   testSwitch+0AFh (13FE81CAFh)

A jump table based solution does not use comparison at all.

Either not enough branches to cause the compiler to generate a jump table, or your compiler simply doesn't generate them. I'm not sure which.

EDIT 2014: There has been some discussion elsewhere from people familiar with the LLVM optimizer saying that the jump table optimization can be important in many scenarios; e.g. in cases where there is an enumeration with many values and many cases against values in said enumeration. That said, I stand by what I said above in 2011 -- too often I see people thinking "if I make it a switch, it'll be the same time no matter how many cases I have" -- and that's completely false. Even with a jump table you get the indirect jump cost and you pay for entries in the table for each case; and memory bandwidth is a Big Deal on modern hardware.

Write code for readability. Any compiler worth its salt is going to see an if / else if ladder and transform it into equivalent switch or vice versa if it would be faster to do so.

c performance switch-statement assembly jump-table

To your question:

1.What would a basic jump table look like, in x86 or x64?

Jump table is memory address that holds pointer to the labels in something like array structure. following example will help you understand how jump tables are laid out

00B14538  D8 09 AB 00 D8 09 AB 00 D8 09 AB 00 D8 09 AB 00  Ø.«.Ø.«.Ø.«.Ø.«.00B14548  D8 09 AB 00 D8 09 AB 00 D8 09 AB 00 00 00 00 00  Ø.«.Ø.«.Ø.«.....00B14558  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................00B14568  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................

Where 00B14538 is the pointer to the Jump table , and value like D8 09 AB 00 represents label pointer.

2.Is this code using a jump table?No in this case.

3.Why is there no performance difference in this example?

There is no performance difference because instruction for both case looks same, no jump table.

4.Is there any situation in which there is a significant performance difference?

If you have very long sequence of if check, in that case using a jump table improves performance (branching/jmp instructions are expensive if they don't predict near-perfectly) but comes with the cost of memory.

The code for all the compare instructions has some size, too, so especially with 32-bit pointers or offsets, a single jump table lookup might not cost a lot more size in an executable.

Conclusion: Compiler is smart enough handle such case and generate appropriate instructions :)

c performance switch-statement assembly jump-table

The compiler is free to compile the switch statement as a code which is equivalent to if-statement, or to create a jump table. It will likely chose one on the other based on what will execute fastest or generate the smallest code somewhat depending on what you have specified in you compiler options -- so worst case it will be the same speed as if-statements

I would trust the compiler to do the best choice and focus on what makes the code most readable.

If the number of cases becomes very large a jump table will be much faster than a series of if. However if the steps between the values is very large, then the jump table can become large, and the compiler may choose not to generate one.

CodeHunter

Is 'switch' faster than 'if'?

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last