Performance 32 bit vs. 64 bit arithmetic Performance 32 bit vs. 64 bit arithmetic linux linux

Performance 32 bit vs. 64 bit arithmetic


It depends on the exact CPU and operation. On 64-bit Pentium IVs, for example, multiplication of 64-bit registers was quite a bit slower. Core 2 and later CPUs have been designed for 64-bit operation from the ground up.

Generally, even code written for a 64-bit platform uses 32-bit variables where values will fit in them. This isn't primarily because arithmetic is faster (on modern CPUs, it generally isn't) but because it uses less memory and memory bandwidth.

A structure containing a dozen integers will be half the size if those integers are 32-bit than if they are 64-bit. This means it will take half as many bytes to store, half as much space in the cache, and so on.

64-bit native registers and arithmetic are used where values may not fit into 32-bits. But the main performance benefits come from the extra general purpose registers available in the x86_64 instruction set. And of course, there are all the benefits that come from 64-bit pointers.

So the real answer is that it doesn't matter. Even if you use x86_64 mode, you can (and generally do) still use 32-bit arithmetic where it will do, and you get the benefits of larger pointers and more general purpose registers. When you use 64-bit native operations, it's because you need 64-bit operations, and you know they'll be faster than faking it with multiple 32-bit operations -- your only other choice. So the relative performance of 32-bit versus 64-bit registers should never be a deciding factor in any implementation decision.


I just stumbled upon this question, but I think one very important aspect is missing here: if you really look down into assembly code using the type 'int' for indices will likely slow down the code your compiler generates. This is because 'int' defaults to a 32bit type on many 64bit compilers and platforms (Visual Studio, GCC) and doing address calculations with pointers (which are necessarily 64bit on a 64bit OS) and 'int' will cause the compiler to emit unnecessary conversions between 32 and 64bit registers. I've just experienced this in a very performance critical inner loop of my code. Switching from 'int' to 'long long' as loop index improved my algorithm run time by about 10%, which was quite a huge gain considering the extensive SSE/AVX2 vectorization I was already using at that point.


In a primarily 32-bit application (meaning only 32-bit arithmetic is used, and 32-bit pointers are sufficient), the real benefits of the x86-64 architecture are the other "updates" AMD made to the architecture:

  • 16 general-purpose registers, up from 8 in x86
  • RIP-relative addressing mode
  • others...

This is evident by the new x32 ABI implemented in Linux.