Is it better to avoid using the mod operator when possible? Is it better to avoid using the mod operator when possible? c c

Is it better to avoid using the mod operator when possible?


My general advice is as follows. Use whichever version you think is easier on the eye, and then profile your entire system. Only optimize those parts of the code that the profiler flags up as bottlenecks. I'll bet my bottom dollar that the modulo operator isn't going to be among them.

As far as the specific example goes, only benchmarking can tell which is faster on your specific architecture using your specific compiler. You are potentially replacing modulo with branching, and it's anything but obvious which would be faster.


Some simple measurement:

#include <stdio.h>#include <stdlib.h>int main(int argc, char *argv[]){    int test = atoi(argv[1]);    int divisor = atoi(argv[2]);    int iterations = atoi(argv[3]);    int a = 0;    if (test == 0) {        for (int i = 0; i < iterations; i++)            a = (a + 1) % divisor;    } else if (test == 1) {        for (int i = 0; i < iterations; i++)            a = a + 1 == divisor ? 0 : a + 1;    }    printf("%d\n", a);}

Compiling with either gcc or clang with -O3, and running time ./a.out 0 42 1000000000 (modulo version) or time ./a.out 1 42 1000000000 (comparison version) results in

  • 6.25 seconds user runtime for the modulo version,
  • 1.03 seconds for the comparison version.

(using gcc 5.2.1 or clang 3.6.2; Intel Core i5-4690K @ 3.50GHz; 64-bit Linux)

This means that it is probably a good idea to use the comparison version.


Well, have a look at 2 ways to get the next value of a "modulo 3" cyclic counter.

int next1(int n) {    return (n + 1) % 3;}int next2(int n) {    return n == 2 ? 0 : n + 1;}

I've compiled it with gcc -O3 option (for the common x64 architecture), and -s to get the assembly code.

The code for the first function does some unexplainable magic (*) to avoid a division, using a multiplication anyway:

addl    $1, %edimovl    $1431655766, %edxmovl    %edi, %eaximull   %edxmovl    %edi, %eaxsarl    $31, %eaxsubl    %eax, %edxleal    (%rdx,%rdx,2), %eaxsubl    %eax, %edimovl    %edi, %eaxret

And is much longer (and I bet slower) than the second function:

leal    1(%rdi), %eaxcmpl    $2, %edimovl    $0, %edxcmove   %edx, %eaxret

So it is not always true that "the (modern) compiler does a better job than you anyway".

Interestingly, the same experiment with 4 instead of 3 leads to a and-masking for the first function

addl    $1, %edimovl    %edi, %edxsarl    $31, %edxshrl    $30, %edxleal    (%rdi,%rdx), %eaxandl    $3, %eaxsubl    %edx, %eaxret

but it is still, and by large, inferior to the second version.

Being more explicit about proper ways to do the things

int next3(int n) {    return (n + 1) & 3;;}

yields much better results :

leal    1(%rdi), %eaxandl    $3, %eaxret

(*) well, not that complicated. Multiplication by reciprocical. Compute the integer constant K = (2^N)/3, for some large enough value of N. Now, when you want the value of X/3, instead of a division by 3, compute X*K, and shift it N positions to the right.