Is memset() more efficient than for loop in C? Is memset() more efficient than for loop in C? c c

Is memset() more efficient than for loop in C?


Most certainly, memset will be much faster than that loop. Note how you treat one character at a time, but those functions are so optimized that set several bytes at a time, even using, when available, MMX and SSE instructions.

I think the paradigmatic example of these optimizations, that go unnoticed usually, is the GNU C library strlen function. One would think that it has at least O(n) performance, but it actually has O(n/4) or O(n/8) depending on the architecture (yes, I know, in big O() will be the same, but you actually get an eighth of the time). How? Tricky, but nicely: strlen.


Well, why don't we take a look at the generated assembly code, full optimization under VS 2010.

char x[500];char y[500];int i;      memset(x, 0, sizeof(x) );     003A1014  push        1F4h    003A1019  lea         eax,[ebp-1F8h]    003A101F  push        0    003A1021  push        eax    003A1022  call        memset (3A1844h)  

And your loop...

char x[500];char y[500];int i;    for( i = 0; i < 500; ++i ){    x[i] = 0;      00E81014  push        1F4h        00E81019  lea         eax,[ebp-1F8h]        00E8101F  push        0        00E81021  push        eax        00E81022  call        memset (0E81844h)        /* note that this is *replacing* the loop,          not being called once for each iteration. */}

So, under this compiler, the generated code is exactly the same. memset is fast, and the compiler is smart enough to know that you are doing the same thing as calling memset once anyway, so it does it for you.

If the compiler actually left the loop as-is then it would likely be slower as you can set more than one byte size block at a time (i.e., you could unroll your loop a bit at a minimum. You can assume that memset will be at least as fast as a naive implementation such as the loop. Try it under a debug build and you will notice that the loop is not replaced.

That said, it depends on what the compiler does for you. Looking at the disassembly is always a good way to know exactly what is going on.


It really depends on the compiler and library. For older compilers or simple compilers, memset may be implemented in a library and would not perform better than a custom loop.

For nearly all compilers that are worth using, memset is an intrinsic function and the compiler will generate optimized, inline code for it.

Others have suggested profiling and comparing, but I wouldn't bother. Just use memset. Code is simple and easy to understand. Don't worry about it until your benchmarks tell you this part of code is a performance hotspot.