Inline function v. Macro in C -- What's the Overhead (Memory/Speed)? Inline function v. Macro in C -- What's the Overhead (Memory/Speed)? c c

Inline function v. Macro in C -- What's the Overhead (Memory/Speed)?


Calling an inline function may or may not generate a function call, which typically incurs a very small amount of overhead. The exact situations under which an inline function actually gets inlined vary depending on the compiler; most make a good-faith effort to inline small functions (at least when optimization is enabled), but there is no requirement that they do so (C99, ยง6.7.4):

Making a function an inline function suggests that calls to the function be as fast as possible. The extent to which such suggestions are effective is implementation-defined.

A macro is less likely to incur such overhead (though again, there is little to prevent a compiler from somehow doing something; the standard doesn't define what machine code programs must expand to, only the observable behavior of a compiled program).

Use whatever is cleaner. Profile. If it matters, do something different.

Also, what fizzer said; calls to pow (and division) are both typically more expensive than function-call overhead. Minimizing those is a good start:

double ratio = SigmaSquared/RadialDistanceSquared;double AttractiveTerm = ratio*ratio*ratio;EnergyContribution += 4 * Epsilon * AttractiveTerm * (AttractiveTerm - 1.0);

Is EnergyContribution made up only of terms that look like this? If so, pull the 4 * Epsilon out, and save two multiplies per iteration:

double ratio = SigmaSquared/RadialDistanceSquared;double AttractiveTerm = ratio*ratio*ratio;EnergyContribution += AttractiveTerm * (AttractiveTerm - 1.0);// later, once you've done all of those terms...EnergyContribution *= 4 * Epsilon;


An macro is not really a function. whatever you define as a macro gets verbatim posted into your code, before the compiler gets to see it, by the preprocessor. The preprocessor is just a software engineers tool that enables various abstractions to better structure your code.

A function inline or otherwise the compiler does know about, and can make decisions on what to do with it. A user supplined inline keyword is just a suggestion and the compiler may over-ride it. It is this over-riding that in most cases would result in better code.

Another side effect of the compiler being aware of the functions is that you could potentially force the compiler to take certain decisions -for example, disabling inlining of your code, which could enable you to better debug or profile your code. There are probably many other use-cases that inline functions enable vs. macros.

Macros are extremely powerful though, and to back this up I would cite google test and google mock. There are many reasons to use macros :D.

Simple mathmatical operations that are chained together using functions are often inlined by the compiler, especially if the function is only called once in the translation step. So, I wouldn't be surprised that the compiler takes inlining decisions for you, regardless of weather the keyword is supplied or not.

However, if the compiler doesn't you can manually flatted out segments of your code. If you do flatten it out perhaps macros will serve as a good abstraction, after all they present similar semantics to a "real" function.

The Crux

So, do you want the compiler to be aware of certain logical boundaries so it can produce better physical code, or do you want force decisions on the compiler by flattening it out manually or by using macros. The industry leans towards the former.

I would lean towards using macros in this case, just because it's quick and dirty, without having to learn much more. However, as macros are a software engineering abstraction, and because you are concerned with the code the compiler generates, if the problem were to become slightly more advanced I would use C++ templates, as they were designed for the concerns you are pondering.


It's the calls to pow() you want to eliminate. This function takes general floating point exponents and is inefficient for raising to integral exponents. Replacing these calls with e.g.

inline double cube(double x){    return x * x * x;}

is the only thing which will make a significant difference to your performance here.