GCC recommendations and options for fastest code GCC recommendations and options for fastest code unix unix

GCC recommendations and options for fastest code


Without knowing any specifics on your program it's hard to say. O3 covers most of the optimisations. The remaining options come "at a cost". If you can tolerate some random rounding and your code isn't dependent on IEEE floating point standards then you can try -Ofast. This disregards standards compliance and can give you faster code.

The remaining optimisations flags can only improve performance of certain programs, but can even be detrimental to others. Look at the available flags in the gcc documentation on optimisation flags and benchmark them.

Another option is to enable C99 (-std=c99) and inline appropriate functions. This is a bit of an art, you shouldn't inline everything, but with a little work you can get your code to be faster (albeit at the cost of having a larger executable).

If speed is really an issue I would suggest either going back to Microsoft's compiler, or to try Intel's. I've come to appreciate how slow some gcc compiled code can be, especially when it involves math.h.

EDIT: Oh wait, you said C++? Then disregard my C99 paragraph, you can inline already :)


I would try profile guided optimization:

-fprofile-generate Enable options usually used for instrumenting application to produce profile useful for later recompilation with profile feedback based optimization. You must use -fprofile-generate both when compiling and when linking your program. The following options are enabled: -fprofile-arcs, -fprofile-values, -fvpt.

You should also give the compiler hints about the architecture on which the program will run.For example if it will only run on a server and you can compile it on the same machine as the server, you can just use -march=native.Otherwise you need to determine which features your users will all have and pass the corresponding parameter to GCC.

(Apparently you're targeting 64-bit, so GCC will probably already include more optimizations than for generic x86.)


-oFast


Please try -oFast instead of -o3

Also here is a list of flags you might want to selectively enable.

-ffloat-store

-fexcess-precision=style

-ffast-math

-fno-rounding-math

-fno-signaling-nans

-fcx-limited-range

-fno-math-errno

-funsafe-math-optimizations

-fassociative-math

-freciprocal-math

-ffinite-math-only

-fno-signed-zeros

-fno-trapping-math

-frounding-math

-fsingle-precision-constant

-fcx-fortran-rules

A complete list of the flags and their detailed description is available here