Flushing denormalised numbers to zero Flushing denormalised numbers to zero xcode xcode

Flushing denormalised numbers to zero


You're looking for a platform-defined way to set FTZ and/or DAZ in the MXCSR register (on x86 with SSE or x86-64); see https://stackoverflow.com/a/2487733/567292

Usually this is called something like _controlfp; Microsoft documentation is at http://msdn.microsoft.com/en-us/library/e9b52ceh.aspx

You can also use the _MM_SET_FLUSH_ZERO_MODE macro: http://msdn.microsoft.com/en-us/library/a8b5ts9s(v=vs.71).aspx - this is probably the most cross-platform portable method.


For disabling denormals globally I use these 2 macros:

//warning these macros has to be used in the same scope#define MXCSR_SET_DAZ_AND_FTZ \int oldMXCSR__ = _mm_getcsr(); /*read the old MXCSR setting */ \int newMXCSR__ = oldMXCSR__ | 0x8040; /* set DAZ and FZ bits */ \_mm_setcsr( newMXCSR__ ); /*write the new MXCSR setting to the MXCSR */ #define MXCSR_RESET_DAZ_AND_FTZ \/*restore old MXCSR settings to turn denormals back on if they were on*/ \_mm_setcsr( oldMXCSR__ ); 

I call the first one at the beginning of the process and the second at the end.Unfortunately this seems to not works well on Windows.

To flush denormals locally I use this

const Float32 k_DENORMAL_DC = 1e-25f;inline void FlushDenormalToZero(Float32& ioFloat) {     ioFloat += k_DENORMAL_DC;    ioFloat -= k_DENORMAL_DC;    } 


To do this, use the Intel Intrinsics macros during program startup. For example:

#include <immintrin.h> int main() {  _MM_SET_FLUSH_ZERO_MODE(_MM_FLUSH_ZERO_ON); }

In my version of MSVC, this emitted the following assembly code:

    stmxcsr DWORD PTR tv805[rsp]    mov eax, DWORD PTR tv805[rsp]    bts eax, 15    mov DWORD PTR tv807[rsp], eax    ldmxcsr DWORD PTR tv807[rsp]

MXCSR is the control and status register, and this code is setting bit 15, which turns flush zero mode on.

One thing to note: this only affects denormals resulting from a computation. If you want to also set denormals to zero if they're used as input, you also need to set the DAZ flag (denormals are zero), using the following command:

_MM_SET_DENORMALS_ZERO_MODE(_MM_DENORMALS_ZERO_ON);

See https://software.intel.com/en-us/cpp-compiler-developer-guide-and-reference-setting-the-ftz-and-daz-flags for more information.

Also note that you need to set MXCSR for each thread, as the values contained are local to each thread.