Fixed-size floating point types Fixed-size floating point types c c

Fixed-size floating point types


Nothing like this exists in the C or C++ standards at present. In fact, there isn't even a guarantee that float will be a binary floating-point format at all.

Some compilers guarantee that the float type will be the IEEE-754 32 bit binary format. Some do not. In reality, float is in fact the IEEE-754 single type on most non-embedded platforms, though the usual caveats about some compilers evaluating expressions in a wider format apply.

There is a working group discussing adding C language bindings for the 2008 revision of IEEE-754, which could consider recommending that such a typedef be added. If this were added to C, I expect the C++ standard would follow suit... eventually.


If you want to know whether your float is the IEEE 32-bit type, check std::numeric_limits<float>::is_iec559. It's a compile-time constant, not a function.

If you want to be more bulletproof, also check std::numeric_limits<float>::digits to make sure they aren't sneakily using the IEEE standard double-precision for float. It should be 24.

When it comes to long double, it's more important to check digits because there are a couple IEEE formats which it might reasonably be: 128 bits (digits = 113) or 80 bits (digits = 64).

It wouldn't be practical to have float32_t as such because you usually want to use floating-point hardware, if available, and not to fall back on a software implementation.


If you think having typedefs such as float32_t and float64_t are impractical for any reasons, you must be too accustomed to your familiar OS, compiler, that you are unable too look outside your little nest.

There exist hardware which natively runs 32-bit IEEE floating point operations and others that do 64-bit. Sometimes such systems even have to talk to eachother, in which case it is extremely important to know if a double is 32 bit or 64 bit on each platform. If the 32-bit platform were to do excessive calculations on base on the 64-bit values from the other, we may want to cast to the lower precision depending on timing and speed requirements.

I personally feel uncomfortable using floats and doubles unless I know exactly how many bits they are on my platfrom. Even more so if I am to transfer these to another platform over some communications channel.