What is the most efficient way to represent small values in a struct? What is the most efficient way to represent small values in a struct? c c

What is the most efficient way to represent small values in a struct?


For dense packing that doesn't incur a large overhead of reading, I'd recommend a struct with bitfields. In your example where you have four values ranging from 0 to 3, you'd define the struct as follows:

struct Foo {    unsigned char a:2;    unsigned char b:2;    unsigned char c:2;    unsigned char d:2;}

This has a size of 1 byte, and the fields can be accessed simply, i.e. foo.a, foo.b, etc.

By making your struct more densely packed, that should help with cache efficiency.

Edit:

To summarize the comments:

There's still bit fiddling happening with a bitfield, however it's done by the compiler and will most likely be more efficient than what you would write by hand (not to mention it makes your source code more concise and less prone to introducing bugs). And given the large amount of structs you'll be dealing with, the reduction of cache misses gained by using a packed struct such as this will likely make up for the overhead of bit manipulation the struct imposes.


Pack them only if space is a consideration - for example, an array of 1,000,000 structs. Otherwise, the code needed to do shifting and masking is greater than the savings in space for the data. Hence you are more likely to have a cache miss on the I-cache than the D-cache.


There is no definitive answer, and you haven't given enough information to allow a "right" choice to be made. There are trade-offs.

Your statement that your "primary goal is time efficiency" is insufficient, since you haven't specified whether I/O time (e.g. to read data from file) is more of a concern than computational efficiency (e.g. how long some set of computations take after a user hits a "Go" button).

So it might be appropriate to write the data as a single char (to reduce time to read or write) but unpack it into an array of four int (so subsequent calculations go faster).

Also, there is no guarantee that an int is 32 bits (which you have assumed in your statement that the first packing uses 128 bits). An int can be 16 bits.