Why must a short be converted to an int before arithmetic operations in C and C++? Why must a short be converted to an int before arithmetic operations in C and C++? c c

Why must a short be converted to an int before arithmetic operations in C and C++?


If we look at the Rationale for International Standard—Programming Languages—C in section 6.3.1.8 Usual arithmetic conversions it says (emphasis mine going forward):

The rules in the Standard for these conversions are slight modifications of those in K&R: the modifications accommodate the added types and the value preserving rules. Explicit license was added to perform calculations in a “wider” type than absolutely necessary, since this can sometimes produce smaller and faster code, not to mention the correct answer more often. Calculations can also be performed in a “narrower” type by the as if rule so long as the same end result is obtained. Explicit casting can always be used to obtain a value in a desired type

Section 6.3.1.8 from the draft C99 standard covers the Usual arithmetic conversions which is applied to operands of arithmetic expressions for example section 6.5.6 Additive operators says:

If both operands have arithmetic type, the usual arithmetic conversions are performed on them.

We find similar text in section 6.5.5 Multiplicative operators as well. In the case of a short operand, first the integer promotions are applied from section 6.3.1.1 Boolean, characters, and integers which says:

If an int can represent all values of the original type, the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions.48) All other types are unchanged by the integer promotions.

The discussion from section 6.3.1.1 of the Rationale or International Standard—Programming Languages—C on integer promotions is actually more interesting, I am going to selectively quote b/c it is too long to fully quote:

Implementations fell into two major camps which may be characterized as unsigned preserving and value preserving.

[...]

The unsigned preserving approach calls for promoting the two smaller unsigned types to unsigned int. This is a simple rule, and yields a type which is independent of execution environment.

The value preserving approach calls for promoting those types to signed int if that type can properly represent all the values of the original type, and otherwise for promoting those types to unsigned int. Thus, if the execution environment represents short as something smaller than int, unsigned short becomes int; otherwise it becomes unsigned int.

This can have some rather unexpected results in some cases as Inconsistent behaviour of implicit conversion between unsigned and bigger signed types demonstrates, there are plenty more examples like that. Although in most cases this results in the operations working as expected.


It's not a feature of the language as much as it is a limitation of physical processor architectures on which the code runs. The int typer in C is usually the size of your standard CPU register. More silicon takes up more space and more power, so in many cases arithmetic can only be done on the "natural size" data types. This is not universally true, but most architectures still have this limitation. In other words, when adding two 8-bit numbers, what actually goes on in the processor is some type of 32-bit arithmetic followed by either a simple bit mask or another appropriate type conversion.


short and char types are considered by the standard sort of "storage types" i.e. sub-ranges that you can use to save some space but that are not going to buy you any speed because their size is "unnatural" for the CPU.

On certain CPUs this is not true but good compilers are smart enough to notice that if you e.g. add a constant to an unsigned char and store the result back in an unsigned char then there's no need to go through the unsigned char -> int conversion.For example with g++ the code generated for the inner loop of

void incbuf(unsigned char *buf, int size) {    for (int i=0; i<size; i++) {        buf[i] = buf[i] + 1;    }}

is just

.L3:    addb    $1, (%rdi,%rax)    addq    $1, %rax    cmpl    %eax, %esi    jg  .L3.L1:

where you can see that an unsigned char addition instruction (addb) is used.

The same happens if you're doing your computations between short ints and storing the result in short ints.