Why does the complement behave differently through printf? Why does the complement behave differently through printf? c c

Why does the complement behave differently through printf?


In this statement:

printf("%d",~c);

the c is converted to int1 type before ~ (bitwise complement) operator is applied. This is because of integer promotions, that are invoked to operand of the ~. In this case an object of unsigned char type is promoted to (signed) int, which is then (after ~ operator evaluation) used by printf function, with matching %d format specifier.

Notice that default argument promotions (as printf is a variadic function) does not play any role here, as object is already of type int.

On the other hand, in this code:

unsigned char c = 4, d;d = ~c;printf("%d", d);

the following steps occur:

  • c is a subject to integer promotions because of ~ (in the same way, as described above)
  • ~c rvalue is evaluated as (signed) int value (e.g. -5)
  • d=~c makes an implicit conversion from int to unsigned char, as d has such type. You may think of it as the same as d = (unsigned char) ~c. Notice that d cannot be negative (this is general rule for all unsigned types).
  • printf("%d", d); invokes default argument promotions, thus d is converted to int and the (nonnegative) value is preserved (i.e. the int type can represent all values of unsigned char type).

1) assuming that int can represent all values of the unsigned char (see T.C.'s comment below), but it is very likely to happen in this way. More specifically, we assume that INT_MAX >= UCHAR_MAX holds. Typically the sizeof(int) > sizeof(unsigned char) holds and byte consist of eight bits. Otherwise the c would be converted to unsigned int (as by C11 subclause §6.3.1.1/p2), and the format specifier should be also changed accordingly to %u in order to avoid getting an UB (C11 §7.21.6.1/p9).


char is promoted to int in printf statement before the operation ~ in second snippet. So c, which is

0000 0100 (2's complement)  

in binary is promoted to (assuming 32-bit machine)

0000 0000 0000 0000 0000 0000 0000 0100 // Say it is x  

and its bit-wise complement is equal to the two's complement of the value minus one (~x = −x − 1)

1111 1111 1111 1111 1111 1111 1111 1011  

which is -5 in decimal in 2's complement form.

Note that the default promotion of char c to int is also performed in

d = ~c;

before complement operation but the result is converted back to unsigned char as d is of type unsigned char.

C11: 6.5.16.1 Simple assignment (p2):

In simple assignment (=), the value of the right operand is converted to the type of the assignment expression and replaces the value stored in the object designated by the left operand.

and

6.5.16 (p3):

The type of an assignment expression is the type the left operand would haveafter lvalue conversion.


To understand behavior of your code, you need to learn the concept called 'Integer Promotions' (that happens in your code implicitly before bit wise NOT operation on an unsigned char operand) As mentioned in N1570 committee draft:

§ 6.5.3.3 Unary arithmetic operators

  1. The result of the ~ operator is the bitwise complement of its (promoted) operand (that is, each bit in the result is set if and only if the corresponding bit in the converted operand is not set). The integer promotions are performed on the operand, and the result has the promoted type. If the promoted type is an " 'unsigned type', the expression ~E is equivalent to the maximum value representable in that type minus E".

Because unsigned char type is narrower than (as it requires fewer bytes) int type, - implicit type promotion performed by abstract machine(compiler) and value of variable c is promoted to int at the time of compilation (before application of the complement operation ~). It is required for the correct execution of the program because ~ need an integer operand.

§ 6.5 Expressions

  1. Some operators (the unary operator ~, and the binary operators <<, >>, &, ^, and |, collectively described as bitwise operators) are required to have operands that have integer type. These operators yield values that depend on the internal representations of integers, and have implementation-defined and undefined aspects for signed types.

Compilers are smart-enough to analyze expressions, checks semantics of expressions, perform type checking and arithmetic conversions if required. That's the reason that to apply ~ on char type we don't need to explicitly write ~(int)c — called explicit type casting (and do avoid errors).

Note:

  1. Value of c is promoted to int in expression ~c, but type of c is still unsigned char - its type does not. Don't be confused.

  2. Important: result of ~ operation is of int type!, check below code (I don't have vs-compiler, I am using gcc):

    #include<stdio.h>#include<stdlib.h>int main(void){   unsigned char c = 4;   printf(" sizeof(int) = %zu,\n sizeof(unsigned char) = %zu",            sizeof(int),            sizeof(unsigned char));   printf("\n sizeof(~c) = %zu", sizeof(~c));           printf("\n");   return EXIT_SUCCESS;}

    compile it, and run:

    $ gcc -std=gnu99 -Wall -pedantic x.c -o x$ ./xsizeof(int) = 4,sizeof(unsigned char) = 1sizeof(~c) = 4

    Notice: size of result of ~c is same as of int, but not equals to unsigned char — result of ~ operator in this expression is int! that as mentioned 6.5.3.3 Unary arithmetic operators

    1. The result of the unary - operator is the negative of its (promoted) operand. The integer promotions are performed on the operand, and the result has the promoted type.

Now, as @haccks also explained in his answer -that result of ~c on 32-bit machine and for value of c = 4 is:

1111 1111 1111 1111 1111 1111 1111 1011

in decimal it is -5 — that is the output of your second code!

In your first code, one more line is interesting to understand b = ~c;, because b is an unsigned char variable and result of ~c is of int type, so to accommodate value of result of ~c to b result value (~c) is truncated to fit into the unsigned char type as follows:

    1111 1111 1111 1111 1111 1111 1111 1011  // -5 & 0xFF &  0000 0000 0000 0000 0000 0000 1111 1111  // - one byte          -------------------------------------------                                            1111 1011  

Decimal equivalent of 1111 1011 is 251. You could get same effect using:

printf("\n ~c = %d", ~c  & 0xFF); 

or as suggested by @ouah in his answer using explicitly casting.