Why does the complement behave differently through printf?
In this statement:
printf("%d",~c);
the c
is converted to int
1 type before ~
(bitwise complement) operator is applied. This is because of integer promotions, that are invoked to operand of the ~
. In this case an object of unsigned char
type is promoted to (signed) int
, which is then (after ~
operator evaluation) used by printf
function, with matching %d
format specifier.
Notice that default argument promotions (as printf
is a variadic function) does not play any role here, as object is already of type int
.
On the other hand, in this code:
unsigned char c = 4, d;d = ~c;printf("%d", d);
the following steps occur:
c
is a subject to integer promotions because of~
(in the same way, as described above)~c
rvalue is evaluated as (signed)int
value (e.g.-5
)d=~c
makes an implicit conversion fromint
tounsigned char
, asd
has such type. You may think of it as the same asd = (unsigned char) ~c
. Notice thatd
cannot be negative (this is general rule for all unsigned types).printf("%d", d);
invokes default argument promotions, thusd
is converted toint
and the (nonnegative) value is preserved (i.e. theint
type can represent all values ofunsigned char
type).
1) assuming that int
can represent all values of the unsigned char
(see T.C.'s comment below), but it is very likely to happen in this way. More specifically, we assume that INT_MAX >= UCHAR_MAX
holds. Typically the sizeof(int) > sizeof(unsigned char)
holds and byte consist of eight bits. Otherwise the c
would be converted to unsigned int
(as by C11 subclause §6.3.1.1/p2), and the format specifier should be also changed accordingly to %u
in order to avoid getting an UB (C11 §7.21.6.1/p9).
char
is promoted to int
in printf
statement before the operation ~
in second snippet. So c
, which is
0000 0100 (2's complement)
in binary is promoted to (assuming 32-bit machine)
0000 0000 0000 0000 0000 0000 0000 0100 // Say it is x
and its bit-wise complement is equal to the two's complement of the value minus one (~x = −x − 1
)
1111 1111 1111 1111 1111 1111 1111 1011
which is -5
in decimal in 2's complement form.
Note that the default promotion of char
c
to int
is also performed in
d = ~c;
before complement operation but the result is converted back to unsigned char
as d
is of type unsigned char
.
C11: 6.5.16.1 Simple assignment (p2):
In simple assignment (
=
), the value of the right operand is converted to the type of the assignment expression and replaces the value stored in the object designated by the left operand.
and
6.5.16 (p3):
The type of an assignment expression is the type the left operand would haveafter lvalue conversion.
To understand behavior of your code, you need to learn the concept called 'Integer Promotions' (that happens in your code implicitly before bit wise NOT operation on an unsigned char
operand) As mentioned in N1570 committee draft:
§ 6.5.3.3 Unary arithmetic operators
- The result of the
~
operator is the bitwise complement of its (promoted) operand (that is, each bit in the result is set if and only if the corresponding bit in the converted operand is not set). The integer promotions are performed on the operand, and the result has the promoted type. If the promoted type is an " 'unsigned type', the expression~E
is equivalent to the maximum value representable in that type minusE
".
Because unsigned char
type is narrower than (as it requires fewer bytes) int
type, - implicit type promotion performed by abstract machine(compiler) and value of variable c
is promoted to int
at the time of compilation (before application of the complement operation ~
). It is required for the correct execution of the program because ~
need an integer operand.
§ 6.5 Expressions
- Some operators (the unary operator
~
, and the binary operators<<
,>>
,&
,^
, and|
, collectively described as bitwise operators) are required to have operands that have integer type. These operators yield values that depend on the internal representations of integers, and have implementation-defined and undefined aspects for signed types.
Compilers are smart-enough to analyze expressions, checks semantics of expressions, perform type checking and arithmetic conversions if required. That's the reason that to apply ~
on char
type we don't need to explicitly write ~(int)c
— called explicit type casting (and do avoid errors).
Note:
Value of
c
is promoted toint
in expression~c
, but type ofc
is stillunsigned char
- its type does not. Don't be confused.Important: result of
~
operation is ofint
type!, check below code (I don't have vs-compiler, I am using gcc):#include<stdio.h>#include<stdlib.h>int main(void){ unsigned char c = 4; printf(" sizeof(int) = %zu,\n sizeof(unsigned char) = %zu", sizeof(int), sizeof(unsigned char)); printf("\n sizeof(~c) = %zu", sizeof(~c)); printf("\n"); return EXIT_SUCCESS;}
compile it, and run:
$ gcc -std=gnu99 -Wall -pedantic x.c -o x$ ./xsizeof(int) = 4,sizeof(unsigned char) = 1sizeof(~c) = 4
Notice: size of result of
~c
is same as ofint
, but not equals tounsigned char
— result of~
operator in this expression isint
! that as mentioned 6.5.3.3 Unary arithmetic operators- The result of the unary
-
operator is the negative of its (promoted) operand. The integer promotions are performed on the operand, and the result has the promoted type.
- The result of the unary
Now, as @haccks also explained in his answer -that result of ~c
on 32-bit machine and for value of c = 4
is:
1111 1111 1111 1111 1111 1111 1111 1011
in decimal it is -5
— that is the output of your second code!
In your first code, one more line is interesting to understand b = ~c;
, because b
is an unsigned char
variable and result of ~c
is of int
type, so to accommodate value of result of ~c
to b
result value (~c) is truncated to fit into the unsigned char type as follows:
1111 1111 1111 1111 1111 1111 1111 1011 // -5 & 0xFF & 0000 0000 0000 0000 0000 0000 1111 1111 // - one byte ------------------------------------------- 1111 1011
Decimal equivalent of 1111 1011
is 251
. You could get same effect using:
printf("\n ~c = %d", ~c & 0xFF);
or as suggested by @ouah in his answer using explicitly casting.