Why unused objects in STATIC lib included in final binary when SHARED lib reference them? Why unused objects in STATIC lib included in final binary when SHARED lib reference them? unix unix

Why unused objects in STATIC lib included in final binary when SHARED lib reference them?


Here is a much simplified illustration of the linker behaviour that is puzzlingyou:

main.c

extern void foo(void);int main(void){    foo();    return 0;}

foo.c

#include <stdio.h>void foo(void){    puts(__func__);}

bar.c

#include <stdio.h>extern void do_bar(void);void bar(void){    do_bar();}

do_bar.c

#include <stdio.h>void do_bar(void){    puts(__func__);}

Let's compile all those source files to object files:

$ gcc -Wall -c main.c foo.c bar.c do_bar.c

Now we'll try to link a program, like so:

$ gcc -o prog main.o foo.o bar.obar.o: In function `bar':bar.c:(.text+0x5): undefined reference to `do_bar'

The undefined function do_bar is referenced only in the definitionof bar, and bar is not referenced inthe program at all. Why then the linkage failure?

Quite simply, this linkage failed because we told the linker to link bar.ointo the program; so it did; and bar.o contains the definition of bar,which references do_bar, which is not defined in the linkage. bar is notreferenced, but do_bar is - by bar, which is linked in the program.

By default, the linker demands that any symbol that is referenced in the linkageof a program is defined in the linkage. If we compel it to link the definitionof bar, then it will demand a definition of do_bar, because without adefinition of do_bar it hasn't actually got a definition of bar. It if linksa definition of bar, it does not question whether we need to link it,and then permit undefined references to do_bar if the answer is No.

The linkage failure is course fixable with:

$ gcc -o prog main.o foo.o bar.o do_bar.o$ ./progfoo

Now in this illustration, linking bar.o in the program is simply gratuitous. Wecan also link successfully just by not telling the linker to link bar.o.

gcc -o prog main.o foo.o$ ./progfoo

bar.o and do_bar.o are both are superfluous forexecuting main, but the program can only be linked with both, or with neither

But suppose foo and bar were defined in the same file?

They might be defined in the same object file, foobar.o:

ld -r -o foobar.o foo.o bar.o

And then:

$ gcc -o prog main.o foobar.ofoobar.o: In function `bar':(.text+0x18): undefined reference to `do_bar'collect2: error: ld returned 1 exit status

Now, the linker cannot link the definition of foo without also linking thedefinition of bar. So once again, we have to link a definition of do_bar:

$ gcc -o prog main.o foobar.o do_bar.o$ ./progfoo

Linked like this, prog contains definitions of foo, bar and do_bar:

$ nm prog | grep -e foo -e bar000000000000065d T bar0000000000000669 T do_bar000000000000064a T foo

(T = defined function symbol).

Equally, foo and bar might be defined in the same shared library:

$ gcc -Wall -fPIC -c foo.c bar.c$ gcc -shared -o libfoobar.so foo.o bar.o

and then this linkage:

$ gcc -o prog main.o -L. -lfoobar -Wl,-rpath=$(pwd)./libfoobar.so: undefined reference to `do_bar'collect2: error: ld returned 1 exit status

fails just as before, and is fixable in the same way:

$ gcc -o prog main.o do_bar.o -L. -lfoobar -Wl,-rpath=$(pwd)$ ./progfoo

When we link the shared library libfoobar.so rather than the objectfile foobar.o, our prog has a different symbol table:

$ nm prog | grep -e foo -e bar00000000000007aa T do_bar             U foo

This time, prog does not contain definitions of either foo or bar. Itcontains an undefined reference (U) to foo, because it calls foo, and ofcourse that reference will now be satisfied, at runtime, by the definition in libfoobar.so.There's not even an undefined reference to bar, nor should there be, since the programnever calls bar.

But still, prog contains the definition of do_bar, which is now unreferencedfrom all functions in the symbol table.

This echoes your own SSCCE, but in a less convoluted way. In your case:

  • The object file libsub.a(shared2.o) islinked into the program to provide definitions for func2a and func2b.

  • Those defintions must be found and linked because they are referenced, respectively, in the definitions of Client_func2aand Client_func2b, which are defined in libcshared.so.

  • libcshared.so must be linked to provide a definition of Client_func1a.

  • A definition of Client_func1a must be found and linked because it isreferenced from the definition of func1a.

  • And func1a is called by main.

That's why we see:

$ nm main | grep func2                 U Client_func2a                 U Client_func2b00000000004009f7 T func2a0000000000400a30 T func2b

in the symbol table of your program.

It is is not at all unusual for definitions to be linked into a program forfunctions that it does not call. It usually happens in the way we we've seen: the linkage,recursively resolving symbol references starting with main, discovers that it needs a definitionof f, which it can only get by linking some object file file.o, and with file.oit also links a definition of function g, which is never called.

What is rather odd is to end up with a program like your main and like my last version of prog,which contains a definition of an uncalled function (e.g do_bar) that is linked to resolvereferences from the definition of another uncalled function (e.g. bar) that is not defined in the program.Even if there are redundant function definitions, usually we can chain them back to one or moreobject files in the linkage where the first redundant definitions are pulled in along withsome necessary defintions.

This oddity is caused, in a case like:

gcc -o prog main.o do_bar.o -L. -lfoobar -Wl,-rpath=$(pwd)

because the first redundant function definition that must be linked (bar) isprovided by linking a shared library, libfoobar.so, while the definition of do_barthat is demanded by bar is not in that shared library, or any other shared library,but in an object file.

The definition of bar that's provided by libfoobar.so will stay there when theprogram is linked with that shared library. It won't be physically linked into theprogram. That's the nature of dynamic linkage. But any object file required by thelinkage - whether it's a free-standing object file like do_bar.o or onethat the linker extracts from an archive like libsub.a(shared2.o) - can only belinked physically into the program. So the redundant do_bar appears in thesymbol table of prog. But the redundant bar, which explains why do_bar is there,isn't there. It is in the symbol table of libfoobar.so.

When you discover dead code in your program, you might like the linker to be smarter.Usually, it can be smarter, at the cost of some extra effort. You need to ask it to garbage-collect sections,and before that, you need to ask the compiler to prepare the way by generating data-sections andfunction-sections in the object files. See How to remove unused C/C++ symbols with GCC and ld?, andthe answer

But this way of pruning dead code will not work in the unusual case where thedead code is linked in the program to satisfy redundant references from a shared libraryrequired by the linkage. The linker can only recursively garbage-collect unused sections fromthe ones that it outputs into the program, and it only outputs sections that are inputfrom object files, not from shared libraries that are to be dynamically linked.

The right way to avoid the dead code in your main and my prog is not to do that peculiar kind of linkage in whicha shared library will contain undefined references that the program does not call but that have to beresolved by linking dead object code into your program.

Instead, when you build a shared library, either don't leave any undefined references in it,or else leave only undefined references that shall by satisfied by its own dynamic dependencies.

So, the proper way to build my libfoobar.so is:

$ gcc -shared -o libfoobar.so foo.o bar.o do_bar.o

This gives me a shared library that has an API of:

void foo(void);void bar(void);

for whoever wants either or both of them, and no undefined references. ThenI build my program that is a client just of foo:

$ gcc -o prog main.o -L. -lfoobar -Wl,-rpath=$(pwd)$ ./progfoo

And it contains no dead code:

$ nm prog | grep -e foo -e bar                 U foo

Similarly, if you build your libshared.so without undefined references, like:

$ gcc -c -fPIC shared2.c shared1.c$ ar -crs libsub.a  shared1.o shared2.o$ gcc -shared -o libcshared.so cshared1.o cshared2.o -L. -lsub

and then link your program:

$ gcc -o main main.c libcmain.so  libcshared.so

it too will have no dead code:

$ nm main | grep func                 U func1a

If you dislike the fact that libsub.a(shared1.o) and libsub.a(shared2.o)become physically linked into libcshared.so by this solution, then take theother orthodox approach to linking a shared library: leave all the func* functions undefined in libcshared.so: make libsub alsoa shared library, which then is a dynamic dependency of libcshared.so.


If you're just looking to get rid of unused functions, you may not need to use a shared library. For GCC, try this. For XL, replace -fdata-sections -ffunction-sections with -qfuncsect. An important related topic is the use of export/import lists and visibility options. These control whether extra symbols linked into your library are exported outside your library or not. See here for more information.