gcc-10.0.1 Specific Segfault gcc-10.0.1 Specific Segfault c c

gcc-10.0.1 Specific Segfault


Summary: This appears to be a bug in gcc, related to string optimization. A self-contained testcase is below. There was initially some doubt as to whether the code is correct, but I think it is.

I have reported the bug as PR 93982. A proposed fix was committed but it does not fix it in all cases, leading to the followup PR 94015 (godbolt link).

You should be able to work around the bug by compiling with the flag -fno-optimize-strlen.


I was able to reduce your test case to the following minimal example (also on godbolt):

struct a {    const char ** target;};char* R_alloc(void);struct a foo(void) {    struct a res;    res.target = (const char **) R_alloc();    res.target[0] = "12345678";    res.target[1] = "";    res.target[2] = "";    res.target[3] = "";    res.target[4] = "";    return res;}

With gcc trunk (gcc version 10.0.1 20200225 (experimental)) and -O2 (all other options turned out to be unnecessary), the generated assembly on amd64 is as follows:

.LC0:        .string "12345678".LC1:        .string ""foo:        subq    $8, %rsp        call    R_alloc        movq    $.LC0, (%rax)        movq    $.LC1, 16(%rax)        movq    $.LC1, 24(%rax)        movq    $.LC1, 32(%rax)        addq    $8, %rsp        ret

So you are quite right that the compiler is failing to initialize res.target[1] (note the conspicuous absence of movq $.LC1, 8(%rax)).

It is interesting to play with the code and see what affects the "bug". Perhaps significantly, changing the return type of R_alloc to void * makes it go away, and gives you "correct" assembly output. Maybe less significantly but more amusingly, changing the string "12345678" to be either longer or shorter also makes it go away.


Previous discussion, now resolved - the code is apparently legal.

The question I have is whether your code is actually legal. The fact that you take the char * returned by R_alloc() and cast it to const char **, and then store a const char * seems like it might violate the strict aliasing rule, as char and const char * are not compatible types. There is an exception that allows you to access any object as char (to implement things like memcpy), but this is the other way around, and as best I understand it, that's not allowed. It makes your code produce undefined behavior and so the compiler can legally do whatever the heck it wants.

If this is so, the correct fix would be for R to change their code so that R_alloc() returns void * instead of char *. Then there would be no aliasing problem. Unfortunately, that code is outside your control, and it's not clear to me how you can use this function at all without violating strict aliasing. A workaround might be to interpose a temporary variable, e.g. void *tmp = R_alloc(); res.target = tmp; which solves the problem in the test case, but I'm still not sure if it's legal.

However, I am not sure of this "strict aliasing" hypothesis, because compiling with -fno-strict-aliasing, which AFAIK is supposed to make gcc allow such constructs, does not make the problem go away!


Update. Trying some different options, I found that either -fno-optimize-strlen or -fno-tree-forwprop will result in "correct" code being generated. Also, using -O1 -foptimize-strlen yields the incorrect code (but -O1 -ftree-forwprop does not).

After a little git bisect exercise, the error seems to have been introduced in commit 34fcf41e30ff56155e996f5e04.


Update 2. I tried digging into the gcc source a little bit, just to see what I could learn. (I don't claim to be any sort of compiler expert!)

It looks like the code in tree-ssa-strlen.c is meant to keep track of strings appearing in the program. As near as I can tell, the bug is that in looking at the statement res.target[0] = "12345678"; the compiler conflates the address of the string literal "12345678" with the string itself. (That seems to be related to this suspicious code which was added in the aforementioned commit, where if it tries to count the bytes of a "string" that is actually an address, it instead looks at what that address points to.)

So it thinks that the statement res.target[0] = "12345678", instead of storing the address of "12345678" at the address res.target, is storing the string itself at that address, as if the statement were strcpy(res.target, "12345678"). Note for what's ahead that this would result in the trailing nul being stored at address res.target+8 (at this stage in the compiler, all offsets are in bytes).

Now when the compiler looks at res.target[1] = "", it likewise treats this as if it were strcpy(res.target+8, ""), the 8 coming from the size of a char *. That is, as if it were simply storing a nul byte at address res.target+8. However, the compiler "knows" that the previous statement already stored a nul byte at that very address! As such, this statement is "redundant" and can be discarded (here).

This explains why the string has to be exactly 8 characters long to trigger the bug. (Though other multiples of 8 can also trigger the bug in other situations.)