Does ISO C allow aliasing of the argv[] pointers supplied to main()? Does ISO C allow aliasing of the argv[] pointers supplied to main()? c c

Does ISO C allow aliasing of the argv[] pointers supplied to main()?


By my reading, the answer to the titular is "yes", since nowhere is it explicitly forbidden and nowhere does the standard urge or require the use of restrict-qualified argv, but the answer might turn on the interpretation of "and retain their last-stored values between program startup and program termination.".

I concur that the standard does not explicitly forbid elements of the argument vector from being aliases of each other. I don't think the modifiability and value-retention provisions contradict that position, but they do suggest to me that the committee did not consider the possibility of aliasing.

The practical import of this question is that if the answer to it is indeed "yes", a portable program that wishes to modify the strings in argv must first perform (the equivalent of) POSIX strdup() on them for safety.

Indeed, that's exactly why I think the committee didn't even consider the possibility. If they had done then surely they would have at least included a footnote to that same effect, or else explicitly specified that the argument strings are all distinct.

I'm inclined to think that this detail escaped the committee's attention because in practice, implementations indeed do provide distinct strings, and because it is rare, moreover, for programs to modify their argument strings (though modifying argv itself is somewhat more common). If the committee agreed to issue an official interpretation in this area, then I would not be surprised for them to come down against the possibility of aliasing.

Until and unless such an interpretation is issued, however, you are right that strict conformance does not permit you to rely a priori on argv elements not being aliased.


The way it works on common *nix platforms (including Linux and Mac OS, presumably FreeBSD too) is that argv is an array of pointers into a single memory area containing the argument strings one after another (separated only by the null terminator). Using execl() does not change this--even if the caller passes the same pointer multiple times, the source string is copied multiple times, with no special behavior for identical (i.e. aliased) pointers (an uncommon case with no great benefit to optimize).

However, C does not require this implementation. The truly paranoid may want to copy every string before modifying it, perhaps skipping the copies if memory is limited and a loop over argv shows that none of the pointers actually alias (at least among those the program intends to modify). This seems overly paranoid unless you are developing flight software or the like.


As a data point, I have compiled and run the following programs on several systems. (Disclaimer: these programs are intended to provide a data point, but as we'll see, they do not end up answering the question as stated.)

p1.c:

#include <stdio.h>#include <unistd.h>int main(){    char test[] = "test";    execl("./p2", "p2", test, test, NULL);}

p2.c:

#include <stdio.h>int main(int argc, char **argv){    int i;    for(i = 1; i < argc; i++) printf("%s ", argv[i]); printf("\n");    argv[1][0] = 'b';    for(i = 1; i < argc; i++) printf("%s ", argv[i]); printf("\n");}

Every place I've tried it (under MacOS and several flavors of Unix and Linux) it has printed

test test best test 

Since the second line was never "best best", this proves that, on the tested systems, by the time the second program is run, the strings are no longer aliased.

Of course, this test does not prove that strings in argv can never be aliased, under any circumstances, under any system out there. I think all it proves is that, unsurprisingly, each of the tested operating systems recopies the argument list at least once between the time p1 calls execl and the time that p2 is actually invoked. In other words, the argument vector constructed by the invoking program is not used directly in the called program, and in the process of copying it, it is (again not surprisingly) "normalized", meaning that the effects of any aliasing are lost.

(I say this is not surprising because if you think about the way the exec family of system calls actually work, and the way process memory is laid out under Unix-like systems, there's no way that the invoking program's argument list could be used directly; it has to be copied, at least once, into the address space of the new, exec'ed process. Furthermore, any obvious and straightforward method of copying the argument list is always and automatically going to "normalize" it in this way; the kernel would have to do significant, extra, totally unnecessary work in order to detect and preserve any aliasing.)

Just in case it matters, I modified the first program in this way:

#include <stdio.h>#include <unistd.h>int main(){    char test[] = "test";    char *argv[] = {"p2", test, test, NULL};    execv("./p2", argv);}

The results were unchanged.


With all of this said, I agree that this issue does seem like an oversight or buglet in the standards. I'm not aware of any clause guaranteeing that the strings pointed to by argv are distinct, meaning that a paranoidly-written program probably can't depend on such a guarantee, no matter how likely it is that (as this answer demonstrates) any reasonable implementation is likely to do it that way.