C++ char array null terminator location C++ char array null terminator location arrays arrays

C++ char array null terminator location


In the case of a string literal the compiler is actually reserving an extra char element for the \0 element.

// Create a new char arraychar* str2 = (char*) malloc( strlen(str1) );

This is a common mistake new C programmers make. When allocating the storage for a char* you need to allocate the number of characters + 1 more to store the \0. Not allocating the extra storage here means this line is also illegal

// Null-terminate the second onestr2[strlen(str1)] = '\0';

Here you're actually writing past the end of the memory you allocated. When allocating X elements the last legal byte you can access is the memory address offset by X - 1. Writing to the X element causes undefined behavior. It will often work but is a ticking time bomb.

The proper way to write this is as follows

size_t size = strlen(str1) + sizeof(char);char* str2 = (char*) malloc(size);strncpy( str2, str1, size);// Output the second onecout << "Str2: " << str2 << endl;

In this example the str2[size - 1] = '\0' isn't actually needed. The strncpy function will fill all extra spaces with the null terminator. Here there are only size - 1 elements in str1 so the final element in the array is unneeded and will be filled with \0


Is it actually allocating an array of length 12 instead of 11, with the 12th character being '\0'?

Yes.

But doesn't writing to str2[11] imply that we are writing outside of the allocated memory space of str2, since str2[11] is the 12th byte, but we only allocated 11 bytes?

Yes.

Would it be better to use malloc( strlen(str1) + 1 ) instead of malloc( strlen(str1) )?

Yes, because the second form is not long enough to copy the string into.

Running this code does not seem to cause any compiler warnings or run-time errors.

Detecting this in all but the simplest cases is a very difficult problem. So the compiler authors simply don't bother.


This sort of complexity is exactly why you should be using std::string rather than raw C-style strings if you are writing C++. It's as simple as this:

std::string str1 = "hello world";std::string str2 = str1;


The literal "hello world" is a char array that looks like:

{ 'h', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd', '\0' }

So, yes, the literal is 12 chars in size.

Also, malloc( strlen(str1) ) is allocating memory for 1 less byte than is needed, since strlen returns the length of the string, not including the NUL terminator. Writing to str[strlen(str1)] is writing 1 byte past the amount of memory that you've allocated.

Your compiler won't tell you that, but if you run your program through valgrind or a similar program available on your system it'll tell you if you're accessing memory you shouldn't be.