What is the safest way to pass strings around in C? What is the safest way to pass strings around in C? unix unix

What is the safest way to pass strings around in C?


Well-written C code adheres to the following convention:

  • All functions return a status code of type int, where a return value of 0 indicates success, and a -1 indicates failure. On failure, the function should set errno with an appropriate value (e.g. EINVAL).
  • Values that are "reported" by a function should be reported via the use of "out parameters". In other words, one of the parameters should be a pointer to the destination object.
  • Ownership of pointers should belong to the invoker; consequently, a function should not free any of its parameters, and should only free objects that it, itself, allocates with malloc/calloc.
  • Strings should be passed either as const char* objects or as char* objects, depending on whether the string is to be overwritten. If the string is not to be modified, then const char* should be used.
  • Whenever an array is passed that is not a NUL-terminated string, a parameter should be provided indicating the the number of elements in the array or the capacity of that array.
  • When a modifiable string/buffer (i.e. char*) object is passed into a function, and that function is to overwrite, append, or otherwise modify the string, a parameter indicating the capacity of the string/buffer needs to be provided (so as to allow for dynamic buffer sizes and to avoid bufffer overflow).

I should point out that in your example code, you are returning localstr and not returnstr. Consequently, you are returning an address of an object in the current function's stack frame. The current function's stack frame will disappear once the function has returned. Invoking another function immediately afterwards will likely alter the data in that location, leading to the corruption that you have observed. Returning the address of a local variable leads to "undefined behavior" and is incorrect.

Edit
Based on your updated code (get_fullpath), it is clear that the problem is not in your function get_fullpath, but rather in the function that is calling it. Most likely, the paths variable is being supplied by a function that returns the address of a local variable. Consequently, when you create a local variable within get_fullpath, it is using the same exact location on the stack that paths previously occupied. Since "paths" is aliasing "fullpaths", it is basically overwritten with the address of the buffer that you've malloced, which is blank.

Edit 2
I have created a C Coding Conventions page on my website with more detailed recommendations, explanations, and examples for writing C code, in case you are interested. Also, the statement that localstr is being returned instead of returnstr is no longer true since the question has last been edited.


You can't return a pointer to an array that's allocated locally within the function. As soon as the function returns, that array is going to be clobbered.

Also, when you put

char localstr[MAX_STRLENGTH] = strcpy(localstr, somestr);

what happens is that strcpy() will copy the bytes into the localstr[] array, but then you have an unnecessary assignment thing going on. You could probably get the intended effect as two lines, thus ..

char localstr[MAX_STRLENGTH];strcpy(localstr, somestr);

Also, it's bad form to embed a free() call inside a function like this. Ideally the free() should be visible at the same level of scope where the malloc() occurred. By the same logic it's a little dubious to allocate memory down in a function this way.

If you want a function to modify a string, a common convention goes something like so

// use a prototype like this to use the same buffer for both input and outputint modifyMyString(char buffer[], int bufferSize) {    // .. operate you find in buffer[],    //    leaving the result in buffer[]    //    and be sure not to exceed buffer length    // depending how it went, return EXIT_FAILURE or maybe    return EXIT_SUCCESS;// or separate input and outputsint workOnString(char inBuffer[], int inBufSize, char outBuffer[], int outBufSize) {    // (notice, you could replace inBuffer with const char *)    // leave result int outBuffer[], return pass fail status    return EXIT_SUCCESS;

Not embedding malloc() or free() inside will also help avoid memory leaks.


Is your "update" example complete? I wouldn't think that would compile: it calls for a return value but you never return anything. You never do anything will fullpath, but perhaps that's deliberate, maybe your point is just to say that when you do the malloc, other things break.

Without seeing the caller, it's impossible to say definitively what is happening here. My guess is that paths is a dynamically-allocated block that was free'd before you called this function. Depending on the compiler implementation, a free'd block could still appear to contain valid data until a future malloc takes over the space.

Update: to actually answer the question

String handling is a well-known problem in C. If you create a fixed-size array to hold the string, you have to worry about a long string overflowing the allocated space. This means constantly checking string sizes on copies, using strncpy and strncat instead of the plain strcpy and strcat, or similar techniques. You can skip this and just say, "Well, no one would ever have a name longer than 60 characters" or some such, but there's always the danger then that someone will. Even on something that should have a known size, like a social security number or an ISBN, someone could make a mistake entering it and hit a key twice, or a malicious user could deliberately enter something long. Etc. Of course this is mostly an issue on data entry or reading files. Once you have a string in a field of some known size, then for any copies or other manipulation, you know the size.

The alternative is to use dynamically-allocated buffers where you can make them as big as needed. This sounds like a good solution when you first hear it, but in practice it's a giantic pain in C, because allocating the buffers and freeing them when you no longer need them is a lot of trouble. Another poster here said that the function that allocates a buffer should be the same one that frees it. A good rule of thumb, I generally agree, but ... What if a subroutine wants to return a string? So it allocates the buffer, returns it, and ... how can it free it? It can't because the whole point is that it wants to return it to the caller. The caller can't allocate the buffer because it doesn't know the size. Also, seemingly simple things like:

if (strcmp(getMeSomeString(),stringIWantToCompareItTo)==0) etc

are impossible. If the getMeSomeString function allocates the string, sure, it can return it so we do the compare, but now we've lost the handle and we can never free it. You end up having to write awkward code like

char* someString=getMeSomeString();int f=strcmp(someString,stringIWantToCompareItTo);free(someString);if (f==0)etc

So okay, it works, but readability just plummetted.

In practice, I've found that when strings can reasonably be expected to be of a knowable size, I allocate fixed-length buffers. If an input is bigger than the buffer, I either truncate it or give an error message, depending on the context. I only resort to dynamically-allocated buffers when the size is potentially large and unpredictable.