What can cause exec to fail? What happens next? What can cause exec to fail? What happens next? unix unix

What can cause exec to fail? What happens next?


The problem with handling exec failure is that usually exec is performed in a child process, and you want to do the error handling in the parent process. But you can't just exit(errno) because (1) you don't know if error codes fit in an exit code, and (2), you can't distinguish between failure to exec and failure exit codes from the new program you exec.

The best solution I know is using pipes to communicate the success or failure of exec:

  1. Before forking, open a pipe in the parent process.
  2. After forking, the parent closes the writing end of the pipe and reads from the reading end.
  3. The child closes the reading end and sets the close-on-exec flag for the writing end.
  4. The child calls exec.
  5. If exec fails, the child writes the error code back to the parent using the pipe, then exits.
  6. The parent reads eof (a zero-length read) if the child successfully performed exec, since close-on-exec made successful exec close the writing end of the pipe. Or, if exec failed, the parent reads the error code and can proceed accordingly. Either way, the parent blocks until the child calls exec.
  7. The parent closes the reading end of the pipe.


From the exec(3) man page:

The execl(), execle(), execlp(), execvp(), and execvP() functions may fail and set errno for any of the errors specified for the library functions execve(2) and malloc(3).

The execv() function may fail and set errno for any of the errors specified for the library function execve(2).

And then from the execve(2) man page:

ERRORS

Execve() will fail and return to the calling process if:

  • [E2BIG] - The number of bytes in the new process's argument list is larger than the system-imposed limit. This limit is specified by the sysctl(3) MIB variable KERN_ARGMAX.
  • [EACCES] - Search permission is denied for a component of the path prefix.
  • [EACCES] - The new process file is not an ordinary file.
  • [EACCES] - The new process file mode denies execute permission.
  • [EACCES] - The new process file is on a filesystem mounted with execution disabled (MNT_NOEXEC in <sys/mount.h>).
  • [EFAULT] - The new process file is not as long as indicated by the size values in its header.
  • [EFAULT] - Path, argv, or envp point to an illegal address.
  • [EIO] - An I/O error occurred while reading from the file system.
  • [ELOOP] - Too many symbolic links were encountered in translating the pathname. This is taken to be indicative of a looping symbolic link.
  • [ENAMETOOLONG] - A component of a pathname exceeded {NAME_MAX} characters, or an entire path name exceeded {PATH_MAX} characters.
  • [ENOENT] - The new process file does not exist.
  • [ENOEXEC] - The new process file has the appropriate access permission, but has an unrecognized format (e.g., an invalid magic number in its header).
  • [ENOMEM] - The new process requires more virtual memory than is allowed by the imposed maximum (getrlimit(2)).
  • [ENOTDIR] - A component of the path prefix is not a directory.
  • [ETXTBSY] - The new process file is a pure procedure (shared text) file that is currently open for writing or reading by some process.

malloc() is a lot less complicated, and uses only ENOMEM. From the malloc(3) man page:

If successful, calloc(), malloc(), realloc(), reallocf(), and valloc() functions return a pointer to allocated memory. If there is an error, they return a NULL pointer and set errno to ENOMEM.


What you do after the exec() call returns depends on the context - what the program is supposed to do, what the error is, and what you might be able to do to work around the problem.

One source of trouble could be that you specified a simple program name instead of a pathname; maybe you could retry with execvp(), or convert the command into an invocation of sh -c 'what you originally specified'. Whether any of these is reasonable depends on the application. If there are major security issues involved, probably you don't try again.

If you specified a pathname and there is a problem with that (ENOTDIR, ENOENT, EPERM), then you may not have any sensible fallback, but you can report the error meaningfully.

In the old days (10+ years ago), some systems did not support the '#!' shebang notation, and if you were not sure whether you were executing an executable or a shell script, you tried it as an executable and then retried it as a shell script. That might or might not work if you were running a Perl script, but in those days, you wrote your Perl scripts to detect that they were being run by a shell and to re-exec themselves with Perl. Fortunately, those days are mostly over.

To the extent possible, it is important to ensure that the process reports the problem so that it can be traced - writing its message to a log file or just to stderr (or maybe even syslog()), so that those who have to work out what went wrong have more information to help them other than the hapless end user's report "I tried X and it didn't work". It is crucial that if nothing works, then the exit status is not 0 as that indicates success. Even that might be ignored - but you did what you could.