Need explanations for Linux bash builtin exec command behavior

linux bash exec flow

In this particular case, you have the exec in a pipeline. In order to execute a series of pipeline commands, the shell must initially fork, making a sub-shell. (Specifically it has to create the pipe, then fork, so that everything run "on the left" of the pipe can have its output sent to whatever is "on the right" of the pipe.)

To see that this is in fact what is happening, compare:

{ ls; echo this too; } | cat

with:

{ exec ls; echo this too; } | cat

The former runs ls without leaving the sub-shell, so that this sub-shell is therefore still around to run the echo. The latter runs ls by leaving the sub-shell, which is therefore no longer there to do the echo, and this too is not printed.

(The use of curly-braces { cmd1; cmd2; } normally suppresses the sub-shell fork action that you get with parentheses (cmd1; cmd2), but in the case of a pipe, the fork is "forced", as it were.)

Redirection of the current shell happens only if there is "nothing to run", as it were, after the word exec. Thus, e.g., exec >stdout 4<input 5>>append modifies the current shell, but exec foo >stdout 4<input 5>>append tries to exec command foo. [Note: this is not strictly accurate; see addendum.]

Interestingly, in an interactive shell, after exec foo >output fails because there is no command foo, the shell sticks around, but stdout remains redirected to file output. (You can recover with exec >/dev/tty. In a script, the failure to exec foo terminates the script.)

With a tip of the hat to @Pumbaa80, here's something even more illustrative:

#! /bin/bashshopt -s execfailexec ls | cat -Eecho this goes to stdoutecho this goes to stderr 1>&2

(note: cat -E is simplified down from my usual cat -vET, which is my handy go-to for "let me see non-printing characters in a recognizable way"). When this script is run, the output from ls has cat -E applied (on Linux this makes end-of-line visible as a $ sign), but the output sent to stdout and stderr (on the remaining two lines) is not redirected. Change the | cat -E to > out and, after the script runs, observe the contents of file out: the final two echos are not in there.

Now change the ls to foo (or some other command that will not be found) and run the script again. This time the output is:

$ ./demo.sh./demo.sh: line 3: exec: foo: not foundthis goes to stderr

and the file out now has the contents produced by the first echo line.

This makes what exec "really does" as obvious as possible (but no more obvious, as Albert Einstein did not put it :-) ).

Normally, when the shell goes to execute a "simple command" (see the manual page for the precise definition, but this specifically excludes the commands in a "pipeline"), it prepares any I/O redirection operations specified with <, >, and so on by opening the files needed. Then the shell invokes fork (or some equivalent but more-efficient variant like vfork or clone depending on underlying OS, configuration, etc), and, in the child process, rearranges the open file descriptors (using dup2 calls or equivalent) to achieve the desired final arrangements: > out moves the open descriptor to fd 1—stdout—while 6> out moves the open descriptor to fd 6.

If you specify the exec keyword, though, the shell suppresses the fork step. It does all the file opening and file-descriptor-rearranging as usual, but this time, it affects any and all subsequent commands. Finally, having done all the redirections, the shell attempts to execve() (in the system-call sense) the command, if there is one. If there is no command, or if the execve() call fails and the shell is supposed to continue running (is interactive or you have set execfail), the shell soldiers on. If the execve() succeeds, the shell no longer exists, having been replaced by the new command. If execfail is unset and the shell is not interactive, the shell exits.

(There's also the added complication of the command_not_found_handle shell function: bash's exec seems to suppress running it, based on test results. The exec keyword in general makes the shell not look at its own functions, i.e., if you have a shell function f, running f as a simple command runs the shell function, as does (f) which runs it in a sub-shell, but running (exec f) skips over it.)

As for why ls>out1 ls>out2 creates two files (with or without an exec), this is simple enough: the shell opens each redirection, and then uses dup2 to move the file descriptors. If you have two ordinary > redirects, the shell opens both, moves the first one to fd 1 (stdout), then moves the second one to fd 1 (stdout again), closing the first in the process. Finally, it runs ls ls, because that's what's left after removing the >out1 >out2. As long as there is no file named ls, the ls command complains to stderr, and writes nothing to stdout.

CodeHunter

Need explanations for Linux bash builtin exec command behavior

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last