How does one properly assign temporary Bash variables on a per-command basis? How does one properly assign temporary Bash variables on a per-command basis? bash bash

How does one properly assign temporary Bash variables on a per-command basis?


$ foo=bar$ foo=qux echo $foobar

This is a common bash gotcha -- and https://www.shellcheck.net/ catches it:

foo=qux echo $foo^-- SC2097: This assignment is only seen by the forked process.             ^-- SC2098: This expansion will not see the mentioned assignment.

The issue is that the first foo=bar is setting a bash variable, not an environment variable. Then, the inline foo=qux syntax is used to set an environment variable for echo -- however echo never actually looks at that variable. Instead $foo gets recognized as a bash variable and replaced with bar.

So back to your main question, you were basically there with your final attempt using the subshell -- except that you don't actually need the subshell:

while IFS=, read -a A; do  IFS=,; echo "${A[*]:1:2}"done <<< alpha,bravo,charlie

outputs:

bravo,charlie

For completeness, here's a final example that reads in multiple lines and uses a different output separator to demonstrate that the different IFS assignments aren't stomping on each other:

while IFS=, read -a A; do  IFS=:; echo "${A[*]:1:2}"done < <(echo -e 'alpha,bravo,charlie\nfoo,bar,baz')

outputs:

bravo:charliebar:baz


The answer is a bit simpler than the other answers are presenting:

$ foo=bar$ foo=qux echo $foobar

We see "bar" because the shell expands $foo before setting foo=qux

Simple Command Expansion -- there's a lot to get through here, so bear with me...

When a simple command is executed, the shell performs the following expansions, assignments, and redirections, from left to right.

  1. The words that the parser has marked as variable assignments (those preceding the command name) and redirections are saved for later processing.
  2. The words that are not variable assignments or redirections are expanded (see Shell Expansions). If any words remain after expansion, the first word is taken to be the name of the command and the remaining words are the arguments.
  3. Redirections are performed as described above (see Redirections).
  4. The text after the ‘=’ in each variable assignment undergoes tilde expansion, parameter expansion, command substitution, arithmetic expansion, and quote removal before being assigned to the variable.

If no command name results, the variable assignments affect the current shell environment. Otherwise, the variables are added to the environment of the executed command and do not affect the current shell environment. If any of the assignments attempts to assign a value to a readonly variable, an error occurs, and the command exits with a non-zero status.

If no command name results, redirections are performed, but do not affect the current shell environment. A redirection error causes the command to exit with a non-zero status.

If there is a command name left after expansion, execution proceeds as described below. Otherwise, the command exits. If one of the expansions contained a command substitution, the exit status of the command is the exit status of the last command substitution performed. If there were no command substitutions, the command exits with a status of zero.

So:

  • the shell sees foo=qux and saves that for later
  • the shell sees $foo and expands it to "bar"
  • then we now have: foo=qux echo bar

Once you really understand the order that bash does things, a lot of the mystery goes away.


Short answer: the effects of changing IFS are complex and hard to understand, and best avoided except for a few well-defined idioms (IFS=, read ... is one of the idioms I consider ok).

Long answer: There are a couple of things you need to keep in mind in order to understand the results you're seeing from changes to IFS:

  • Using IFS=something as a prefix to a command changes IFS only for that one command's execution. In particular, it does not affect how the shell parses the arguments to be passed to that command; that's controlled by the shell's value of IFS, not the one used for the command's execution.

  • Some commands pay attention to the value of IFS they're executed with (e.g. read), but others don't (e.g. echo).

Given the above, IFS=, read -a A does what you'd expect, it splits its input on ",":

$ IFS=, read -a A <<<"alpha,bravo,charlie"$ declare -p Adeclare -a A='([0]="alpha" [1]="bravo" [2]="charlie")'

But echo pays no attention; it always puts spaces between the arguments it's passed, so using IFS=something as a prefix to it has no effect at all:

$ echo alpha bravoalpha bravo$ IFS=, echo alpha bravoalpha bravo

So when you use IFS=, echo "${A[*]:1:2}", it's equivalent to just echo "${A[*]:1:2}", and since the shell's definition of IFS starts with space, it puts the elements of A together with spaces between them. So it's equivalent to running IFS=, echo "alpha bravo".

On the other hand, IFS=,; echo "${A[*]:1:2}" changes the shell's definition of IFS, so it does affect how the shell puts the elements together, so it comes out equivalent to IFS=, echo "alpha,bravo". Unfortunately, it also affects everything else from that point on so you either have to isolate it to a subshell or set it back to normal afterward.

Just for completeness, here are a couple of other versions that don't work:

$ IFS=,; echo "${A[@]:1:2}"bravo charlie

In this case, the [@] tells the shell to treat each element of the array as a separate argument, so it's left to echo to merge them, and it ignores IFS and always uses spaces.

$ IFS=,; echo "${A[@]:1:2}"bravo charlie

So how about this:

$ IFS=,; echo ${A[*]:1:2}bravo charlie

In this case, the [*] tells the shell to mash all elements together with the first character of IFS between them, giving bravo,charlie. But it's not in double-quotes, so the shell immediately re-splits it on ",", splitting it back into separate arguments again (and then echo joins them with spaces as always).

If you want to change the shell's definition of IFS without having to isolate it to a subshell, there are a few options to change it and set it back afterward. In bash, you can set it back to normal like this:

$ IFS=,$ while read -a A; do    # Note: IFS change not needed here; it's already changed> echo "${A[*]:1:2}"> done <<<alpha,bravo,charliebravo,charlie$ IFS=$' \t\n'

But the $'...' syntax isn't available in all shells; if you need portability it's best to use literal characters:

IFS=' '        # You can't see it, but there's a literal space and tab after the first '

Some people prefer to use unset IFS, which just forces the shell to its default behavior, which is pretty much the same as with IFS defined in the normal way.

...but if IFS had been changed in some larger context, and you don't want to mess that up, you need to save it and then set it back. If it's been changed normally, this'll work:

saveIFS=$IFS...IFS=$saveIFS

...but if someone thought it was a good idea to use unset IFS, this will define it as blank, giving weird results. So you can use this approach or the unset approach, but not both. If you want to make this robust against the unset conflict, you can use something like this in bash:

saveIFS=${IFS:-$' \t\n'}

...or for portability, leave off the $' ' and use literal space+tab+newline:

saveIFS=${IFS:- }                # Again, there's an invisible space and tab at the end of the first line

All in all, it's a lot of mess full of traps for the unwary. I recommend avoiding it whenever possible.