How does one properly assign temporary Bash variables on a per-command basis?
$ foo=bar$ foo=qux echo $foobar
This is a common bash gotcha -- and https://www.shellcheck.net/ catches it:
foo=qux echo $foo^-- SC2097: This assignment is only seen by the forked process. ^-- SC2098: This expansion will not see the mentioned assignment.
The issue is that the first foo=bar
is setting a bash variable, not an environment variable. Then, the inline foo=qux
syntax is used to set an environment variable for echo
-- however echo
never actually looks at that variable. Instead $foo
gets recognized as a bash variable and replaced with bar
.
So back to your main question, you were basically there with your final attempt using the subshell -- except that you don't actually need the subshell:
while IFS=, read -a A; do IFS=,; echo "${A[*]:1:2}"done <<< alpha,bravo,charlie
outputs:
bravo,charlie
For completeness, here's a final example that reads in multiple lines and uses a different output separator to demonstrate that the different IFS assignments aren't stomping on each other:
while IFS=, read -a A; do IFS=:; echo "${A[*]:1:2}"done < <(echo -e 'alpha,bravo,charlie\nfoo,bar,baz')
outputs:
bravo:charliebar:baz
The answer is a bit simpler than the other answers are presenting:
$ foo=bar$ foo=qux echo $foobar
We see "bar" because the shell expands $foo
before setting foo=qux
Simple Command Expansion -- there's a lot to get through here, so bear with me...
When a simple command is executed, the shell performs the following expansions, assignments, and redirections, from left to right.
- The words that the parser has marked as variable assignments (those preceding the command name) and redirections are saved for later processing.
- The words that are not variable assignments or redirections are expanded (see Shell Expansions). If any words remain after expansion, the first word is taken to be the name of the command and the remaining words are the arguments.
- Redirections are performed as described above (see Redirections).
- The text after the ‘=’ in each variable assignment undergoes tilde expansion, parameter expansion, command substitution, arithmetic expansion, and quote removal before being assigned to the variable.
If no command name results, the variable assignments affect the current shell environment. Otherwise, the variables are added to the environment of the executed command and do not affect the current shell environment. If any of the assignments attempts to assign a value to a readonly variable, an error occurs, and the command exits with a non-zero status.
If no command name results, redirections are performed, but do not affect the current shell environment. A redirection error causes the command to exit with a non-zero status.
If there is a command name left after expansion, execution proceeds as described below. Otherwise, the command exits. If one of the expansions contained a command substitution, the exit status of the command is the exit status of the last command substitution performed. If there were no command substitutions, the command exits with a status of zero.
So:
- the shell sees
foo=qux
and saves that for later - the shell sees
$foo
and expands it to "bar" - then we now have:
foo=qux echo bar
Once you really understand the order that bash does things, a lot of the mystery goes away.
Short answer: the effects of changing IFS
are complex and hard to understand, and best avoided except for a few well-defined idioms (IFS=, read ...
is one of the idioms I consider ok).
Long answer: There are a couple of things you need to keep in mind in order to understand the results you're seeing from changes to IFS
:
Using
IFS=something
as a prefix to a command changesIFS
only for that one command's execution. In particular, it does not affect how the shell parses the arguments to be passed to that command; that's controlled by the shell's value ofIFS
, not the one used for the command's execution.Some commands pay attention to the value of
IFS
they're executed with (e.g.read
), but others don't (e.g.echo
).
Given the above, IFS=, read -a A
does what you'd expect, it splits its input on ",":
$ IFS=, read -a A <<<"alpha,bravo,charlie"$ declare -p Adeclare -a A='([0]="alpha" [1]="bravo" [2]="charlie")'
But echo
pays no attention; it always puts spaces between the arguments it's passed, so using IFS=something
as a prefix to it has no effect at all:
$ echo alpha bravoalpha bravo$ IFS=, echo alpha bravoalpha bravo
So when you use IFS=, echo "${A[*]:1:2}"
, it's equivalent to just echo "${A[*]:1:2}"
, and since the shell's definition of IFS
starts with space, it puts the elements of A
together with spaces between them. So it's equivalent to running IFS=, echo "alpha bravo"
.
On the other hand, IFS=,; echo "${A[*]:1:2}"
changes the shell's definition of IFS
, so it does affect how the shell puts the elements together, so it comes out equivalent to IFS=, echo "alpha,bravo"
. Unfortunately, it also affects everything else from that point on so you either have to isolate it to a subshell or set it back to normal afterward.
Just for completeness, here are a couple of other versions that don't work:
$ IFS=,; echo "${A[@]:1:2}"bravo charlie
In this case, the [@]
tells the shell to treat each element of the array as a separate argument, so it's left to echo
to merge them, and it ignores IFS
and always uses spaces.
$ IFS=,; echo "${A[@]:1:2}"bravo charlie
So how about this:
$ IFS=,; echo ${A[*]:1:2}bravo charlie
In this case, the [*]
tells the shell to mash all elements together with the first character of IFS
between them, giving bravo,charlie
. But it's not in double-quotes, so the shell immediately re-splits it on ",", splitting it back into separate arguments again (and then echo
joins them with spaces as always).
If you want to change the shell's definition of IFS
without having to isolate it to a subshell, there are a few options to change it and set it back afterward. In bash, you can set it back to normal like this:
$ IFS=,$ while read -a A; do # Note: IFS change not needed here; it's already changed> echo "${A[*]:1:2}"> done <<<alpha,bravo,charliebravo,charlie$ IFS=$' \t\n'
But the $'...'
syntax isn't available in all shells; if you need portability it's best to use literal characters:
IFS=' ' # You can't see it, but there's a literal space and tab after the first '
Some people prefer to use unset IFS
, which just forces the shell to its default behavior, which is pretty much the same as with IFS
defined in the normal way.
...but if IFS
had been changed in some larger context, and you don't want to mess that up, you need to save it and then set it back. If it's been changed normally, this'll work:
saveIFS=$IFS...IFS=$saveIFS
...but if someone thought it was a good idea to use unset IFS
, this will define it as blank, giving weird results. So you can use this approach or the unset
approach, but not both. If you want to make this robust against the unset
conflict, you can use something like this in bash:
saveIFS=${IFS:-$' \t\n'}
...or for portability, leave off the $' '
and use literal space+tab+newline:
saveIFS=${IFS:- } # Again, there's an invisible space and tab at the end of the first line
All in all, it's a lot of mess full of traps for the unwary. I recommend avoiding it whenever possible.