Unexpected strings escape in process argv Unexpected strings escape in process argv shell shell

Unexpected strings escape in process argv


Use $'...' form to pass escape sequences like \t, \n, \r, \0 etc in BASH:

python -c 'import sys; print sys.argv' $SHELL '$SHELL' \t $'\t' $'\\t'['-c', '/bin/bash', '$SHELL', 't', '\t', '\\t']

As per man bash:

Words of the form $'string' are treated specially. The word expands to string, with backslash-escaped characters replaced as specified by the ANSI C standard. Backslash escape sequences, if present, are decoded as follows:

\a     alert (bell)\b     backspace\e\E     an escape character\f     form feed\n     new line\r     carriage return\t     horizontal tab\v     vertical tab\\     backslash\'     single quote\"     double quote\nnn   the eight-bit character whose value is the octal value nnn (one to three digits)\xHH   the eight-bit character whose value is the hexadecimal value HH (one or two hex digits)\uHHHH the Unicode (ISO/IEC 10646) character whose value is the hexadecimal value HHHH (one to four hex digits)\UHHHHHHHH the Unicode (ISO/IEC 10646) character whose value is the hexadecimal value HHHHHHHH  (one  to  eight  hex digits)\cx    a control-x character


In both python and node.js, there is a difference between the way print works with scalar strings and the way it works with collections.

Strings are printed simply as a sequence of characters. The resulting output is generally what the user expects to see, but it cannot be used as the representation of the string in the language. But when a list/array is printed out, what you get is a valid list/array literal, which can be used in a program.

For example, in python:

>>> print("x")x>>> print(["x"])['x']

When printing the string, you just see the characters. But when printing the list containing the string, python adds quote characters, so that the output is a valid list literal. Similarly, it would add backslashes, if necessary:

>>> print("\\")\>>> print(["\\"])['\\']

node.js works in exactly the same way:

$ node -p '"\\"'\$ node -p '["\\"]'[ '\\' ]

When you print the string containing a single backslash, you just get a single backslash. But when you print a list/array containing a string consisting of a single backslash, you get a quoted string in which the backslash is escaped with a backslash, allowing it to be used as a literal in a program.

As with the printing of strings in node and python, the standard echo shell utility just prints the actual characters in the string. In a standard shell, there is no mechanism similar to node and python printing of arrays. Bash, however, does provide a mechanism for printing out the value of a variable in a format which could be used as part of a bash program:

$ quote=\"# $quote is a single character:$ echo "${#quote}"1# $quote prints out as a single quote, as you would expect$ echo "$quote""# If you needed a representation, use the 'declare' builtin:$ declare -p quotedeclare -- quote="\""# You can also use the "%q" printf format (a bash extension)$ printf "%q\n" "$quote"\"

(References: bash manual on declare and printf. Or type help declare and help printf in a bash session.)


That's not the full story, though. It is also important to understand how the shell interprets what you type. In other words, when you write

some_utility  \" "\"" '\"'

What does some_utility actually see in the argv array?

In most contexts in a standard shell (including bash), C-style escapes sequences like \t are not interpreted as such. (The standard shell utility printf does interpret these sequences when they appear in a format string, and some other standard utilities also interpret the sequences, but the shell itself does not.) The handling of backslash by a standard shell depends on the context:

  • Unquoted strings: the backslash quotes the following character, whatever it is (unless it is a newline, in which case both the backslash and the newline are removed from the input).

  • Double-quoted strings: backslash can be used to escape the characters $, \, ", `; also, a backslash followed by a newline is removed from the input, as in an unquoted string. In bash, if history expansion is enabled (as it is by default in interactive shells), backslash can also be used to avoid history expansion of !, but the backslash is retained in the final string.

  • Single-quoted strings: backslash is treated as a normal character. (As a result, there is no way to include a single quote in a single-quoted string.)

Bash adds two more quoting mechanisms:

  • C-style quoting, $'...'. If a single-quoted string is preceded by a dollar sign, then C-style escape sequences inside the string are interpreted in roughly the same way a C compiler would. This includes the standard whitespace characters such as newline (\n), octal, hexadecimal and unicode escapes (\010, \x0a, \u000A, \U0000000A), plus a few non-C sequences including "control" characters (\cJ) and the ESC character \e or \E (the same as \x1b). Backslashes can also be used to escape \, ' and ". (Note that this is a different list from the list of backslashable characters in double-quoted strings; here, a backslash before a dollar sign or a backtic is not special, while a backslash before a single quote is special; moreover, the backslash-newline sequence is not interpreted.)

  • Locale-specific Translation: $"...". If a double-quoted string is preceded by a dollar sign, backslashes (and variable expansions and command substitutions) are interpreted as with a normal double-quoted strings, and then the string is looked up in a message catalog determined by the current locale.

(References: Posix standard, Bash manual.)