Should I escape shell arguments in Perl? Should I escape shell arguments in Perl? shell shell

Should I escape shell arguments in Perl?


If you use system $cmd, @args rather than system "$cmd @args" (an array rather than a string), then you do not have to escape the arguments because no shell is invoked (see system). system {$cmd} $cmd, @args will not invoke a shell either even if $cmd contains metacharacters and @args is empty (this is documented as part of exec). If the args are coming from user input (or other untrusted source), you will still want to untaint them. See -T in the perlrun docs, and the perlsec docs.

If you need to read the output or send input to the command, qx and readpipe have no equivalent. Instead, use open my $output, "-|", $cmd, @args or open my $input, "|-", $cmd, @args although this is not portable as it requires a real fork which means Unix only... I think. Maybe it'll work on Windows with its simulated fork. A better option is something like IPC::Run, which will also handle the case of piping commands to other commands, which neither the multi-arg form of system nor the 4 arg form of open will handle.


On Windows, the situation is a bit nastier. Basically, all Win32 programs receive one long command-line string -- the shell (usually cmd.exe) may do some interpretation first, removing < and > redirections for example, but it does not split it up at word boundaries for the program. Each program must do this parsing themselves (if they wish -- some programs don't bother). In C and C++ programs, routines provided by the runtime libraries supplied with the compiler toolchain will generally perform this parsing step before main() is called.

The problem is, in general, you don't know how a given program will parse its command line. Many programs are compiled with some version of MSVC++, whose quirky parsing rules are described here, but many others are compiled with different compilers that use different conventions.

This is compounded by the fact that cmd.exe has its own quirky parsing rules. The caret (^) is treated as an escape character that quotes the following character, and text inside double quotes is treated as quoted if a list of tricky criteria are met (see cmd /? for the full gory details). If your command contains any strange characters, it's very easy for cmd.exe's idea of which parts of text are "quoted" and which aren't to get out of sync with your target program's, and all hell breaks loose.

So, the safest approach for escaping arguments on Windows is:

  1. Escape arguments in the manner expected by the command-line parsing logic of the program you're calling. (Hopefully you know what that logic is; if not, try a few examples and guess.)
  2. Join the escaped arguments with spaces.
  3. Prefix every single non-alphanumeric character of the resulting string with ^.
  4. Append any redirections or other shell trickery (e.g. joining commands with &&).
  5. Run the command with system() or backticks.


 sub esc_chars {  # will change, for example, a!!a to a\!\!a     @_ =~ s/([;<>\*\|`&\$!#\(\)\[\]\{\}:'"])/\\$1/g;     return @_;  }

http://www.slac.stanford.edu/slac/www/resource/how-to-use/cgi-rexx/cgi-esc.html