How unix's strings command works? How unix's strings command works? unix unix

How unix's strings command works?


strings does not attempt to parse all kinds of files. It scans any file for a long enough sequence of 'printable characters', and when found, shows it. See? No "parsing" involved. (With one exception.)

.. So each byte can mean different things from an ASCII/Unicode character to other metadata.

Only up to a certain point. strings is very straightforward, as it does not attempt to 'parse' for meanings. That is, it does not see the difference between a text string "Hello world" and any random binary sequence that happens to contain the bytes 0x48, 0x65, 0x6C, 0x6C, 0x6F (etc.) in that particular order.

The only allowance it has is you can tell it to (attempt to) interpret the raw bytes as a different character set:

-e encoding
--encoding=encoding
Select the character encoding of the strings that are to be found. Possible values for encoding are: s = single-7-bit-byte characters (ASCII, ISO 8859, etc., default), S = single-8-bit-byte characters, b = 16-bit bigendian, l = 16-bit littleendian, B = 32-bit bigen- dian, L = 32-bit littleendian. Useful for finding wide character strings.

(http://unixhelp.ed.ac.uk/CGI/man-cgi?strings)

and again, then it merely does what you told it to: when told to look for 7-bit ASCII only, it will skip high ASCII characters (even though these may appear in "valid text" inside the binary) and when told 8-bit is okay as well, it shows accented characters as well as random stuff such as ¿, ¼, ¢ and ².


As to parsing, you can infer from the man page there is a single exception:

Do not scan only the initialized and loaded sections of object files; scan the whole files..

where this "object file" is an executable type that your system supports. This may be pure pragmatically: executable binary headers are easily recognized and parsed (an example for "ELF" on SO itself), and mostly one is interested in the text stored in the executable/data part of a binary, and not in the more random bytes in its headers and relocation tables.


For each file given, GNU strings prints the printable character sequences that are at least 4 characters long (or the number given with the options below) and are followed by an unprintable character. By default, it only prints the strings from the initialized and loaded sections of object files; for other types of files, it prints the strings from the whole file.

strings is mainly useful for determining the contents of non-text files.