How to split a delimited string into an array in awk?
To split a string to an array in awk
we use the function split()
:
awk '{split($0, a, ":")}' # ^^ ^ ^^^ # | | | # string | delimiter # | # array to store the pieces
If no separator is given, it uses the FS
, which defaults to the space:
$ awk '{split($0, a); print a[2]}' <<< "a:b c:d e"c:d
We can give a separator, for example :
:
$ awk '{split($0, a, ":"); print a[2]}' <<< "a:b c:d e"b c
Which is equivalent to setting it through the FS
:
$ awk -F: '{split($0, a); print a[1]}' <<< "a:b c:d e"b c
In gawk you can also provide the separator as a regexp:
$ awk '{split($0, a, ":*"); print a[2]}' <<< "a:::b c::d e" #note multiple :b c
And even see what the delimiter was on every step by using its fourth parameter:
$ awk '{split($0, a, ":*", sep); print a[2]; print sep[1]}' <<< "a:::b c::d e"b c:::
Let's quote the man page of GNU awk:
split(string, array [, fieldsep [, seps ] ])
Divide string into pieces separated by fieldsep and store the pieces in array and the separator strings in the seps array. The first piece is stored in
array[1]
, the second piece inarray[2]
, and so forth. The string value of the third argument, fieldsep, is a regexp describing where to split string (much as FS can be a regexp describing where to split input records). If fieldsep is omitted, the value of FS is used.split()
returns the number of elements created. seps is agawk
extension, withseps[i]
being the separator string betweenarray[i]
andarray[i+1]
. If fieldsep is a single space, then any leading whitespace goes intoseps[0]
and any trailing whitespace goes intoseps[n]
, where n is the return value ofsplit()
(i.e., the number of elements in array).
Please be more specific! What do you mean by "it doesn't work"?Post the exact output (or error message), your OS and awk version:
% awk -F\| '{ for (i = 0; ++i <= NF;) print i, $i }' <<<'12|23|11'1 122 233 11
Or, using split:
% awk '{ n = split($0, t, "|") for (i = 0; ++i <= n;) print i, t[i] }' <<<'12|23|11'1 122 233 11
Edit: on Solaris you'll need to use the POSIX awk (/usr/xpg4/bin/awk) in order to process 4000 fields correctly.