How to make the 'cut' command treat same sequental delimiters as one?
As you comment in your question, awk
is really the way to go. To use cut
is possible together with tr -s
to squeeze spaces, as kev's answer shows.
Let me however go through all the possible combinations for future readers. Explanations are at the Test section.
tr | cut
tr -s ' ' < file | cut -d' ' -f4
awk
awk '{print $4}' file
bash
while read -r _ _ _ myfield _do echo "forth field: $myfield"done < file
sed
sed -r 's/^([^ ]*[ ]*){3}([^ ]*).*/\2/' file
Tests
Given this file, let's test the commands:
$ cat athis is line 1 more textthis is line 2 more textthis is line 3 more textthis is line 4 more text
tr | cut
$ cut -d' ' -f4 ais # it does not show what we want!$ tr -s ' ' < a | cut -d' ' -f412 # this makes it!34$
awk
$ awk '{print $4}' a1234
bash
This reads the fields sequentially. By using _
we indicate that this is a throwaway variable as a "junk variable" to ignore these fields. This way, we store $myfield
as the 4th field in the file, no matter the spaces in between them.
$ while read -r _ _ _ a _; do echo "4th field: $a"; done < a4th field: 14th field: 24th field: 34th field: 4
sed
This catches three groups of spaces and no spaces with ([^ ]*[ ]*){3}
. Then, it catches whatever coming until a space as the 4th field, that it is finally printed with \1
.
$ sed -r 's/^([^ ]*[ ]*){3}([^ ]*).*/\2/' a1234
shortest/friendliest solution
After becoming frustrated with the too many limitations of cut
, I wrote my own replacement, which I called cuts
for "cut on steroids".
cuts provides what is likely the most minimalist solution to this and many other related cut/paste problems.
One example, out of many, addressing this particular question:
$ cat text.txt0 1 2 30 1 2 3 4$ cuts 2 text.txt22
cuts
supports:
- auto-detection of most common field-delimiters in files (+ ability to override defaults)
- multi-char, mixed-char, and regex matched delimiters
- extracting columns from multiple files with mixed delimiters
- offsets from end of line (using negative numbers) in addition to start of line
- automatic side-by-side pasting of columns (no need to invoke
paste
separately) - support for field reordering
- a config file where users can change their personal preferences
- great emphasis on user friendliness & minimalist required typing
and much more. None of which is provided by standard cut
.
See also: https://stackoverflow.com/a/24543231/1296044
Source and documentation (free software): http://arielf.github.io/cuts/