Regular expressions and Shell scripts Regular expressions and Shell scripts shell shell

Regular expressions and Shell scripts


I would use Grep from the output of ls together with AWK (by piping them).

ls | grep '^unchanged_' | awk -F'[_-]' '{print $2}'
  1. ls: get file names of the directory
  2. grep: get matched files only (file name filetering)
  3. awk: basically this is identical with your original sample (note: the number should be $2)


You can use the find command to get a list of filenames and then the cut command to pull out the parts you want. A for loop could then be used to iterate over them but that requires that all results fit on a singe shell command line, and if you get too many files in a directory the command fails. A while loop will handle an arbitrary number of files.

find /work/test -type f -name 'unchanged*' | \    cut -d_ -f2 | cut -d- -f1 | \    while read fname;do echo $fname;done

If all you need is the list of values, you can omit the while loop -- it's just there as a placeholder in case you wanted to do something with each value.

The first argument after the find command is the top-level directory; find will recurse into any subdirectories. "-type f" limits its output to regular files. The -name option limits its output to just files beginning with unchanged.

"cut" is a nice utility for pulling out fields between delimiters. The first cut's "-d_" says to use the underscore as a delimiter, and "-f2" says to grab the second field; this gives us everything after the underscore. Next we specify a dash as a delimiter and grab what comes before the first one; this is our number. We're getting a stream of these, one per line, which we pass into the while loop. The read command will read one whitespace-delimited word at a time into the given variable name and let you do whatever you want with it.

The above commands won't deal well with unusual filenames containing newline characters, or extracted terms containing spaces, but it doesn't sound like you have that to deal with here.


Step 1) Use a wildcard to select matching files: unchanged_*.

Step 2) Extract the numbers. You could use a regex, but an even easier way using purely shell constructs is to remove the stuff before and after the number.

What this looks like:

cd /work/test/for file in unchanged_*; do    number=${file#unchanged_}   # remove "unchanged_"    number=${number%%-*}        # remove everything after dash    echo "$number"done