How to sort an array in Bash How to sort an array in Bash bash bash

How to sort an array in Bash


You don't really need all that much code:

IFS=$'\n' sorted=($(sort <<<"${array[*]}"))unset IFS

Supports whitespace in elements (as long as it's not a newline), and works in Bash 3.x.

e.g.:

$ array=("a c" b f "3 5")$ IFS=$'\n' sorted=($(sort <<<"${array[*]}")); unset IFS$ printf "[%s]\n" "${sorted[@]}"[3 5][a c][b][f]

Note: @sorontar has pointed out that care is required if elements contain wildcards such as * or ?:

The sorted=($(...)) part is using the "split and glob" operator. You should turn glob off: set -f or set -o noglob or shopt -op noglob or an element of the array like * will be expanded to a list of files.

What's happening:

The result is a culmination six things that happen in this order:

  1. IFS=$'\n'
  2. "${array[*]}"
  3. <<<
  4. sort
  5. sorted=($(...))
  6. unset IFS

First, the IFS=$'\n'

This is an important part of our operation that affects the outcome of 2 and 5 in the following way:

Given:

  • "${array[*]}" expands to every element delimited by the first character of IFS
  • sorted=() creates elements by splitting on every character of IFS

IFS=$'\n' sets things up so that elements are expanded using a new line as the delimiter, and then later created in a way that each line becomes an element. (i.e. Splitting on a new line.)

Delimiting by a new line is important because that's how sort operates (sorting per line). Splitting by only a new line is not-as-important, but is needed preserve elements that contain spaces or tabs.

The default value of IFS is a space, a tab, followed by a new line, and would be unfit for our operation.

Next, the sort <<<"${array[*]}" part

<<<, called here strings, takes the expansion of "${array[*]}", as explained above, and feeds it into the standard input of sort.

With our example, sort is fed this following string:

a cbf3 5

Since sort sorts, it produces:

3 5a cbf

Next, the sorted=($(...)) part

The $(...) part, called command substitution, causes its content (sort <<<"${array[*]}) to run as a normal command, while taking the resulting standard output as the literal that goes where ever $(...) was.

In our example, this produces something similar to simply writing:

sorted=(3 5a cbf)

sorted then becomes an array that's created by splitting this literal on every new line.

Finally, the unset IFS

This resets the value of IFS to the default value, and is just good practice.

It's to ensure we don't cause trouble with anything that relies on IFS later in our script. (Otherwise we'd need to remember that we've switched things around--something that might be impractical for complex scripts.)


Original response:

array=(a c b "f f" 3 5)readarray -t sorted < <(for a in "${array[@]}"; do echo "$a"; done | sort)

output:

$ for a in "${sorted[@]}"; do echo "$a"; done35abcf f

Note this version copes with values that contains special characters or whitespace (except newlines)

Note readarray is supported in bash 4+.


Edit Based on the suggestion by @Dimitre I had updated it to:

readarray -t sorted < <(printf '%s\0' "${array[@]}" | sort -z | xargs -0n1)

which has the benefit of even understanding sorting elements with newline characters embedded correctly. Unfortunately, as correctly signaled by @ruakh this didn't mean the the result of readarray would be correct, because readarray has no option to use NUL instead of regular newlines as line-separators.


Here's a pure Bash quicksort implementation:

#!/bin/bash# quicksorts positional arguments# return is in array qsort_retqsort() {   local pivot i smaller=() larger=()   qsort_ret=()   (($#==0)) && return 0   pivot=$1   shift   for i; do      # This sorts strings lexicographically.      if [[ $i < $pivot ]]; then         smaller+=( "$i" )      else         larger+=( "$i" )      fi   done   qsort "${smaller[@]}"   smaller=( "${qsort_ret[@]}" )   qsort "${larger[@]}"   larger=( "${qsort_ret[@]}" )   qsort_ret=( "${smaller[@]}" "$pivot" "${larger[@]}" )}

Use as, e.g.,

$ array=(a c b f 3 5)$ qsort "${array[@]}"$ declare -p qsort_retdeclare -a qsort_ret='([0]="3" [1]="5" [2]="a" [3]="b" [4]="c" [5]="f")'

This implementation is recursive… so here's an iterative quicksort:

#!/bin/bash# quicksorts positional arguments# return is in array qsort_ret# Note: iterative, NOT recursive! :)qsort() {   (($#==0)) && return 0   local stack=( 0 $(($#-1)) ) beg end i pivot smaller larger   qsort_ret=("$@")   while ((${#stack[@]})); do      beg=${stack[0]}      end=${stack[1]}      stack=( "${stack[@]:2}" )      smaller=() larger=()      pivot=${qsort_ret[beg]}      for ((i=beg+1;i<=end;++i)); do         if [[ "${qsort_ret[i]}" < "$pivot" ]]; then            smaller+=( "${qsort_ret[i]}" )         else            larger+=( "${qsort_ret[i]}" )         fi      done      qsort_ret=( "${qsort_ret[@]:0:beg}" "${smaller[@]}" "$pivot" "${larger[@]}" "${qsort_ret[@]:end+1}" )      if ((${#smaller[@]}>=2)); then stack+=( "$beg" "$((beg+${#smaller[@]}-1))" ); fi      if ((${#larger[@]}>=2)); then stack+=( "$((end-${#larger[@]}+1))" "$end" ); fi   done}

In both cases, you can change the order you use: I used string comparisons, but you can use arithmetic comparisons, compare wrt file modification time, etc. just use the appropriate test; you can even make it more generic and have it use a first argument that is the test function use, e.g.,

#!/bin/bash# quicksorts positional arguments# return is in array qsort_ret# Note: iterative, NOT recursive! :)# First argument is a function name that takes two arguments and compares themqsort() {   (($#<=1)) && return 0   local compare_fun=$1   shift   local stack=( 0 $(($#-1)) ) beg end i pivot smaller larger   qsort_ret=("$@")   while ((${#stack[@]})); do      beg=${stack[0]}      end=${stack[1]}      stack=( "${stack[@]:2}" )      smaller=() larger=()      pivot=${qsort_ret[beg]}      for ((i=beg+1;i<=end;++i)); do         if "$compare_fun" "${qsort_ret[i]}" "$pivot"; then            smaller+=( "${qsort_ret[i]}" )         else            larger+=( "${qsort_ret[i]}" )         fi      done      qsort_ret=( "${qsort_ret[@]:0:beg}" "${smaller[@]}" "$pivot" "${larger[@]}" "${qsort_ret[@]:end+1}" )      if ((${#smaller[@]}>=2)); then stack+=( "$beg" "$((beg+${#smaller[@]}-1))" ); fi      if ((${#larger[@]}>=2)); then stack+=( "$((end-${#larger[@]}+1))" "$end" ); fi   done}

Then you can have this comparison function:

compare_mtime() { [[ $1 -nt $2 ]]; }

and use:

$ qsort compare_mtime *$ declare -p qsort_ret

to have the files in current folder sorted by modification time (newest first).

NOTE. These functions are pure Bash! no external utilities, and no subshells! they are safe wrt any funny symbols you may have (spaces, newline characters, glob characters, etc.).

NOTE2. The test [[ $i < $pivot ]] is correct. It uses the lexicographical string comparison. If your array only contains integers and you want to sort numerically, use ((i < pivot)) instead.Please don't edit this answer to change that. It has already been edited (and rolled back) a couple of times. The test I gave here is correct and corresponds to the output given in the example: the example uses both strings and numbers, and the purpose is to sort it in lexicographical order. Using ((i < pivot)) in this case is wrong.