How to return an array in bash without using globals? How to return an array in bash without using globals? bash bash

How to return an array in bash without using globals?


With Bash version 4.3 and above, you can make use of a nameref so that the caller can pass in the array name and the callee can use a nameref to populate the named array, indirectly.

#!/usr/bin/env bashcreate_array() {    local -n arr=$1             # use nameref for indirection    arr=(one "two three" four)}use_array() {    local my_array    create_array my_array       # call function to populate the array    echo "inside use_array"    declare -p my_array         # test the array}use_array                       # call the main function

Produces the output:

inside use_arraydeclare -a my_array=([0]="one" [1]="two three" [2]="four")

You could make the function update an existing array as well:

update_array() {    local -n arr=$1             # use nameref for indirection    arr+=("two three" four)     # update the array}use_array() {    local my_array=(one)    update_array my_array       # call function to update the array}

This is a more elegant and efficient approach since we don't need command substitution $() to grab the standard output of the function being called. It also helps if the function were to return more than one output - we can simply use as many namerefs as the number of outputs.


Here is what the Bash Manual says about nameref:

A variable can be assigned the nameref attribute using the -n option to the declare or local builtin commands (see Bash Builtins) to create a nameref, or a reference to another variable. This allows variables to be manipulated indirectly. Whenever the nameref variable is referenced, assigned to, unset, or has its attributes modified (other than using or changing the nameref attribute itself), the operation is actually performed on the variable specified by the nameref variable’s value. A nameref is commonly used within shell functions to refer to a variable whose name is passed as an argument to the function. For instance, if a variable name is passed to a shell function as its first argument, running

declare -n ref=$1 inside the function creates a nameref variable ref whose value is the variable name passed as the first argument. References and assignments to ref, and changes to its attributes, are treated as references, assignments, and attribute modifications to the variable whose name was passed as $1.


What's wrong with globals?

Returning arrays is really not practical. There are lots of pitfalls.

That said, here's one technique that works if it's OK that the variable have the same name:

$ f () { local a; a=(abc 'def ghi' jkl); declare -p a; }$ g () { local a; eval $(f); declare -p a; }$ f; declare -p a; echo; g; declare -p adeclare -a a='([0]="abc" [1]="def ghi" [2]="jkl")'-bash: declare: a: not founddeclare -a a='([0]="abc" [1]="def ghi" [2]="jkl")'-bash: declare: a: not found

The declare -p commands (except for the one in f() are used to display the state of the array for demonstration purposes. In f() it's used as the mechanism to return the array.

If you need the array to have a different name, you can do something like this:

$ g () { local b r; r=$(f); r="declare -a b=${r#*=}"; eval "$r"; declare -p a; declare -p b; }$ f; declare -p a; echo; g; declare -p adeclare -a a='([0]="abc" [1]="def ghi" [2]="jkl")'-bash: declare: a: not found-bash: declare: a: not founddeclare -a b='([0]="abc" [1]="def ghi" [2]="jkl")'-bash: declare: a: not found


Bash can't pass around data structures as return values. A return value must be a numeric exit status between 0-255. However, you can certainly use command or process substitution to pass commands to an eval statement if you're so inclined.

This is rarely worth the trouble, IMHO. If you must pass data structures around in Bash, use a global variable--that's what they're for. If you don't want to do that for some reason, though, think in terms of positional parameters.

Your example could easily be rewritten to use positional parameters instead of global variables:

use_array () {    for idx in "$@"; do        echo "$idx"    done}create_array () {    local array=("a" "b" "c")    use_array "${array[@]}"}

This all creates a certain amount of unnecessary complexity, though. Bash functions generally work best when you treat them more like procedures with side effects, and call them in sequence.

# Gather values and store them in FOO.get_values_for_array () { :; }# Do something with the values in FOO.process_global_array_variable () { :; }# Call your functions.get_values_for_arrayprocess_global_array_variable

If all you're worried about is polluting your global namespace, you can also use the unset builtin to remove a global variable after you're done with it. Using your original example, let my_list be global (by removing the local keyword) and add unset my_list to the end of my_algorithm to clean up after yourself.