Parse a nested variable from YAML file in bash Parse a nested variable from YAML file in bash kubernetes kubernetes

Parse a nested variable from YAML file in bash


As others are commenting, it is recommended to make use of yq (along with jq) if available.
Then please try the following:

value=$(yq -r 'recurse | select(.name? == "CALICO_IPV4POOL_CIDR") | .value' "calico.yaml")echo "$value"

Output:

192.168.0.0/16


If you're able to install new dependencies, and are planning on dealing with lots of yaml files, yq is a wrapper around jq that can handle yaml. It'd allow a safe (non-grep) way of accessing nested yaml values.

Usage would look something like MY_VALUE=$(yq '.myValue.nested.value' < config-file.yaml)

Alternatively, How can I parse a YAML file from a Linux shell script? has a bash-only parser that you could use to get your value.


The right way to do this is to use a scripting language and a YAML parsing library to extract the field you're interested in.

Here's an example of how to do it in Python. If you were doing this for real you'd probably split it out into multiple functions and have better error reporting. This is literally just to illustrate some of the difficulties caused by the format of calico.yaml, which is several YAML documents concatenated together, not just one. You also have to loop over some of the lists internal to the document in order to extract the field you're interested in.

#!/usr/bin/env python3import yamldef foo():    with open('/tmp/calico.yaml', 'r') as fil:        docs = yaml.safe_load_all(fil)        doc = None        for candidate in docs:            if candidate["kind"] == "DaemonSet":                doc = candidate                break        else:            raise ValueError("no YAML document of kind DaemonSet")        l1 = doc["spec"]        l2 = l1["template"]        l3 = l2["spec"]        l4 = l3["containers"]        for containers_item in l4:            l5 = containers_item["env"]            env = l5            for entry in env:                if entry["name"] == "CALICO_IPV4POOL_CIDR":                    return entry["value"]    raise ValueError("no CALICO_IPV4POOL_CIDR entry")print(foo())

However, sometimes you need a solution right now and shell scripts are very good at that.

If you're hitting an API endpoint, then the YAML will usually be pretty-printed so you can get away with extracting text in ways that won't work on arbitrary YAML.

Something like the following should be fairly robust:

cat </tmp/calico.yaml | grep -A1 CALICO_IPV4POOL_CIDR | grep value: | cut -d: -f2 | tr -d ' "'

Although it's worth checking at the end with a regex that the extracted value really is valid IPv4 CIDR notation.

The key thing here is grep -A1 CALICO_IPV4POOL_CIDR .

The two-element dictionary you mentioned (shown below) will always appear as one chunk since it's a subtree of the YAML document.

    - name: CALICO_IPV4POOL_CIDR      value: "192.168.0.0/16"

The keys in calico.yaml are not sorted alphabetically in general, but in {"name": <something>, "value": <something else>} constructions, name does consistently appear before value.