How to read a python tuple using PyYAML? How to read a python tuple using PyYAML? python python

How to read a python tuple using PyYAML?


I wouldn't call what you've done hacky for what you are trying to do. Your alternative approach from my understanding is to make use of python-specific tags in your YAML file so it is represented appropriately when loading the yaml file. However, this requires you modifying your yaml file which, if huge, is probably going to be pretty irritating and not ideal.

Look at the PyYaml doc that further illustrates this. Ultimately you want to place a !!python/tuple in front of your structure that you want to represented as such. To take your sample data, it would like:

YAML FILE:

cities:  1: !!python/tuple [0,0]  2: !!python/tuple [4,0]  3: !!python/tuple [0,4]  4: !!python/tuple [4,4]  5: !!python/tuple [2,2]  6: !!python/tuple [6,2]highways:  - !!python/tuple [1,2]  - !!python/tuple [1,3]  - !!python/tuple [1,5]  - !!python/tuple [2,4]  - !!python/tuple [3,4]  - !!python/tuple [5,4]start: 1end: 4

Sample code:

import yamlwith open('y.yaml') as f:    d = yaml.load(f.read())print(d)

Which will output:

{'cities': {1: (0, 0), 2: (4, 0), 3: (0, 4), 4: (4, 4), 5: (2, 2), 6: (6, 2)}, 'start': 1, 'end': 4, 'highways': [(1, 2), (1, 3), (1, 5), (2, 4), (3, 4), (5, 4)]}


Depending on where your YAML input comes from your "hack" is a good solution, especially if you would use yaml.safe_load() instead of the unsafe yaml.load(). If only the "leaf" sequences in your YAML file need to be tuples you can do ¹:

import pprintimport ruamel.yamlfrom ruamel.yaml.constructor import SafeConstructordef construct_yaml_tuple(self, node):    seq = self.construct_sequence(node)    # only make "leaf sequences" into tuples, you can add dict     # and other types as necessary    if seq and isinstance(seq[0], (list, tuple)):        return seq    return tuple(seq)SafeConstructor.add_constructor(    u'tag:yaml.org,2002:seq',    construct_yaml_tuple)with open('input.yaml') as fp:    data = ruamel.yaml.safe_load(fp)pprint.pprint(data, width=24)

which prints:

{'cities': {1: (0, 0),            2: (4, 0),            3: (0, 4),            4: (4, 4),            5: (2, 2),            6: (6, 2)}, 'end': 4, 'highways': [(1, 2),              (1, 3),              (1, 5),              (2, 4),              (3, 4),              (5, 4)], 'start': 1}

if you then need to process more material where sequence need to be "normal" lists again, use:

SafeConstructor.add_constructor(    u'tag:yaml.org,2002:seq',    SafeConstructor.construct_yaml_seq)

¹ This was done using ruamel.yaml a YAML 1.2 parser, of which I am the author. You should be able to do same with the older PyYAML if you only ever need to support YAML 1.1 and/or cannot upgrade for some reason


I ran in the same problem as the question and I was not too satisfied by the two answers. While browsing around the pyyaml documentation I foundreally two interesting methods yaml.add_constructor and yaml.add_implicit_resolver.

The implicit resolver solves the problem of having to tag all entries with !!python/tuple by matching the strings with a regex. I also wanted to use the tuple syntax, so write tuple: (10,120) instead of writing a list tuple: [10,120] which then getsconverted to a tuple, I personally found that very annoying. I also did not want to install an external library. Here is the code:

import yamlimport re# this is to convert the string written as a tuple into a python tupledef yml_tuple_constructor(loader, node):     # this little parse is really just for what I needed, feel free to change it!                                                                                                def parse_tup_el(el):                                                                                                                    # try to convert into int or float else keep the string                                                                              if el.isdigit():                                                                                                                         return int(el)                                                                                                                   try:                                                                                                                                     return float(el)                                                                                                                 except ValueError:                                                                                                                       return el                                                                                                                    value = loader.construct_scalar(node)                                                                                                # remove the ( ) from the string                                                                                                     tup_elements = value[1:-1].split(',')                                                                                                # remove the last element if the tuple was written as (x,b,)                                                                         if tup_elements[-1] == '':                                                                                                               tup_elements.pop(-1)                                                                                                             tup = tuple(map(parse_tup_el, tup_elements))                                                                                         return tup                                                                                                                       # !tuple is my own tag name, I think you could choose anything you want                                                                                                                                   yaml.add_constructor(u'!tuple', yml_tuple_constructor)# this is to spot the strings written as tuple in the yaml                                                                               yaml.add_implicit_resolver(u'!tuple', re.compile(r"\(([^,\W]{,},){,}[^,\W]*\)")) 

Finally by executing this:

>>> yml = yaml.load("""   ...: cities:   ...:   1: (0,0)   ...:   2: (4,0)   ...:   3: (0,4)   ...:   4: (4,4)   ...:   5: (2,2)   ...:   6: (6,2)   ...: highways:   ...:   - (1,2)   ...:   - (1,3)   ...:   - (1,5)   ...:   - (2,4)   ...:   - (3,4)   ...:   - (5,4)   ...: start: 1   ...: end: 4""")>>>  yml['cities']{1: (0, 0), 2: (4, 0), 3: (0, 4), 4: (4, 4), 5: (2, 2), 6: (6, 2)}>>> yml['highways'][(1, 2), (1, 3), (1, 5), (2, 4), (3, 4), (5, 4)]

There could be a potential drawback with save_load compared to load which I did not test.