How to read a python tuple using PyYAML?
I wouldn't call what you've done hacky for what you are trying to do. Your alternative approach from my understanding is to make use of python-specific tags in your YAML file so it is represented appropriately when loading the yaml file. However, this requires you modifying your yaml file which, if huge, is probably going to be pretty irritating and not ideal.
Look at the PyYaml doc that further illustrates this. Ultimately you want to place a !!python/tuple
in front of your structure that you want to represented as such. To take your sample data, it would like:
YAML FILE:
cities: 1: !!python/tuple [0,0] 2: !!python/tuple [4,0] 3: !!python/tuple [0,4] 4: !!python/tuple [4,4] 5: !!python/tuple [2,2] 6: !!python/tuple [6,2]highways: - !!python/tuple [1,2] - !!python/tuple [1,3] - !!python/tuple [1,5] - !!python/tuple [2,4] - !!python/tuple [3,4] - !!python/tuple [5,4]start: 1end: 4
Sample code:
import yamlwith open('y.yaml') as f: d = yaml.load(f.read())print(d)
Which will output:
{'cities': {1: (0, 0), 2: (4, 0), 3: (0, 4), 4: (4, 4), 5: (2, 2), 6: (6, 2)}, 'start': 1, 'end': 4, 'highways': [(1, 2), (1, 3), (1, 5), (2, 4), (3, 4), (5, 4)]}
Depending on where your YAML input comes from your "hack" is a good solution, especially if you would use yaml.safe_load()
instead of the unsafe yaml.load()
. If only the "leaf" sequences in your YAML file need to be tuples you can do ¹:
import pprintimport ruamel.yamlfrom ruamel.yaml.constructor import SafeConstructordef construct_yaml_tuple(self, node): seq = self.construct_sequence(node) # only make "leaf sequences" into tuples, you can add dict # and other types as necessary if seq and isinstance(seq[0], (list, tuple)): return seq return tuple(seq)SafeConstructor.add_constructor( u'tag:yaml.org,2002:seq', construct_yaml_tuple)with open('input.yaml') as fp: data = ruamel.yaml.safe_load(fp)pprint.pprint(data, width=24)
which prints:
{'cities': {1: (0, 0), 2: (4, 0), 3: (0, 4), 4: (4, 4), 5: (2, 2), 6: (6, 2)}, 'end': 4, 'highways': [(1, 2), (1, 3), (1, 5), (2, 4), (3, 4), (5, 4)], 'start': 1}
if you then need to process more material where sequence need to be "normal" lists again, use:
SafeConstructor.add_constructor( u'tag:yaml.org,2002:seq', SafeConstructor.construct_yaml_seq)
¹ This was done using ruamel.yaml a YAML 1.2 parser, of which I am the author. You should be able to do same with the older PyYAML if you only ever need to support YAML 1.1 and/or cannot upgrade for some reason
I ran in the same problem as the question and I was not too satisfied by the two answers. While browsing around the pyyaml documentation I foundreally two interesting methods yaml.add_constructor
and yaml.add_implicit_resolver
.
The implicit resolver solves the problem of having to tag all entries with !!python/tuple
by matching the strings with a regex. I also wanted to use the tuple syntax, so write tuple: (10,120)
instead of writing a list tuple: [10,120]
which then getsconverted to a tuple, I personally found that very annoying. I also did not want to install an external library. Here is the code:
import yamlimport re# this is to convert the string written as a tuple into a python tupledef yml_tuple_constructor(loader, node): # this little parse is really just for what I needed, feel free to change it! def parse_tup_el(el): # try to convert into int or float else keep the string if el.isdigit(): return int(el) try: return float(el) except ValueError: return el value = loader.construct_scalar(node) # remove the ( ) from the string tup_elements = value[1:-1].split(',') # remove the last element if the tuple was written as (x,b,) if tup_elements[-1] == '': tup_elements.pop(-1) tup = tuple(map(parse_tup_el, tup_elements)) return tup # !tuple is my own tag name, I think you could choose anything you want yaml.add_constructor(u'!tuple', yml_tuple_constructor)# this is to spot the strings written as tuple in the yaml yaml.add_implicit_resolver(u'!tuple', re.compile(r"\(([^,\W]{,},){,}[^,\W]*\)"))
Finally by executing this:
>>> yml = yaml.load(""" ...: cities: ...: 1: (0,0) ...: 2: (4,0) ...: 3: (0,4) ...: 4: (4,4) ...: 5: (2,2) ...: 6: (6,2) ...: highways: ...: - (1,2) ...: - (1,3) ...: - (1,5) ...: - (2,4) ...: - (3,4) ...: - (5,4) ...: start: 1 ...: end: 4""")>>> yml['cities']{1: (0, 0), 2: (4, 0), 3: (0, 4), 4: (4, 4), 5: (2, 2), 6: (6, 2)}>>> yml['highways'][(1, 2), (1, 3), (1, 5), (2, 4), (3, 4), (5, 4)]
There could be a potential drawback with save_load
compared to load
which I did not test.