In Python, how can you load YAML mappings as OrderedDicts? In Python, how can you load YAML mappings as OrderedDicts? python python

In Python, how can you load YAML mappings as OrderedDicts?


Python >= 3.6

In python 3.6+, it seems that dict loading order is preserved by default without special dictionary types. The default Dumper, on the other hand, sorts dictionaries by key. Starting with pyyaml 5.1, you can turn this off by passing sort_keys=False:

a = dict(zip("unsorted", "unsorted"))s = yaml.safe_dump(a, sort_keys=False)b = yaml.safe_load(s)assert list(a.keys()) == list(b.keys())  # True

This can work due to the new dict implementation that has been in use in pypy for some time. While still considered an implementation detail in CPython 3.6, "the insertion-order preserving nature of dicts has been declared an official part of the Python language spec" as of 3.7+, see What's New In Python 3.7.

Note that this is still undocumented from PyYAML side, so you shouldn't rely on this for safety critical applications.

Original answer (compatible with all known versions)

I like @James' solution for its simplicity. However, it changes the default global yaml.Loader class, which can lead to troublesome side effects. Especially, when writing library code this is a bad idea. Also, it doesn't directly work with yaml.safe_load().

Fortunately, the solution can be improved without much effort:

import yamlfrom collections import OrderedDictdef ordered_load(stream, Loader=yaml.SafeLoader, object_pairs_hook=OrderedDict):    class OrderedLoader(Loader):        pass    def construct_mapping(loader, node):        loader.flatten_mapping(node)        return object_pairs_hook(loader.construct_pairs(node))    OrderedLoader.add_constructor(        yaml.resolver.BaseResolver.DEFAULT_MAPPING_TAG,        construct_mapping)    return yaml.load(stream, OrderedLoader)# usage example:ordered_load(stream, yaml.SafeLoader)

For serialization, you could use the following funcion:

def ordered_dump(data, stream=None, Dumper=yaml.SafeDumper, **kwds):    class OrderedDumper(Dumper):        pass    def _dict_representer(dumper, data):        return dumper.represent_mapping(            yaml.resolver.BaseResolver.DEFAULT_MAPPING_TAG,            data.items())    OrderedDumper.add_representer(OrderedDict, _dict_representer)    return yaml.dump(data, stream, OrderedDumper, **kwds)# usage:ordered_dump(data, Dumper=yaml.SafeDumper)

In each case, you could also make the custom subclasses global, so that they don't have to be recreated on each call.


The yaml module allow you to specify custom 'representers' to convert Python objects to text and 'constructors' to reverse the process.

_mapping_tag = yaml.resolver.BaseResolver.DEFAULT_MAPPING_TAGdef dict_representer(dumper, data):    return dumper.represent_dict(data.iteritems())def dict_constructor(loader, node):    return collections.OrderedDict(loader.construct_pairs(node))yaml.add_representer(collections.OrderedDict, dict_representer)yaml.add_constructor(_mapping_tag, dict_constructor)


2018 option:

oyaml is a drop-in replacement for PyYAML which preserves dict ordering. Both Python 2 and Python 3 are supported. Just pip install oyaml, and import as shown below:

import oyaml as yaml

You'll no longer be annoyed by screwed-up mappings when dumping/loading.

Note: I'm the author of oyaml.