Creating nested dataclass objects in Python
This is a request that is as complex as the dataclasses
module itself, which means that probably the best way to achieve this "nested fields" capability is to define a new decorator, akin to @dataclass
.
Fortunately, if you don't need the signature of the __init__
method to reflect the fields and their defaults, like the classes rendered by calling dataclass
, this can be a whole lot simpler: A class decorator that will call the original dataclass
and wrap some functionality over its generated __init__
method can do it with a plain "...(*args, **kwargs):
" style function.
In other words, all one needs to do is write a wrapper around the generated __init__
method that will inspect the parameters passed in "kwargs", check if any corresponds to a "dataclass field type", and if so, generate the nested object prior to calling the original __init__
. Maybe this is harder to spell out in English than in Python:
from dataclasses import dataclass, is_dataclassdef nested_dataclass(*args, **kwargs): def wrapper(cls): cls = dataclass(cls, **kwargs) original_init = cls.__init__ def __init__(self, *args, **kwargs): for name, value in kwargs.items(): field_type = cls.__annotations__.get(name, None) if is_dataclass(field_type) and isinstance(value, dict): new_obj = field_type(**value) kwargs[name] = new_obj original_init(self, *args, **kwargs) cls.__init__ = __init__ return cls return wrapper(args[0]) if args else wrapper
Note that besides not worrying about __init__
signature, thisalso ignores passing init=False
- since it would be meaningless anyway.
(The if
in the return line is responsible for this to work either being called with named parameters or directly as a decorator, like dataclass
itself)
And on the interactive prompt:
In [85]: @dataclass ...: class A: ...: b: int = 0 ...: c: str = "" ...: In [86]: @dataclass ...: class A: ...: one: int = 0 ...: two: str = "" ...: ...: In [87]: @nested_dataclass ...: class B: ...: three: A ...: four: str ...: In [88]: @nested_dataclass ...: class C: ...: five: B ...: six: str ...: ...: In [89]: obj = C(five={"three":{"one": 23, "two":"narf"}, "four": "zort"}, six="fnord")In [90]: obj.five.three.twoOut[90]: 'narf'
If you want the signature to be kept, I'd recommend using the private helper functions in the dataclasses
module itself, to create a new __init__
.
You can try dacite
module. This package simplifies creation of data classes from dictionaries - it also supports nested structures.
Example:
from dataclasses import dataclassfrom dacite import from_dict@dataclassclass A: x: str y: int@dataclassclass B: a: Adata = { 'a': { 'x': 'test', 'y': 1, }}result = from_dict(data_class=B, data=data)assert result == B(a=A(x='test', y=1))
To install dacite, simply use pip:
$ pip install dacite
Instead of writing a new decorator I came up with a function modifying all fields of type dataclass
after the actual dataclass
is initialized.
def dicts_to_dataclasses(instance): """Convert all fields of type `dataclass` into an instance of the specified data class if the current value is of type dict.""" cls = type(instance) for f in dataclasses.fields(cls): if not dataclasses.is_dataclass(f.type): continue value = getattr(instance, f.name) if not isinstance(value, dict): continue new_value = f.type(**value) setattr(instance, f.name, new_value)
The function could be called manually or in __post_init__
. This way the @dataclass
decorator can be used in all its glory.
The example from above with a call to __post_init__
:
@dataclassclass One: f_one: int f_two: str@dataclassclass Two: def __post_init__(self): dicts_to_dataclasses(self) f_three: str f_four: Onedata = {'f_three': 'three', 'f_four': {'f_one': 1, 'f_two': 'two'}}two = Two(**data)# Two(f_three='three', f_four=One(f_one=1, f_two='two'))