Making object JSON serializable with regular encoder
As I said in a comment to your question, after looking at the json
module's source code, it does not appear to lend itself to doing what you want. However the goal could be achieved by what is known as monkey-patching(see question What is a monkey patch?).This could be done in your package's __init__.py
initialization script and would affect all subsequent json
module serialization since modules are generally only loaded once and the result is cached in sys.modules
.
The patch changes the default json encoder's default
method—the default default()
.
Here's an example implemented as a standalone module for simplicity's sake:
Module: make_json_serializable.py
""" Module that monkey-patches json module when it's imported soJSONEncoder.default() automatically checks for a special "to_json()"method and uses it to encode the object if found."""from json import JSONEncoderdef _default(self, obj): return getattr(obj.__class__, "to_json", _default.default)(obj)_default.default = JSONEncoder.default # Save unmodified default.JSONEncoder.default = _default # Replace it.
Using it is trivial since the patch is applied by simply importing the module.
Sample client script:
import jsonimport make_json_serializable # apply monkey-patchclass Foo(object): def __init__(self, name): self.name = name def to_json(self): # New special method. """ Convert to JSON format string representation. """ return '{"name": "%s"}' % self.namefoo = Foo('sazpaz')print(json.dumps(foo)) # -> "{\"name\": \"sazpaz\"}"
To retain the object type information, the special method can also include it in the string returned:
return ('{"type": "%s", "name": "%s"}' % (self.__class__.__name__, self.name))
Which produces the following JSON that now includes the class name:
"{\"type\": \"Foo\", \"name\": \"sazpaz\"}"
Magick Lies Here
Even better than having the replacement default()
look for a specially named method, would be for it to be able to serialize most Python objects automatically, including user-defined class instances, without needing to add a special method. After researching a number of alternatives, the following which uses the pickle
module, seemed closest to that ideal to me:
Module: make_json_serializable2.py
""" Module that imports the json module and monkey-patches it soJSONEncoder.default() automatically pickles any Python objectsencountered that aren't standard JSON data types."""from json import JSONEncoderimport pickledef _default(self, obj): return {'_python_object': pickle.dumps(obj)}JSONEncoder.default = _default # Replace with the above.
Of course everything can't be pickled—extension types for example. However there are ways defined to handle them via the pickle protocol by writing special methods—similar to what you suggested and I described earlier—but doing that would likely be necessary for a far fewer number of cases.
Deserializing
Regardless, using the pickle protocol also means it would be fairly easy to reconstruct the original Python object by providing a custom object_hook
function argument on any json.loads()
calls that used any '_python_object'
key in the dictionary passed in, whenever it has one. Something like:
def as_python_object(dct): try: return pickle.loads(str(dct['_python_object'])) except KeyError: return dctpyobj = json.loads(json_str, object_hook=as_python_object)
If this has to be done in many places, it might be worthwhile to define a wrapper function that automatically supplied the extra keyword argument:
json_pkloads = functools.partial(json.loads, object_hook=as_python_object)pyobj = json_pkloads(json_str)
Naturally, this could be monkey-patched it into the json
module as well, making the function the default object_hook
(instead of None
).
I got the idea for using pickle
from an answer by Raymond Hettinger to another JSON serialization question, whom I consider exceptionally credible as well as an official source (as in Python core developer).
Portability to Python 3
The code above does not work as shown in Python 3 because json.dumps()
returns a bytes
object which the JSONEncoder
can't handle. However the approach is still valid. A simple way to workaround the issue is to latin1
"decode" the value returned from pickle.dumps()
and then "encode" it from latin1
before passing it on to pickle.loads()
in the as_python_object()
function. This works because arbitrary binary strings are valid latin1
which can always be decoded to Unicode and then encoded back to the original string again (as pointed out in this answer by Sven Marnach).
(Although the following works fine in Python 2, the latin1
decoding and encoding it does is superfluous.)
from decimal import Decimalclass PythonObjectEncoder(json.JSONEncoder): def default(self, obj): return {'_python_object': pickle.dumps(obj).decode('latin1')}def as_python_object(dct): try: return pickle.loads(dct['_python_object'].encode('latin1')) except KeyError: return dctclass Foo(object): # Some user-defined class. def __init__(self, name): self.name = name def __eq__(self, other): if type(other) is type(self): # Instances of same class? return self.name == other.name return NotImplemented __hash__ = Nonedata = [1,2,3, set(['knights', 'who', 'say', 'ni']), {'key':'value'}, Foo('Bar'), Decimal('3.141592653589793238462643383279502884197169')]j = json.dumps(data, cls=PythonObjectEncoder, indent=4)data2 = json.loads(j, object_hook=as_python_object)assert data == data2 # both should be same
You can extend the dict class like so:
#!/usr/local/bin/python3import jsonclass Serializable(dict): def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) # hack to fix _json.so make_encoder serialize properly self.__setitem__('dummy', 1) def _myattrs(self): return [ (x, self._repr(getattr(self, x))) for x in self.__dir__() if x not in Serializable().__dir__() ] def _repr(self, value): if isinstance(value, (str, int, float, list, tuple, dict)): return value else: return repr(value) def __repr__(self): return '<%s.%s object at %s>' % ( self.__class__.__module__, self.__class__.__name__, hex(id(self)) ) def keys(self): return iter([x[0] for x in self._myattrs()]) def values(self): return iter([x[1] for x in self._myattrs()]) def items(self): return iter(self._myattrs())
Now to make your classes serializable with the regular encoder, extend 'Serializable':
class MySerializableClass(Serializable): attr_1 = 'first attribute' attr_2 = 23 def my_function(self): print('do something here')obj = MySerializableClass()
print(obj)
will print something like:
<__main__.MySerializableClass object at 0x1073525e8>
print(json.dumps(obj, indent=4))
will print something like:
{ "attr_1": "first attribute", "attr_2": 23, "my_function": "<bound method MySerializableClass.my_function of <__main__.MySerializableClass object at 0x1073525e8>>"}
I suggest putting the hack into the class definition. This way, once the class is defined, it supports JSON. Example:
import jsonclass MyClass( object ): def _jsonSupport( *args ): def default( self, xObject ): return { 'type': 'MyClass', 'name': xObject.name() } def objectHook( obj ): if 'type' not in obj: return obj if obj[ 'type' ] != 'MyClass': return obj return MyClass( obj[ 'name' ] ) json.JSONEncoder.default = default json._default_decoder = json.JSONDecoder( object_hook = objectHook ) _jsonSupport() def __init__( self, name ): self._name = name def name( self ): return self._name def __repr__( self ): return '<MyClass(name=%s)>' % self._namemyObject = MyClass( 'Magneto' )jsonString = json.dumps( [ myObject, 'some', { 'other': 'objects' } ] )print "json representation:", jsonStringdecoded = json.loads( jsonString )print "after decoding, our object is the first in the list", decoded[ 0 ]