How to flatten a pandas dataframe with some columns as json?

python json pandas dataframe flatten

Here's a solution using json_normalize() again by using a custom function to get the data in the correct format understood by json_normalize function.

import astfrom pandas.io.json import json_normalizedef only_dict(d):    '''    Convert json string representation of dictionary to a python dict    '''    return ast.literal_eval(d)def list_of_dicts(ld):    '''    Create a mapping of the tuples formed after     converting json strings of list to a python list       '''    return dict([(list(d.values())[1], list(d.values())[0]) for d in ast.literal_eval(ld)])A = json_normalize(df['columnA'].apply(only_dict).tolist()).add_prefix('columnA.')B = json_normalize(df['columnB'].apply(list_of_dicts).tolist()).add_prefix('columnB.pos.')

Finally, join the DFs on the common index to get:

df[['id', 'name']].join([A, B])

EDIT:- As per the comment by @MartijnPieters, the recommended way of decoding the json strings would be to use json.loads() which is much faster when compared to using ast.literal_eval() if you know that the data source is JSON.

python json pandas dataframe flatten

The quickest seems to be:

import pandas as pdimport jsonjson_struct = json.loads(df.to_json(orient="records"))    df_flat = pd.io.json.json_normalize(json_struct) #use pd.io.json

python json pandas dataframe flatten

create a custom function to flatten columnB then use pd.concat

def flatten(js):    return pd.DataFrame(js).set_index('pos').squeeze()pd.concat([df.drop(['columnA', 'columnB'], axis=1),           df.columnA.apply(pd.Series),           df.columnB.apply(flatten)], axis=1)

CodeHunter

How to flatten a pandas dataframe with some columns as json?

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last