Pandas read nested json

You can use json_normalize:

import jsonwith open('myJson.json') as data_file:        data = json.load(data_file)  df = pd.json_normalize(data, 'locations', ['date', 'number', 'name'],                     record_prefix='locations_')print (df)  locations_arrTime locations_arrTimeDiffMin locations_depTime  \0                                                        06:32   1             06:37                        1             06:40   2             08:24                        1                       locations_depTimeDiffMin           locations_name locations_platform  \0                        0  Spital am Pyhrn Bahnhof                  2   1                        0  Windischgarsten Bahnhof                  2   2                                    Linz/Donau Hbf               1A-B     locations_stationIdx locations_track number    name        date  0                    0          R 3932         R 3932  01.10.2016  1                    1                         R 3932  01.10.2016  2                   22                         R 3932  01.10.2016

EDIT:

You can use read_json with parsing name by DataFrame constructor and last groupby with apply join:

df = pd.read_json("myJson.json")df.locations = pd.DataFrame(df.locations.values.tolist())['name']df = df.groupby(['date','name','number'])['locations'].apply(','.join).reset_index()print (df)        date    name number                                          locations0 2016-01-10  R 3932         Spital am Pyhrn Bahnhof,Windischgarsten Bahnho...

python json parsing pandas

A possible alternative to pandas.json_normalize is to build your own dataframe by extracting only the selected keys and values from the nested dictionary. The main reason for doing this is because json_normalize gets slow for very large json file (and might not always produce the output you want).

So, here is an alternative way to flatten the nested dictionary in pandas using glom. The aim is to extract selected keys and value from the nested dictionary and save them in a separate column of the pandas dataframe (:

Here is a step by step guide: https://medium.com/@enrico.alemani/flatten-nested-dictionaries-in-pandas-using-glom-7948345c88f5

import pandas as pdfrom glom import glomfrom ast import literal_evaltarget = {    "number": "",    "date": "01.10.2016",    "name": "R 3932",    "locations":        {            "depTimeDiffMin": "0",            "name": "Spital am Pyhrn Bahnhof",            "arrTime": "",            "depTime": "06:32",            "platform": "2",            "stationIdx": "0",            "arrTimeDiffMin": "",            "track": "R 3932"        }}   # Import datadf = pd.DataFrame([str(target)], columns=['target'])# Extract id keys and save value into a separate pandas columndf['id'] = df['target'].apply(lambda row: glom(literal_eval(row), 'locations.name'))

CodeHunter

Pandas read nested json

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last