How to import JSON into R and convert it to table?
fromJSON
returns a list, you can use the *apply
functions to go through each element.It's fairly straightforward (once you know what to do!) to convert it to a "table" (data frame is the correct R terminology).
library(rjson)# You can pass directly the filenamemy.JSON <- fromJSON(file="test.json")df <- lapply(my.JSON, function(play) # Loop through each "play" { # Convert each group to a data frame. # This assumes you have 6 elements each time data.frame(matrix(unlist(play), ncol=6, byrow=T)) })# Now you have a list of data frames, connect them together in# one single dataframedf <- do.call(rbind, df)# Make column names nicer, remove row namescolnames(df) <- names(my.JSON[[1]][[1]])rownames(df) <- NULLdf wins losses max_killed battles plane_id max_ground_object_destroyed1 118 40 7 158 4401 32 100 58 7 158 2401 33 120 38 7 158 2403 34 12 450 7 158 4401 35 150 8 7 158 2401 36 120 328 7 158 2403 3
I find jsonlite
to be a little more user friendly for this task. Here is a comparison of three JSON parsing packages (biased in favor of jsonlite
)
library(jsonlite)data <- fromJSON('path/to/file.json')data#> $play1# wins losses max_killed battles plane_id max_ground_object_destroyed# 1 118 40 7 158 4401 3# 2 100 58 7 158 2401 3# 3 120 38 7 158 2403 3# # $play2# wins losses max_killed battles plane_id max_ground_object_destroyed# 1 12 450 7 158 4401 3# 2 150 8 7 158 2401 3# 3 120 328 7 158 2403 3
If you want to collapse those list names into a new column, I recommend dplyr::bind_rows
rather than do.call(rbind, data)
library(dplyr)data <- bind_rows(data, .id = 'play')# Source: local data frame [6 x 7]# play wins losses max_killed battles plane_id max_ground_object_destroyed# (chr) (chr) (chr) (chr) (chr) (chr) (chr)# 1 play1 118 40 7 158 4401 3# 2 play1 100 58 7 158 2401 3# 3 play1 120 38 7 158 2403 3# 4 play2 12 450 7 158 4401 3# 5 play2 150 8 7 158 2401 3# 6 play2 120 328 7 158 2403 3
Beware that the columns may not have the type you expect (notice the columns are all characters since all of the numbers were quoted in the provided JSON data)!
Edit Nov. 2017: One approach to type conversion would be to use mutate_if
to guess the intended type of character columns.
data <- mutate_if(data, is.character, type.convert, as.is = TRUE)
I prefer tidyjson over rjson and jsonlite as it has a easy workflow for converting multilevel nested json objects to 2 dimensional tables. Your problem can be easily solved using this package from github.
devtools::install_github("sailthru/tidyjson")library(tidyjson)library(dplyr)> json %>% as.tbl_json %>% gather_keys %>% gather_array %>% + spread_values(+ wins = jstring("wins"),+ losses = jstring("losses"),+ max_killed = jstring("max_killed"),+ battles = jstring("battles"),+ plane_id = jstring("plane_id"),+ max_ground_object_destroyed = jstring("max_ground_object_destroyed")+ )
Output
document.id key array.index wins losses max_killed battles plane_id max_ground_object_destroyed1 1 play1 1 118 40 7 158 4401 32 1 play1 2 100 58 7 158 2401 33 1 play1 3 120 38 7 158 2403 34 1 play2 1 12 450 7 158 4401 35 1 play2 2 150 8 7 158 2401 36 1 play2 3 120 328 7 158 2403 3