Convert JSON format String into array in HIVE
You can start with this:
select concat(‘{“name”’,data_json) from your_table q1 --re-construct your jsonlateral view explode(split(json_data,’{“name”’)) json_splits as data_json --split json at each {"name" tag into array and then explode
Note: I code is not tested as I don't have access to hive currently. This should definitely give you a good start OR you can always go with Hive SerDe for JSON com.cloudera.hive.serde.JSONSerDe
As suggested by @ruben123, go with Hive SerDe for JSON especially when your json is complex. There are several JSONSerDe available, eg. com.cloudera.hive.serde.JSONSerDe
, org.openx.data.jsonserde.JsonSerDe
link
Make sure json is properly formatted, one line json for one record. So, your json should be:
{"name":"abc", "address":{"street":"str1", "city":"c1"}, "phone":"1234567"}{"name":"def", "address":{"street":"str2", "city":"c1"}, "phone":"7145895"}
Create hive table:
CREATE TABLE sample_json ( name STRING, address STRUCT< street: STRING, city: STRING>, phone INT )ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'LOCATION '/your/hdfs/directory';
To select access field, simply
select name, address.street, address.city, phone from sample_json;abc str1 c1 1234567def str2 c1 7145895
Note: if JSONSerDe is not installed yet, you must run ADD JAR