Hive Array<Struct<>> Insertion shows null
map keys terminated by ','
create external table temp ( regionkey smallint ,name string ,comment string ,nations array<struct<n_nationkey:smallint,n_name:string,n_comment:string>>) row format delimited fields terminated by '|' map keys terminated by ',';
select * from temp;
+-----------+--------+------------------------------------------------+-----------------------------------------------------------------------+| regionkey | name | comment | nations |+-----------+--------+------------------------------------------------+-----------------------------------------------------------------------+| 4 | EUROPE | Low sale Business Region for Training4Exam.com | [{"n_nationkey":7,"n_name":"Bulgaria","n_comment":"Reference "}] || 4 | EUROPE | Low sale Business Region for HadoopExam.com | [{"n_nationkey":19,"n_name":"Belgium","n_comment":"Reference site "}] || 4 | EUROPE | Low sale Business Region for Training4Exam.com | [{"n_nationkey":22,"n_name":"Ryan","n_comment":"Reference site"}] |+-----------+--------+------------------------------------------------+-----------------------------------------------------------------------+
FYI
To be backward-compatible, initialize the first 3 separator to be the given values from the table properties.
The default number of separators is 8;
if only hive.serialization.extend.nesting.levels is set, the number of separators is extended to 24;
if hive.serialization.extend.additional.nesting.levels is set, the number of separators is extended to 154.
@param tableProperties table properties to extract the user provided separators
David's answer is very efficient & I liked it very much but cannot understand why collection items must be replaced by map keys (Seems there is a bug in Hive based on the description as he has suggested, I am not a pro in coding).
However, this is the long version
create table regiontemp(str string);load data inpath '/user/cloudera/MohsenFiles/first_first.csv' into table regiontemp;create external table region (r_regionkey smallint,r_name string,r_comment string,r_nations array<struct<n_nationkey:smallint,n_name:string,n_comment:string>>)row format delimitedfields terminated by '|'collection items terminated by ','insert overwrite table regionselect split(str,'\\|')[0] r_regionkey,split(str,'\\|')[1] r_name,split(str,'\\|')[2] r_comment,array(named_struct("n_nationkey",cast(split(split(str,'\\|')[3],",")[0] as smallint), "n_name",split(split(str,'\\|')[3],",")[1] , "n_comment",split(split(str,'\\|')[3],",")[2] ))from regiontemp ;
now in impalaINVALIDATE METADATA;
or in Hive (Aggregation On Struct columns Hive again based on David's answer for another Q)