Hive: dynamic partition adding to external table Hive: dynamic partition adding to external table hadoop hadoop

Hive: dynamic partition adding to external table


I have a very similar issue where, after a migration, I have to recreate a table for which I have the data, but not the metadata. The solution seems to be, after recreating the table:

MSCK REPAIR TABLE table_name;

Explained here

This also mentions the "alter table X recover partitions" that OP commented on his own post. MSCK REPAIR TABLE table_name; works on non-Amazon-EMR implementations (Cloudera in my case).


The partitions are a physical segmenting of the data - where the partition is maintained by the directory system, and the queries use the metadata to determine where the partition is located. so if you can make the directory structure match the query, it should find the data you want. for example:

select count(*) from table_name where (d >= '2011-08-03) and (d <= '2011-08-09');

but I do not know of any date-range operations otherwise, you'll have to do the math to create the query pattern first.

you can also create external tables, and add partitions to them that define the location.This allows you to shred the data as you like, and still use the partition scheme to optimize the queries.


I do not believe there is any built-in functionality for this in Hive. You may be able to write a plugin. Creating custom UDFs

Probably do not need to mention this, but have you considered a simple bash script that would take your parameters and pipe the commands to hive?

Oozie workflows would be another option, however that might be overkill. Oozie Hive Extension - After some thinking I dont think Oozie would work for this.