Sqoop incremental export using hcatalog? Sqoop incremental export using hcatalog? hadoop hadoop

Sqoop incremental export using hcatalog?


I have not seen incremental sqoop export arguments. The other way you could try is to create a contol_table in hive where you keep log of the table name & timestamp when it was last exported every time.

create table if not exists control_table ( table_name  string, export_date timestamp);insert into control_table 'export_table1' as table_name, from_unixtime(unix_timestamp()) as export_date from control_table;

If export_table1 is the table you want to export incrementally and assuming if have already executed above two statements.

--execute below at once   --get the timestamp when the table was last executedcreate temporary table control_table_now as select table_name, max(export_date) as last_export_date from control_table group by table_name;--get incremental rowscreate table new_export_table1 as select field1, field2, field3, .... timestamp1 from export_table1 e, control_table_now c where c.table_name = 'export_table1' and e.timestamp1 >= c.last_export_date;--append the control_table for next processinsert into control_table 'export_table1' as table_name, from_unixtime(unix_timestamp()) as export_date from control_table;

Now, export the new_export_table1 table which is incrementally created using sqoop export command.


By default sqoop does not support incremental update with hcatalog integration, when we try it gives following error

Append mode for imports is not compatible with HCatalog. Please remove the parameter--append-modeat org.apache.sqoop.tool.BaseSqoopTool.validateHCatalogOptions(BaseSqoopTool.java:1561)

you can use query option to make it work. as described in this hortonworks document