How can we automate the incremental import in Sqoop from DB to HBase using linux script How can we automate the incremental import in Sqoop from DB to HBase using linux script shell shell

How can we automate the incremental import in Sqoop from DB to HBase using linux script


how to store the --lastval and how to pass it to the next time?

define --lastval as an alias or export variable in linux or unix. which can be retried from automation script

once load is finish then change it to recent value, by capturing latest lastval

export lastupdatedvalue=hive -e 'select last_value from table #selection query based on the logic.

sqoop import --connect jdbc:mysql://localhost:3306/ydb --table <your table> --username root -P --incremental append --last-value ${lastupdatedvalue}

:

You can try a sqoop action in Oozie, it should work. Else use shell action in oozie if you prefer scripts. It is possible to schedule it using coordinator function available.

Also have a look at

Importing data incrementally

To import data in increments, you use the --incremental argument with the import command. Sqoop compares the values in a check column against a reference value for the most recent import. These arguments import all rows having an id greater than 100.

--incremental--check-column id--last-value 100

If you run an incremental import from the command line, Sqoop prints the last value in a subsequent incremental import. If you run an incremental import from a saved job, Sqoop retains the last value in the saved job. To import only newer rows than those previously imported, use the --exec option. Sqoop imports only rows having an id greater than the specified row id.