How to import only new data by using Sqoop?

You can use sqoop Incremental Imports:

Sqoop provides an incremental import mode which can be used to retrieve only rows newer than some previously-imported set of rows.

Incremental import arguments:

--check-column (col) Specifies the column to be examined when determining which rows to import.--incremental (mode) Specifies how Sqoop determines which rows are new. Legal values for mode include append and last modified.

--last-value (value) Specifies the maximum value of the check column from the previous import.

Reference: https://sqoop.apache.org/docs/1.4.2/SqoopUserGuide.html#_incremental_imports

For Incremental Import: You would need to specify a value in a check column against a reference value for the most recent import. For example, if the –incremental append argument was specified, along with –check-column id and –last-value 100, all rows with id > 100 will be imported. If an incremental import is run from the command line, the value which should be specified as –last-value in a subsequent incremental import will be printed to the screen for your reference. If an incremental import is run from a saved job, this value will be retained in the saved job. Subsequent runs of sqoop job –exec some Incremental Job will continue to import only newer rows than those previously imported.

For importing all the tables at one go, you would need to use sqoop-import-all-tables command, but this command must satisfy the below criteria to work

Each table must have a single-column primary key.You must intend to import all columns of each table.You must not intend to use non-default splitting column, nor impose any conditions via a WHERE clause.

Reference: https://hortonworks.com/community/forums/topic/sqoop-incremental-import/

CodeHunter

How to import only new data by using Sqoop?

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last