Sqoop Hive table import, Table dataType doesn't match with database
In addition to above answers we may also have to observe when the error is coming, e.g.
In my case I had two types of data columns that caused error: json and binary
for json column the error came while a Java Class was executing, at the very beginning of the import process :
/04/19 09:37:58 ERROR orm.ClassWriter: Cannot resolve SQL type
for binary column, error was thrown while importing into the hive tables (after data is imported and put into HDFS files)
16/04/19 09:51:22 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: Hive does not support the SQL type for column featured_binary
To get rid of these two errors, I had to provide the following options
--map-column-java column1_json=String,column2_json=String,featured_binary=String --map-column-hive column1_json=STRING,column2_json=STRING,featured_binary=STRING
In summary, we may have to provide the
--map-column-java
or
--map-column-hive
depending upon the failure.
You can use the parameter --map-column-hive
to override default mapping. This parameter expects a comma-separated list of key-value pairs separated by = to specify which column should be matched to which type in Hive.
sqoop import \ ... --hive-import \ --map-column-hive id=STRING,price=DECIMAL
A new feature was added with sqoop-2103/sqoop 1.4.5 that lets you call out the decimal precision with the map-column-hive parameter. Example:
--map-column-hive 'TESTDOLLAR_AMT=DECIMAL(20%2C2)'
This syntax would define the field as a DECIMAL(20,2)
. The %2C
is used as a comma and the parameter needs to be in single quotes if submitting from the bash shell.
I tried using Decimal with no modification and I got a Decimal(10,0)
as a default.