sqoop export from hdfs to oracle Error
Instead of changing your data files in Hadoop, you should use the --map-column-java argument in your sqoop export.
If you have for example two DATE
columns named DATE_COLUMN_1
and DATE_COLUMN_2
in your Oracle table, then you can add the following argument to your sqoop command:
--map-column-java DATE_COLUMN_1=java.sql.Date,DATE_COLUMN_2=java.sql.Date
As mentioned before, the JDBC format has to be used in your Hadoop text file. But in this case yyyy-mm-dd
will work.
Fromhttp://archive.cloudera.com/cdh/3/sqoop/SqoopUserGuide.html#_dates_and_times,
Oracle JDBC represents DATE and TIME SQL types as TIMESTAMP values. Any DATE columns in an Oracle database will be imported as a TIMESTAMP in Sqoop, and Sqoop-generated code will store these values in java.sql.Timestamp fields.
When exporting data back to a database, Sqoop parses text fields as TIMESTAMP types (with the form yyyy-mm-dd HH:MM:SS.ffffffff) even if you expect these fields to be formatted with the JDBC date escape format of yyyy-mm-dd. Dates exported to Oracle should be formatted as full timestamps.
So you would need to format the dates in your files to conform to the format yyyy-mm-dd HH:MM:SS.ffffffff
before exporting to Oracle.
EDIT:
Answering the comment,
There around 70 files(tables) in hdfs I need to export..So,in all files I need to change the date from yyyy-mm-dd to yyyy-mm-dd HH:MM:SS.ffffffff, any simple way to format it.
Well you could write an awk
script to do that for you. Or else you can check if the below idea works:
- Create a new temporary table
TEMPIMPORT
with the same structure as tableTW5T0
except changing the column which has theDATE
datatype toVARCHAR2
- Load using Sqoop into the new temporary table
TEMPIMPORT
. Run the DML below to export the data back int
TW5T0
(and commit of course):insert into tw5t0 (select [[all_your_columns_here_except_date_column]],to_date(date_column,'yyyy-mm-dd') from tempimport);