Sqoop export to SQL Server fails/hangs for more number of columns Sqoop export to SQL Server fails/hangs for more number of columns hadoop hadoop

Sqoop export to SQL Server fails/hangs for more number of columns


We faced the same issue on our end - sqoop export to a table in SQL Server reached 100% and then it just hung until timeout period of 10 mins was reached, after which the job was failed. In our investigation, we found that the reason for this was, in fact, composite primary key violation on SQL Server side, for which we did not have any visibility on our hadoop cluster end. Once we resolved this PK violation, sqoop export completed successfully.

I would also like to point out that access privilege was not the issue and we tested this by successfully running insert via sqoop eval, which completed with no issues.

As the next step on your end, I would recommend that you first test your write access privilege by running sqoop eval. Once you confirmed that you are able to insert records in your target table via sqoop eval, go ahead and list out all the constraints that your target table in SQL Server enforces and then add the appropriate logic in your data lake to prevent such records from being exported to SQL Server. If you can make sure that data you are exporting to SQL Server does not violate any constraints on the SQL Server side, your sqoop export issue should be resolved. Let us know if this does not resolve the issue you are facing.


It seems that you have problems with the privileges of your xxxxxx user in your xxxxxxx database. In export operation after map phase the job tries to execute the insert-update query, but if it does not have nessecary permissions for the username, it can get stuck. Try to assign db_writer role to your user. Another option, if it is possible, try to execute operation under sa account just to understand if it is that case.


Your error log is not showing much stack, to understand the error I would suggest checking the yarn logs for the failed nodes.

Before you check for the issues on SQL Server side, I've tweaked your sqoop job below try making those changes, I'm pretty sure it will solve the issue you are facing.

#Changes Made - #Increase the number of mappers to 8 or 10 for faster process#columns mapping - You have to give your column names in SQL server table in the sequence to match with your file sqoop export \--connect "jdbc:sqlserver://x.x.x.x:port;database=xxxxxxx" \--username xxxxxx --password yyyyyy \--table tbldummy2 \--export-dir /user/hue/Out2 \--input-fields-terminated-by '|' \-m <increase to higher number depending on your cluster> \--columns "col1,col2,col2"--verbose