Sqoop export to SQL Server fails/hangs for more number of columns

sql-server hadoop mapreduce sqoop

We faced the same issue on our end - sqoop export to a table in SQL Server reached 100% and then it just hung until timeout period of 10 mins was reached, after which the job was failed. In our investigation, we found that the reason for this was, in fact, composite primary key violation on SQL Server side, for which we did not have any visibility on our hadoop cluster end. Once we resolved this PK violation, sqoop export completed successfully.

I would also like to point out that access privilege was not the issue and we tested this by successfully running insert via sqoop eval, which completed with no issues.

As the next step on your end, I would recommend that you first test your write access privilege by running sqoop eval. Once you confirmed that you are able to insert records in your target table via sqoop eval, go ahead and list out all the constraints that your target table in SQL Server enforces and then add the appropriate logic in your data lake to prevent such records from being exported to SQL Server. If you can make sure that data you are exporting to SQL Server does not violate any constraints on the SQL Server side, your sqoop export issue should be resolved. Let us know if this does not resolve the issue you are facing.

sql-server hadoop mapreduce sqoop

It seems that you have problems with the privileges of your xxxxxx user in your xxxxxxx database. In export operation after map phase the job tries to execute the insert-update query, but if it does not have nessecary permissions for the username, it can get stuck. Try to assign db_writer role to your user. Another option, if it is possible, try to execute operation under sa account just to understand if it is that case.

sql-server hadoop mapreduce sqoop

Your error log is not showing much stack, to understand the error I would suggest checking the yarn logs for the failed nodes.

Before you check for the issues on SQL Server side, I've tweaked your sqoop job below try making those changes, I'm pretty sure it will solve the issue you are facing.

#Changes Made - #Increase the number of mappers to 8 or 10 for faster process#columns mapping - You have to give your column names in SQL server table in the sequence to match with your file sqoop export \--connect "jdbc:sqlserver://x.x.x.x:port;database=xxxxxxx" \--username xxxxxx --password yyyyyy \--table tbldummy2 \--export-dir /user/hue/Out2 \--input-fields-terminated-by '|' \-m <increase to higher number depending on your cluster> \--columns "col1,col2,col2"--verbose

CodeHunter

Sqoop export to SQL Server fails/hangs for more number of columns

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last