Is there a way to multithread a SqlDataReader?

sql sql-server performance multithreading c#-3.0

Set up a producer/consumer queue, with one producer process to pull from the reader and queue records as fast as it can, but do no "processing". Then some other number of processes (how many you want depends on your system) to dequeue and process each queued record.

sql sql-server performance multithreading c#-3.0

You shouldn't read that many rows on the client.

That being said, you can partition your query into multiple queries and execute them in parallel. That means launch multiple SqlCommands in separate threads and have them each churn a partition of the result. The A+ question is how to partition the result, and this depends largely o your data and your query:

You can use a range of keys (eg. ID betweem 1 and 10000, ID between 10001 and 20000 etc)
You can use an attribute (eg. RecordTypeID IN (1,2), RecordTypeID IN (3,4) etc)
You can use a synthetic range (ie. ROW_NUMBER() BETWEEN 1 and 1000 etC), but this is very problematic to pull of right
You can use a hash (eg. BINARY_CHECKSUM(*)%10 == 0, BINARY_CHECKSUM(*)%10==1 etc)

You just have to be very careful that the partition queries do no overlap and block during execution (ie. scan the same records and acquire X locks), thus serializing each other.

sql sql-server performance multithreading c#-3.0

Is it a simple ranged query like WHERE Id between 1 and 500000? If so you can just kick off N queries that each return 1/N of the range. But it helps to know where you are bottlenecked with the single threaded approach. If you are doing contiguous reads from one disk spindle to fulfill the query then you should probably stick with a single thread. If it is partitioned across spindles by some range then you can intelligently tune your queries to maximize throughput from disk (i.e. read from each disk in parallel with separate queries). If you expect all of the rows to be in memory then you can parallelize at will. But if the query is more complex then you may not be able to easily partition it without incurring a bunch of overhead. Most of the time the above options will not apply well and the producer/consumer that Joel mentioned will be the only place to parallelize. Depending on how much time you spend processing each row, this may be provide only trivial gains.

CodeHunter

Is there a way to multithread a SqlDataReader?

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last