Parallelizing tasks in Node.js Parallelizing tasks in Node.js multithreading multithreading

Parallelizing tasks in Node.js


How do I make this be actually parallel?

First, you won't really be running in parallel while in a single node application. A node application runs on a single thread and only one event at a time is processed by node's event loop. Even when running on a multi-core box you won't get parallelism of processing within a node application.

That said, you can get processing parallelism on multicore machine via forking the code into separate node processes or by spawning child process. This, in effect, allows you to create multiple instances of node itself and to communicate with those processes in different ways (e.g. stdout, process fork IPC mechanism). Additionally, you could choose to separate the functions (by responsibility) into their own node app/server and call it via RPC.

What is the thing done typically by async code to not block the caller (when working with NodeJS)? Is it starting a child process?

It is not starting a new process. Underneath, when async.parallel is used in node.js, it is using process.nextTick(). And nextTick() allows you to avoid blocking the caller by deferring work onto a new stack so you can interleave cpu intensive tasks, etc.

Long story short

Node doesn't make it easy "out of the box" to achieve multiprocessor concurrency. Node instead gives you a non-blocking design and an event loop that leverages a thread without sharing memory. Multiple threads cannot share data/memory, therefore locks aren't needed. Node is lock free. One node process leverages one thread, and this makes node both safe and powerful.

When you need to split work up among multiple processes then use some sort of message passing to communicate with the other processes / servers. e.g. IPC/RPC.


For more see:

Awesome answer from SO on What is Node.js... with tons of goodness.

Understanding process.nextTick()


Asynchronous and parallel are not the same thing. Asynchronous means that you don't have to wait for synchronization. Parallel means that you can be doing multiple things at the same time. Node.js is only asynchronous, but its only ever 1 thread. It can only work on 1 thing at once. If you have a long running computation, you should start another process and then just have your node.js process asynchronously wait for results.

To do this you could use child_process.spawn and then read data from stdin.

http://nodejs.org/api/child_process.html#child_process_child_process_spawn_command_args_options

var spawn = require('child_process').spawn;var process2 = spawn('sh', ['./computationProgram', 'parameter'] );process2.stderr.on('data', function (data) {    //handle error input});process2.stdout.on('data', function (data) {    //handle data results});


Keep in mind I/O is parallelized by Node.js; only your JavaScript callbacks are single threaded.

Assuming you are writing a server, an alternative to adding the complexity of spawning processes or forking is to simply build stateless node servers and run an instance per core, or better yet run many instances each in their own virtualized micro server. Coordinate incoming requests using a reverse proxy or load balancer.

You could also offload computation to another server, maybe MongoDB (using MapReduce) or Hadoop.

To be truly hardcore, you could write a Node plugin in C++ and have fine-grained control of parallelizing the computation code. The speed up from C++ might negate the need of parallelization anyway.

You can always write code to perform computationally intensive tasks in another language best suited for numeric computation, and e.g. expose them through a REST API.

Finally, you could perhaps run the code on the GPU using node-cuda or something similar depending on the type of computation (not all can be optimized for GPU).

Yes, you can fork and spawn other processes, but it seems to me one of the major advantages of node is to not much have to worry about parallelization and threading, and therefor bypass a great amount of complexity altogether.