Distributed computing vs threads Distributed computing vs threads multithreading multithreading

Distributed computing vs threads


I've never found them to be very similar. Let me define for the purposes of this post a "node" to be one hardware thread running on one machine. So a quad core machine is four nodes, as is a cluster of four single processor boxes.

Each node will typically be running some processing, and there will need to be some type of cross-node communication. Usually the first instance of this communication is telling the node what to do. For this communication, I can use shared memory, semaphores, shared files, named pipes, sockets, remote procedure calls, distributed COM, etc. But the easiest ones to use, shared memory and semaphores, are not typically available across a network. Shared files may be available, but performance is typically poor. Sockets tend to be the most common and most flexible choice over a network, rather than the more sophisticated mechanisms. At that point you have to deal with the details of network architecture, including latency, bandwidth, packet loss, network topology, and more.

If you start with a queue of work, nodes on the same machine can use simple shared memory to get things to do. You can even write it up lockless and it will work seamlessly. With nodes over a network, where do you put the queue? If you centralize it, that machine may suffer very high bandwidth costs. Try to distribute it and things get very complicated very quickly.

What I've found, in general, is the people tackling this type of parallel architecture tend to choose embarrassingly parallel problems to solve. Raytracing comes to mind. There's not much cross-node communication required, apart from job distribution. There are many problems like this, to be sure, but I find it a bit disingenuous to suggest that distributed computing is essentially the same as threading.

Now if you're going to go write threading that behaves identically to a distributed system, using pure message passing and not assuming any thread to be the "main" one and such, then yes, they're going to be very similar. But what you've done is pretended you have a distributed architecture and implemented it in threads. The thing is that threading is a much simpler case of parallelism than true distributed computing is. You can abstract the two into a single problem, but by choosing the harder version and sticking strictly to it. And the results won't be as good as they could be when all of the nodes are local to a machine. You're not taking advantage of the special case.


Distributing computing is done over multiple different independent machines, generally with sometimes specialized OS's. It's harder because the interconnectedness of the machines is much lower, and therefore problems which require a lot of quick, random access to the entire dataset are very difficult to solve.

Generally speaking, you need specialized libraries to do distributed computing problems that figure out how to assign nodes to problems and cart around the data.

I really wonder if they are coming to different conclusions because they are trying to solve the wrong problems on each platform. Some problems adhere very nicely to highly interconnected machines, and can benefit from really power supercomputers. Other problems can be dealt with on simply distributed models. In general, supercomputers can solve a wider range of problems, but are much, much more specialized and expensive.


The difference seems to come back to Threads share state, Processes pass messages.

You need to decide how you want to maintain state in your app before choosing one.

Share state is easy to get started with, all the data and variables are just there. But once deadlocks/race conditions enter, its hard to modify/scale.

Message passing (eg Erlang) requires a different approach to design, you have to think about opportunities for concurrency from the beginning, but state of each distributed process is isolated, making locking/race problems easier to deal with.