Multiple threads inside docker container Multiple threads inside docker container multithreading multithreading

Multiple threads inside docker container


A container as such has nothing to do with the computation you need to perform. The question you are posting is whether I should have multiple processes doing my processing or multiple threads spawned by the same process doing the processing ?

A container is just a platform for running your application in the environment you want. Period. It means, you would be running a process inside a container to run your business logic. Multiple containers simply means multiple processes and as it is advised, you should go for multiple threads rather than multiple processes as spawning a new process (in your case, as container) would eat up more resources and would also require more memory etc. So it is better to have just one container which will spawn multiple threads to do the job for you.

However, it also depends upon the configuration of the underlying machine on which the container is started. If it makes sense to spawn multiple containers with multiple threads because of the multicore capabilities of the underlying hardware, you should do that as well.


Short answer:

Run your program as a single docker container. Think of a Docker container as a lightweight isolated environment, akin to a virtual environment, where you can run a program/service. This service can run multiple threads, all launched from the parent program - it is still one service running on a single Docker container.

Explanation:

Lets assume you have a program that spawns threads to do some work - this program might be a thread pool to do some computation on a set of chunks or it could be a web server like Apache. It could even be some Python code that instantiates a process pool do the crunch computation. In all these cases all the threads and processes belong to a master process that can be thought as a single program or service. This single program is triggered through a single user command, the command that you will specify in the Dockerfile ENTRYPOINT.

For example, you can run an Apache server container using the official Apache image on docker hub docker hub ref):

docker run -dit --name my-apache-app -v "$PWD":/usr/local/apache2/htdocs/ httpd:2.4

And this will run the Apache web server as a single container, irrespective of how many threads it executes, which can easily be referred to when the operator wants it stopped, restarted, deleted, etc, using the docker commands. And this is more convenient, as we don't need to worry about attaching mounting volumes, opening ports, and linking multitudes of containers, so they communicate to each other.

So the main point is that you want to spawn a container for each service instance. If you wanted to launch duplicate instances of the parent process, for example, run Apache on two machines as part of a load balanced configuration, then you would run two containers, one on each host.

As an aside, if you have a use case where you needed to run diverse jobs in batch system, where each job required a specific libraries installed, then that type of use case would benefit from the environment isolation that would one would achieve from running different containers. But this is not what you asked, your question specifically mentioned a web server spawning threads and processes utilizing threads to do work on chunks, and for those cases you spawn a single container for the service/program.