Faster forking of large processes on Linux? Faster forking of large processes on Linux? linux linux

Faster forking of large processes on Linux?


On Linux, you can use posix_spawn(2) with the POSIX_SPAWN_USEVFORK flag to avoid the overhead of copying page tables when forking from a large process.

See Minimizing Memory Usage for Creating Application Subprocesses for a good summary of posix_spawn(2), its advantages and some examples.

To take advantage of vfork(2), make sure you #define _GNU_SOURCE before #include <spawn.h> and then simply posix_spawnattr_setflags(&attr, POSIX_SPAWN_USEVFORK)

I can confirm that this works on Debian Lenny, and provides a massive speed-up when forking from a large process.

benchmarking the various spawns over 1000 runs at 100M RSS                            user     system      total        realfspawn (fork/exec):     0.100000  15.460000  40.570000 ( 41.366389)pspawn (posix_spawn):   0.010000   0.010000   0.540000 (  0.970577)


Outcome: I was going to go down the early-spawned helper subprocess route as suggested by other answers here, but then I came across this re using huge page support to improve fork performance.

Having tried it myself using libhugetlbfs to simply make all my app's mallocs allocate huge pages, I'm now getting around 2400 forks/s regardless of the process size (over the range I'm interested in anyway). Amazing.


Did you actually measure how much time forks take? Quoting the page you linked,

Linux never had this problem; because Linux used copy-on-write semantics internally, Linux only copies pages when they changed (actually, there are still some tables that have to be copied; in most circumstances their overhead is not significant)

So the number of forks doesn't really show how big the overhead will be. You should measure the time consumed by forks, and (which is a generic advice) consumed only by the forks you actually perform, not by benchmarking maximum performance.

But if you really figure out that forking a large process is a slow, you may spawn a small ancillary process, pipe master process to its input, and receive commands to exec from it. The small process will fork and exec these commands.

posix_spawn()

This function, as far as I understand, is implemented via fork/exec on desktop systems. However, in embedded systems (particularly, in those without MMU on board), processes are spawned via a syscall, interface to which is posix_spawn or a similar function. Quoting the informative section of POSIX standard describing posix_spawn:

  • Swapping is generally too slow for a realtime environment.

  • Dynamic address translation is not available everywhere that POSIX might be useful.

  • Processes are too useful to simply option out of POSIX whenever it must run without address translation or other MMU services.

Thus, POSIX needs process creation and file execution primitives that can be efficiently implemented without address translation or other MMU services.

I don't think that you will benefit from this function on desktop if your goal is to minimize time consumption.