About pointers after fork() About pointers after fork() unix unix

About pointers after fork()


The key here is the concept of a virtual address space.

Modern processors (Say anything newer then a 80386) have a memory management unit which maps from a per process virtual address space to physical memory pages under control of the kernel.

When the kernel sets up a process it creates a set of page table entries for that process that define the physical memory pages to virtual address space mapping, and it is in this virtual address space that the program executes.

Conceptually when you fork, the kernel copies the existing process pages to a new set of physical pages and sets up the new processes page tables so that as far as the new process is concerned it appears to be running in the same virtual memory layout as the original one had, while actually addressing entirely different physical memory.

The detail is more subtle as nobody wants to waste time copying hundreds of MB of data unless such is necessary.When the process calls fork() the kernel sets up a second set of page table entries (for the new process), but points them at the same physical pages as the original process, it then sets the flag in both sets of pages to make the mmu consider them read only.....

As soon as either process writes to a page, the memory management unit generates a page fault (due to the PTE entry having the read only flag set), and the page fault handler then allocates a new page from physical memory, copies the data over, updates the page table entry and sets the pages back to read/write. In this way, pages are only actually copied the first time either process tries to make a change to a copy on write page, and the slight of hand goes completely unnoticed by either process.

Regards, Dan.


Logically, the fork()ed process gets its own, independent copy of more or less the whole state of the parent process. That couldn't work if pointers in the child referred to memory belonging to the parent.

The details of how a particular UNIX-like kernel makes that work can vary. Linux implements the child process's memory via copy-on-write pages, which makes fork()ing comparatively cheap relative to other possible implementations. In that case, the child's pointers really do point to the parent process's memory, up until such time that either child or parent tries to modify that memory, at which time a copy is made for the child to use. That all relies on the underlying virtual memory system. Other UNIX and UNIX-like systems can and have done it differently.


The child modified a pointer that is perfectly legal in its address space because it is a copy of its parent. There was no effect on the parent because the memory is not logically shared. Each process gets to go its separate way after the fork.

UNIX has a number of ways of creating shared memory (where one process can modify memory and have that modification seen by another process), but fork is not one of them. And it's a good thing because otherwise, synchronization between the parent and child would be almost impossible.