Understanding concurrent file writes from multiple processes Understanding concurrent file writes from multiple processes unix unix

Understanding concurrent file writes from multiple processes


Atomicity of writes less than PIPE_BUF applies only to pipes and FIFOs. For file writes, POSIX says:

This volume of POSIX.1-2008 does not specify behavior of concurrent writes to a file from multiple processes. Applications should use some form of concurrency control.

...which means that you're on your own - different UNIX-likes will give different guarantees.


Firstly, O_APPEND or the equivalent FILE_APPEND_DATA on Windows means that increments of the maximum file extent (file "length") are atomic under concurrent writers, and that is by any amount, not just PIPE_BUF. This is guaranteed by POSIX, and Linux, FreeBSD, OS X and Windows all implement it correctly. Samba also implements it correctly, NFS before v5 does not as it lacks the wire format capability to append atomically. So if you open your file with append-only, concurrent writes will not tear with respect to one another on any major OS unless NFS is involved.

This says nothing about whether reads will ever see a torn write though, and on that POSIX says the following about atomicity of read() and write() to regular files:

All of the following functions shall be atomic with respect to each other in the effects specified in POSIX.1-2008 when they operate on regular files or symbolic links ... [many functions] ... read() ... write() ... If two threads each call one of these functions, each call shall either see all of the specified effects of the other call, or none of them. [Source]

and

Writes can be serialized with respect to other reads and writes. If a read() of file data can be proven (by any means) to occur after a write() of the data, it must reflect that write(), even if the calls are made by different processes. [Source]

but conversely:

This volume of POSIX.1-2008 does not specify behavior of concurrent writes to a file from multiple processes. Applications should use some form of concurrency control. [Source]

A safe interpretation of all three of these requirements would suggest that all writes overlapping an extent in the same file must be serialised with respect to one another and to reads such that torn writes never appear to readers.

A less safe, but still allowed interpretation could be that reads and writes only serialise with each other between threads inside the same process, and between processes writes are serialised with respect to reads only (i.e. there is sequentially consistent i/o ordering between threads in a process, but between processes i/o is only acquire-release).

Of course, just because the standard requires these semantics doesn't mean implementations comply, though in fact FreeBSD with ZFS behaves perfectly, very recent Windows (10.0.14393) with NTFS behaves perfectly, and recent Linuxes with ext4 behaves correctly if O_DIRECT is on. If you would like more detail on how well major OS and filing systems comply with the standard, see this answer


It's not luck, in the sense that if you dig into the kernel you can probably prove that in your particular circumstances it will never happen that one processes' write is interleaved with another one. I am assuming that:

  • You are not hitting any file size limits
  • You are not filling the filesystem in which you create the test file
  • The file is a regular file (not a socket, pipe, or something else)
  • The filesystem is local
  • The buffer does not span multiple virtual memory mappings (this one is known to be true, because it's malloc()ed, which puts it on the heap, which it contiguous.
  • The processes aren't interrupted, signaled, or traced while write() is busy.
  • There are no disk I/O errors, RAM failures, or any other abnormal conditions.
  • (Maybe others)

You will probably indeed find that if all those assumptions hold true, it is the case that the kernel of the operating system you happen to be using always accomplishes a single write() system call with a single atomic contiguous write to the following file.

That doesn't mean you can count on this always being true. You never know when it might not be true when:

  • the program is run on a different operating system
  • the file moves to an NFS filesystem
  • the process gets a signal while the write() is in progress and the write() returns a partial result (fewer bytes than requested). Not sure if POSIX really allows this to happen but I program defensively!
  • etc...

So your experiment can't prove that you can count on non-interleaved writes.