How safe are memory-mapped files for reading input files? How safe are memory-mapped files for reading input files? windows windows

How safe are memory-mapped files for reading input files?


It is not really a problem.

Yes, another process may modify the file while you have it mapped, and yes, it is possible that you will see the modifications. It is even likely, since almost all operating systems have unified virtual memory systems, so unless one requests unbuffered writes, there's no way of writing without going through the buffer cache, and no way without someone holding a mapping seeing the change.
That isn't even a bad thing. Actually, it would be more disturbing if you couldn't see the changes. Since the file quasi becomes part of your address space when you map it, it makes perfect sense that you see changes to the file.

If you use conventional I/O (such as read), someone can still modify the file while you are reading it. Worded differently, copying file content to a memory buffer is not always safe in presence of modifications. It is "safe" insofar as read will not crash, but it does not guarantee that your data is consistent.
Unless you use readv, you have no guarantees about atomicity whatsoever (and even with readv you have no guarantee that what you have in memory is consistent with what is on disk or that it doesn't change between two calls to readv). Someone might modify the file between two read operations, or even while you are in the middle of it.
This isn't just something that isn't formally guaranteed but "probably still works" -- on the contrary, e.g. under Linux writes are demonstrably not atomic. Not even by accident.

The good news:
Usually, processes don't just open an arbitrary random file and start writing to it. When such a thing happens, it is usually either a well-known file that belongs to the process (e.g. log file), or a file that you explicitly told the process to write to (e.g. saving in a text editor), or the process creates a new file (e.g. compiler creating an object file), or the process merely appends to an existing file (e.g. db journals, and of course, log files). Or, a process might atomically replace a file with another one (or unlink it).

In every case, the whole scary problem boils down to "no issue" because either you are well aware of what will happen (so it's your responsibility), or it works seamlessly without interfering.

If you really don't like the possibility that another process could possibly write to your file while you have it mapped, you can simply omit FILE_SHARE_WRITE under Windows when you create the file handle. POSIX makes it somewhat more complicated since you need to fcntl the descriptor for a mandatory lock, which isn't necessary supported or 100% reliable on every system (for example, under Linux).


In theory, you're probably in real trouble if someone doesmodify the file while you're reading it. In practice: you'rereading characters, and nothing else: no pointers, or anythingwhich could get you into trouble. In practice... formally,I think it's still undefined behavior, but it's one whichI don't think you have to worry about. Unless the modificationsare very minor, you'll get a lot of compiler errors, but that'sabout the end of it.

The one case which might cause problems is if the file wasshortened. I'm not sure what happens then, when you're readingbeyond the end.

And finally: the system isn't arbitrarily going to open andmodify the file. It's a source file; it will be some idiotprogrammer who does it, and he deserves what he gets. In nocase will your undefined behavior corrupt the system or otherpeoples files.

Note too that most editors work on a private copy; when thewrite back, they do so by renaming the original, and creatinga new file. Under Unix, once you've opened the file to mmapit, all that counts is the inode number. And when the editorrenames or deletes the file, you still keep your copy. Themodified file will get a new inode. The only thing you have toworry about is if someone opens the file for update, and thengoes around modifying it. Not many programs do this on textfiles, except for appending additional data to the end.

So while formally, there's some risk, I don't think you have toworry about it. (If you're really paranoid, you could turn offwrite authorisation while you're mmaped. And if there'sreally an enemy agent out to get your, he can turn it right backon.)