Unit Testing File Modifications Unit Testing File Modifications linux linux

Unit Testing File Modifications


You're talking about testing too much at once. If you start trying to attack a testing problem by saying "Let's verify that it modifies its environment correctly", you're doomed to failure. Environments have dozens, maybe even millions of potential variations.

Instead, look at the pieces ("units") of your program. For example, are you going to have a function that determines where the files are that have to be written? What are the inputs to that function? Perhaps an environment variable, perhaps some values read from a config file? Test that function, and don't actually do anything that modifies the filesystem. Don't pass it "realistic" values, pass it values that are easy to verify against. Make a temporary directory, populate it with files in your test's setUp method.

Then test the code that writes the files. Just make sure it's writing the right contents file contents. Don't even write to a real filesystem! You don't need to make "fake" file objects for this, just use Python's handy StringIO modules; they're "real" implementations of the "file" interface, they're just not the ones that your program is actually going to be writing to.

Ultimately you will have to test the final, everything-is-actually-hooked-up-for-real top-level function that passes the real environment variable and the real config file and puts everything together. But don't worry about that to get started. For one thing, you will start picking up tricks as you write individual tests for smaller functions and creating test mocks, fakes, and stubs will become second nature to you. For another: even if you can't quite figure out how to test that one function call, you will have a very high level of confidence that everything which it is calling works perfectly. Also, you'll notice that test-driven development forces you to make your APIs clearer and more flexible. For example: it's much easier to test something that calls an open() method on an object that came from somewhere abstract, than to test something that calls os.open on a string that you pass it. The open method is flexible; it can be faked, it can be implemented differently, but a string is a string and os.open doesn't give you any leeway to catch what methods are called on it.

You can also build testing tools to make repetitive tasks easy. For example, twisted provides facilities for creating temporary files for testing built right into its testing tool. It's not uncommon for testing tools or larger projects with their own test libraries to have functionality like this.


You have two levels of testing.

  1. Filtering and Modifying content. These are "low-level" operations that don't really require physical file I/O. These are the tests, decision-making, alternatives, etc. The "Logic" of the application.

  2. File system operations. Create, copy, rename, delete, backup. Sorry, but those are proper file system operations that -- well -- require a proper file system for testing.

For this kind of testing, we often use a "Mock" object. You can design a "FileSystemOperations" class that embodies the various file system operations. You test this to be sure it does basic read, write, copy, rename, etc. There's no real logic in this. Just methods that invoke file system operations.

You can then create a MockFileSystem which dummies out the various operations. You can use this Mock object to test your other classes.

In some cases, all of your file system operations are in the os module. If that's the case, you can create a MockOS module with mock version of the operations you actually use.

Put your MockOS module on the PYTHONPATH and you can conceal the real OS module.

For production operations you use your well-tested "Logic" classes plus your FileSystemOperations class (or the real OS module.)


For later readers who just want a way to test that code writing to files is working correctly, here is a "fake_open" that patches the open builtin of a module to use StringIO. fake_open returns a dict of opened files which can be examined in a unit test or doctest, all without needing a real file-system.

def fake_open(module):    """Patch module's `open` builtin so that it returns StringIOs instead of    creating real files, which is useful for testing. Returns a dict that maps    opened file names to StringIO objects."""    from contextlib import closing    from StringIO import StringIO    streams = {}    def fakeopen(filename,mode):        stream = StringIO()        stream.close = lambda: None        streams[filename] = stream        return closing(stream)    module.open = fakeopen    return streams