Redaction in git Redaction in git git git

Redaction in git


First of all, you should change the password on the FTP site. The password has already been made public; you can't guarantee that no one has cloned the repo, or it's not in plain-text in a backup somewhere, or something of the sort. If the password is at all valuable, I would consider it compromised by now.

Now, for your question about how to edit history. The git filter-branch command is intended for this purpose; it will walk through each commit in your repository's history, apply a command to modify it, and then create a new commit.

In particular, you want git filter-branch --tree-filter. This allows you to edit the contents of the tree (the actual files and directories) for each commit. It will run a command in a directory containing the entire tree, your command may edit files, add new files, delete files, move them, and so on. Git will then create a new commit object with all of the same metadata (commit message, date, and so on) as the previous one, but with the tree as modified by your command, treating new files as adds, missing files as deletes, etc (so, your command does not need to do git add or git rm, it just needs to modify the tree).

For your purposes, something like the following should work, with the appropriate regular expression and file name depending on your exact situation:

git filter-branch --tree-filter "sed -i -e 's/SekrtPassWrd/REDACTED/' myscript.py" -- --all

Remember to do this to a copy of your repository, so if something goes wrong, you will still have the original and can start over again. filter-branch will also save references to your original branches, as original/refs/heads/master and so on, so you should be able to recover even if you forget to do this; when doing some global modification to my source code history, I like to make sure I have multiple fallbacks in case something goes wrong.

To explain how this works in more detail:

sed -i -e 's/SekrtPassWrd/REDACTED/' myscript.py

This will replace SekrtPassWrd in your myscript.py file with REDACTED; the -i option to sed tells it to edit the file in place, with no backup file (as that backup would be picked up by Git as a new file).

If you need to do something more complicated than a single substitution, you can write a script, and just invoke that for your command; just be sure to call it with an absolute pathname, as git filter-branch call your command from within a temporary directory.

git filter-branch --tree-filter <command> -- --all

This tells git to run a tree filter, as described above, over every branch in your repository. The -- --all part tells Git to apply this to all branches; without it, it would only edit the history of the current branch, leaving all of the other branches unchanged (which probably isn't what you want).

See the documentation on GitHub on Removing Sensitive Data (as originally pointed out by MBO) for some more information about dealing with the copies of the information that have been pushed to GitHub. Note that they reiterate my advice to change your password, and provide some tips for dealing with cached copies that GitHub may still have.


Maybe just easier to change your password on the FTP site? Unless you're embarrassed by the code...


I believe you should be able to change all of your commits using the filter-branch command. See the section in the ProGit book for details.

However, as @MBO's link notes

force-pushing does not erase commits on the remote repo, it simply introduces new ones and moves the branch pointer to point to them

So you'll need to remove the repository completely from GitHub to remove those commits (i.e. even if they're not in your commit history, they're still floating around in the repository)