Remove folder and its contents from git/GitHub's history Remove folder and its contents from git/GitHub's history git git

Remove folder and its contents from git/GitHub's history


If you are here to copy-paste code:

This is an example which removes node_modules from history

git filter-branch --tree-filter "rm -rf node_modules" --prune-empty HEADgit for-each-ref --format="%(refname)" refs/original/ | xargs -n 1 git update-ref -decho node_modules/ >> .gitignoregit add .gitignoregit commit -m 'Removing node_modules from git history'git gcgit push origin master --force

What git actually does:

The first line iterates through all references on the same tree (--tree-filter) as HEAD (your current branch), running the command rm -rf node_modules. This command deletes the node_modules folder (-r, without -r, rm won't delete folders), with no prompt given to the user (-f). The added --prune-empty deletes useless (not changing anything) commits recursively.

The second line deletes the reference to that old branch.

The rest of the commands are relatively straightforward.


I find that the --tree-filter option used in other answers can be very slow, especially on larger repositories with lots of commits.

Here is the method I use to completely remove a directory from the git history using the --index-filter option, which runs much quicker:

# Make a fresh clone of YOUR_REPOgit clone YOUR_REPOcd YOUR_REPO# Create tracking branches of all branchesfor remote in `git branch -r | grep -v /HEAD`; do git checkout --track $remote ; done# Remove DIRECTORY_NAME from all commits, then remove the refs to the old commits# (repeat these two commands for as many directories that you want to remove)git filter-branch --index-filter 'git rm -rf --cached --ignore-unmatch DIRECTORY_NAME/' --prune-empty --tag-name-filter cat -- --allgit for-each-ref --format="%(refname)" refs/original/ | xargs -n 1 git update-ref -d# Ensure all old refs are fully removedrm -Rf .git/logs .git/refs/original# Perform a garbage collection to remove commits with no refsgit gc --prune=all --aggressive# Force push all branches to overwrite their history# (use with caution!)git push origin --all --forcegit push origin --tags --force

You can check the size of the repository before and after the gc with:

git count-objects -vH


It appears that the up-to-date answer to this is to not use filter-branch directly (at least git itself does not recommend it anymore), and defer that work to an external tool. In particular, git-filter-repo is currently recommended. The author of that tool provides arguments on why using filter-branch directly can lead to issues.

Most of the multi-line scripts above to remove dir from the history could be re-written as:

git filter-repo --path dir --invert-paths

The tool is more powerful than just that, apparently. You can apply filters by author, email, refname and more (full manpage here). Furthermore, it is fast. Installation is easy - it is distributed in a variety of formats.