What is the advantage of git lfs? What is the advantage of git lfs? git git

What is the advantage of git lfs?


One specificity of Git (and other distributed systems) compared to centralized systems is that each repository contains the whole history of the project. Suppose you create a 100 MB file, modify it 100 times in a way that doesn't compress well. You'll end up with a 10 GB repository. This means that each clone will download 10 GB of data, eat 10 GB of disk space on each machine on which you're making a clone. What's even more frustrating: you'd still have to download these 10 GB of data even if you git rm the big files.

Putting big files in a separate system like git-lfs allow you to store only pointers to each version of the file in the repository, hence each clone will only download a tiny piece of data for each revision. The checkout will download only the version you are using, i.e. 100 MB in the example above. As a result, you would be using disk space on the server, but saving a lot of bandwidth and disk space on the client.

In addition to this, the algorithm used by git gc (internally, git repack) does not always work well with big files. Recent versions of Git made progress in this area and it should work reasonably well, but using a big repository with big files in it may eventually get you in trouble (like not having enough RAM to repack your repository).