How large does a "large file" have to be to benefit from Git LFS?

git git-lfs

There is no exact threshold to define what is a large file. This is up to the user. To see if you need to store some files using Git LFS you need to understand how git works.

The most fundamental difference between Git and other source control tools (perforce, svn), is that Git stores a full snapshot of the repository on every commit. Thus when you have a large file, the snapshot contains a compressed version of this file (or a pointer to the file blob if the file wasn't changed). The repository snapshot is stored as a graph under the .git folder. Thus if the fileis "large", the repository size will grow rapidly.

There are multiple criteria to determine whether to store a file using Git LFS.

The size of the file. IMO if a file is more than 10 MB, you should consider storing it in Git LFS
How often the file is modified. A large file (based on the users intuition of a large file) that changes very often should be stored using Git LFS
The type of the file. A non-text file that cannot be merged is elligible for Git LFS storage

Will I benefit from Git LFS with "large files" as small as 50 MB? 20MB? 5MB? 1MB? Less than 1MB?

Depending on how often the file changes, in any size mentioned you can benefit. Consider the case where you do 100 commits editing the file every time. For a 20MB file that can be compressed say to 15 MB, the repository size would increase by approximately 1.5GB if the file is not stored using Git LFS.

git git-lfs

LFS is a tool for maintenance of resources of projects. Suppose you have a project which has *.psd files which used in Front-end. These files are usually large and the versioning of a file is not respect to previous versions(git saves history of changes for text files in commits but for binary files this approach could not be used. diff of two .cpp files has meaning but diff of two raw photo does not.). So if you put resources to repository its size and cloning time will be growth unsightly. Moreover maintenace will be hard.

How can overcome this issue? First of all one good idea is that to split database of large files from codes in server-side. Another is that the clients allowed to pull part of them which they want to use currently on his/her local machine(i.e not all of previous files).

What LFS does? It hash its tracked files and store theme as a pointers to original files. Store original files to separate database on server-side. Local repositories have all of pointers in their history but when you checkout a specific commit, it pull just its contents. The size of local repository and time of clone will decreases impressively in this manner.

PS: The method of receiving files in lfs is differ from git. So I think it uses some technics to split large files, send them in different parallel connections and and merge them... and such stuff which can improve its functionality... But what is important is that it can increase time of clone/pull for hundred/thousands of small files.

Also note that git has problem with files larger that 4GB in windows.

CodeHunter

How large does a "large file" have to be to benefit from Git LFS?

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last