Speeding up the initial git-svn fetch Speeding up the initial git-svn fetch git git

Speeding up the initial git-svn fetch


At work I use git-svn against a ~170000 revision SVN repo. What I did was use git-svn init + git-svn fetch -r... to limit my initial fetch to a reasonable number of revisions. You must be careful to choose a revision that is actually in the branch you want. Everything is fully functional even with truncated history except git-blame, which obviously attributes all the lines older than your starting rev to the first rev.

You can further speed this up with ignore-paths to prune out subtrees that you don't want.

You can add more revisions later, but it will be painful. You will have to reset the rev-map (sadly I even wrote git-svn reset and I can't say offhand if it will remove all revisions, so it may be by hand). Then git-svn fetch more revisions and git-filter-branch to reparent your old root to the new tree. That will rewrite every commit but it won't affect the source blobs themselves. You have to do similar surgery when people undertake big reorgs of the svn repo.

If you actually need all of the revisions (for example for a migration) then you should be looking at some flavor of svn-fast-export + git-fast-import. There may be one that adds rev tags to match git-svn, in which case you could fast-import and then just graft in the svn remote. Even if the existing svn-fast-export options don't have that feature, you can probably add it before your original clone completes!


Apparently there is no good answer. Some work is being done on git-fast-import but it isn't ready for prime time yet. They are still trying to figure out how to detect and represent 'svn cp' actions. The one bright spot is that someone on the list came up with an optimization for git-svn that seems to have made a big impact.

http://permalink.gmane.org/gmane.comp.version-control.git/168718


In a repository with 20k commits I had similar problems. In my case it turned out that there was a few strange tags in subversion that caused problems. There was tags that copied / instead of /trunk. That cause git svn fetch to go into infinite loop.I fixed it by converting in chunks.

git svn fetch -r0:1000git svn fetch -r0:2000git svn fetch -r0:3000

Watch the output and if you don't see new r... once in a while then something is wrong.Use git log --all to see how far the conversion got. Let say you got to 1565. Then continue the fetch like this.

git svn fetch -r1567:2000

It was very tedious but it got the job done.