What makes merging in DVCS easy? What makes merging in DVCS easy? git git

What makes merging in DVCS easy?


There's nothing in particular in DVCSs that makes merging easier. It's simply cultural: a DVCS wouldn't work at all if merging were hard, so DVCS developers invest a lot of time and effort into making merging easy. CVCS users OTOH are used to crappy merging, so there's no incentive for the developers to make it work. (Why make something good when your users pay you equally well for something crap?)

Linus Torvalds said in one of his Git talks that when he was using CVS at Transmeta, they set aside an entire week during a development cycle for merging. And everybody just accepted this as the normal state of affairs. Nowadays, during a merge window, Linus does hundreds of merges within just a few hours.

CVCSs could have just as good merging capabilities as DVCSs, if CVCS users simply went to their vendors and said that this crap is unacceptable. But they are caught in the Blub paradox: they simply don't know that it is unacceptable, because they have never seen a working merge system. They don't know that there is something better out there.

And when they do try out a DVCS, they magically attribute all the goodness to the "D" part.

Theoretically, due to the centralized nature, a CVCS should have better merge capabilities, because they have a global view of the entire history, unlike DVCS were every repository only has a tiny fragment.

To recap: the whole point of a DVCS is to have many decentralized repositories and constantly merge changes back and forth. Without good merging, a DVCS simply is useless. A CVCS however, can still survive with crappy merging, especially if the vendor can condition its users to avoid branching.

So, just like with everything else in software engineering, it's a matter of effort.


In Git and other DVCS merges are easy not because of some mystical series of changesets view (unless you are using Darcs, with its theory of patches, or some Darcs-inspired DVCS; they are minority, though) that Joel rambles about, but because of merge tracking, and the more fundamental fact that each revisions knows its parents. For that you need (I think) whole-tree / full-repository commits... which unfortunately limits ability to do partial checkouts, and making a commit about only subset of files.

When each revision (each commit), including merge commits, know its parents (for merge commits that means having/remembering more than one parent, i.e. merge tracking), you can reconstruct diagram (DAG = Direct Acyclic Graph) of revision history. If you know graph of revisions, you can find common ancestor of the commits you want to merge. And when your DVCS knows itself how to find common ancestor, you don't need to provide it as an argument, as for example in CVS.

Note that there might be more than one common ancestor of two (or more) commits. Git makes use of so called "recursive" merge strategy, which merges merge bases (common ancestor), till you are left with one virtual / effective common ancestor (in some simplification), and can the do simple 3-way merge.

Git use of rename detection was created to be able to deal with merges involving file renames. (This supports Jörg W Mittag argument that DVCS have better merge support because they had to have it, as merges are much more common than in CVCS with its merge hidden in 'update' command, in update-then-commit workflow, c.f. Understanding Version Control (WIP) by Eric S. Raymond).


Part of the reason is of course the technical argument that DVCSes store more information than SVN does (DAG, copies), and also have a simpler internal model, which is why it is able to perform more accurate merges, as mentioned in the other responses.

However probably an even more important difference is that because you have a local repository, you can make frequent, small commits, and also frequently pull and merge incoming changes. This is caused more by the ‘human factor’, the differences in the way a human works with a centralised VCS versus a DVCS.

With SVN, if you update and there are conflicts, SVN will merge what it can and insert markers in your code where it can’t. Big big problem with this is that your code will now no longer be in a workable state until you resolve all the conflicts.

This distracts you from the work you are trying to achieve, so typically SVN users do not merge while they are working on a task. Combine this with the fact that SVN users also tend to let changes accumulate in a single large commit for the fear of breaking other people’s working copies, and there will be large periods of time between the branch and the merge.

With Mercurial, you can merge with incoming changes much more frequently inbetween your smaller incremental commits. This will by definition result in less merge conflicts, because you will be working on a more up-to-date codebase.

And if there turns out to be a conflict, you can decide to postpone the merge and do it at your own leisure. This in particular makes the merging so much less annoying.