Can git be made to mostly auto-merge XML order-insensitive files? Can git be made to mostly auto-merge XML order-insensitive files? xml xml

Can git be made to mostly auto-merge XML order-insensitive files?


This gets pretty hard in general.

Some have attempted to use Git's union merge (which is more accessible now than it was in early days; as in that question, you just add merge=union in a .gitattributes file), but this does not work in general. It might work sometimes. Boiling it down a lot, it works if your XML is always structured so that naive line-oriented union merge produces valid XML (basically, keeping whole XML sub-elements all on one line), and you are always adding whole new XML sub-elements.

It is possible, in Git, to write a custom merge driver. Writing a useful one for XML is hard.

First we need an XML diff engine, such as Sylvain Thénault's xmldiff, to construct two string-to-string (or tree-to-tree) edits for three XML files (the merge base, local or --ours, and other or --theirs files: diff base-vs-local and base-vs-ours). This particular one looks like it works similarly to Python's difflib. (However, due to the referenced papers, it looks like it produces tree move / nesting-level operations as well as simple insert and delete. This is a natural and reasonable thing for a tree-to-tree edit algorithm to do, and probably actually desirable here.)

Then, given two such diffs, we need code to combine them. The union method is to ignore all deletions: simply add all additions to the base version (or, equivalently, add the "other" additions to the "local", or the "local" additions to the "other"). We could also combine tree insert/delete operations a la "real" (non-union-style) merges, and perhaps even declare conflicts. (And it might be nice to allow different handling of tree nesting-level-changes, driven by something vaguely like a DTD.)

These last parts are not, as far as I know anyway, done anywhere. Besides that, the Python xmldiff I linked here is a fairly big chunk of code (I have not read it anywhere near closely, nor attempted to install it, I just downloaded it and skimmed—it implements both a Myers-like algorithm, and the fancier "fast match / edit script" algorithm from the Stanford paper).