Git (or Hg) plugin for dealing with Microsoft Word and/or OpenOffice files
How about:
- Save your Word docs in XML.
- Commit your XML Word files.
Diff using an external XML diff tool. For example:
$ git difftool -t xmldiff c3d293 498571
Transforming the XML files to have one element per line should make the check-in process run efficiently and also allow the external XML diff tool to process quickly.
References:
A nice trick I was able to come up with that also works on Open Office files, PPTs, etc.:
http://xcafebabe.blogspot.hu/2012/09/sexy-comparison-of-word-documents-with.html
Here's a screenshot that demonstrates the result:
If you are on MS Windows, use TortoiseGit. I just had to go through this painful experience, and TGit, although inelegant takes some of the pain out it. A couple of other points:
- Surprisingly git diff and gitk both do a reasonably good job of at least visualizing diffs between .docx (not sure about .doc, but I would assume it's the same). This is good for just a quick scan of diffs when doing commits.
- You are completely out of luck as far as fast forward and automerging is concerned. Unfortunately I have not found a tool that can handle this (although I like the xml idea above), so you will have to do all merges manually.
Microsoft Word (MS Word) has a decent, if flawed, merge tool. AFAIK, it can only do 2-way merges (i.e.:
X0 + dX = X1
), not 3-way or 2-parent merges, which are more common in version control (i.e.:X0 + dX1 + dX2 = X1
). You could solve merge conflicts using this tool, but there would be some legwork right - checking out each branch, exporting HEAD as an untracked version, etc.X0 = *.BASE.docx,X0 + dX1 = *.LOCAL.docx andX0 + dX2 = *.REMOTE.docx
Luckily this is exactly what TGit (and TSVN too) do. I would unfortunately, avoid
rebase
since if you have to replay several changes in a row, it can be very tiring, butmerge
for short documents is fine, just not great.