Python difflib: highlighting differences inline? Python difflib: highlighting differences inline? python python

Python difflib: highlighting differences inline?


For your simple example:

import difflibdef show_diff(seqm):    """Unify operations between two compared stringsseqm is a difflib.SequenceMatcher instance whose a & b are strings"""    output= []    for opcode, a0, a1, b0, b1 in seqm.get_opcodes():        if opcode == 'equal':            output.append(seqm.a[a0:a1])        elif opcode == 'insert':            output.append("<ins>" + seqm.b[b0:b1] + "</ins>")        elif opcode == 'delete':            output.append("<del>" + seqm.a[a0:a1] + "</del>")        elif opcode == 'replace':            raise NotImplementedError, "what to do with 'replace' opcode?"        else:            raise RuntimeError, "unexpected opcode"    return ''.join(output)>>> sm= difflib.SequenceMatcher(None, "lorem ipsum dolor sit amet", "lorem foo ipsum dolor amet")>>> show_diff(sm)'lorem<ins> foo</ins> ipsum dolor <del>sit </del>amet'

This works with strings. You should decide what to do with "replace" opcodes.


Here's an inline differ inspired by @tzot's answer above (also Python 3 compatible):

def inline_diff(a, b):    import difflib    matcher = difflib.SequenceMatcher(None, a, b)    def process_tag(tag, i1, i2, j1, j2):        if tag == 'replace':            return '{' + matcher.a[i1:i2] + ' -> ' + matcher.b[j1:j2] + '}'        if tag == 'delete':            return '{- ' + matcher.a[i1:i2] + '}'        if tag == 'equal':            return matcher.a[i1:i2]        if tag == 'insert':            return '{+ ' + matcher.b[j1:j2] + '}'        assert False, "Unknown tag %r"%tag    return ''.join(process_tag(*t) for t in matcher.get_opcodes())

It's not perfect, for example, it would be nice to expand 'replace' opcodes to recognize the full word replaced instead of just the few different letters, but it's a good place to start.

Sample output:

>>> a='Lorem ipsum dolor sit amet consectetur adipiscing'>>> b='Lorem bananas ipsum cabbage sit amet adipiscing'>>> print(inline_diff(a, b))Lorem{+  bananas} ipsum {dolor -> cabbage} sit amet{-  consectetur} adipiscing


difflib.SequenceMatcher will operate on single lines. You can use the "opcodes" to determine how to change the first line to make it the second line.