How to compare a list of lists/sets in python? How to compare a list of lists/sets in python? python python

How to compare a list of lists/sets in python?


So you want the difference between two lists of items.

first_list = [['Test.doc', '1a1a1a', 1111],               ['Test2.doc', '2b2b2b', 2222],               ['Test3.doc', '3c3c3c', 3333]]secnd_list = [['Test.doc', '1a1a1a', 1111],               ['Test2.doc', '2b2b2b', 2222],               ['Test3.doc', '8p8p8p', 9999],               ['Test4.doc', '4d4d4d', 4444]]

First I'd turn each list of lists into a list of tuples, so as tuples are hashable (lists are not) so you can convert your list of tuples into a set of tuples:

first_tuple_list = [tuple(lst) for lst in first_list]secnd_tuple_list = [tuple(lst) for lst in secnd_list]

Then you can make sets:

first_set = set(first_tuple_list)secnd_set = set(secnd_tuple_list)

EDIT (suggested by sdolan): You could have done the last two steps for each list in a one-liner:

first_set = set(map(tuple, first_list))secnd_set = set(map(tuple, secnd_list))

Note: map is a functional programming command that applies the function in the first argument (in this case the tuple function) to each item in the second argument (which in our case is a list of lists).

and find the symmetric difference between the sets:

>>> first_set.symmetric_difference(secnd_set) set([('Test3.doc', '3c3c3c', 3333),     ('Test3.doc', '8p8p8p', 9999),     ('Test4.doc', '4d4d4d', 4444)])

Note first_set ^ secnd_set is equivalent to symmetric_difference.

Also if you don't want to use sets (e.g., using python 2.2), its quite straightforward to do. E.g., with list comprehensions:

>>> [x for x in first_list if x not in secnd_list] + [x for x in secnd_list if x not in first_list][['Test3.doc', '3c3c3c', 3333], ['Test3.doc', '8p8p8p', 9999], ['Test4.doc', '4d4d4d', 4444]]

or with the functional filter command and lambda functions. (You have to test both ways and combine).

>>> filter(lambda x: x not in secnd_list, first_list) + filter(lambda x: x not in first_list, secnd_list)[['Test3.doc', '3c3c3c', 3333], ['Test3.doc', '8p8p8p', 9999], ['Test4.doc', '4d4d4d', 4444]]


Not sure if there is a nice function for this, but the "manual" way to do it isn't difficult:

differences = []for list in firstList:    if list not in secondList:        differences.append(list)


>>> First_list = [['Test.doc', '1a1a1a', '1111'], ['Test2.doc', '2b2b2b', '2222'], ['Test3.doc', '3c3c3c', '3333']] >>> Secnd_list = [['Test.doc', '1a1a1a', '1111'], ['Test2.doc', '2b2b2b', '2222'], ['Test3.doc', '3c3c3c', '3333'], ['Test4.doc', '4d4d4d', '4444']] >>> z = [tuple(y) for y in First_list]>>> z[('Test.doc', '1a1a1a', '1111'), ('Test2.doc', '2b2b2b', '2222'), ('Test3.doc', '3c3c3c', '3333')]>>> x = [tuple(y) for y in Secnd_list]>>> x[('Test.doc', '1a1a1a', '1111'), ('Test2.doc', '2b2b2b', '2222'), ('Test3.doc', '3c3c3c', '3333'), ('Test4.doc', '4d4d4d', '4444')]>>> set(x) - set(z)set([('Test4.doc', '4d4d4d', '4444')])