Get difference between two lists

python performance list set set-difference

To get elements which are in temp1 but not in temp2 :

In [5]: list(set(temp1) - set(temp2))Out[5]: ['Four', 'Three']

Beware that it is asymmetric :

In [5]: set([1, 2]) - set([2, 3])Out[5]: set([1])

where you might expect/want it to equal set([1, 3]). If you do want set([1, 3]) as your answer, you can use set([1, 2]).symmetric_difference(set([2, 3])).

python performance list set set-difference

The existing solutions all offer either one or the other of:

Faster than O(n*m) performance.
Preserve order of input list.

But so far no solution has both. If you want both, try this:

s = set(temp2)temp3 = [x for x in temp1 if x not in s]

Performance test

import timeitinit = 'temp1 = list(range(100)); temp2 = [i * 2 for i in range(50)]'print timeit.timeit('list(set(temp1) - set(temp2))', init, number = 100000)print timeit.timeit('s = set(temp2);[x for x in temp1 if x not in s]', init, number = 100000)print timeit.timeit('[item for item in temp1 if item not in temp2]', init, number = 100000)

Results:

4.34620224079 # ars' answer4.2770634955  # This answer30.7715615392 # matt b's answer

The method I presented as well as preserving order is also (slightly) faster than the set subtraction because it doesn't require construction of an unnecessary set. The performance difference would be more noticable if the first list is considerably longer than the second and if hashing is expensive. Here's a second test demonstrating this:

init = '''temp1 = [str(i) for i in range(100000)]temp2 = [str(i * 2) for i in range(50)]'''

Results:

11.3836875916 # ars' answer3.63890368748 # this answer (3 times faster!)37.7445402279 # matt b's answer

python performance list set set-difference

Can be done using python XOR operator.

This will remove the duplicates in each list
This will show difference of temp1 from temp2 and temp2 from temp1.

set(temp1) ^ set(temp2)

CodeHunter

Get difference between two lists

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last