Diff and intersection reporting between two text files Diff and intersection reporting between two text files shell shell

Diff and intersection reporting between two text files


sort | uniq is good, but comm might be even better. "man comm" for more information.

From the manual page:

EXAMPLES       comm -12 file1 file2              Print only lines present in both file1 and file2.       comm -3 file1 file2              Print lines in file1 not in file2, and vice versa.

You can also use the Python set type, but comm is easier.


Unix shell solution-:

# duplicate linessort text1.txt text2.txt | uniq -d# unique linessort text1.txt text2.txt | uniq -u


words1 = set(open("some1.txt").read().split())words2 = set(open("some2.txt").read().split())duplicates  = words1.intersection(words2)uniques = words1.difference(words2).union(words2.difference(words1))print "Duplicates(%d):%s"%(len(duplicates),duplicates)print "\nUniques(%d):%s"%(len(uniques),uniques)

something like that at least