diff files comparing only first n characters of each line

Easy starter:

diff <(cut -d' ' -f1 md5s1.txt)  <(cut -d' ' -f1 md5s2.txt)

Also, consider just

diff -EwburqN folder1/ folder2/

linux comparison diff md5 md5sum

Compare only the md5 column using diff on <(cut -c -32 md5sums.sort.XXX), and tell diff to print just the line numbers of added or removed lines, using --old/new-line-format='%dn'$'\n'. Pipe this into ed md5sums.sort.XXX so it will print only those lines from the md5sums.sort.XXX file.

diff \    --new-line-format='%dn'$'\n' \    --old-line-format='' \    --unchanged-line-format='' \    <(cut -c -32 md5sums.sort.old) \    <(cut -c -32 md5sums.sort.new) \    | ed md5sums.sort.new \    > files-addeddiff \    --new-line-format='' \    --old-line-format='%dn'$'\n' \    --unchanged-line-format='' \    <(cut -c -32 md5sums.sort.old) \    <(cut -c -32 md5sums.sort.new) \    | ed md5sums.sort.old \    > files-removed

The problem with ed is that it will load the entire file into memory, which can be a problem if you have a lot of checksums. Instead of piping the output of diff into ed, pipe it into the following command, which will use much less memory.

diff … | (    lnum=0;    while read lprint; do        while [ $lnum -lt $lprint ]; do read line <&3; ((lnum++)); done;        echo $line;    done) 3<md5sums.sort.XXX

linux comparison diff md5 md5sum

If you are looking for duplicate files fdupes can do this for you:

$ fdupes --recurse

On ubuntu you can install it by doing

$ apt-get install fdupes

CodeHunter

diff files comparing only first n characters of each line

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last