How to detect only the different files in my bash shell script? How to detect only the different files in my bash shell script? unix unix

How to detect only the different files in my bash shell script?


Here is your script corrected:

while IFS= read -r filename;    do        # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #        # inspecting the digest of each file individually         #        # shows many files are identical and so are the digests   #        # It also prints MD5 (full file path) = md5_signature!    #        # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #        md5 "old/$filename"              # please use double quotes        md5 "new/$filename"         # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #        # Using -q eliminates all output from md5 except the sig      #        # Your script now works correctly                             #        # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #        [[ $(md5 -q "old/$filename") == $(md5 -q "new/$filename") ]] || echo differs; # differs    done < files.txt

Problems:

  1. You had a typo of new/$fullfile rather than new/$filename
  2. You should use "new/$filename" (ie, use double quotes) around the file name expansions
  3. Use md5 -q to compare output of md5 on different files. Otherwise md5, by default, prints the input file path in the form of MD5 (full_path/base_name) = 2504fcc0c0a57d14aa6b4193b5efaf94. Since these paths are guaranteed to be different in two different directories, the different path names will cause the failure in the string comparison.

The comments above assume you are using md5 on BSD or, likely, on macOS.

Here is an alternate solution that works both on Linux with md5sum and BSD with md5. Just feed the content of the file to the stdin of either program and only the md5 signature is printed:

$ md5 <new/file.pdf2504fcc0c0a57d14aa6b4193b5efaf94

vs if you use the file name, the path is printed and the MD5 hash signature used is printed:

$ md5 new/file.pdfMD5 (new/file.pdf) = 2504fcc0c0a57d14aa6b4193b5efaf94

The same holds true for md5sum on Linux or GNU core utilities.


Instead of computing MD5 checksums, you could use the diff command which compares file contents. Its primary use is to processes files line-by-line and compare their differences (and generate patches) but it can just as easily be used for this purpose. It returns an exit of 0 if there are no differences between the two files and 1 if there are any differences.

while IFS= read -r filename;  do    if ! diff "old/$filename" "new/$filename" > /dev/null;    then      echo "“$filename” differs"    fi  done < files-to-compare.txt

If you’re using GNU diff, you could simply use its -q, --brief option which reports only that the files differ (instead of detailing how they differ):

while IFS= read -r filename;  do    diff -q "old/$filename" "new/$filename"  done < files-to-compare.txt


on my Linux ubuntu, there is the md5sum command: it prints the digest and the filename:

md5sum myFile215e0f7b4ea9fd9ea5f31106155839fe  myFile

I mean you need to extract only the digest from the output:

md5sum myFile | sed 's/^\([^[:blank:]]*\).*$/\1/g'215e0f7b4ea9fd9ea5f31106155839fe

Then use this last command line in the test:

...[[ $(md5sum old/"${filename}" | sed 's/^\([^[:blank:]]*\).*$/\1/g') = $(md5sum new/"${filename}" | sed 's/^\([^[:blank:]]*\).*$/\1/g') ]] || echo differs;...