Checksum Exception when reading from or copying to hdfs in apache hadoop Checksum Exception when reading from or copying to hdfs in apache hadoop hadoop hadoop

Checksum Exception when reading from or copying to hdfs in apache hadoop


You are probably hitting the bug described in HADOOP-7199. What happens is that when you download a file with copyToLocal, it also copies a crc file in the same directory, so if you modify your file and then try to do copyFromLocal, it will do a checksum of your new file and compare to your local crc file and fail with a non descriptive error message.

To fix it, please check if you have this crc file, if you do just remove it and try again.


I face the same problem solved by removing .crc files


Ok so I managed to solve this issue and I'm writing the answer here just in case someone else encounters the same problem.

What I did was simply create a new file and copied all the contents from the problematic file.

From what I can presume it looks like some crc file is being created and attached to that particular file, hence by trying with another file, another crc check will be carried out. Another reason could be that I have named the file attr.txt, which could be a conflicting file name with some other resource. Maybe someone could expand even more on my answer, since I am not 100% sure on the technical details and these are just my observations.