Nutch in Windows: Failed to set permissions of path Nutch in Windows: Failed to set permissions of path hadoop hadoop

Nutch in Windows: Failed to set permissions of path


It took me a while to get this working but here's the solution which works on nutch 1.7.

  1. Download Hadoop Core 0.20.2 from the maven repository
  2. Replace $NUTCH_HOME/lib/hadoop-core-1.2.0.jar with the downloaded file renaming it with the same name.

That should be it.

Explanation

This issue is caused by hadoop since it assumes you're running on unix and abides by the file permission rules. The issue was resolved in 2011 actually but nutch didn't update the hadoop version they use. The relevant fixes are here and here


We are using Nutch too, but it is not supported for running on Windows, on Cygwin our 1.4 version had similar problems as you had, something like mapreduce too.

We solved it by using a vm (Virtual box) with Ubuntu and a shared directory between Windows and Linux, so we can develop and built on Windows and run Nutch (crawling) on Linux.


I have Nutch running on windows, no custom build. It's a long time since I haven't used it though. But one thing that took me a while to catch, is that you need to run cygwin as a windows admin to get the necessary rights.