Accessing stream output from hdfs of MRjob Accessing stream output from hdfs of MRjob hadoop hadoop

Accessing stream output from hdfs of MRjob


As long as you have a path that contains hdfs:/ you will not succeed as that is never going to be valid.

In the comments you mentioned that you tried to add hdfs:// manually, which may be a nice hack, but in your code I don't see you 'clean up' the wrong hdfs:/. So even if you add the right prefix, the next thing in line will be the wrong one, and the code still has no chance to succeed.

So, please clean it up.


Practical note: This question is from a while ago, if there is a problem in the software itself that is likely resolved by now. If the problem persists, it is likely something odd in the code that you try to use. Perhaps start with a trivial example from a reliable source to exclude this possibility.