Hadoop dfs error : INFO ipc.Client: Retrying connect to server: localhost
Apparently, someone added the older hadoop(1.0.3) bin directory into the path variable before I had added the new hadoop(1.0.4) bin directory. And thus whenever I ran "hadoop" from the CLI, it executed the binaries of the older hadoop rather than the new one.
Solution:
Remove the entire bin path of older hadoop
Shutdown cluster - Exit terminal
Login in new terminal session
Startup node
Tried
hadoop dfs -ls /
-> Works fine !!!! Good lesson learnt.
Looks like many people ran into this problem.
There might be no need to change /etc/hosts, and make sure you can access master and slave from each other, and your core-site.xml are the same pointing to the right master node and port number.
Then run $HADOOP/bin/stop-all.sh, $HADOOP/bin/start-all.sh on master node ONLY. (If run on slave might lead to problems). Use JPS to check whether all services are there as follows.
On master node:4353 DataNode4640 JobTracker4498 SecondaryNameNode4788 TaskTracker4989 Jps4216 NameNode
On slave node:3143 Jps2827 DataNode2960 TaskTracker
In addition, check your firewall rules between namenode and datanode