HDFS error: could only be replicated to 0 nodes, instead of 1

amazon-ec2 hadoop

WARNING: The following will destroy ALL data on HDFS. Do not execute the steps in this answer unless you do not care about destroying existing data!!

You should do this:

stop all hadoop services
delete dfs/name and dfs/data directories
hdfs namenode -format Answer with a capital Y
start hadoop services

Also, check the diskspace in your system and make sure the logs are not warning you about it.

amazon-ec2 hadoop

This is your issue - the client can't communicate with the Datanode. Because the IP that the client received for the Datanode is an internal IP and not the public IP. Take a look at this

http://www.hadoopinrealworld.com/could-only-be-replicated-to-0-nodes/

Look at the sourcecode from DFSClient$DFSOutputStrem (Hadoop 1.2.1)

//// Connect to first DataNode in the list.//success = createBlockOutputStream(nodes, clientName, false);if (!success) {  LOG.info("Abandoning " + block);  namenode.abandonBlock(block, src, clientName);  if (errorIndex < nodes.length) {    LOG.info("Excluding datanode " + nodes[errorIndex]);    excludedNodes.add(nodes[errorIndex]);  }  // Connection failed. Let's wait a little bit and retry  retry = true;}

The key to understand here is that Namenode only provide the list of Datanodes to store the blocks. Namenode does not write the data to the Datanodes. It is the job of the Client to write the data to the Datanodes using the DFSOutputStream . Before any write can begin the above code make sure that the Client can communicate with the Datanode(s) and if the communication fails to the Datanode, the Datanode is added to the excludedNodes .

amazon-ec2 hadoop

Look at following:

By seeing this exception(could only be replicated to 0 nodes, instead of 1), datanode is not available to Name Node..

This are the following cases Data Node may not available to Name Node

Data Node disk is Full
Data Node is Busy with block report and block scanning
If Block Size is Negative value(dfs.block.size in hdfs-site.xml)
while write in progress primary datanode goes down(Any n/w fluctations b/w Name Node and Data Node Machines)
when Ever we append any partial chunk and call sync for subsequent partial chunk appends client should store the previous data in buffer.

For example after appending "a" I have called sync and when I am trying the to append the buffer should have "ab"

And Server side when the chunk is not multiple of 512 then it will try to do Crc comparison for the data present in block file as well as crc present in metafile. But while constructing crc for the data present in block it is always comparing till the initial Offeset Or For more analysis Please the data node logs

Reference: http://www.mail-archive.com/hdfs-user@hadoop.apache.org/msg01374.html

CodeHunter

HDFS error: could only be replicated to 0 nodes, instead of 1

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last