Why is RAID not recommended for Hadoop HDFS setups? Why is RAID not recommended for Hadoop HDFS setups? hadoop hadoop

Why is RAID not recommended for Hadoop HDFS setups?


RAID is used for two purposes. Depending on the RAID configuration you can get:

  1. Better performance: reading a file can be spread over multiple disks or different disks can be transparently used to read multiple files from the same file system.
  2. Fault-tolerance: Data is replicated or stored using parity bits on multiple disks. If a disk fails, it can be recovered from another replica or recomputed using the parity bits.

HDFS has similar mechanisms built in software. HDFS splits files into chunks (so-called file blocks) which are replicated across multiple datanodes and stored on their local filesystems. Usually, datanodes have multiple disks which are individually mounted (JBOD). A datanode should distribute its file blocks across all its disks / local filesystems.

This ensures:

  1. Fault-tolerance: If a disk or node goes down, other replicas are available on different data nodes and disks.
  2. High sequential read/write performance: By splitting a file into multiple chunks and storing them on different nodes (and different disks), a file can be read in parallel by concurrently accessing multiple disks (on different nodes). Each disk can read data with its full bandwidth and its read operations do not interfere with other disks. If the cluster is well utilized all disks will be spinning at full speed delivering the maximum sequential read performance.

Since HDFS is taking care of fault-tolerance and "striped" reading, there is no need to use RAID underneath an HDFS. Using RAID will only be more expensive, offer less storage, and also be slower (depending on the concrete RAID config).

Since the namenode is a single-point-of-failure in HDFS, it requires a more reliable hardware setup. Therefore, the use of RAID is recommended on namenodes.


RAID0 on and enterprise server is a huge mistake. I sure would like to meet the person that designed this. This makes no common sense to an IT operations manager. If you configure any of your local server disk with a RAID0 you risk a long and painful RAID0 recovery. If a single disk in a RAID0 fails that RAID partition becomes destroyed and it doesn't magically recover when the disk is replaced. Someone has to logon to the server and delete the old RAID partition and create a new one. This creates a lot of overhead in times when man hours and work cycles are at an all time high. An IT operations manager is either going to delay doing this due to more priority workload or refuse to do it because they don't have enough cycles to take people resources away for more important work. Then its going to get pushed off to another team. Then the politics begin and wham then it gets pushed back to the server owner/customer. If you wanted to make a RAID1 or SAN drive available then you could avoid that entire scenario.