what's the actual ideal NameNode memory size when meet a lot files in HDFS what's the actual ideal NameNode memory size when meet a lot files in HDFS hadoop hadoop

what's the actual ideal NameNode memory size when meet a lot files in HDFS


You can have a physical memory of 256 GB in your namenode. If your data increase in huge volumes, consider hdfs federation. I assume you already have multi cores ( with or without hyperthreading) in the name node host. Guess the below link addresses your GC concerns:https://community.hortonworks.com/articles/14170/namenode-garbage-collection-configuration-best-pra.html


Ideal name node memory size is about total space used by meta of the data + OS + size of daemons and 20-30% space for processing related data.

You should also consider the rate at which data comes in to your cluster. If you have data coming in at 1TB/day then you must consider a bigger memory drive or you would soon run out of memory.

Its always advised to have at least 20% memory free at any point of time. This would help towards avoiding the name node going into a full garbage collection.As Marco specified earlier you may refer NameNode Garbage Collection Configuration: Best Practices and Rationale for GC config.

In your case 256 looks good if you aren't going to get a lot of data and not going to do lots of operations on the existing data.

Refer: How to Plan Capacity for Hadoop Cluster?

Also refer: Select the Right Hardware for Your New Hadoop Cluster