The way to check a HDFS directory's size?
hadoop fs -du -s -h /path/to/dir
displays a directory's size in readable form.
Extending to Matt D and others answers, the command can be till Apache Hadoop 3.0.0
hadoop fs -du [-s] [-h] [-v] [-x] URI [URI ...]
It displays sizes of files and directories contained in the given directory or the length of a file in case it's just a file.
Options:
- The -s option will result in an aggregate summary of file lengths being displayed, rather than the individual files. Without the -s option, the calculation is done by going 1-level deep from the given path.
- The -h option will format file sizes in a human-readable fashion (e.g 64.0m instead of 67108864)
- The -v option will display the names of columns as a header line.
- The -x option will exclude snapshots from the result calculation. Without the -x option (default), the result is always calculated from all INodes, including all snapshots under the given path.
du
returns three columns with the following format:
+-------------------------------------------------------------------+ | size | disk_space_consumed_with_all_replicas | full_path_name | +-------------------------------------------------------------------+
Example command:
hadoop fs -du /user/hadoop/dir1 \ /user/hadoop/file1 \ hdfs://nn.example.com/user/hadoop/dir1
Exit Code: Returns 0 on success and -1 on error.