Why is there no 'hadoop fs -head' shell command? Why is there no 'hadoop fs -head' shell command? hadoop hadoop

Why is there no 'hadoop fs -head' shell command?


I would say it's more to do with efficiency - a head can easily be replicated by piping the output of a hadoop fs -cat through the linux head command.

hadoop fs -cat /path/to/file | head

This is efficient as head will close out the underlying stream after the desired number of lines have been output

Using tail in this manner would be considerably less efficient - as you'd have to stream over the entire file (all HDFS blocks) to find the final x number of lines.

hadoop fs -cat /path/to/file | tail

The hadoop fs -tail command as you note works on the last kilobyte - hadoop can efficiently find the last block and skip to the position of the final kilobyte, then stream the output. Piping via tail can't easily do this.


Starting with version 3.1.0 we now have it:

Usage: hadoop fs -head URI

Displays first kilobyte of the file to stdout.

See here.


hdfs -dfs /path | head

is a good way to solve the problem.