Get the last updated file in HDFS Get the last updated file in HDFS hadoop hadoop

Get the last updated file in HDFS


This one worked for me:

hadoop fs -ls -R /tmp/app | awk -F" " '{print $6" "$7" "$8}' | sort -nr | head -1 | cut -d" " -f3

The output is the entire file path.


Here is the command:

hadoop fs -ls -R /user| awk -F" " '{print $6" "$7" "$8}'|sort -nr|head|cut -d" " -f3-

Your script it self is good enough. Hadoop returns the dates in YYYY-MM-DD HH24:MI:SS format and hence you can just sort them alphabetically.