Change a HDFS file create date from UNIX Change a HDFS file create date from UNIX hadoop hadoop

Change a HDFS file create date from UNIX


If you don't want to write the java code as @SCouto suggested, you can achieve with a simple workaround, below is my explanation of how you can achieve that.

#Changing the file timestamp to 201708210100 in local unix file system[root@quickstart TestFolder]# touch -t 201708210100 SomeTestFile.txt [root@quickstart TestFolder]# ls -lhtotal 0-rw-r--r-- 1 root root 0 Aug 21 01:00 SomeTestFile.txt#when copying the file to hdfs i'm using -p option which preserves the file timestamp[root@quickstart TestFolder]# hdfs dfs -copyFromLocal -p SomeTestFile.txt /Temp#After copying the file if you look at the below TS its reflected the same way in as in local[root@quickstart TestFolder]# hdfs dfs -ls /Temp/SomeTestFile.txt-rw-r--r--   1 root root          0 2017-08-21 01:00 /Temp/SomeTestFile.txt

P.S - Change the local file system time and when copying the file to hdfs use -p which will preserve and reflects the same time in HDFS as well.

If you are concerned about creating a new file and updating it every time you can do something like below with -f which overwrites/forces the file

#HDFS FILE SomeTestFile.txthdfs dfs -ls /Temp/SomeTestFile.txt#To change the file TS for SomeTestFile.txt #Get it to localhdfs dfs -get /Temp/SomeTestFile.txt /SomeFolderInLinux/#Change the time in local with touchtouch -t 201701010100 /SomeFolderInLinux/SomeTestFile.txt#Here is the main part of preserving the time and overwriting the file in hdfshdfs dfs -copyFromLocal -p -f /SomeFolderInLinux/SomeTestFile.txt /Temp/


As far as I know, there is no shell command to do that.

But can be done through the Java API

public void setTimes(Path p, long mtime, long atime) throws IOException

Set access time of a file.

Parameters: p - The path mtime - Set the modification time of this file. The number of milliseconds since Jan 1, 1970. A value of -1 means that this call should not set modification time. atime - Set the access time of this file. The number of milliseconds since Jan 1, 1970. A value of -1 means that this call should not set access time.


Hadoop does provide this facility

The general command line syntax is:command [genericOptions] [commandOptions]

Usage: hadoop fs [generic options] -touch [-a] [-m] [-t TIMESTAMP ] [-c] ...