HDFS Delegation token expired even after adding principle to command line
Issue Solved...!Added the following config to spark command line while initiating the job and it worked.
--conf spark.hadoop.fs.hdfs.impl.disable.cache=true
or you can change this at yarn config level to impact globally.
I tested it its running fine for 3 days.
Thanks
This is a couple years late, but just in case anybody stumbles on this:
Disabling the FS cache (fs.hdfs.impl.disable.cache=true
) means FileSystem#get
will create a new FileSystem every time it is called.
Instead, it looks the app master can refresh the delegation token if you pass --keytab
to spark-submit
:
Even after setting this configuration, Spark job will fail. We were facing the same issue.
A token is valid only for 24 hrs. Yarn renew the token every 24 hrs automatically until it reaches the max lifetime (which is 7 days) then the token cannot get renewed anymore and needs to be reissued, because of which the application will fail.
This might help to solved the problem.https://community.cloudera.com/t5/Support-Questions/Long-running-Spark-streaming-job-that-crashes-on-HDP-2-3-4-7/td-p/181658