Spark: what options can be passed with DataFrame.saveAsTable or DataFrameWriter.options? Spark: what options can be passed with DataFrame.saveAsTable or DataFrameWriter.options? hadoop hadoop

Spark: what options can be passed with DataFrame.saveAsTable or DataFrameWriter.options?


The reason you don't see options documented anywhere is that they are format-specific and developers can keep creating custom write formats with a new set of options.

However, for few supported formats I have listed the options as mentioned in the spark code itself:


Take a look at https://github.com/delta-io/delta/blob/master/src/main/scala/org/apache/spark/sql/delta/DeltaOptions.scala the class "DeltaOptions'

Currently, supported options are:

  • replaceWhere
  • mergeSchema
  • overwriteSchema
  • maxFilesPerTrigger
  • excludeRegex
  • ignoreFileDeletion
  • ignoreChanges
  • ignoreDeletes
  • optimizeWrite
  • dataChange
  • queryName
  • checkpointLocation
  • path
  • timestampAsOf
  • versionAsOf


According to the source code you can specify the path option (indicates where to store the hive external data in hdfs, translated to 'location' in Hive DDL).Not sure you have other options associated with saveAsTable but I'll be searching for more.