What is the Scala type mapping for all Spark SQL DataType
Directly from the Spark SQL and DataFrame Guide:
Data type | Value type in Scala------------------------------------------------ByteType | Byte ShortType | Short IntegerType | Int LongType | Long FloatType | Float DoubleType | Double DecimalType | java.math.BigDecimalStringType | StringBinaryType | Array[Byte]BooleanType | Boolean TimestampType | java.sql.TimestampDateType | java.sql.DateArrayType | scala.collection.Seq MapType | scala.collection.Map StructType | org.apache.spark.sql.Row
For those trying to find the Java types, they're now also hosted at the link from zero323's answer. To document the current revision here:
Data type | Value type in Java | API to access or create a data type-------------------------------------------------------------------------------------------ByteType | byte or Byte | DataTypes.ByteTypeShortType | short or Short | DataTypes.ShortTypeIntegerType | int or Integer | DataTypes.IntegerTypeLongType | long or Long | DataTypes.LongTypeFloatType | float or Float | DataTypes.FloatTypeDoubleType | double or Double | DataTypes.DoubleTypeDecimalType | java.math.BigDecimal | DataTypes.createDecimalType() or DataTypes.createDecimalType(precision, scale).StringType | String | DataTypes.StringTypeBinaryType | byte[] | DataTypes.BinaryTypeBooleanType | boolean or Boolean | DataTypes.BooleanTypeTimestampType | java.sql.Timestamp | DataTypes.TimestampTypeDateType | java.sql.Date | DataTypes.DateTypeArrayType | java.util.List | DataTypes.createArrayType(elementType) or DataTypes.createArrayType(elementType, containsNull).MapType | java.util.Map | DataTypes.createMapType(keyType, valueType) or DataTypes.createMapType(keyType, valueType, valueContainsNull)StructType | org.apache.spark.sql.Row | DataTypes.createStructType(fields)StructField | The value type in Java of the | DataTypes.createStructField(name, dataType, nullable) | data type of this field (For | | example, int for a StructField | | with the data type IntegerType) |
One thing of note when working with StructTypes in particular - it appears that, if you wish to declare an empty StructType in another as a placeholder value, you must use a new StructType()
rather than the suggested DataTypes.createStructType((StructField)null)
to prevent null pointers. Remember to instantiate the nested StructType with StructFields prior to usage.