Get list of data types from schema in Apache Spark

Get list of data types from schema in Apache Spark

Here's a suggestion:

df = sqlContext.createDataFrame([('a', 1)])types = [f.dataType for f in df.schema.fields]types> [StringType, LongType]


Since the question title is not python-specific, I'll add scala version here:

val types = => f.dataType)

It will result in an array of org.apache.spark.sql.types.DataType.

Use schema.dtypes

scala> val df = Seq(("ABC",10,20.4)).toDF("a","b","c")df: org.apache.spark.sql.DataFrame = [a: string, b: int ... 1 more field]scala>scala> df.printSchemaroot |-- a: string (nullable = true) |-- b: integer (nullable = false) |-- c: double (nullable = false)scala> df.dtypesres2: Array[(String, String)] = Array((a,StringType), (b,IntegerType), (c,DoubleType))scala> scala.collection.immutable.Set[String] = Set(StringType, IntegerType, DoubleType)scala>