Get list of data types from schema in Apache Spark Get list of data types from schema in Apache Spark python python

Get list of data types from schema in Apache Spark


Here's a suggestion:

df = sqlContext.createDataFrame([('a', 1)])types = [f.dataType for f in df.schema.fields]types> [StringType, LongType]

Reference:


Since the question title is not python-specific, I'll add scala version here:

val types = df.schema.fields.map(f => f.dataType)

It will result in an array of org.apache.spark.sql.types.DataType.


Use schema.dtypes

scala> val df = Seq(("ABC",10,20.4)).toDF("a","b","c")df: org.apache.spark.sql.DataFrame = [a: string, b: int ... 1 more field]scala>scala> df.printSchemaroot |-- a: string (nullable = true) |-- b: integer (nullable = false) |-- c: double (nullable = false)scala> df.dtypesres2: Array[(String, String)] = Array((a,StringType), (b,IntegerType), (c,DoubleType))scala> df.dtypes.map(_._2).toSetres3: scala.collection.immutable.Set[String] = Set(StringType, IntegerType, DoubleType)scala>