Spark dataframe: Pivot and Group based on columns
You can use collect_list
if you can bear with an empty List at cells where it should be zero:
df.groupBy("id").pivot("app").agg(collect_list("customer")).show+---+--------+----+--------+| id| bc| fe| fw|+---+--------+----+--------+|id3|[TR, WM]| []| []||id1| []|[WM]|[CS, WM]||id2| []| []| [CS]|+---+--------+----+--------+
Using CONCAT_WS we can explode array and can remove the square brackets.
df.groupBy("id").pivot("app").agg(concat_ws(",",collect_list("customer")))