Which function in spark is used to combine two RDDs by keys Which function in spark is used to combine two RDDs by keys python python

Which function in spark is used to combine two RDDs by keys


I would union the two RDDs and to a reduceByKey to merge the values.

(rdd1 union rdd2).reduceByKey(_ ++ _)


Just use join and then map the resulting rdd.

rdd1.join(rdd2).map(case (k, (ls, rs)) => (k, ls ++ rs))