Spark & HCatalog? Spark & HCatalog? hadoop hadoop

Spark & HCatalog?


You can use spark SQL to read from Hive Table instead of HCatalog.

https://spark.apache.org/sql/

You can apply same transformations like Pig using Spark Java/Scala/Python language like filter, join, group by..


You can reference the following link for using HCatalog InputFormat wrapper with Spark; which was written prior to SparkSQL.
https://gist.github.com/granturing/7201912


Our systems have loaded both and we can use either. Spark takes on traits of the language you are using, Scala, Python...,. For example using Spark with Python you can utilize many of the libraries of Python within Spark.