Any Open Source Pregel like framework for distributed processing of large Graphs?
- Apache Giraph http://giraph.apache.org
- Phoebus https://github.com/xslogic/phoebus
- Bagel https://github.com/mesos/spark/pull/48
- Hama http://hama.apache.org/
- Signal-Collect http://code.google.com/p/signal-collect/
- HipG http://www.cs.vu.nl/~ekr/hipg/
The main Hadoop project for distributed graph processing is the Hama project. Its still in incubation though.
The project has broken its work into two areas; a matrix package and a graph package.
Update:
A better option would be the Apache Giraph project which is based on Google Pregel.
Yes, a new project called Golden Orb, which is an open-source Pregel implementation written in Java that runs on both HBASE and Cassandra.
It has been submitted to Apache incubator for approval, and Ravel, the company behind Golden Orb, said they are releasing it this month (http://www.raveldata.com/goldenorb/).
See http://www.quora.com/Graph-Databases/What-open-source-graph-databases-support-horizontal-scaling
UPDATE: GraphX is GraphLab2 on Spark implemented by Joey Gonzalez, the creator of GraphLab2.
Spark's unique primitives make GraphX-Pregel the fastest JVM-based Pregel implementation. Spark is written in Scala, but Spark has a Java and Python API.
See...
- GraphX: A Resilient Distributed Graph System on Spark (PDF)
- Introduction to GraphX, by Joseph Gonzalez, Reynold Xin - UC Berkeley AmpLab 2013 (YouTube)
- My Hacker News comment/overview on Spark.
P.S. There is also Bagel, which was the first cut at Pregel on Spark. It works; however, GraphX will be the way forward.