What would be a good application for an enhanced version of MapReduce that shares information between Mappers? What would be a good application for an enhanced version of MapReduce that shares information between Mappers? hadoop hadoop

What would be a good application for an enhanced version of MapReduce that shares information between Mappers?


The enhancement I'm building allows some data to be shared between the mappers while they are computing.

Apache Giraph is based on Google Pregel which is based on BSP and is used for graph processing. In BSP, there is data sharing between the processes in the communication phase.

Giraph depends on Hadoop for implementation. In general there is no communication between the mappers in MapReduce, but in Giraph the mappers communicate with each other during the communication phase of BSP.

You might be also interested in Apache Hama which implements BSP and can be used for more than graph processing.

There might be some reason why mappers don't communicate in the MR. Have you considered these factors in your enhancement?

What are some other good real-world applications of this kind of enhancement?

Graph processing is one thing I can think of, similar to Giraph. Checkout the different use cases for BSP, some might be applicable for this kind of enhancement. I am also very interested what other have to say on this.