Matrix Multiplication in hadoop Matrix Multiplication in hadoop hadoop hadoop

Matrix Multiplication in hadoop


The idea is that you can break matrix multiplication into subproblems with something like the Strassen Algorithm and then send those subproblems to a bunch of different computers. Once those subproblems are finished the summing together of the different subproblems into the matrix itsself can also be handled with. The key to using Mapreduce is that all of the subproblems can basically be computed in parallel, which is... what Mapreduce is for.


Couple of frameworks like Apache Hama have implementation of the PageRank. Apache Giraph also has support for Pagerank.

MapReduce is not well suited for PageRank, so Google published Pregel paper for large scale graph computing.


The link you provided explains this as clear as anything (plus it includes source-code). If you're still struggling with the concepts of the operations, then you should probably start by reading up some more on matrix/linear algebra, so you understand the underlying mathematics.