Doubts about page rank Doubts about page rank hadoop hadoop

Doubts about page rank


As I suppose, you have too few iterations. Why 10? Why 100? Or 100000? You should count, what are the mediums or maximums of the two last changes. And thus evaluate the possible error.

And the PR is a probability. The sum of all of them should be 1! The sentence "sum of all the pagerank is equal to the total number of pages" is wrong.

As for another formula, it belongs to another model and another PR. Of course, you can use it too. Or both. But you can't check using it.


it depends what base you choose (default is 1). After each iteration you have to calculate

delta = (base - sum_of_ranks) / N

And then decrease each rank by delta. Only in this way you will keep you ranks alive until the end last iteration.