FloatingPointError from PyMC in sampling from a Dirichlet distribution FloatingPointError from PyMC in sampling from a Dirichlet distribution python python

FloatingPointError from PyMC in sampling from a Dirichlet distribution


I found this nugget in their documentation:

The stochastic variable cutoff cannot be smaller than the largest element of D, otherwise D’s density would be zero. The standard Metropolis step method can handle this case without problems; it will propose illegal values occasionally, but these will be rejected.

This would lead me to believe that the dtype=np.float (which is essential has the same range as float), may not be the method you want to go about. The documentation says it needs to be a numpy dtype, but it just needs to be a dtype that converts to a numpy dtype object and in Python2 (correct me if I'm wrong) number dtypes were fixed size types meaning they're the same. Maybe utilizing the Decimal module would be an option. This way you can set the level of precision to encapsulate expected value ranges, and pass it to your extended stochastic method where it would be converted.

from decimal import Decimal, getcontextgetcontext().prec = 15dtype=Decimal

I don't know this wouldn't still be truncated once the numpy library got a hold of it, or if it would respect the inherited level of precision. I have no accurate method of testing this, but give it a try and let me know how that works for you.

Edit: I tested the notion of precision inheritance and it would seem to hold:

>>> from decimal import Decimal, getcontext>>> getcontext().prec = 10>>> Decimal(1) / Decimal(7)Decimal('0.1428571429')>>> np.float(Decimal(1) / Decimal(7))0.1428571429>>> getcontext().prec = 15>>> np.float(Decimal(1) / Decimal(7))0.142857142857143>>> 


If you do get small numbers, it might simply be too small for a float. This is typically also what the logarithms are there for to avoid. What if you use dtype=np.float64?


As you have suggested at the end of your question, the issue is with too small numbers that are float-casted to 0. One solution could be to tweak a little the source code and replace the division with for example np.divide and in the "where" condition to add some explicit casting for to small values to a given threshold.