How to handle log(0) when using cross entropy
If you don't mind the dependency on scipy, you can use scipy.special.xlogy
. You would replace the expression
np.multiply(np.log(predY), Y) + np.multiply((1 - Y), np.log(1 - predY))
with
xlogy(Y, predY) + xlogy(1 - Y, 1 - predY)
If you expect predY
to contain very small values, you might get better numerical results using scipy.special.xlog1py
in the second term:
xlogy(Y, predY) + xlog1py(1 - Y, -predY)
Alternatively, knowing that the values in Y
are either 0 or 1, you can compute the cost in an entirely different way:
Yis1 = Y == 1cost = -(np.log(predY[Yis1]).sum() + np.log(1 - predY[~Yis1]).sum())/m
How do you usually handle this issue?
Add small number (something like 1e-15) to predY
- this number doesn't make predictions much off, and it solves log(0) issue.
BTW if your algorithm outputs zeros and ones it might be useful to check the histogram of returned probabilities - when algorithm is so sure that something's happening it can be a sign of overfitting.
One common way to deal with log(x) and y / x
where x is always non-negative but can become 0 is to add a small constant (as written by Jakub).
You can also clip the value (e.g. tf.clip_by_value
or np.clip
).