Alternative to Scipy mode function in Numpy?
The scipy.stats.mode
function is defined with this code, which only relies on numpy:
def mode(a, axis=0): scores = np.unique(np.ravel(a)) # get ALL unique values testshape = list(a.shape) testshape[axis] = 1 oldmostfreq = np.zeros(testshape) oldcounts = np.zeros(testshape) for score in scores: template = (a == score) counts = np.expand_dims(np.sum(template, axis),axis) mostfrequent = np.where(counts > oldcounts, score, oldmostfreq) oldcounts = np.maximum(counts, oldcounts) oldmostfreq = mostfrequent return mostfrequent, oldcounts
Source: https://github.com/scipy/scipy/blob/master/scipy/stats/stats.py#L609
If you know there are not many different values (relative to the size of the input "itemArray"), something like this could be efficient:
uniqueValues = np.unique(itemArray).tolist()uniqueCounts = [len(np.nonzero(itemArray == uv)[0]) for uv in uniqueValues]modeIdx = uniqueCounts.index(max(uniqueCounts))mode = itemArray[modeIdx]# All counts as a mapvalueToCountMap = dict(zip(uniqueValues, uniqueCounts))