Alternative to Scipy mode function in Numpy? Alternative to Scipy mode function in Numpy? numpy numpy

Alternative to Scipy mode function in Numpy?


The scipy.stats.mode function is defined with this code, which only relies on numpy:

def mode(a, axis=0):    scores = np.unique(np.ravel(a))       # get ALL unique values    testshape = list(a.shape)    testshape[axis] = 1    oldmostfreq = np.zeros(testshape)    oldcounts = np.zeros(testshape)    for score in scores:        template = (a == score)        counts = np.expand_dims(np.sum(template, axis),axis)        mostfrequent = np.where(counts > oldcounts, score, oldmostfreq)        oldcounts = np.maximum(counts, oldcounts)        oldmostfreq = mostfrequent    return mostfrequent, oldcounts

Source: https://github.com/scipy/scipy/blob/master/scipy/stats/stats.py#L609


If you know there are not many different values (relative to the size of the input "itemArray"), something like this could be efficient:

uniqueValues = np.unique(itemArray).tolist()uniqueCounts = [len(np.nonzero(itemArray == uv)[0])                for uv in uniqueValues]modeIdx = uniqueCounts.index(max(uniqueCounts))mode = itemArray[modeIdx]# All counts as a mapvalueToCountMap = dict(zip(uniqueValues, uniqueCounts))