Pandas DENSE RANK
The fastest solution is factorize
:
df['Rank'] = pd.factorize(df.Year)[0] + 1
Timings:
#len(df)=40kdf = pd.concat([df]*10000).reset_index(drop=True)In [13]: %timeit df['Rank'] = df.Year.rank(method='dense').astype(int)1000 loops, best of 3: 1.55 ms per loopIn [14]: %timeit df['Rank1'] = df.Year.astype('category').cat.codes + 11000 loops, best of 3: 1.22 ms per loopIn [15]: %timeit df['Rank2'] = pd.factorize(df.Year)[0] + 11000 loops, best of 3: 737 µs per loop
You can convert the year to categoricals and then take their codes (adding one because they are zero indexed and you wanted the initial value to start with one per your example).
df['Rank'] = df.Year.astype('category').cat.codes + 1>>> df Year Value Rank0 2012 10 11 2013 20 22 2013 25 23 2014 30 3