Pandas date_range to generate monthly data at beginning of the month Pandas date_range to generate monthly data at beginning of the month python python

Pandas date_range to generate monthly data at beginning of the month


You can do this by changing the freq argument from 'M' to 'MS':

d = pandas.date_range(start='1/1/1980', end='11/1/1990', freq='MS')    print(d)

This should now print:

DatetimeIndex(['1980-01-01', '1980-02-01', '1980-03-01', '1980-04-01',               '1980-05-01', '1980-06-01', '1980-07-01', '1980-08-01',               '1980-09-01', '1980-10-01',                ...               '1990-02-01', '1990-03-01', '1990-04-01', '1990-05-01',               '1990-06-01', '1990-07-01', '1990-08-01', '1990-09-01',               '1990-10-01', '1990-11-01'],              dtype='datetime64[ns]', length=131, freq='MS', tz=None)

Look into the offset aliases part of the documentation. There it states that 'M' is for the end of the month (month end frequency) while 'MS' for the beginning (month start frequency).


It is worth noting that the 'MS' option of pandas.date_range() suggested by Dimitris makes the range start at the beginning of the next month, which may not be expected :

start = "2020-03-08"end = "2021-03-08"pd.date_range(start, end, freq='MS')

results in

DatetimeIndex(['2020-04-01', '2020-05-01', '2020-06-01', '2020-07-01',           '2020-08-01', '2020-09-01', '2020-10-01', '2020-11-01',           '2020-12-01', '2021-01-01', '2021-02-01', '2021-03-01'],          dtype='datetime64[ns]', freq='MS')

A workaround is to work only with the year and month of the start date :

pd.date_range(start[:7], end, freq='MS')

will then give

DatetimeIndex(['2020-03-01', '2020-04-01', '2020-05-01', '2020-06-01',           '2020-07-01', '2020-08-01', '2020-09-01', '2020-10-01',           '2020-11-01', '2020-12-01', '2021-01-01', '2021-02-01',           '2021-03-01'],          dtype='datetime64[ns]', freq='MS')

If you wish to keep the same starting day for each month, you can then add the offset with pd.DateOffset() :

pd.date_range(start[:7], end, freq='MS') + pd.DateOffset(days=7)

will give

DatetimeIndex(['2020-03-08', '2020-04-08', '2020-05-08', '2020-06-08',           '2020-07-08', '2020-08-08', '2020-09-08', '2020-10-08',           '2020-11-08', '2020-12-08', '2021-01-08', '2021-02-08',           '2021-03-08'],          dtype='datetime64[ns]', freq=None)