How to resample a dataframe with different functions applied to each column? How to resample a dataframe with different functions applied to each column? numpy numpy

How to resample a dataframe with different functions applied to each column?


With pandas 0.18 the resample API changed (see the docs). So for pandas >= 0.18 the answer is:

In [31]: frame.resample('1H').agg({'radiation': np.sum, 'tamb': np.mean})Out[31]:                          tamb   radiation2012-04-05 08:00:00  5.161235  279.5071822012-04-05 09:00:00  4.968145  290.9410732012-04-05 10:00:00  4.478531  317.6782852012-04-05 11:00:00  4.706206  335.2586332012-04-05 12:00:00  2.457873    8.655838

Old Answer:

I am answering my question to reflect the time series related changes in pandas >= 0.8 (all other answers are outdated).

Using pandas >= 0.8 the answer is:

In [30]: frame.resample('1H', how={'radiation': np.sum, 'tamb': np.mean})Out[30]:                          tamb   radiation2012-04-05 08:00:00  5.161235  279.5071822012-04-05 09:00:00  4.968145  290.9410732012-04-05 10:00:00  4.478531  317.6782852012-04-05 11:00:00  4.706206  335.2586332012-04-05 12:00:00  2.457873    8.655838


You can also downsample using the asof method of pandas.DateRange objects.

In [21]: hourly = pd.DateRange(datetime.datetime(2012, 4, 5, 8, 0),...                          datetime.datetime(2012, 4, 5, 12, 0),...                          offset=pd.datetools.Hour())In [22]: frame.groupby(hourly.asof).size()Out[22]: key_02012-04-05 08:00:00    602012-04-05 09:00:00    602012-04-05 10:00:00    602012-04-05 11:00:00    602012-04-05 12:00:00    1In [23]: frame.groupby(hourly.asof).agg({'radiation': np.sum, 'tamb': np.mean})Out[23]:                      radiation  tamb key_0                                2012-04-05 08:00:00  271.54     4.4912012-04-05 09:00:00  266.18     5.2532012-04-05 10:00:00  292.35     4.9592012-04-05 11:00:00  283.00     5.4892012-04-05 12:00:00  0.5414     9.532


To tantalize you, in pandas 0.8.0 (under heavy development in the timeseries branch on GitHub), you'll be able to do:

In [5]: frame.convert('1h', how='mean')Out[5]:                      radiation      tamb2012-04-05 08:00:00   7.840989  8.4461092012-04-05 09:00:00   4.898935  5.4592212012-04-05 10:00:00   5.227741  4.6608492012-04-05 11:00:00   4.689270  5.3213982012-04-05 12:00:00   4.956994  5.093980

The above mentioned methods are the right strategy with the current production version of pandas.