Difference between map, applymap and apply methods in Pandas Difference between map, applymap and apply methods in Pandas python python

# Difference between map, applymap and apply methods in Pandas

Straight from Wes McKinney's Python for Data Analysis book, pg. 132 (I highly recommended this book):

Another frequent operation is applying a function on 1D arrays to each column or row. DataFrame’s apply method does exactly this:

``In : frame = DataFrame(np.random.randn(4, 3), columns=list('bde'), index=['Utah', 'Ohio', 'Texas', 'Oregon'])In : frameOut:                b         d         eUtah   -0.029638  1.081563  1.280300Ohio    0.647747  0.831136 -1.549481Texas   0.513416 -0.884417  0.195343Oregon -0.485454 -0.477388 -0.309548In : f = lambda x: x.max() - x.min()In : frame.apply(f)Out: b    1.133201d    1.965980e    2.829781dtype: float64``

Many of the most common array statistics (like sum and mean) are DataFrame methods, so using apply is not necessary.

Element-wise Python functions can be used, too. Suppose you wanted to compute a formatted string from each floating point value in frame. You can do this with applymap:

``In : format = lambda x: '%.2f' % xIn : frame.applymap(format)Out:             b      d      eUtah    -0.03   1.08   1.28Ohio     0.65   0.83  -1.55Texas    0.51  -0.88   0.20Oregon  -0.49  -0.48  -0.31``

The reason for the name applymap is that Series has a map method for applying an element-wise function:

``In : frame['e'].map(format)Out: Utah       1.28Ohio      -1.55Texas      0.20Oregon    -0.31Name: e, dtype: object``

Summing up, `apply` works on a row / column basis of a DataFrame, `applymap` works element-wise on a DataFrame, and `map` works element-wise on a Series.

# Comparing `map`, `applymap` and `ap``ply`: Context Matters

First major difference: DEFINITION

• `map` is defined on Series ONLY
• `applymap` is defined on DataFrames ONLY
• `apply` is defined on BOTH

Second major difference: INPUT ARGUMENT

• `map` accepts `dict`s, `Series`, or callable
• `applymap` and `apply` accept callables only

Third major difference: BEHAVIOR

• `map` is elementwise for Series
• `applymap` is elementwise for DataFrames
• `apply` also works elementwise but is suited to more complex operations and aggregation. The behaviour and return value depends on the function.

Fourth major difference (the most important one): USE CASE

• `map` is meant for mapping values from one domain to another, so is optimised for performance (e.g., `df['A'].map({1:'a', 2:'b', 3:'c'})`)
• `applymap` is good for elementwise transformations across multiple rows/columns (e.g., `df[['A', 'B', 'C']].applymap(str.strip)`)
• `apply` is for applying any function that cannot be vectorised (e.g., `df['sentences'].apply(nltk.sent_tokenize)`)

# Summarising Footnotes

1. `map` when passed a dictionary/Series will map elements based on the keys in that dictionary/Series. Missing values will be recorded as NaN in the output.
2. `applymap` in more recent versions has been optimised for some operations. You will find `applymap` slightly faster than `apply` in some cases. My suggestion is to test them both and use whatever works better.

3. `map` is optimised for elementwise mappings and transformation. Operations that involve dictionaries or Series will enable pandas to use faster code paths for better performance.

4. `Series.apply` returns a scalar for aggregating operations, Series otherwise. Similarly for `DataFrame.apply`. Note that `apply` also has fastpaths when called with certain NumPy functions such as `mean`, `sum`, etc.

## Quick Summary

• `DataFrame.apply` operates on entire rows or columns at a time.

• `DataFrame.applymap`, `Series.apply`, and `Series.map` operate on oneelement at time.

`Series.apply` and `Series.map` are similar and often interchangeable. Some of their slight differences are discussed in osa's answer below.