Python pandas insert list into a cell
Since set_value
has been deprecated since version 0.21.0, you should now use at
. It can insert a list into a cell without raising a ValueError
as loc
does. I think this is because at
always refers to a single value, while loc
can refer to values as well as rows and columns.
df = pd.DataFrame(data={'A': [1, 2, 3], 'B': ['x', 'y', 'z']})df.at[1, 'B'] = ['m', 'n']df = A B0 1 x1 2 [m, n]2 3 z
You also need to make sure the column you are inserting into has dtype=object
. For example
>>> df = pd.DataFrame(data={'A': [1, 2, 3], 'B': [1,2,3]})>>> df.dtypesA int64B int64dtype: object>>> df.at[1, 'B'] = [1, 2, 3]ValueError: setting an array element with a sequence>>> df['B'] = df['B'].astype('object')>>> df.at[1, 'B'] = [1, 2, 3]>>> df A B0 1 11 2 [1, 2, 3]2 3 3
Pandas >= 0.21
set_value
has been deprecated. You can now use DataFrame.at
to set by label, and DataFrame.iat
to set by integer position.
Setting Cell Values with at
/iat
# Setupdf = pd.DataFrame({'A': [12, 23], 'B': [['a', 'b'], ['c', 'd']]})df A B0 12 [a, b]1 23 [c, d]df.dtypesA int64B objectdtype: object
If you want to set a value in second row of the "B" to some new list, use DataFrane.at
:
df.at[1, 'B'] = ['m', 'n']df A B0 12 [a, b]1 23 [m, n]
You can also set by integer position using DataFrame.iat
df.iat[1, df.columns.get_loc('B')] = ['m', 'n']df A B0 12 [a, b]1 23 [m, n]
What if I get ValueError: setting an array element with a sequence
?
I'll try to reproduce this with:
df A B0 12 NaN1 23 NaNdf.dtypesA int64B float64dtype: object
df.at[1, 'B'] = ['m', 'n']# ValueError: setting an array element with a sequence.
This is because of a your object is of float64
dtype, whereas lists are object
s, so there's a mismatch there. What you would have to do in this situation is to convert the column to object first.
df['B'] = df['B'].astype(object)df.dtypesA int64B objectdtype: object
Then, it works:
df.at[1, 'B'] = ['m', 'n']df A B0 12 NaN1 23 [m, n]
Possible, But Hacky
Even more wacky, I've found you can hack through DataFrame.loc
to achieve something similar if you pass nested lists.
df.loc[1, 'B'] = [['m'], ['n'], ['o'], ['p']]df A B0 12 [a, b]1 23 [m, n, o, p]
You can read more about why this works here.