String concatenation of two pandas columns

df['bar'] = df.bar.map(str) + " is " + df.foo.

This question has already been answered, but I believe it would be good to throw some useful methods not previously discussed into the mix, and compare all methods proposed thus far in terms of performance.

Here are some useful solutions to this problem, in increasing order of performance.

`DataFrame.agg`

This is a simple str.format-based approach.

df['baz'] = df.agg('{0[bar]} is {0[foo]}'.format, axis=1)df  foo  bar     baz0   a    1  1 is a1   b    2  2 is b2   c    3  3 is c

You can also use f-string formatting here:

df['baz'] = df.agg(lambda x: f"{x['bar']} is {x['foo']}", axis=1)df  foo  bar     baz0   a    1  1 is a1   b    2  2 is b2   c    3  3 is c

`char.array`-based Concatenation

Convert the columns to concatenate as chararrays, then add them together.

a = np.char.array(df['bar'].values)b = np.char.array(df['foo'].values)df['baz'] = (a + b' is ' + b).astype(str)df  foo  bar     baz0   a    1  1 is a1   b    2  2 is b2   c    3  3 is c

List Comprehension with `zip`

I cannot overstate how underrated list comprehensions are in pandas.

df['baz'] = [str(x) + ' is ' + y for x, y in zip(df['bar'], df['foo'])]

Alternatively, using str.join to concat (will also scale better):

df['baz'] = [    ' '.join([str(x), 'is', y]) for x, y in zip(df['bar'], df['foo'])]

df  foo  bar     baz0   a    1  1 is a1   b    2  2 is b2   c    3  3 is c

List comprehensions excel in string manipulation, because string operations are inherently hard to vectorize, and most pandas "vectorised" functions are basically wrappers around loops. I have written extensively about this topic in For loops with pandas - When should I care?. In general, if you don't have to worry about index alignment, use a list comprehension when dealing with string and regex operations.

The list comp above by default does not handle NaNs. However, you could always write a function wrapping a try-except if you needed to handle it.

def try_concat(x, y):    try:        return str(x) + ' is ' + y    except (ValueError, TypeError):        return np.nandf['baz'] = [try_concat(x, y) for x, y in zip(df['bar'], df['foo'])]

`perfplot` Performance Measurements

Graph generated using perfplot. Here's the complete code listing.

Functions

def brenbarn(df):    return df.assign(baz=df.bar.map(str) + " is " + df.foo)def danielvelkov(df):    return df.assign(baz=df.apply(        lambda x:'%s is %s' % (x['bar'],x['foo']),axis=1))def chrimuelle(df):    return df.assign(        baz=df['bar'].astype(str).str.cat(df['foo'].values, sep=' is '))def vladimiryashin(df):    return df.assign(baz=df.astype(str).apply(lambda x: ' is '.join(x), axis=1))def erickfis(df):    return df.assign(        baz=df.apply(lambda x: f"{x['bar']} is {x['foo']}", axis=1))def cs1_format(df):    return df.assign(baz=df.agg('{0[bar]} is {0[foo]}'.format, axis=1))def cs1_fstrings(df):    return df.assign(baz=df.agg(lambda x: f"{x['bar']} is {x['foo']}", axis=1))def cs2(df):    a = np.char.array(df['bar'].values)    b = np.char.array(df['foo'].values)    return df.assign(baz=(a + b' is ' + b).astype(str))def cs3(df):    return df.assign(        baz=[str(x) + ' is ' + y for x, y in zip(df['bar'], df['foo'])])

python string pandas numpy dataframe

The problem in your code is that you want to apply the operation on every row. The way you've written it though takes the whole 'bar' and 'foo' columns, converts them to strings and gives you back one big string. You can write it like:

df.apply(lambda x:'%s is %s' % (x['bar'],x['foo']),axis=1)

It's longer than the other answer but is more generic (can be used with values that are not strings).

CodeHunter

String concatenation of two pandas columns

`DataFrame.agg`

`char.array`-based Concatenation

List Comprehension with `zip`

`perfplot` Performance Measurements

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last

String concatenation of two pandas columns

char.array-based Concatenation

List Comprehension with zip

perfplot Performance Measurements

Recent Posts

`char.array`-based Concatenation

List Comprehension with `zip`

`perfplot` Performance Measurements