Replacing punctuation in a data frame based on punctuation list [duplicate]

Use replace with correct regex would be easier:

In [41]:import pandas as pdpd.set_option('display.notebook_repr_html', False)df = pd.DataFrame({'text':['test','%hgh&12','abc123!!!','porkyfries']})dfOut[41]:         text0        test1     %hgh&122   abc123!!!3  porkyfries[4 rows x 1 columns]

use regex with the pattern which means not alphanumeric/whitespace

In [49]:df['text'] = df['text'].str.replace('[^\w\s]','')dfOut[49]:         text0        test1       hgh122      abc1233  porkyfries[4 rows x 1 columns]

python pandas dataframe large-data

For removing punctuation from a text column in your dataframme:

In:

import reimport stringrem = string.punctuationpattern = r"[{}]".format(rem)pattern

Out:

'[!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~]'

In:

df = pd.DataFrame({'text':['book...regh', 'book...', 'boo,', 'book. ', 'ball, ', 'ballnroll"', '"rope"', 'rick % ']})df

Out:

        text0  book...regh1      book...2         boo,3       book. 4       ball, 5   ballnroll"6       "rope"7      rick %

In:

df['text'] = df['text'].str.replace(pattern, '')df

You can replace the pattern with your desired character. Ex - replace(pattern, '$')

Out:

        text0   bookregh1       book2        boo3      book 4      ball 5  ballnroll6       rope7     rick

python pandas dataframe large-data

Translate is often considered the cleanest and fastest way to remove punctuation (source)

import stringtext = text.translate(None, string.punctuation.translate(None, '"'))

You may find that it works better to remove punctuation in 'a' before loading it into pandas.

CodeHunter

Replacing punctuation in a data frame based on punctuation list [duplicate]

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last