Remove ends of string entries in pandas DataFrame column
I think you can use str.replace
with regex .txt$'
( $
- matches the end of the string):
import pandas as pddf = pd.DataFrame({'A': {0: 2, 1: 1}, 'C': {0: 5, 1: 1}, 'B': {0: 4, 1: 2}, 'filename': {0: "txt.txt", 1: "x.txt"}}, columns=['filename','A','B', 'C'])print df filename A B C0 txt.txt 2 4 51 x.txt 1 2 1df['filename'] = df['filename'].str.replace(r'.txt$', '')print df filename A B C0 txt 2 4 51 x 1 2 1df['filename'] = df['filename'].map(lambda x: str(x)[:-4])print df filename A B C0 txt 2 4 51 x 1 2 1df['filename'] = df['filename'].str[:-4]print df filename A B C0 txt 2 4 51 x 1 2 1
EDIT:
rstrip
can remove more characters, if the end of strings contains some characters of striped string (in this case .
, t
, x
):
Example:
print df filename A B C0 txt.txt 2 4 51 x.txt 1 2 1df['filename'] = df['filename'].str.rstrip('.txt')print df filename A B C0 2 4 51 1 2 1
You can use str.rstrip
to remove the endings:
df['filename'] = df['filename'].str.rstrip('.txt')
should work
You may want:
df['filename'] = df.apply(lambda x: x['filename'][:-4], axis = 1)