Replace values in multiple untitled columns to 0, 1, 2 depending on column
Here is one way to do it:
- Define a function to replace the x:
import redef replaceX(col): cond = ~((col == "x") | (col == "X")) # Check if the name of the column is undefined if not re.match(r'Unnamed: \d+', col.name): return col.where(cond, 0) else: # Check what is the value of the first row if col.iloc[0] == "Commented": return col.where(cond, 1) elif col.iloc[0] == "No comment": return col.where(cond, 2) return col
Or if your first row don't contain "Commented" or "No comment" for titled columns you can have a solution without regex:
def replaceX(col): cond = ~((col == "x") | (col == "X")) # Check what is the value of the first row if col.iloc[0] == "Commented": return col.where(cond, 1) elif col.iloc[0] == "No comment": return col.where(cond, 2) return col.where(cond, 0)
- Apply this function on the DataFrame:
# Apply the function on every column (axis not specified so equal 0)df.apply(lambda col: replaceX(col))
Output:
title Unnamed: 2 Unnamed: 30 Commented No comment1 2 0 23 1
Documentation: