How do you One Hot Encode columns with a list of strings as values? How do you One Hot Encode columns with a list of strings as values? pandas pandas

How do you One Hot Encode columns with a list of strings as values?


I think you need str.join with str.get_dummies:

df = df['tickers'].str.join('|').str.get_dummies()

Or:

from sklearn.preprocessing import MultiLabelBinarizermlb = MultiLabelBinarizer()df = pd.DataFrame(mlb.fit_transform(df['tickers']),columns=mlb.classes_, index=df.index)print (df)   AAPL  ABT  ADBE  AMGN  AMZN  BABA  BAY  CVS  DIS  ECL  EMR  FAST  GE  \1     0    0     0     0     0     0    0    0    1    0    0     0   0   2     1    0     0     0     1     1    1    0    0    0    0     0   0   3     0    0     0     0     0     0    0    0    0    0    0     0   0   4     0    1     1     1     0     0    0    1    0    0    0     0   0   5     0    1     0     0     0     0    0    1    1    1    1     1   1      GOOGL  MCDO  PEP  1      0     0    0  2      0     0    0  3      0     1    1  4      0     0    0  5      1     0    0  


You can use apply(pd.Series) and then get_dummies():

df = pd.DataFrame({"tickers":[["DIS"], ["AAPL","AMZN","BABA","BAY"],                               ["MCDO","PEP"], ["ABT","ADBE","AMGN","CVS"],                               ["ABT","CVS","DIS","ECL","EMR","FAST","GE","GOOGL"]]})pd.get_dummies(df.tickers.apply(pd.Series), prefix="", prefix_sep="")   AAPL  ABT  DIS  MCDO  ADBE  AMZN  CVS  PEP  AMGN  BABA  DIS  BAY  CVS  ECL  \0     0    0    1     0     0     0    0    0     0     0    0    0    0    0   1     1    0    0     0     0     1    0    0     0     1    0    1    0    0   2     0    0    0     1     0     0    0    1     0     0    0    0    0    0   3     0    1    0     0     1     0    0    0     1     0    0    0    1    0   4     0    1    0     0     0     0    1    0     0     0    1    0    0    1      EMR  FAST  GE  GOOGL  0    0     0   0      0  1    0     0   0      0  2    0     0   0      0  3    0     0   0      0  4    1     1   1      1