Pandas for Python: Exception: Data must be 1-dimensional

python pandas scikit-learn one-hot-encoding

Most sklearn methods don't care about column names, as they're mainly concerned with the math behind the ML algorithms they implement. You can add column names back onto the OneHotEncoder output after fit_transform(), if you can figure out the label encoding ahead of time.

First, grab the column names of your predictors from the original dataset, excluding the first one (which we reserve for LabelEncoder):

X_cols = dataset.columns[1:-1]X_cols# Index(['Age', 'Salary'], dtype='object')

Now get the order of the encoded labels. In this particular case, it looks like LabelEncoder() organizes its integer mapping alphabetically:

labels = labelencoder_X.fit(X[:, 0]).classes_ labels# ['France' 'Germany' 'Spain']

Combine these column names, and then add them to X when you convert to DataFrame:

# X gets re-used, so make sure to define encoded_cols after this lineX[:, 0] = labelencoder_X.fit_transform(X[:, 0])encoded_cols = np.append(labels, X_cols)# ...X = onehotencoder.fit_transform(X).toarray()encoded_df = pd.DataFrame(X, columns=encoded_cols)encoded_df   France  Germany  Spain        Age        Salary0     1.0      0.0    0.0  44.000000  72000.0000001     0.0      0.0    1.0  27.000000  48000.0000002     0.0      1.0    0.0  30.000000  54000.0000003     0.0      0.0    1.0  38.000000  61000.0000004     0.0      1.0    0.0  40.000000  63777.7777785     1.0      0.0    0.0  35.000000  58000.0000006     0.0      0.0    1.0  38.777778  52000.0000007     1.0      0.0    0.0  48.000000  79000.0000008     0.0      1.0    0.0  50.000000  83000.0000009     1.0      0.0    0.0  37.000000  67000.000000

NB: For example data I'm using this dataset, which seems either very similar or identical to the one used by OP. Note how the output is identical to OP's X matrix.

CodeHunter

Pandas for Python: Exception: Data must be 1-dimensional

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last