Does GridSearchCV perform cross-validation?

python machine-learning scikit-learn cross-validation grid-search

All estimators in scikit where name ends with CV perform cross-validation.But you need to keep a separate test set for measuring the performance.

So you need to split your whole data to train and test. Forget about this test data for a while.

And then pass this train data only to grid-search. GridSearch will split this train data further into train and test to tune the hyper-parameters passed to it. And finally fit the model on the whole train data with best found parameters.

Now you need to test this model on the test data you kept aside in the beginning. This will give you the near real world performance of model.

If you use the whole data into GridSearchCV, then there would be leakage of test data into parameter tuning and then the final model may not perform that well on newer unseen data.

You can look at my other answers which describe the GridSearch in more detail:

python machine-learning scikit-learn cross-validation grid-search

Yes, GridSearchCV performs cross-validation. If I understand the concept correctly - you want to keep part of your data set unseen for the model in order to test it.

So you train your models against train data set and test them on a testing data set.

Here I was doing almost the same - you might want to check it...

CodeHunter

Does GridSearchCV perform cross-validation?

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last