Is there a quicker way of running GridsearchCV Is there a quicker way of running GridsearchCV python python

Is there a quicker way of running GridsearchCV


A few things:

  1. 10-fold CV is overkill and causes you to fit 10 models for each parameter group. You can get an instant 2-3x speedup by switching to 5- or 3-fold CV (i.e., cv=3 in the GridSearchCV call) without any meaningful difference in performance estimation.
  2. Try fewer parameter options at each round. With 9x9 combinations, you're trying 81 different combinations on each run. Typically, you'll find better performance at one end of the scale or the other, so maybe start with a coarse grid of 3-4 options, and then go finer as you start to identify the area that's more interesting for your data. 3x3 options means a 9x speedup vs. what you're doing now.
  3. You can get a trivial speedup by setting njobs to 2+ in your GridSearchCV call so you run multiple models at once. Depending on the size of your data, you may not be able to increase it too high, and you won't see an improvement increasing it past the number of cores you're running, but you can probably trim a bit of time that way.


Also you could set probability=False inside of SVC estimator to avoid applying expensive Platt's calibration internally.(If having ability to run predict_proba is crucial, perform GridSearchCv with refit=False,and after picking best paramset in terms of model's quality on test set just retrain best estimator with probability=True on whole trainig set.)

Another step would be to use RandomizedSearchCv instead of GridSearchCV, which would allow you to reach better model quality at roughly the same time (as controlled by n_iters parameter).

And, as already mentioned, use n_jobs=-1


Adding to the other answers (like not using 10-fold CV and using fewer parameter options each round), there are other ways you can speed up your model.

Parallelize your code

Randy mentioned that can use n_jobs to parallelize your taining (this is based on the number of cores on your computer). The only difference with the code below is it uses n_jobs = -1 which creates 1 job per core automatically. So if you have 4 cores, it will try to utilize all 4 cores. The code below is run on an 8 core computer. It took 18.3 seconds with n_jobs = -1 on my computer as opposed to 2 minutes 17 seconds without.

import numpy as npfrom sklearn import svmfrom sklearn import datasetsfrom sklearn.model_selection import GridSearchCVrng = np.random.RandomState(0)X, y = datasets.make_classification(n_samples=1000, random_state=rng)clf = svm.SVC(kernel="rbf" , probability = True, cache_size = 600)gamma_range = [1e-7,1e-6,1e-5,1e-4,1e-3,1e-2,1e-1,1e0,1e1]c_range = [1e-3,1e-2,1e-1,1e0,1e1,1e2,1e3,1e4,1e5]param_grid = dict(gamma = gamma_range, C = c_range)grid = GridSearchCV(clf, param_grid, cv= 10, scoring="accuracy", n_jobs = -1)%time grid.fit(X, y)

Note that if you have access to a cluster, you can distribute your training with Dask or Ray.

Different Hyperparameter Optimization Techniques

Your code uses GridSearchCV which is an exhaustive search over specified parameter values for an estimator. Scikit-Learn also has RandomizedSearchCV which samples a given number of candidates from a parameter space with a specified distribution. Using randomized search for the code example below took 3.35 seconds.

import numpy as npfrom sklearn import svmfrom sklearn import datasetsfrom sklearn.model_selection import RandomizedSearchCVrng = np.random.RandomState(0)X, y = datasets.make_classification(n_samples=1000, random_state=rng)clf = svm.SVC(kernel="rbf" , probability = True, cache_size = 600)gamma_range = [1e-7,1e-6,1e-5,1e-4,1e-3,1e-2,1e-1,1e0,1e1]c_range = [1e-3,1e-2,1e-1,1e0,1e1,1e2,1e3,1e4,1e5]param_grid = dict(gamma = gamma_range, C = c_range)grid = RandomizedSearchCV(clf, param_grid, cv= 10, scoring="accuracy", n_jobs = -1)%time grid.fit(X, y)

halving

Image from documentation.

Recently (scikit-learn 0.24.1 January 2021), scikit-learn added the experimental hyperparameter search estimators halving grid search (HalvingGridSearchCV) and halving random search (HalvingRandomSearch). These techniques can be used to search the parameter space using successive halving. The image above shows that all hyperparameter candidates are evaluated with a small number of resources at the first iteration and the more promising candidates are selected and given more resources during each successive iteration.You can use it by upgrading your scikit-learn (pip install --upgrade scikit-learn)