Is it possible to toggle a certain step in sklearn pipeline?
Pipeline
steps cannot currently be made optional in a grid search but you could wrap thePCA
class into your ownOptionalPCA
component with a boolean parameter to turn off PCA when requested as a quick workaround. You might want to have a look at hyperopt to setup more complex search spaces. I think it has good sklearn integration to support this kind of patterns by default but I cannot find the doc anymore. Maybe have a look at this talk.For the dependent parameters problem,
GridSearchCV
supports trees of parameters to handle this case as demonstrated in the documentation.
From the docs:
Individual steps may also be replaced as parameters, and non-final steps may be ignored by setting them to None:
from sklearn.linear_model import LogisticRegressionparams = dict(reduce_dim=[None, PCA(5), PCA(10)], clf=[SVC(), LogisticRegression()], clf__C=[0.1, 10, 100])grid_search = GridSearchCV(pipe, param_grid=params)