GridSearchCV - XGBoost - Early Stopping

python-3.x scikit-learn regression data-science xgboost

When using early_stopping_rounds you also have to give eval_metric and eval_set as input parameter for the fit method. Early stopping is done via calculating the error on an evaluation set. The error has to decrease every early_stopping_rounds otherwise the generation of additional trees is stopped early.

See the documentation of xgboosts fit method for details.

Here you see a minimal fully working example:

import xgboost as xgbfrom sklearn.model_selection import GridSearchCVfrom sklearn.model_selection import TimeSeriesSplitcv = 2trainX= [[1], [2], [3], [4], [5]]trainY = [1, 2, 3, 4, 5]# these are the evaluation setstestX = trainX testY = trainYparamGrid = {"subsample" : [0.5, 0.8]}fit_params={"early_stopping_rounds":42,             "eval_metric" : "mae",             "eval_set" : [[testX, testY]]}model = xgb.XGBRegressor()gridsearch = GridSearchCV(model, paramGrid, verbose=1 ,         fit_params=fit_params,         cv=TimeSeriesSplit(n_splits=cv).get_n_splits([trainX,trainY]))gridsearch.fit(trainX,trainY)

python-3.x scikit-learn regression data-science xgboost

An update to @glao's answer and a response to @Vasim's comment/question, as of sklearn 0.21.3 (note that fit_params has been moved out of the instantiation of GridSearchCV and been moved into the fit() method; also, the import specifically pulls in the sklearn wrapper module from xgboost):

import xgboost.sklearn as xgbfrom sklearn.model_selection import GridSearchCVfrom sklearn.model_selection import TimeSeriesSplitcv = 2trainX= [[1], [2], [3], [4], [5]]trainY = [1, 2, 3, 4, 5]# these are the evaluation setstestX = trainX testY = trainYparamGrid = {"subsample" : [0.5, 0.8]}fit_params={"early_stopping_rounds":42,             "eval_metric" : "mae",             "eval_set" : [[testX, testY]]}model = xgb.XGBRegressor()gridsearch = GridSearchCV(model, paramGrid, verbose=1,                      cv=TimeSeriesSplit(n_splits=cv).get_n_splits([trainX, trainY]))gridsearch.fit(trainX, trainY, **fit_params)

CodeHunter

GridSearchCV - XGBoost - Early Stopping

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last