confidence and prediction intervals with StatsModels

For test data you can try to use the following.

predictions = result.get_prediction(out_of_sample_df)predictions.summary_frame(alpha=0.05)

I found the summary_frame() method buried here and you can find the get_prediction() method here. You can change the significance level of the confidence interval and prediction interval by modifying the "alpha" parameter.

I am posting this here because this was the first post that comes up when looking for a solution for confidence & prediction intervals – even though this concerns itself with test data rather.

Here's a function to take a model, new data, and an arbitrary quantile, using this approach:

def ols_quantile(m, X, q):  # m: OLS model.  # X: X matrix.  # q: Quantile.  #  # Set alpha based on q.  a = q * 2  if q > 0.5:    a = 2 * (1 - q)  predictions = m.get_prediction(X)  frame = predictions.summary_frame(alpha=a)  if q > 0.5:    return frame.obs_ci_upper  return frame.obs_ci_lower

python statistics statsmodels

update see the second answer which is more recent. Some of the models and results classes have now a get_prediction method that provides additional information including prediction intervals and/or confidence intervals for the predicted mean.

old answer:

iv_l and iv_u give you the limits of the prediction interval for each point.

Prediction interval is the confidence interval for an observation and includes the estimate of the error.

I think, confidence interval for the mean prediction is not yet available in statsmodels.(Actually, the confidence interval for the fitted values is hiding inside the summary_table of influence_outlier, but I need to verify this.)

Proper prediction methods for statsmodels are on the TODO list.

Addition

Confidence intervals are there for OLS but the access is a bit clumsy.

To be included after running your script:

from statsmodels.stats.outliers_influence import summary_tablest, data, ss2 = summary_table(re, alpha=0.05)fittedvalues = data[:, 2]predict_mean_se  = data[:, 3]predict_mean_ci_low, predict_mean_ci_upp = data[:, 4:6].Tpredict_ci_low, predict_ci_upp = data[:, 6:8].T# Check we got the right thingsprint np.max(np.abs(re.fittedvalues - fittedvalues))print np.max(np.abs(iv_l - predict_ci_low))print np.max(np.abs(iv_u - predict_ci_upp))plt.plot(x, y, 'o')plt.plot(x, fittedvalues, '-', lw=2)plt.plot(x, predict_ci_low, 'r--', lw=2)plt.plot(x, predict_ci_upp, 'r--', lw=2)plt.plot(x, predict_mean_ci_low, 'r--', lw=2)plt.plot(x, predict_mean_ci_upp, 'r--', lw=2)plt.show()

enter image description here

This should give the same results as SAS, http://jpktd.blogspot.ca/2012/01/nice-thing-about-seeing-zeros.html

python statistics statsmodels

summary_frame and summary_table work well when you need exact results for a single quantile, but don't vectorize well. This will provide a normal approximation of the prediction interval (not confidence interval) and works for a vector of quantiles:

def ols_quantile(m, X, q):  # m: Statsmodels OLS model.  # X: X matrix of data to predict.  # q: Quantile.  #  from scipy.stats import norm  mean_pred = m.predict(X)  se = np.sqrt(m.scale)  return mean_pred + norm.ppf(q) * se

CodeHunter

confidence and prediction intervals with StatsModels

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last