Adding a regression line on a ggplot
In general, to provide your own formula you should use arguments x
and y
that will correspond to values you provided in ggplot()
- in this case x
will be interpreted as x.plot
and y
as y.plot
. You can find more information about smoothing methods and formula via the help page of function stat_smooth()
as it is the default stat used by geom_smooth()
.
ggplot(data,aes(x.plot, y.plot)) + stat_summary(fun.data=mean_cl_normal) + geom_smooth(method='lm', formula= y~x)
If you are using the same x and y values that you supplied in the ggplot()
call and need to plot the linear regression line then you don't need to use the formula inside geom_smooth()
, just supply the method="lm"
.
ggplot(data,aes(x.plot, y.plot)) + stat_summary(fun.data= mean_cl_normal) + geom_smooth(method='lm')
As I just figured, in case you have a model fitted on multiple linear regression, the above mentioned solution won't work.
You have to create your line manually as a dataframe that contains predicted values for your original dataframe (in your case data
).
It would look like this:
# read datasetdf = mtcars# create multiple linear modellm_fit <- lm(mpg ~ cyl + hp, data=df)summary(lm_fit)# save predictions of the model in the new data frame # together with variable you want to plot againstpredicted_df <- data.frame(mpg_pred = predict(lm_fit, df), hp=df$hp)# this is the predicted line of multiple linear regressionggplot(data = df, aes(x = mpg, y = hp)) + geom_point(color='blue') + geom_line(color='red',data = predicted_df, aes(x=mpg_pred, y=hp))
# this is predicted line comparing only chosen variablesggplot(data = df, aes(x = mpg, y = hp)) + geom_point(color='blue') + geom_smooth(method = "lm", se = FALSE)
The simple and versatile solution is to draw a line using slope
and intercept
from geom_abline
. Example usage with a scatterplot and lm
object:
library(tidyverse)petal.lm <- lm(Petal.Length ~ Petal.Width, iris)ggplot(iris, aes(x = Petal.Width, y = Petal.Length)) + geom_point() + geom_abline(slope = coef(petal.lm)[[2]], intercept = coef(petal.lm)[[1]])
coef
is used to extract the coefficients of the formula provided to lm
. The numeric indexing assumes that (Intercept)
is listed first, which is the case if the model includes an intercept. If you have some other linear model object or line to plot, just plug in the slope and intercept values similarly.