Simple linear regression using R

Data

test = c(5.5, 6, 7, 5, 8, 2, 9, 10, 2, 3, 4, 6.5, 8.5, 1)
exam = c(6.5, 6, 8, 7.5, 7, 4, 8, 10, 1, 5, 5, 6, 9, 5)

1. Scatterplot and Pearson correlation coefficient
plot(test, exam, xlab = "Intermediate exams", ylab = "Final exams")
cor.test(test, exam)

2. Linear model computation
fit = lm(exam ~ test)
summary(fit)
layout(matrix(c(1,2,3,4),2,2))
plot(fit)

3. Studentized residuals
Practical rule: Any observation with absolute studentized residual larger that 2 is considered to be an outlier
fit$residuals

4. Standardized Coefficients
lm(scale(exam) ~ scale(test))

5. 95% Confidence Interval for the coefficients
confint(fit, level=0.95)

6. Regression line representation (simple)
plot(test, exam, xlab = "Intermediate exams", pch = 19 ,col = "cornflowerblue" , ylab = "Final exams", cex = 1.5, cex.lab = 1.5, cex.axis = 1.5)
abline(lm(exam ~ test), col="brown4")

7. Regression line representation (richer edition)
fit = lm(exam ~ test)
newx = seq(min(test), max(test), length.out=100)
preds = predict(fit, newdata = data.frame(test=newx), interval = 'confidence')
plot(exam ~ test, xlab = "Intermediate exams", pch = 19 ,col = "grey37" , ylab = "Final exams", cex = 1.5, cex.lab = 1.5, cex.axis = 1.5)
rsquare = round(summary(fit)$r.squared, 3)
text(1.5, 9, bquote ("R"^{2} ~ " ="~.(rsquare)), cex = 1.5)
polygon(c(rev(newx), newx), c(rev(preds[ ,3]), preds[ ,2]), border = NA)
abline(fit)
lines(newx, preds[ ,3], lty = 'dashed', col = 'red')
lines(newx, preds[ ,2], lty = 'dashed', col = 'red')
The above example is contained in the paragraph 4.2 of the book "Στατιστική ανάλυση με τη γλώσσα R" (in Greek, ISBN: 978-960-93-9445-1) published in Thessaloniki, 2017.

Comments