Analysis of covariance (ANCOVA) using R

Data

exam = c(65, 65, 60, 70, 55, 80, 40, 90, 50, 100, 30, 95)
gender = c(0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0)
gender = factor(gender, levels = c(0, 1), labels = c("Woman", "Man"))
age = c(29, 28, 22, 19, 18, 28, 16, 25, 17, 35, 15, 32)
ancova.df = data.frame(exam, gender, age)

1. Descriptive Statistics
my.summary = function(avar){
  return (paste(round(mean(avar), 1), " (", round(sd(avar), 1), ")"))
}
print(my.summary(ancova.df [,1][ancova.df$gender == "Woman"]))
print(my.summary(ancova.df [,1][ancova.df$gender == "Man"]))

2. Compare men and women in terms of their exams
t.test(exam ~ gender, data = ancova.df, var.equal = TRUE)

3. Scatterplot of exam score on age
plot(age, exam, xlab = "Age", ylab = "Score", 
     main = "Age and performance correlation", 
     cex = 1.3, cex.main = 1.3, cex.axis = 1.3, cex.lab = 1.3)

4. Pearson correlation between age and exam
cor.test(exam, age)

5. Analysis of covariance (ANCOVA) (along with eta square indexes)
library(heplots)
lm.ancova = lm(exam ~ age + gender, ancova.df)
etasq(lm.ancova, anova = TRUE, type = 3)

6. Normality of residuals
allresiduals = residuals(lm.ancova)
library(moments)
skewness(allresiduals)
kurtosis(allresiduals)
ks.test(allresiduals, "pnorm", mean(allresiduals), sd(allresiduals))
shapiro.test(allresiduals)

7. Residuals: Histogram and q - q plot
hist(allresiduals, xlab = "Residuals", ylab = "Frequency", col = c("grey63"))
qqnorm(allresiduals)
qqline(allresiduals, col = 2,lwd=2,lty=2)

8. Homogeneity of independent groups
leveneTest(exam ~ gender, ancova.df)

Remark: Instead of ANCOVA, linear regression could be applied
lm.ancova = lm(exam ~ age + gender, ancova.df)
summary(lm.ancova)

The above example is contained in the paragraph 5.5 of the book "Στατιστική ανάλυση με τη γλώσσα R" (in Greek, ISBN: 978-960-93-9445-1) published in Thessaloniki, 2017.

Comments