--- title: "Issues in the interpretation of 2-way ANOVA model outputs" output: slidy_presentation --- ## The coke data: ```{r} library("ISwR") data(coking) attach(coking) boxplot(time~temp*width,data=coking, col=(c("gold","darkgreen")), main="Time to process coke", xlab="Temperature . Oven Size", ylab="Time") ``` ## Experimental design ```{r} table(coking[,c('width','temp')]) ``` Notation: - $n_{ij}$ number of observation for level $i$ of Temperature and level $i$ of Temperature - $n_{i.}$ number of observation for level $i$ of Temperature - $n_{.j}$ number of observation for level $j$ of Width - $n$ total nulber of observations The experimental design is full, balanced with replications. ## Orthogonality We have also: $$ n_{ij} = \frac{n_{i.}n_{.j}}{n} $$ This property is known as orthogonality of the design. ## Testing effects of factors in an orthogonal experimental design We compare below the output of various model fitting ## time~temp ```{r} summary(lm(time~temp)) anova(lm(time~temp)) ``` ## Boxplot of time~temp ignoring the effect of width ```{r} boxplot(time~temp,data=coking, col="red", main="Time to process coke", xlab="Temperature", ylab="Time") ``` Disregardnig the potential effect of width makes us blind to the effect of `temp`. ## time~width ```{r} summary(lm(time~width)) anova(lm(time~width)) ``` ## Boxplot of time~width ignoring the effect of temp ```{r} boxplot(time~width,data=coking, col="red", main="Time to process coke", xlab="Width", ylab="Time") ``` ## Take-home message \# 1: It can be dangerous to draw conclusions from separate analyses of the various factors. ## time~temp+width ```{r} summary(lm(time~temp+width)) ``` ## time~temp+width ```{r} anova(lm(time~temp+width)) ``` The estimated effect and p-values relative to temperature are not the same under the models `time~temp` and `time~temp+width` ## ```{r} anova(lm(time~temp)) ``` ## Order of factors in the formula Perhaps not suprisingly `lm(time~temp+width)` and `lm(time~width+temp)` return striclty the same numerical output: ```{r} anova(lm(time~temp+width)) anova(lm(time~width+temp)) ``` ## Analysis in un-balanced design: Let us create an artifially un-balnced deisgn by dropping the first observation: ```{r} coking.unb = coking[-1,] attach(coking.unb) table(coking.unb[,c('width','temp')]) ``` Doing so, we have lost the orthogonality property. ## Order of factors in the formula for unbalanced design Perhaps not suprisingly `lm(time~temp+width)` and `lm(time~width+temp)` return striclty the same numerical output: ```{r} anova(lm(time~temp+width,data=coking.unb)) anova(lm(time~width+temp,data=coking.unb)) ``` ## Take-home message \# 2: The lack of orthogonlality complicates the interpretation of analyses of variance.