Read the data, and verify that the ANOVA factors are stored as factors in the R object.
library(SPH.140.615)
summary(mosq)
## length individual cage
## Min. :49.30 1:6 A:8
## 1st Qu.:58.25 2:6 B:8
## Median :67.05 3:6 C:8
## Mean :66.63 4:6
## 3rd Qu.:72.03
## Max. :84.00
str(mosq)
## 'data.frame': 24 obs. of 3 variables:
## $ length : num 58.5 59.5 77.8 80.9 84 83.6 70.1 68.3 69.8 69.8 ...
## $ individual: Factor w/ 4 levels "1","2","3","4": 1 1 2 2 3 3 4 4 1 1 ...
## $ cage : Factor w/ 3 levels "A","B","C": 1 1 1 1 1 1 1 1 2 2 ...
Plot the data.
par(las=1)
stripchart(split(jitter(mosq$length,factor=2), list(mosq$individual, mosq$cage)), ylab="", pch=1, vertical=TRUE)
abline(v=c(4.5,8.5),lty=2)
Let’s give this a go.
mosq.aov <- aov(length ~ cage / individual, data=mosq)
summary(mosq.aov)
## Df Sum Sq Mean Sq F value Pr(>F)
## cage 2 665.7 332.8 255.7 1.45e-10 ***
## cage:individual 9 1720.7 191.2 146.9 6.98e-11 ***
## Residuals 12 15.6 1.3
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
This is wrong. By default, R tests both factors by calculating the mean-squares with the error in the denominator, and there is no option to tell it otherwise. Later, we will learn how to do this analysis with maximum likelihood estimation. For now, we use a custom function for the sums of squares approach, which is available in the SPH.140.615 package.
nested.anova(mosq.aov)
## Analysis of Variance Table
##
## Response: length
## Df Sum Sq Mean Sq F value Pr(>F)
## cage 2 665.68 332.84 1.7409 0.2295
## cage:individual 9 1720.68 191.19 146.8781 6.981e-11 ***
## Residuals 12 15.62 1.30
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
This function is also written so you can easily access the entries in the ANOVA table.
str(nested.anova(mosq.aov))
## Classes 'anova' and 'data.frame': 3 obs. of 5 variables:
## $ Df : int 2 9 12
## $ Sum Sq : num 665.7 1720.7 15.6
## $ Mean Sq: num 332.8 191.2 1.3
## $ F value: num 1.74 146.88 NA
## $ Pr(>F) : num 2.30e-01 6.98e-11 NA
## - attr(*, "heading")= chr [1:2] "Analysis of Variance Table\n" "Response: length"
For example, to estimate the variances in a random effects model, we need to access the Mean Squares.
MS <- nested.anova(mosq.aov)$M
MS
## [1] 332.837917 191.186389 1.301667
Using the expected mean squares formulas from class, we can get the within, subgroup and group estimates.
Sw <- MS[3]
Ss <- (MS[2]-MS[3])/2
Sg <- (MS[1]-MS[2])/(2*4)
S <- c(Sw,Ss,Sg)
S
## [1] 1.301667 94.942361 17.706441
The standard deviations.
sqrt(S)
## [1] 1.140906 9.743837 4.207902
The percentage contributions to the total variability.
S/sum(S)
## [1] 0.01142309 0.83318974 0.15538717
round(100*S/sum(S),1)
## [1] 1.1 83.3 15.5
The data are available in the SPH.140.615 package.
str(flies)
## 'data.frame': 192 obs. of 3 variables:
## $ response: num 19 19.6 22.9 26 28.6 ...
## $ jar : Factor w/ 3 levels "A","B","C": 1 1 1 1 1 1 1 1 2 2 ...
## $ strain : Factor w/ 8 levels "BS","LC","LDD",..: 3 3 3 3 3 3 3 3 3 3 ...
summary(flies)
## response jar strain
## Min. :17.15 A:64 BS :24
## 1st Qu.:25.51 B:64 LC :24
## Median :28.95 C:64 LDD :24
## Mean :29.52 NH :24
## 3rd Qu.:33.23 NKS :24
## Max. :48.45 OL :24
## (Other):48
Plot the data.
par(las=2)
stripchart(split(flies$response, list(flies$jar, flies$strain)), ylab="", pch=1, vertical=TRUE, xlim=c(1.2,23.8))
abline(v=seq(3.5,24,by=3),lty=2)
nested.anova(aov(response ~ strain / jar, data=flies))
## Analysis of Variance Table
##
## Response: response
## Df Sum Sq Mean Sq F value Pr(>F)
## strain 7 1323.4 189.059 8.4673 0.0002196 ***
## strain:jar 16 357.3 22.328 0.8044 0.6793007
## Residuals 168 4663.2 27.757
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1