Chapter 2 ñ Comparing Means among Three or More Groups of Numbers:
Analysis of Variance

INTRODUCTION
The t-Test allows comparison of the means of two groups of numbers. However, in some cases it's useful to compare the means of three or more groups of numbers. Here the question is whether there is a significant difference among the means of the different groups. (The word "among" is used instead of the word "between" when the comparison involves more than two groups.) In these cases an Analysis of Variance (Anova) is required. Throughout this chapter, Anova refers to one-way Anova where only one independent variable is considered (see end of chapter for discussion on other types of Anova).

BACKGROUND EXAMPLE

Imagine you are a researcher in charge of a captive breeding program for an endangered butterfly. It turns out that the caterpillars of this butterfly, the cabbage white, feed on the leaves of plants in the mustard family. You are trying to figure out the most efficient way to raise these caterpillars in captivity and you set up an experiment to compare broccoli, cabbage and wild mustard as food for the caterpillars.

In your experiment, you raise caterpillars in your greenhouse in groups of 100 in ten different patches of broccoli, cabbage and wild mustard. For each patch, you count the number of caterpillars that survive to pupation (when they make their chrysalis) and calculate a percent survival in each patch. Table 2.1 shows the data from this experiment. A spreadsheet with these data can be found here.

Table 2.1. Percent survival of cabbage white butterflies raised on broccoli, cabbage and wild mustard. Research Question: Is there a significant difference among the three diets in the survival rate of caterpillars to pupation?

If one of these food plants results in higher survival rates, it may be the best option for the breeding program. In order to answer your question, it makes sense to describe and graph the data, then perform an Analysis of Variance to determine whether there is a significant difference among the mean survival rates for the three diets.

ENTERING AND DESCRIBING THE DATA

See Chapter 1 for detailed directions on how to enter data and use formulas to calculate basic descriptive statistics (Table 1.4 should be particularly useful). Table 2.2 shows the raw data for the caterpillar diet experiment along with corresponding descriptive statistics.

Table 2.2. Raw data and descriptive statistics for data from the caterpillar diet experiment. As shown in Table 2.2, the mean survival rates for caterpillars raised on broccoli and cabbage are very similar (79.6% and 79.4%), but the mean survival rate on wild mustard is greater (87.3%). Also, the maximum and minimum survival rates for wild mustard are greater than those for broccoli and cabbage. However, the ranges are large (23, 21 and 22), and there is a fair amount of overlap in the three groups of numbers. The next step in understanding any possible effects of diet on survival rate is to graph the data.

GRAPHING THE DATA

Table 2.3 shows how the caterpillar survival data must be entered in order to graph these data and to add mean values to the scatterplot. Here are detailed directions for making this type of graph (as shown in Figure 2.1 below).

Table 2.3. Data from the caterpillar diet experiment entered in the correct form for making a scatterplot that shows mean values. Category 1 corresponds to broccoli, category 2 to cabbage and category 3 to wild mustard.  Figure 2.1. Scatterplot showing the raw data from the caterpillar diet experiment along with the mean survival rate to pupation for each diet type.

Once you've created a scatterplot that includes mean values, it should be easier to interpret your data. As we discussed earlier when interpreting Table 2.2, the mean value, and the maximum and minimum values for the wild mustard diet are all greater than the corresponding values for the other two diets. Also, there is a fair amount of "spread" in the data as revealed by the relatively large range within each group. However, the scatterplot in Figure 2.1 reveals some additional information. Almost all of the survival rates for caterpillars raised on broccoli and cabbage are less than the mean survival rate for those raised on wild mustard. Likewise, almost all of the survival rates for caterpillars raised on wild mustard are greater than the means for broccoli and cabbage. So perhaps survival rate is significantly greater on wild mustard than on the other two plants. In order to confirm this, we need to do an analysis of variance.

TESTING THE DATA

An Analysis of Variance (Anova) is used to determine whether there is a statistically significant difference among the means of three or more groups of numbers. In the caterpillar example, Anova will allow us to test whether there's a significant difference in the survival rate among the three caterpillar diets.

In order to use MS Excel to do an Anova, the data must be entered as shown in Table 2.1. These directions will describe how to do the Anova and format your table.

Table 2.4. Table showing the MS Excel output from an Anova on the caterpillar survival data. Table 2.5. Example of a formatted table showing the Anova output that should be included in a talk or paper. Table 2.5 above includes the minimum information that should be reported for an Anova. It includes the values you will need to determine whether the results show statistical significance as well as additional information that is helpful in understanding how an Anova works.

The Essentials
In Table 2.5, count is simply the sample size ñ in this case it corresponds to the 10 broccoli, cabbage and wild mustard patches in which you raised the caterpillars. The table also shows the mean (or average) survival rate for each of the three diets, and the variance which measures the variation in survival rate on each diet . . . the higher the value for variance, the greater the variation. The variance is equal to the standard deviation squared, so if you wanted to know the standard deviation for each diet, you could simply take the square root of the variance (to take a square root in MS Excel, use the "sqrt" function).

The answer to whether there is a significant difference among the means is found by looking at the p-value. If p < 0.05, then there is a significant difference; if p > 0.05, there is not. In this case, p = 0.027 which is less than 0.05, so there is a significant difference among the mean survival rate of caterpillars raised on broccoli, cabbage and wild mustard. In precise terms, a p-value of 0.027 means that there is only a 2.7% chance that the differences among the means are the result of random chance. Therefore, since random chance probably is not causing the differences, something interesting is probably going on, such as a real effect of food type on the survival of the caterpillars.

So, you can conclude that diet did have a significant effect on the survival rate of caterpillars in your experiment. By looking at the mean values and your scatterplot, you can conclude that survival rate was higher for caterpillars raised on wild mustard than for those raised on broccoli or cabbage. If you want to formally make pairwise comparisons between just two groups (such as wild mustard vs. broccoli) you need to do some additional testing. MS Excel is not set up to do this type of pairwise comparison. For details on making pairwise comparisons in conjunction with Anova, see Gotelli and Ellison (2004). Alternatively, a simple t-Test can be used to make a pairwise comparison, but should be interpreted with caution because doing multiple t-Tests can artificially increase the chance of finding statistical significance. Some suggestions for using MS Excel for pairwise comparison can be found in Appendix VIII.

Other Parameters
Table 2.5 also includes three other parameters, df, MS and F. The parameter df is an abbreviation for degrees of freedom which is related to sample size. In general, the higher the value for df, the greater the ability to detect significant differences. MS stands for mean square. The two values shown in the table reflect the amount of variation between groups and the amount of variation within groups. In this case the Between Groups MS represents the amount of variation in survival rates among the groups of caterpillars raised on broccoli, cabbage and wild mustard. Likewise, the Within Groups MS represents the amount of variation in survival rates within the caterpillars raised on a given food type. In general, if there is a lot of variation within groups (such as a lot of "spread" in the broccoli data, the cabbage data and the wild mustard data) and not a lot of variation among groups, then there probably is not a significant difference among the means of the groups. On the other hand, if there is very little variation within groups (such as little "spread" in the broccoli data, etc.) but a lot of variation among the groups (such as little overlap in the data among the groups), there probably is a significant difference among the means. As explained in more detail below, F is a ratio that measures the amount of among group variation relative to within group variation.

MORE ON ANOVA AND ALTERNATIVE TESTS

The Concept Behind Anova
What follows is a very brief overview of how Anova works. For details, consult Gotelli and Ellison (2004).

In an Anova, the overall variation in the data is separated into the variation found within groups and the variation found among groups. Variation is quantified using measures called SS (SS is an abbreviation for Sum of Squares - literally the sum of squared differences between individual data points and mean values). Once the within group SS and among group SS values are calculated, they are converted to MS (mean squares) by dividing by df. The MS values can then be used to calculate F (F is a ratio and equals among group MS divided by within group MS). The higher the F ratio, the greater the among group variation relative to the within group variation and the more likely there are differences in the mean values among the groups.

In an Anova, the calculated F ratio is then compared to a known F distribution. Similar to the way t works in the t-Test, the greater the value of F, the further it occurs out in a tail of the F distribution, the less likely the difference among the means is due to random chance and the lower the p-value.

Other Types of Anova
There are many different types of Anova. Gotelli and Ellison (2004) give an excellent review of the variety of models that can be used in Anova. Of particular interest is the two-way Anova which can test for the effects of two independent variables on a dependent variable. For example, if the caterpillar diet experiment had included patches of young plants and patches of old plants, the resulting data could be analyzed using a two-way Anova. In this case, the two independent variables being tested would be food type (broccoli vs. cabbage vs. wild mustard) and plant age (young vs. old).

When Anova is Appropriate
(see Gotelli and Ellison, 2004 for thorough discussion of assumptions.)
• When the data are distributed normally (see Appendix III).
• When the variances within the groups being compared are equal.
• When the data are independent (see glossary for more on independence).
• When the data were collected in an unbiased manner. Though this manual does not go into detail on methods for data collection, it is important to stress that statistical tests can not correct for data sets that were collected improperly. See Brower et. al. (1998) or Krebs (1989) for more on unbiased sampling.

Alternatives to Anova
The Kruskal-Wallis One Way Anova is a non-parametric alternative to one-way Anova.