-
These directions are for doing a single-factor analysis of variance for comparing the mean values of three or more groups. If you only have two groups to compare, I recommend a t-Test.
-
First, make sure your data are entered appropriately in a spreadsheet. For an example, you may use the butterfly data found here (you can see how the data are formatted in the screenshot below).
-
Go to the Data tab, Data Analysis, select “Anova: Single Factor,” then click OK (If you don’t have a Data Analysis option under your Data tab, see directions for installing the ToolPak here.) Your screen should look like this:
-
Click in the Input Range box, then click and drag to highlight all three columns of data along with the column headings.
-
Make sure the Grouped By Columns option is selected and the Labels in First Row box is checked.
-
Click OK. You should see output that looks like this:
-
Congratulations - you just successfully completed an Anova test! Now for the hard part - formatting and interpreting your table.
-
First, make sure you’re seeing all of the output. Sometimes the columns need to be widened so that you can see everything. To automatically make a column wide enough to show all information, place your cursor between the letters of the column heading (for example, on the line between the letters A and B) and double click. You should now see all of the information.
Formatting Your Statistical Table
You will need to modify your output table before presenting it in a paper or talk. Some information can be deleted; other information needs to be re-labeled.
Your goal is to make your output look like Table 1 below. This bulleted list includes detailed directions for doing that - or, you can experiment by trial and error until you get it right.
-
Delete the top two rows in your table.
-
Delete the word "SUMMARY" and give your table a number, name and description such as "Table 1. Output of an Anova comparing the mean survival rate to pupation of caterpillars raised on broccoli, cabbage and wild mustard."
-
Make all of the fonts normal rather than italics.
-
Rename the categories under "Groups" so that they read "broccoli", "cabbage", and "wild mustard".
-
Replace the word "Groups" with the word "diet".
-
Highlight the cells with the entries "Sum", "796", "794", and "873", choose Delete from the Edit menu, make sure "Shift cells left" is chosen, then click OK.
-
Highlight the cells with the entries "SS", "405.8", and "1318.9", choose Delete from the Edit menu, make sure "Shift cells left" is chosen, then click OK.
-
Highlight the cells with the entries "F crit" and "3.354131", choose Delete from the Edit menu, make sure "Shift cells left" is chosen, then click OK.
-
Delete the bottom two rows from the table (the row beginning with "Total" and the blank row above it.
-
Retype all numbers so they appear with the correct number of digits to the right of the decimal point (see discussion of reporting digits on page 15 under "Formatting Your Statistical Table".
-
Your table should now look like this:
Interpreting Your Output
Table 1 includes the minimum information that should be reported for an Anova. It includes the values you will need to determine whether the results show statistical significance as well as additional information that’s helpful in understanding how an Anova works.
The Essentials
In Table 1, count is simply the sample size – in this case it corresponds to the 10 broccoli, cabbage and wild mustard patches in which the caterpillars were raised. The table also shows the mean (or average) survival rate for each of the three diets, and the variance (a measure of the variation in survival rate) for each diet . . . the higher the value for variance, the greater the variation. The variance is equal to the standard deviation squared, so if you wanted to know the standard deviation for each diet, you could simply take the square root of the variance (to take a square root in MS Excel, use the "sqrt" function).
The answer to whether there is a significant difference among the means is found by looking at the p-value. If p < 0.05, then there is a significant difference; if p > 0.05, there is not. In this case, p = 0.027 which is less than 0.05, so there is a significant difference among the mean survival rate of caterpillars raised on broccoli, cabbage and wild mustard. In precise terms, a p-value of 0.027 means that there is only a 2.7% chance that the differences among the means are the result of random chance. Therefore, since random chance probably is not causing the differences, something interesting is probably going on, such as a real effect of food type on the survival of the caterpillars.
So, you can conclude that diet did have a significant effect on the survival rate of caterpillars in your experiment. By looking at the mean values and a scatterplot (always recommended to go along with your Anova - directions
here), you can conclude that survival rate was higher for caterpillars raised on wild mustard than for those raised on broccoli or cabbage. If you want to formally make pairwise comparisons between just two groups (such as wild mustard vs. broccoli) you need to do some additional testing. Directions for pairwise comparisons can be found
here.
Other Parameters
Table 1 also includes three other parameters, df, MS and F. The parameter df is an abbreviation for degrees of freedom which is related to sample size. In general, the higher the value for df, the greater the ability to detect significant differences. MS stands for mean square. The two values shown in the table reflect the amount of variation between groups and the amount of variation within groups. In this case the Between Groups MS represents the amount of variation in survival rates among the groups of caterpillars raised on broccoli, cabbage and wild mustard. Likewise, the Within Groups MS represents the amount of variation in survival rates within the caterpillars raised on a given food type. In general, if there is a lot of variation within groups (such as a lot of "spread" in the broccoli data, the cabbage data and the wild mustard data) and not a lot of variation among groups, then there probably is not a significant difference among the means of the groups. On the other hand, if there is very little variation within groups (such as little "spread" in the broccoli data, etc.) but a lot of variation among the groups (such as little overlap in the data among the groups), there probably is a significant difference among the means. As explained in more detail in the Anova section of the online Stats Manual, F is a ratio that measures the amount of among group variation relative to within group variation.