Time for the second blog! Can’t believe how fast this semester is going…quite scary actually. I thought I would base my blog on what we have been doing over the last 2 weeks which is one-way ANOVA’s (if you don’t know). Basically an ANOVA is an analysis of variance which is used when there are 3 or more groups. If you only have two groups then you would use a t-test to compare the means. A one-way ANOVA is used to basically see if there is a difference between groups, however…and it is a big however, it does not tell you where the differences lie so tests are done afterwards (but I will go into detail after about the different tests which can be used and when they are used).
So…one-way ANOVA’s here we come. Before I start talking about all the nitty gritty stuff with ANOVA’s and attempt to explain it I think it is best to start with the assumptions of ANOVA. There are 3 assumptions of ANOVA. The first one is that the samples should be normally distributed. The second is homogeneity of variance meaning that variances should be equal (which is what the null hypothesis states), therefore you do not want a significant result on your Levene’s test which tests for homogeneity of variance. The final assumption is that groups are independent of one another. This can be solved really in your methodology and how you collect your data, therefore, to ensure that groups are independent of one another it would probably be best to randomly select your groups to reduce biases.
Now for the nitty gritty stuff! Like I said before one-way ANOVA’s are used to see if there is a difference between the groups (null hypothesis saying there isn’t a difference, alternative hypothesis saying there is a difference). These groups will belong to a single factor and have separate groups within the factor. For example if you think about the data we have been using in class the single factor is the ‘driving error’, this has separate groups which (in our data) are the control group (who consumed no alcohol), a group who consumed a low amount of alcohol and a group who consumed a high amount of alcohol. So we would run a one-way ANOVA on the data produced and see if there is a significant difference between the groups or not. ANOVA produce’s an F value. An F value is calculated by dividing the mean square of the treatment (between) by the mean square of the error (the difference within groups). The mean square is the sum of squares divided by its degrees of freedom. The higher the F value then the more significant it will be. Therefore a significant F would show that there is a difference between the groups you have tested, HOWEVER, (like I mentioned earlier) a significant F does not tell us where the difference lies.
As I mentioned before, there are many follow up tests to discover where the difference lies. These tests are separated into two main categories: planned contrasts (a priori) and post hoc tests (a posteriori). There are some important differences between these tests. Planned contrasts or a priori tests are contrast tests which you decide you will do before you look at the data. Therefore they are based on your hypothesis and your prediction. This means that these contrasts are very specific. For example, if you think about the data we have been using over the last couple of weeks it would make sense to compare the control group who drank no alcohol with people who drank some alcohol whether it was low or high as you would expect more driving errors with people who had consumed alcohol than those who hadn’t had any alcohol. This means you would be comparing the control group against the low alcohol group and the high alcohol group. The low alcohol and high alcohol group would be considered as one group. The way you do this is to code the groups or ‘weight’ the groups with coefficients. Coefficients always have to equal zero. Therefore, doing this comparison you would code the control group as -2 and code both the low alcohol group and the high alcohol group as 1. This would group the low and high alcohol groups together and consider them as one group as they have the same weighting. To check you have weighted them correctly and they add up to zero is simple so in this case it would be -2 + 1 + 1 which of course equals 0. You can also not include a group in your comparisons by coding them with a 0. For example if you wanted to compare the driving errors between participants who drank a low amount of alcohol to those who had a high amount of alcohol you wouldn’t need to include the control group (who didn’t have any alcohol). You would code the control group as 0 so it has no weight in the comparison. You would code the low alcohol group as -1 and the high alcohol group as 1. It doesn’t particularly matter what the numbers are as long as they equal 0 in the end. The low alcohol group and the high alcohol group have different numbers to each other this time as they are being compared to each other, not being classed as one group. So just a quick overview of that; any groups that you want to be considered as one group would have the same weighting, whereas, groups that are being compared need different weightings which equal 0.
Now that I have told you about the planned contrast/a priori tests I will tell you about post hoc/a posteriori tests. Post hoc tests are used to do multiple tests on your data. Post hoc tests are used when you haven’t planned any contrasts to test; therefore they are generally used when you have no idea what to expect (for example if there was no previous research in the area). Post hoc tests are driven by the data, whereas (like I said before), planned contrasts are driven by your hypothesis. As I briefly mentioned, post hoc testing involves carrying out multiple tests to compare your data. This increases the risk of making a type 1 error which means you could say something is significant when it isn’t. This is because by doing multiple tests there is a chance that one of them will show up as significant even if it truly isn’t. If you reduce the risk of type 1 error then you increase the risk of getting a type 2 error so a balance needs to be met. It is one of the reasons why post hoc tests are avoided and more people conduct planned contrasts based on their hypothesis. There are a variety of post hoc tests and they vary in how strict and conservative they are. One of the first being Fisher’s LSD (least significant difference). This test is quite liberal and would show significance easier than some of the other tests that I will talk about. This test keeps the alpha level constant if there are three or less groups being compared, therefore, this test should be avoided when you have a large amount of groups. Another of the more popular post hoc tests is the Scheffe’s test. This is a conservative test, therefore reducing the risk of getting a type 1 error. It is said that this is a good test to use if complex comparisons are being made. Bonferroni is another test which is very strict. The bonferroni test uses t-tests to compare all the possible levels in the variable so if you have 3 groups (1, 2, 3) you would have 3 possible comparisons (1 and 2, 2 and 3, 1 and 3). Hopefully you can see that when there are a large amount of groups it can get complicated and a lot more comparisons will have to be made. With the stricter tests it is harder to find significance which means there is a risk of getting a type 2 error (saying that something isn’t significant when it actually is). Due to this stricter tests have less power; especially compared to planned contrasts. There are more post hoc tests which you can choose to run as well as what I have talked about.
Hopefully now you know about one-way ANOVA’s and that there are different ways of analysing your data to find where the difference lies between the groups once you have established if there is a difference or not. I personally think it makes a lot more sense to use planned contrasts as they are based on your predictions and hypothesis and you can test the specific things you want to test. Also they have more statistical power than that of post hoc tests and therefore you can feel more sure of your results. I think post hoc tests should only ever be used if you genuinely do not have a clue in which direction your data will go and should be used as a last resort.
I found loads and loads of websites that were helpful when I was reading up on this topic. Here are a few of the ones I thought were really helpful and used to help me write this blog.
That looks like a lot but they are all in really easy language to understand. And just a quick comment for whoever reads this far down the blog after the web links but the statistics hell website is really good. Its Dr. Andy Fields website and I know a lot of you really like his book, well this website is really useful as well. It looks odd when you go on it but it is all broke down and has videos to help explain the more difficult topics.