[ Reply | Next | Previous | Up ]

Type 2 Error

Whatever our decision on null hypothesis, we may be making either a correct decision or an error! The null may be true or false, but we can never know – we only have sample data to work with. So, there are two kinds of error that we could make in hypothesis testing – Type 1 Error or Type 2 Error.

Type 2 Error – If the null hypothesis is false, and we choose not to reject it on the evidence provided by our data, we are making a Type 2 Error. That is you’ll miss the effect when it really is there. In other words, it’s the rate of false alarms or false negatives. Once again, the alarm will fail sometimes purely by chance; the effect is present in the population, but the sample you drew doesn’t show it.

The smaller the sample, the more likely you are to commit a Type 2 Error, because the confidence interval is wider and is therefore more likely to overlap zero. The probability of making a Type 2 Error is difficult to determine, but as the probability of a Type 1 Error decreases, that of a Type 2 increases. (Type 1 Error happens if the null hypothesis is true, and we choose to reject it on the evidence provided by our data). So we are forced to compromise – which Type of Error do we most want to avoid.

For example, in a clinical trial of a new drug, the null hypothesis might be that the new drug is no better, on average, than the current drug; that is H0: there is no difference between the two drugs on average. A Type 2 Error would occur if it were concluded that the two drugs produced the same effect, that is, there is no difference between the two drugs on average, when in fact they produced different ones.

The experimental results don't look different than we expect according to the null hypothesis, but they are, perhaps because the effect isn't very big. For example: The rat pups weigh 17.9 grams and we conclude there is no effect. But "really" (if we only knew!) alcohol does reduce weight; we just don't have a big enough effect to see it.

A Type 2 Error is frequently due to sample sizes being too small. The probability of a type 2 error is symbolized by b and written as: P(type 2 error) = b (but is generally unknown).

The power of a statistical hypothesis test measures the test’s ability to reject the null hypothesis when it is actually false – that is, to make a correct decision. In other words, the power of a hypothesis test is the probability of not committing a Type 2 Error. It is calculated by subtracting the probability of type 2 error from 1, usually expressed as: Power = 1-P(type 2 error) = (1-b). The maximum power a test can have is 1 and the minimum is 0. Ideally we want a test to have high power, close to 1.

The Type 2 Error needs to be considered explicitly at the time you design your study. That's when you're supposed to work out the sample size needed to make sure your study has the power to detect anything useful. For this purpose the usual Type 2 Error rate is set to 20%, or 10% for really classy studies. The power of the study is sometimes referred to as 80% (or 90% for a Type 2 Error rate of 10%). In other words, the study has enough power to detect the smallest worthwhile effects 80% (or 90%) of the time.

The concept of power is really relevant when a study is being planned. After a study has been completed, we wish to make statements not about hypothetical alternative hypotheses but about the data, and the way to do this is with estimates and confidence intervals.

References:

http://www.bmj.com/collections/statsbk/3.shtml http://www.unn.ac.uk http://www.cas.lancs.ac.uk http://www.uq.edu.au/~hmrburge/stats/errors/html http://forrest.psych.unc.edu/research/vista-frames/help/lecturenotes/lecture07/definition.html


Last changed: February 20, 2007