Suppose that a researcher wishes to test if a certain kind of growth
hormone will produce faster growth in mice. She injects 10 mice with the
hormone and uses another 10 as a control. Three weeks later, she weighs the
mice and discovers that the mean weight of mice that have received the
injections is 12.05 g and the mean weight of control mice is 9.3 g. These
values indicate that the mice receiving the hormone are heavier. Is her value
of 12.05 significantly different than 9.3? Is it possible that the hormone has
no effect, that the weight difference between the two groups is due to chance?
This is like flipping a coin 10 times. You expect 5 heads and 5 tails but you
might get 6 heads or 7 heads or perhaps 8 heads. Similarly, if the hormone does
not work, you expect the mean for the two groups to be similar but it may not
be exactly the same.
Group 1 -
Hormone -
Weight (grams) |
Group 2
– No Hormone -
Weight (grams) |
| 12.5 |
12 |
| 13 |
8.5 |
| 12 |
10 |
| 12 |
8 |
| 13 |
8 |
| 14 |
13.5 |
| 13 |
9 |
| 10.5 |
8.5 |
| 9.5 |
6.5 |
| 11 |
9 |
| Mean = 12.05 |
Mean = 9.3 |
What is the chance that the two means would be as different as 12.05g and
9.3g if the hormone really did not work? Statistical tests test whether
differences in the data are real differences or whether they are due to
chance. In the example above, we test if the mean of group 1 is significantly
different than the mean of group 2. The alternative is that the difference is
due to chance or random fluctuations and the hormone did not cause additional
weight gain. The test gives the probability that difference could be due to
chance. If the probability that the difference is due to chance is less than 1
out of 20 (<0.05), then we conclude that the difference is real. If the
probability is greater than 0.05, we conclude that the difference is not
significant, it could be due to chance.
There are several tests available for testing means. A commonly used test
for data that are normally distributed
is the t-test.
Sara's Hypothesis is that newborn mice injected with the hormone will be
heavier after 3 weeks of growth than mice without the hormone.
The calculations for the test can be performed by hand but computer
software can do them very quickly. To perform the test, the weight data for
the two groups of mice above are entered into a t-test program.
The software reveals that p = 0.0012. The probability that the difference
between the two means (12.05 and 9.3) is due to chance (random effects) is
0.0012 (or 12 out of 10,000). Because p < 0.05, we conclude that the two
means are really different and that the difference is not due to chance. The
researcher accepts her hypothesis that the hormone produces faster growth. If p had been greater than 0.05, we would reject her hypothesis
and conclude that the two means are not significantly different; the hormone
did not cause one group to be heavier.
The word "significant" has a slightly different meaning in
statistics than it does in general usage. In a statistical test of two means,
if the difference is not due to chance, we conclude that the two means are
significantly different. In the example above, the mean weight of group 1 is
significantly heavier than the mean weight of group 2.
There are two ways to perform this test. The following hypothesis would
lead us to perform a one tailed test:
The mean weight of mice injected with the hormone
will be greater than the mean weight of the control mice.
This is a one-tailed test because the hypothesis
proposes that there is only one possible outcome: the weight of the hormone
mice will be greater than the weight of the control mice.
The following hypothesis would lead us to perform a two tailed test:
The mean weight of mice injected with the hormone
will be different than the mean weight of the control mice.
This is two-tailed because the hypothesis proposes
two possible outcomes. The hypothesis is true if the weight hormone mice is
greater than the weight of control mice. The hypothesis is also true if the
weight of hormone mice is less than the weight of control mice.