The population distribution for our first simulation study is normal with and , as pictured in Figure 5.11. [The article “Platelet Size in Myocardial Infarction” (British Med. J., 1983: 449-451) suggests this distribution for platelet volume in individuals with no history of serious heart problems.]
Figure 5.11 Normal distribution, with and
We actually performed four different experiments, with 500 replications for each one. In the first experiment,500 samples of observations each were generated using Minitab, and the sample sizes for the other three were , and , respectively. The sample mean was calculated for each sample, and the resulting histograms of values appear in Figure 5.12.
Figure 5.12 Sample histograms for based on 500 samples, each consisting of observations: (a) ; (b) ; (c) ; (d)
The first thing to notice about the histograms is their shape. To a reasonable approximation, each of the four looks like a normal curve. The resemblance would be even more striking if each histogram had been based on many more than values. Second, each histogram is centered approximately at 8.25 , the mean of the population being sampled. Had the histograms been based on an unending sequence of values, their centers would have been exactly the population mean,8.25 .
The final aspect of the histograms to note is their spread relative to one another. The larger the value of , the more concentrated is the sampling distribution about the mean value. This is why the histograms for and are based on narrower class intervals than those for the two smaller sample sizes. For the larger sample sizes, most of the values are quite close to 8.25 . This is the effect of averaging. When is small, a single unusual value can result in an value far from the center. With a larger sample size, any unusual values, when averaged in with the other sample values, still tend to yield an value close to . Combining these insights yields a result that should appeal to your intuition: based on a large tends to be closer to than does based on a small .