Consider the distribution shown in Figure 5.17 for the amount purchased (rounded to the nearest dollar) by a randomly selected customer at a particular gas station (a similar distribution for purchases in Britain (in ) appeared in the article “Data Mining for Fun and Profit,” Statistical Science, 2000: 111-131; there were big spikes at the values, , and 30). The distribution is obviously quite non-normal.
We asked Minitab to select 1000 different samples, each consisting of observations, and calculate the value of the sample mean for each one. Figure 5.18 is a histogram of the resulting 1000 values; this is the approximate sampling distribution of under the specified circumstances. This distribution is clearly approximately normal even though the sample size is actually much smaller than 30, our rule-of-thumb cutoff for invoking the Central Limit Theorem. As further evidence for normality, Figure 5.19 shows a normal probability plot of the values; the linear pattern is very prominent. It is typically not nonnormality in the central part of the population distribution that causes the CLT to fail, but instead very substantial skewness.
Figure 5.17 Probability distribution of X = amount of gasoline purchased ($)
Figure 5.18 Approximate sampling distribution of the sample mean amount purchased when and the population distribution is as shown in Figure 5.17
Figure 5.19 Normal probability plot from Minitab of the values based on samples of size