5.4.2 The Central Limit Theorem

When the $X_{i}$ ’s are normally distributed, so is $\overset{ˉ}{X}$ for every sample size $n$ . The derivations in Example 5.21 and simulation experiment of Example 5.24 suggest that even when the population distribution is highly nonnormal, averaging produces a distribution more bell-shaped than the one being sampled. A reasonable conjecture is that

if $n$ is large, a suitable normal curve will approximate the actual distribution of $\overset{ˉ}{X}$ .

The formal statement of this result is the most important theorem of probability.

The Central Limit Theorem (CLT)

Let $X_{1}, X_{2}, \dots, X_{n}$ be a random sample from a distribution with mean $μ$ and variance $σ^{2}$ . Then if $n$ is sufficiently large,

$\overset{ˉ}{X}$ has approximately a normal distribution with

$μ_{\overset{ˉ}{X}} = μ$

$σ_{\overset{ˉ}{X}}^{2} = σ^{2} / n$

$T_{o}$ also has approximately a normal distribution with

$μ_{T_{o}} = n μ$ ,

$σ_{T_{o}}^{2} = n σ^{2}$ .

The larger the value of $n$ , the better the approximation.

Figure 5.16 illustrates the Central Limit Theorem. According to the CLT, when $n$ is large and we wish to calculate a probability such as $P (a \leq \overset{ˉ}{X} \leq b)$ , we need only “pretend” that $\overset{ˉ}{X}$ is normal, standardize it, and use the normal table. The resulting answer will be approximately correct. The exact answer could be obtained only by first finding the distribution of $\overset{ˉ}{X}$ , so the CLT provides a truly impressive shortcut. The proof of the theorem involves much advanced mathematics.

Figure 5.16 The Central Limit Theorem illustrated 0192609f-6f5c-74c9-8588-c1ef28b2184d_35_817_176_760_409_0.jpg

EXAMPLE 5.27

EXAMPLE 5.28

The CLT provides insight into why many random variables have probability distributions that are approximately normal. For example, the measurement error in a scientific experiment can be thought of as the sum of a number of underlying perturbations and errors of small magnitude.

A practical difficulty in applying the CLT is in knowing when $n$ is sufficiently large. The problem is that the accuracy of the approximation for a particular $n$ depends on the shape of the original underlying distribution being sampled. If the underlying distribution is close to a normal density curve, then the approximation will be good even for a small $n$ , whereas if it is far from being normal, then a large $n$ will be required.

Rule of Thumb

The Central Limit Theorem can generally be used if $n > 30$ .

There are population distributions for which even an $n$ of 40 or 50 does not suffice, but such distributions are rarely encountered in practice. On the other hand, the rule of thumb is often conservative; for many population distributions, an $n$ much less than 30 would suffice. For example, in the case of a uniform population distribution, the CLT gives a good approximation for $n \geq 12$ .

EXAMPLE 5.29

Youliang Zhong

Backlinks

Graph View

5.4.2 The Central Limit Theorem