4.3.7 Approximating the Binomial Distribution

Recall that the mean value and standard deviation of a binomial random variable $X$ are $μ_{X} = n p$ and $σ_{X} = n pq$ , respectively. Figure 4.25 displays a binomial probability histogram for the binomial distribution with $n = 25$ , $p = .6$ , for which $μ = 25 (.6) = 15$ and $σ = 25 (.6) (.4) = 2.449$ .

Figure 4.25 Binomial probability histogram for $n = 25, p = .6$ with normal approximation curve superimposed 01925166-48c0-7eca-9860-67f13d0848b1_24_822_1216_787_911_0.jpg

A normal curve with this $μ$ and $σ$ has been superimposed on the probability histogram. Although the probability histogram is a bit skewed (because $p \neq = .5$ ), the normal curve gives a very good approximation, especially in the middle part of the picture. The area of any rectangle (probability of any particular $X$ value) except those in the extreme tails can be accurately approximated by the corresponding normal curve area.

Example

For example, $P (X = 10) = B (10; 25, .6) - B (9; 25, .6) = .021,$ whereas the area under the normal curve between 9.5 and 10.5 is $P (- 2.25 \leq Z \leq - 1.84) = .0207 .$

More generally, as long as the binomial probability histogram is not too skewed, binomial probabilities can be well approximated by normal curve areas. It is then customary to say that $X$ has approximately a normal distribution.

Proposition

Let $X$ be a binomial rv based on $n$ trials with success probability $p$ . Then if the binomial probability histogram is not too skewed, $X$ has approximately a normal distribution with $μ = n p$ and $σ = n pq$ . In particular, for $x =$ a possible value of $X$ ,
$P (X \leq x) = B (x, n, p) \approx (area under the normal curve to the left of x + 0.5) = Φ (\frac{x + 0.5 - n p}{n pq})$
In practice, the approximation is adequate provided that both $n p \geq 10$ and $n q \geq 10$ (i.e., the expected number of successes and the expected number of failures are both at least 10), since there is then enough symmetry in the underlying binomial distribution.

A direct proof of the approximation’s validity is quite difficult. In the next chapter we’ll see that it is a consequence of a more general result called the Central Limit Theorem. In all honesty, the approximation is not so important for probability calculation as it once was. This is because software can now calculate binomial probabilities exactly for quite large values of $n$ .

EX 4.20

When the objective of our investigation is to make an inference about a population proportion $p$ , interest will focus on the sample proportion of successes $X / n$ rather than on $X$ itself. Because this proportion is just $X$ multiplied by the constant $1/ n$ , it will also have approximately a normal distribution

with mean $μ = p$ and standard deviation $σ = pq / n$
provided that both $n p \geq 10$ and $n q \geq 10$ . This normal approximation is the basis for several inferential procedures to be discussed in later chapters.

Youliang Zhong

Backlinks

Graph View

4.3.7 Approximating the Binomial Distribution