INTRODUCTION
A point estimate, because it is a single number, by itself provides no information about the precision and reliability of estimation. Consider, for example, using the statistic to calculate a point estimate for the true average breaking strength (g) of paper towels of a certain brand, and suppose that . Because of sampling variability, it is virtually never the case that . The point estimate says nothing about how close it might be to . An alternative to reporting a single sensible value for the parameter being estimated is to calculate and report an entire interval of plausible values—an interval estimate or confidence interval (CI). A confidence interval is always calculated by first selecting a confidence level, which is a measure of the degree of reliability of the interval. A confidence interval with a confidence level for the true average breaking strength might have a lower limit of 9162.5 and an upper limit of 9482.9. Then at the 95% confidence level, any value of between 9162.5 and 9482.9 is plausible. A confidence level of implies that of all samples would give an interval that includes , or whatever other parameter is being estimated, and only of all samples would yield an erroneous interval. The most frequently used confidence levels are , and . The higher the confidence level, the more strongly we believe that the value of the parameter being estimated lies within the interval (an interpretation of any particular confidence level will be given shortly).
Information about the precision of an interval estimate is conveyed by the width of the interval. If the confidence level is high and the resulting interval is quite narrow, our knowledge of the value of the parameter is reasonably precise. A very wide confidence interval, however, gives the message that there is a great deal of uncertainty concerning the value of what we are estimating. Figure 7.1 shows confidence intervals for true average breaking strengths of two different brands of paper towels. One of these intervals suggests precise knowledge about , whereas the other suggests a very wide range of plausible values.
Figure 7.1 CIs indicating precise (brand 1) and imprecise (brand 2) information about
7.1 Basic Properties of Confidence Intervals
The basic concepts and properties of confidence intervals (CIs) are most easily introduced by first focusing on a simple, albeit somewhat unrealistic, problem situation. Suppose that the parameter of interest is a population mean and that
-
The population distribution is normal
-
The value of the population standard deviation is known
Normality of the population distribution is often a reasonable assumption. However, if the value of is unknown, it is typically implausible that the value of would be available (knowledge of a population’s center typically precedes information concerning spread). We’ll develop methods based on less restrictive assumptions in Sections 7.2 and 7.3.
EXAMPLE 7.1 Industrial engineers who specialize in ergonomics are concerned with designing workspace and worker-operated devices so as to achieve high productivity and comfort. The article “Studies on Ergonomically Designed Alphanumeric Keyboards” (Human Factors, 1985: 175-187) reports on a study of preferred height for an experimental keyboard with large forearm-wrist support. A sample of trained typists was selected, and the preferred keyboard height was determined for each typist. The resulting sample average preferred height was . Assuming that the preferred height is normally distributed with (a value suggested by data in the article), obtain a confidence interval (interval of plausible values) for , the true average preferred height for the population of all experienced typists.
The actual sample observations are assumed to be the result of a random sample from a normal distribution with mean value and standard deviation . The results described in Chapter 5 then imply that, irrespective of the sample size , the sample mean is normally distributed with expected value and standard deviation . Standardizing by first subtracting its expected value and then dividing by its standard deviation yields the standard normal variable
Because the area under the standard normal curve between -1.96 and 1.96 is .95 ,
Now let’s manipulate the inequalities inside the parentheses in (7.2) so that they appear in the equivalent form , where the endpoints and involve and . This is achieved through the following sequence of operations, each yielding inequalities equivalent to the original ones.
- Multiply through by :
- Subtract from each term:
- Multiply through by -1 to eliminate the minus sign in front of (which reverses the direction of each inequality):
that is,
The equivalence of each set of inequalities to the original set implies that
The event inside the parentheses in (7.3) has a somewhat unfamiliar appearance; previously, the random quantity has appeared in the middle with constants on both ends, as in . In (7.3) the random quantity appears on the two ends, whereas the unknown constant appears in the middle. To interpret (7.3), think of a random interval having left endpoint and right endpoint . In interval notation, this becomes
The interval (7.4) is random because the two endpoints of the interval involve a random variable. It is centered at the sample mean and extends to each side of . Thus the interval’s width is , a fixed number; only the location of the interval (its midpoint ) is random (Figure 7.2). Now (7.3) can be paraphrased as “the probability is .95 that the random interval (7.4) includes or covers the true value of .” Before any data is gathered, it is quite likely that will lie inside the interval (7.4).
Figure 7.2 The random interval (7.4) centered at
DEFINITION
If, after observing , we compute the observed sample mean and then substitute into (7.4) in place of , the resulting fixed interval is called a confidence interval for . This CI can be expressed either as
or as
A concise expression for the interval is , where - gives the left endpoint (lower limit) and + gives the right endpoint (upper limit).
EXAMPLE 7.2 (Example 7.1 continued)
2 The quantities needed for computation of the for true average preferred height are , and . The resulting interval is
That is, we can be highly confident, at the confidence level, that . This interval is relatively narrow, indicating that has been rather precisely estimated.
Interpreting a Confidence Level
The confidence level for the interval just defined was inherited from the probability .95 for the random interval (7.4). Intervals having other levels of confidence will be introduced shortly. For now, though, consider how confidence can be interpreted.
We started with an event whose probability was .95 - that the random interval (7.4) would capture the true value of -and then used the data in Example 7.1 to compute the CI . It is therefore tempting to conclude that is within this fixed interval with probability .95 . But by substituting for , all randomness disappears; the interval is not a random interval, and is a constant (unfortunately unknown to us). Thus it is incorrect to write the statement .
A correct interpretation of “95% confidence” relies on the long-run relative frequency interpretation of probability: To say that an event has probability .95 is to say that if the experiment on which is defined is performed over and over again, in the long run will occur of the time. Suppose we obtain another sample of typists’ preferred heights and compute another interval. Now consider repeating this for a third sample, a fourth sample, a fifth sample, and so on. Let be the event that . Since , in the long run of our computed CIs will contain . This is illustrated in Figure 7.3, where the vertical line cuts the measurement axis at the true (but unknown) value of . Notice that 7 of the 100 intervals shown fail to contain . In the long run, only of the intervals so constructed would fail to contain .
According to this interpretation, the confidence level is not so much a statement about any particular interval such as . Instead it pertains to what would happen if a very large number of like intervals were to be constructed using the same CI formula. Although this may seem unsatisfactory, the root of the difficulty lies with our interpretation of probability-it applies to a long sequence of replications of an experiment rather than just a single replication. There is another approach to the construction and interpretation of CIs that uses the notion of subjective probability and Bayes’ theorem, but the technical details are beyond the scope of this text; the book by DeGroot, et al. (see the Chapter 6 bibliography) is a good source. The interval presented here (as well as each interval presented subsequently) is called a “classical” CI because its interpretation rests on the classical notion of probability.
Figure 7.3 One hundred CIs (asterisks identify intervals that do not include )
Other Levels of Confidence
The confidence level of was inherited from the probability .95 for the initial inequalities in (7.2). If a confidence level of is desired, the initial probability of must be replaced by .99, which necessitates changing the critical value from 1.96 to 2.58 . A CI then results from using 2.58 in place of 1.96 in the formula for the CI.
In fact, any desired level of confidence can be achieved by replacing 1.96 or 2.58 with the appropriate standard normal critical value. Recall from Chapter 4 the notation for a critical value: is the number on the horizontal scale that captures upper tail area . As Figure 7.4 shows, a probability (i.e., central curve area) of is achieved by using in place of 1.96 .
Figure 7.4
DEFINITION
A confidence interval for the mean of a normal population when the value of is known is given by
or, equivalently, by .
The formula (7.5) for the CI can also be expressed in words as
point estimate of ( critical value) (standard error of the mean).
EXAMPLE 7.3 The production process for engine control housing units of a particular type has recently been modified. Prior to this modification, historical data had suggested that the distribution of hole diameters for bushings on the housings was normal with a standard deviation of . It is believed that the modification has not affected the shape of the distribution or the standard deviation, but that the value of the mean diameter may have changed. A sample of 40 housing units is selected and hole diameter is determined for each one, resulting in a sample mean diameter of . Let’s calculate a confidence interval for true average hole diameter using a confidence level of . This requires that , from which and (corresponding to a cumulative -curve area of .9500). The desired interval is then
With a reasonably high degree of confidence, we can say that . This interval is rather narrow because of the small amount of variability in hole diameter .
Confidence Level, Precision, and Sample Size
Why settle for a confidence level of when a level of is achievable? Because the price paid for the higher confidence level is a wider interval. Since the interval extends to each side of , the width of the interval is . Similarly, the width of the interval is . That is, we have more confidence in the interval precisely because it is wider. The higher the desired degree of confidence, the wider the resulting interval will be.
If we think of the width of the interval as specifying its precision or accuracy, then the confidence level (or reliability) of the interval is inversely related to its precision. A highly reliable interval estimate may be imprecise in that the endpoints of the interval may be far apart, whereas a precise interval may entail relatively low reliability. Thus it cannot be said unequivocally that a interval is to be preferred to a interval; the gain in reliability entails a loss in precision.
An appealing strategy is to specify both the desired confidence level and interval width and then determine the necessary sample size.
EXAMPLE 7.4 Extensive monitoring of a computer time-sharing system has suggested that response time to a particular editing command is normally distributed with standard deviation 25 millisec. A new operating system has been installed, and we wish to estimate the true average response time for the new environment. Assuming that response times are still normally distributed with , what sample size is necessary to ensure that the resulting has a width of (at most) 10 ? The sample size must satisfy
Rearranging this equation gives
so
Since must be an integer, a sample size of 97 is required.
A general formula for the sample size necessary to ensure an interval width is obtained from equating to and solving for .
The sample size necessary for the CI (7.5) to have a width is
The smaller the desired width , the larger must be. In addition, is an increasing function of (more population variability necessitates a larger sample size) and of the confidence level (as decreases, increases).
The half-width of the is sometimes called the bound on the error of estimation associated with a 95% confidence level. That is, with 95% confidence, the point estimate will be no farther than this from . Before obtaining data, an investigator may wish to determine a sample size for which a particular value of the bound is achieved. For example, with representing the average fuel efficiency (mpg) for all cars of a certain type, the objective of an investigation may be to estimate to within with confidence. More generally, if we wish to estimate to within an amount (the specified bound on the error of estimation) with confidence, the necessary sample size results from replacing by in the formula in the preceding box.
Deriving a Confidence Interval
Let denote the sample on which the CI for a parameter is to be based.
Suppose a random variable satisfying the following two properties can be found:
-
The variable is a function of both and .
-
The probability distribution of the variable does not depend on or on any other unknown parameters.
Let denote this random variable. For example, if the population distribution is normal with known and , the variable satisfies both properties; it clearly depends functionally on , yet has the standard normal probability distribution irrespective of the value of . In general, the form of the function is usually suggested by examining the distribution of an appropriate estimator .
For any between 0 and 1, constants and can be found to satisfy
Because of the second property, and do not depend on . In the normal example, and . Now suppose that the inequalities in (7.6) can be manipulated to isolate , giving the equivalent probability statement
Then and are the lower and upper confidence limits, respectively, for a . In the normal example, we saw that and .
EXAMPLE 7.5 A theoretical model suggests that the time to breakdown of an insulating fluid between electrodes at a particular voltage has an exponential distribution with parameter (see Section 4.4). A random sample of breakdown times yields the following sample data (in min): , . A for and for the true average breakdown time are desired.
Let . It can be shown that this random variable has a probability distribution called a chi-squared distribution with degrees of freedom (df) , where is the parameter of a chi-squared distribution as mentioned in Section 4.4). Appendix Table A. 7 pictures a typical chi-squared density curve and tabulates critical values that capture specified tail areas. The relevant number of df here is . The row of the table shows that 34.170 captures upper-tail area .025 and 9.591 captures lower-tail area .025 (upper-tail area .975). Thus for ,
Division by isolates , yielding
The lower limit of the CI for is , and the upper limit is . For the given data, , giving the interval , .03101).
The expected value of an exponential rv is . Since
the for true average breakdown time is (32.24, 114.87). This interval is obviously quite wide, reflecting substantial variability in breakdown times and a small sample size.
In general, the upper and lower confidence limits result from replacing each in (7.6) by and solving for . In the insulating fluid example just considered, gives as the upper confidence limit, and the lower limit is obtained from the other equation. Notice that the two interval limits are not equidistant from the point estimate, since the interval is not of the form .
Bootstrap Confidence Intervals
The bootstrap technique was introduced in Chapter 6 as a way of estimating . It can also be applied to obtain a CI for . Consider again estimating the mean of a normal distribution when is known. Let’s replace by and use as the point estimator. Notice that is the 97.5th percentile of the distribution of [that is, ]. Similarly, is the 2.5th percentile, so
That is, with
the CI for is . In many cases, the percentiles in (7.7) cannot be calculated, but they can be estimated from bootstrap samples. Suppose we obtain bootstrap samples and calculate , and followed by the 1000 differences . The 25th largest and 25th smallest of these differences are estimates of the unknown percentiles in (7.7). Consult the Devore and Berk or Efron books cited in Chapter 6 for more information.
EXERCISES Section 7.1 (1-11)
- Consider a normal population distribution with the value of known.
a. What is the confidence level for the interval ?
b. What is the confidence level for the interval ?
c. What value of in the CI formula (7.5) results in a confidence level of ?
d. Answer the question posed in part (c) for a confidence level of .
- Each of the following is a confidence interval for true average (i.e., population mean) resonance frequency (Hz) for all tennis rackets of a certain type:
a. What is the value of the sample mean resonance frequency?
b. Both intervals were calculated from the same sample data. The confidence level for one of these intervals is and for the other is . Which of the intervals has the confidence level, and why?
- Suppose that a random sample of 50 bottles of a particular brand of cough syrup is selected and the alcohol content of each bottle is determined. Let denote the average alcohol content for the population of all bottles of the brand under study. Suppose that the resulting confidence interval is .
a. Would a confidence interval calculated from this same sample have been narrower or wider than the given interval? Explain your reasoning.
b. Consider the following statement: There is a chance that is between 7.8 and 9.4. Is this statement correct? Why or why not?
c. Consider the following statement: We can be highly confident that of all bottles of this type of cough syrup have an alcohol content that is between 7.8 and 9.4. Is this statement correct? Why or why not?
d. Consider the following statement: If the process of selecting a sample of size 50 and then computing the corresponding interval is repeated 100 times,95 of the resulting intervals will include . Is this statement correct? Why or why not?
- A CI is desired for the true average stray-load loss (watts) for a certain type of induction motor when the line current is held at 10 amps for a speed of . Assume that stray-load loss is normally distributed with .
a. Compute a for when and .
b. Compute a for when and 58.3.
c. Compute a CI for when and 58.3.
d. Compute an CI for when and 58.3.
e. How large must be if the width of the interval for is to be 1.0 ?
- Assume that the helium porosity (in percentage) of coal samples taken from any particular seam is normally distributed with true standard deviation .75 .
a. Compute a for the true average porosity of a certain seam if the average porosity for 20 specimens from the seam was 4.85.
b. Compute a for true average porosity of another seam based on 16 specimens with a sample average porosity of 4.56 .
c. How large a sample size is necessary if the width of the interval is to be .40 ?
d. What sample size is necessary to estimate true average porosity to within with confidence?
- On the basis of extensive tests, the yield point of a particular type of mild steel-reinforcing bar is known to be normally distributed with . The composition of bars has been slightly modified, but the modification is not believed to have affected either the normality or the value of .
a. Assuming this to be the case, if a sample of 25 modified bars resulted in a sample average yield point of , compute a for the true average yield point of the modified bar.
b. How would you modify the interval in part (a) to obtain a confidence level of ?
-
By how much must the sample size be increased if the width of the CI (7.5) is to be halved? If the sample size is increased by a factor of 25 , what effect will this have on the width of the interval? Justify your assertions.
-
Let , with . Then
a. Use this equation to derive a more general expression for a CI for of which the interval (7.5) is a special case.
b. Let and . Does this result in a narrower or wider interval than the interval (7.5)?
- a. Under the same conditions as those leading to the interval (7.5), . Use this to derive a one-sided interval for that has infinite width and provides a lower confidence bound on . What is this interval for the data in Exercise 5(a)?
b. Generalize the result of part (a) to obtain a lower bound with confidence level .
c. What is an analogous interval to that of part (b) that provides an upper bound on ? Compute this interval for the data of Exercise 4(a).
- A random sample of heat pumps of a certain type yielded the following observations on lifetime (in years): 15.7 .7 4.8 .9 12.2 5.3 .6
a. Assume that the lifetime distribution is exponential and use an argument parallel to that of Example 7.5 to obtain a for expected (true average) lifetime.
b. How should the interval of part (a) be altered to achieve a confidence level of ?
c. What is a for the standard deviation of the lifetime distribution? [Hint: What is the standard deviation of an exponential random variable?]
- Consider the next CIs for that a statistical consultant will obtain for various clients. Suppose the data sets on which the intervals are based are selected independently of one another. How many of these 1000 intervals do you expect to capture the corresponding value of ? What is the probability that between 940 and 960 of these intervals contain the corresponding value of ? [Hint: Let the number among the 1000 intervals that contain . What kind of random variable is ?]
7.2 Large-Sample Confidence Intervals for a Population Mean and Proportion
The CI for given in the previous section assumed that the population distribution is normal with the value of known. We now present a large-sample CI whose validity does not require these assumptions. After showing how the argument leading to this interval generalizes to yield other large-sample intervals, we focus on an interval for a population proportion .
Copyright 2016 Cengage Learning. All Rights Reserved, May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Congage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
A Large-Sample Interval for
Let be a random sample from a population having a mean and standard deviation . Provided that is sufficiently large, the Central Limit Theorem (CLT) implies that has approximately a normal distribution whatever the nature of the population distribution. It then follows that has approximately a standard normal distribution, so that
An argument parallel to that given in Section 7.1 yields as a large-sample CI for with a confidence level of approximately . That is, when is large, the CI for given previously remains valid whatever the population distribution, provided that the qualifier “approximately” is inserted in front of the confidence level.
A practical difficulty with this development is that computation of the CI requires the value of , which will rarely be known. Consider replacing the population standard deviation in by the sample standard deviation to obtain the standardized variable
Previously, there was randomness only in the numerator of by virtue of . In the new standardized variable, both and vary in value from one sample to another. So it might seem that the distribution of the new variable should be more spread out than the curve to reflect the extra variation in the denominator. This is indeed true when is small. However, for large the subsititution of for adds little extra variability, so this variable also has approximately a standard normal distribution. Manipulation of the variable in a probability statement, as in the case of known , gives a general large-sample CI for .
PROPOSITION
If is sufficiently large, the standardized variable
has approximately a standard normal distribution. This implies that
is a large-sample confidence interval for with confidence level approximately . This formula is valid regardless of the shape of the population distribution.
In words, the CI (7.8) is
point estimate of ( critical value) (estimated standard error of the mean).
Generally speaking, will be sufficient to justify the use of this interval. This is somewhat more conservative than the rule of thumb for the CLT because of the additional variability introduced by using in place of .
AMPLE 7.6 Haven’t you always wanted to own a Porsche? The author thought maybe he could afford a Boxster, the cheapest model. So he went to www.cars.com on Nov. 18, 2009, and found a total of 1113 such cars listed. Asking prices ranged from $3499
to \ {130},{000}$ {70},{000}$ ). The prices depressed him, so he focused instead on odometer readings (miles). Here are reported readings for a sample of 50 of these Boxsters:
2948 | 2996 | 7197 | 8338 | 8500 | 8759 | 12710 | 12925 |
15767 | 20000 | 23247 | 24863 | 26000 | 26210 | 30552 | 30600 |
35700 | 36466 | 40316 | 40596 | 41021 | 41234 | 43000 | 44607 |
45000 | 45027 | 45442 | 46963 | 47978 | 49518 | 52000 | 53334 |
54208 | 56062 | 57000 | 57365 | 60020 | 60265 | 60803 | 62851 |
64404 | 72140 | 74594 | 79308 | 79500 | 80000 | 80000 | 84000 |
113000 | 118634 |
A boxplot of the data (Figure 7.5) shows that, except for the two outliers at the upper end, the distribution of values is reasonably symmetric (in fact, a normal probability plot exhibits a reasonably linear pattern, though the points corresponding to the two smallest and two largest observations are somewhat removed from a line fit through the remaining points).
Figure 7.5 A boxplot of the odometer reading data from Example 7.6
Summary quantities include , . The mean and median are reasonably close (if the two largest values were each reduced by 30,000 , the mean would fall to 44,479.4 , while the median would be unaffected). The boxplot and the magnitudes of and relative to the mean and median both indicate a substantial amount of variability. A confidence level of about requires , and the interval is
That is, with confidence. This interval is rather wide because a sample size of 50 , even though large by our rule of thumb, is not large enough to overcome the substantial variability in the sample. We do not have a very precise estimate of the population mean odometer reading.
Is the interval we’ve calculated one of the that in the long run includes the parameter being estimated, or is it one of the “bad” that does not do so? Without knowing the value of , we cannot tell. Remember that the confidence level refers to the long run capture percentage when the formula is used repeatedly on various samples; it cannot be interpreted for a single sample and the resulting interval.
Unfortunately, the choice of sample size to yield a desired interval width is not as straightforward here as it was for the case of known . This is because the width of (7.8) is . Since the value of is not available before the data has been gathered, the width of the interval cannot be determined solely by the choice of . The only option for an investigator who wishes to specify a desired width is to make an educated guess as to what the value of might be. By being conservative and guessing a larger value of , an larger than necessary will be chosen. The investigator may be able to specify a reasonably accurate value of the population range (the difference between the largest and smallest values). Then if the population distribution is not too skewed, dividing the range by 4 gives a ballpark value of what might be.
EXAMPLE 7.7 The charge-to-tap time (min) for carbon steel in one type of open hearth furnace is to be determined for each heat in a sample of size . If the investigator believes that almost all times in the distribution are between 320 and 440 , what sample size would be appropriate for estimating the true average time to within . with a confidence level of ?
A reasonable value for is . Thus
Since the sample size must be an integer, should be used. Note that estimating to within . with the specified confidence level is equivalent to a CI width of 10 min.
A General Large-Sample Confidence Interval
The large-sample intervals and are special cases of a general large-sample CI for a parameter . Suppose that is an estimator satisfying the following properties: (1) It has approximately a normal distribution; (2) it is (at least approximately) unbiased; and (3) an expression for , the standard deviation (standard error) of , is available. For example, in the case is an unbiased estimator whose distribution is approximately normal when is large and . Standardizing yields the rv , which has approximately a standard normal distribution. This justifies the probability statement
Assume first that does not involve any unknown parameters (e.g., known in the case ). Then replacing each in (7.9) by results in , so the lower and upper confidence limits are and , respectively. Now suppose that does not involve but does involve at least one other unknown parameter. Let be the estimate of obtained by using estimates in place of the unknown parameters (e.g., estimates ). Under general conditions (essentially that be close to for most samples), a valid CI is . The large-sample interval is an example.
Finally, suppose that does involve the unknown . For example, we shall see momentarily that this is the case when , a population proportion. Then can be difficult to solve. An approximate solution can often be obtained by replacing in by its estimate . This results in an estimated standard deviation , and the corresponding interval is again .
In words, this CI is
point estimate of ( critical value)(estimated standard error of the estimator)
A Confidence Interval for a Population Proportion
Let denote the proportion of “successes” in a population, where success identifies an individual or object that has a specified property (e.g., individuals who graduated from college, computers that do not need warranty service, etc.). A random sample of individuals or objects is to be selected, and is the number of successes in the sample. Provided that is small compared to the population size, can be regarded as a binomial rv with and . Furthermore, if both and has approximately a normal distribution.
The natural estimator of is , the sample fraction of successes. Since is just multiplied by the constant also has approximately a normal distribution. As shown in Section 6.1, (unbiasedness) and . The standard deviation involves the unknown parameter . Standardizing by subtracting and dividing by then implies that
Proceeding as suggested in the subsection “Deriving a Confidence Interval” (Section 7.1), the confidence limits result from replacing each by and solving the resulting equation for . But whereas the equations employed in deriving the large-sample CI for are linear in , the equations here are quadratic appears in the numerator when both sides of each equation are squared to eliminate the square root). The two roots are
PROPOSITION
Let . Then a confidence interval for a population proportion with confidence level approximately is
where and, as before, the - in (7.10) corresponds to the lower confidence limit and the + to the upper confidence limit.
This is often referred to as the score for .
If the sample size is very large, then is generally quite negligible (small) compared to and is quite negligible compared to 1, from which . In this case is also negligible compared to is a much larger divisor than is ). As a result, the dominant term in the expression is and the score interval is approximately
This latter interval has the general form of a large-sample interval suggested in the last subsection. The approximate CI (7.11) is the one that for decades has appeared in introductory statistics textbooks. It clearly has a much simpler and more appealing form than the score CI. So why bother with the latter?
First of all, suppose we use in the traditional formula (7.11). Then our nominal confidence level (the one we think we’re buying by using that critical value) is approximately . So before a sample is selected, the probability that the random interval includes the actual value of (i.e., the coverage probability) should be about .95 . But as Figure 7.6 shows for the case , the actual coverage probability for this interval can differ considerably from the nominal probability .95, particularly when is not close to . 5 (the graph of coverage probability versus is very jagged because the underlying binomial probability distribution is discrete rather than continuous). This is generally speaking a deficiency of the traditional interval-the actual confidence level can be quite different from the nominal level even for reasonably large sample sizes. Recent research has shown that the score interval rectifies this behavior-for virtually all sample sizes and values of , its actual confidence level will be quite close to the nominal level specified by the choice of . This is due largely to the fact that the score interval is shifted a bit toward .5 compared to the traditional interval. In particular, the midpoint of the score interval is always a bit closer to . 5 than is the midpoint of the traditional interval. This is especially important when is close to 0 or 1 .
Figure 7.6 Actual coverage probability for the interval (7.11) for varying values of when
In addition, the score interval can be used with nearly all sample sizes and parameter values. It is thus not necessary to check the conditions and that would be required were the traditional interval employed. So rather than asking when is large enough for (7.11) to yield a good approximation to (7.10), our recommendation is that the score CI should always be used. The slight additional tediousness of the computation is outweighed by the desirable properties of the interval.
AMPLE 7.8 The article “Repeatability and Reproducibility for Pass/Fail Data” (J. of Testing and Eval.,1997: 151-153) reported that in trials in a particular laboratory, 16 resulted in ignition of a particular type of substrate by a lighted cigarette. Let denote the long-run proportion of all such trials that would result in ignition. A point estimate for is . A confidence interval for with a confidence level of approximately is
This interval is quite wide because a sample size of 48 is not at all large when estimating a proportion.
The traditional interval is
These two intervals would be in much closer agreement were the sample size substantially larger.
Equating the width of the CI for to a prespecified width gives a quadratic equation for the sample size necessary to give an interval with a desired degree of precision. Suppressing the subscript in , the solution is
Neglecting the terms in the numerator involving gives
This latter expression is what results from equating the width of the traditional interval to .
These formulas unfortunately involve the unknown . The most conservative approach is to take advantage of the fact that is maximized at . Thus if is used in (7.12), the width will be at most regardless of what value of results from the sample. Alternatively, if the investigator believes strongly, based on prior information, that , then can be used in place of . A similar comment applies when .
9 The width of the in Example 7.8 is .258 . The value of necessary to ensure a width of .10 irrespective of the value of is
Thus a sample size of 381 should be used. The expression for based on the traditional CI gives a slightly larger value of 385.
One-Sided Confidence Intervals (Confidence Bounds)
The confidence intervals discussed thus far give both a lower confidence bound and an upper confidence bound for the parameter being estimated. In some circumstances, an investigator will want only one of these two types of bounds. For example, a psychologist may wish to calculate a upper confidence bound for true average reaction time to a particular stimulus, or a reliability engineer may want only a lower confidence bound for true average lifetime of components of a certain
type. Because the cumulative area under the standard normal curve to the left of 1.645 is .95 ,
Manipulating the inequality inside the parentheses to isolate on one side and replacing rv’s by calculated values gives the inequality ; the expression on the right is the desired lower confidence bound. Starting with and manipulating the inequality results in the upper confidence bound. A similar argument gives a one-sided bound associated with any other confidence level.
PROPOSITION
A large-sample upper confidence bound for is
and a large-sample lower confidence bound for is
A one-sided confidence bound for results from replacing by and by either + or - in the CI formula (7.10) for . In all cases the confidence level is approximately .
EXAMPLE 7.10 Titanium and its alloys have found increasing use in aerospace and automotive applications because of durability and high strength-to-weight ratios. However, machining can be difficult because of low thermal conductivity. The article “Modeling and Multi-Objective Optimization of Process Parameters of Wire Electrical Discharge Machining Using Non-Dominated Sorting Genetic Algorithm-II (J. of Engr. Manuf., 2012: 1186-2001) described an investigation into different settings that impact wire electrical discharge machining of titanium 6-2-4-2. One characteristic of interest was surface roughness of the metal after machining. A sample of 54 surface roughness observations resulted in a sample mean roughness of 1.9042 and a sample standard deviation of .1455 . An upper confidence bound for true average roughness with confidence level requires (not the value needed for a two-sided CI). The bound is
Thus we estimate with a confidence level of roughly that .
EXERCISES Section 7.2 (12-27)
- The following observations are lifetimes (days) subse- quent to diagnosis for individuals suffering from blood cancer (“A Goodness of Fit Approach to the Class of
Life Distributions with Unknown Age,” Quality and a. Can a confidence interval for true average lifetime be Reliability Engr. Intl., 2012: 761-766): calculated without assuming anything about the
Copyright 2016 Congage Learning, All Rights Reserved, May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Congage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. nature of the lifetime distribution? Explain your reasoning. [Note: A normal probability plot of the data exhibits a reasonably linear pattern.]
b. Calculate and interpret a confidence interval with a confidence level for true average lifetime. [Hint: and .]
- The article “Gas Cooking, Kitchen Ventilation, and Exposure to Combustion Products” (Indoor Air, 2006: 65-73) reported that for a sample of 50 kitchens with gas cooking appliances monitored during a one-week period, the sample mean level (ppm) was 654.16, and the sample standard deviation was 164.43.
a. Calculate and interpret a (two-sided) confidence interval for true average level in the population of all homes from which the sample was selected.
b. Suppose the investigators had made a rough guess of 175 for the value of before collecting data. What sample size would be necessary to obtain an interval width of for a confidence level of ?
- The negative effects of ambient air pollution on children’s lung function has been well established, but less research is available about the impact of indoor air pollution. The authors of “Indoor Air Pollution and Lung Function Growth Among Children in Four Chinese Cities” (Indoor Air, 2012: 3-11) investigated the relationship between indoor air-pollution metrics and lung function growth among children ages 6-13 years living in four Chinese cities. For each subject in the study, the authors measured an important lung-capacity index known as , the forced volume (in ) of air that is exhaled in 1 second. Higher values are associated with greater lung capacity. Among the children in the study, 514 came from households that used coal for cooking or heating or both. Their mean was 1427 with a standard deviation of 325. (A complex statistical procedure was used to show that burning coal had a clear negative effect on mean levels.)
a. Calculate and interpret a (two-sided) confidence interval for true average level in the population of all children from which the sample was selected. Does it appear that the parameter of interest has been accurately estimated?
b. Suppose the investigators had made a rough guess of 320 for the value of before collecting data. What sample size would be necessary to obtain an interval width of for a confidence level of ?
- Determine the confidence level for each of the following large-sample one-sided confidence bounds:
a. Upper bound:
b. Lower bound:
c. Upper bound:
- The alternating current (AC) breakdown voltage of an insulating liquid indicates its dielectric strength. The article “Testing Practices for the AC Breakdown Voltage Testing of Insulation Liquids” (IEEE
Electrical Insulation Magazine, 1995: 21-26) gave the accompanying sample observations on breakdown voltage of a particular circuit under certain conditions.
57 48 63 57 57 55 53 59 53 52 50 55 60 50 56 58
a. Construct a boxplot of the data and comment on interesting features.
b. Calculate and interpret a for true average breakdown voltage . Does it appear that has been precisely estimated? Explain.
c. Suppose the investigator believes that virtually all values of breakdown voltage are between 40 and 70 . What sample size would be appropriate for the to have a width of (so that is estimated to within with confidence)?
- Exercise 1.13 gave a sample of ultimate tensile strength observations (ksi). Use the accompanying descriptive statistics output from Minitab to calculate a lower confidence bound for true average ultimate tensile strength, and interpret the result.
$\mathrm{N}$ | Mean | Median | TrMean | StDev | SE | Mean |
---|---|---|---|---|---|---|
153 | 135.39 | 135.40 | 135.41 | 4.59 | 0.37 | |
Minimum | Maximum | Q1 | Q3 | |||
122.20 | 147.70 | 132.95 | 138.25 |
- The U.S. Army commissioned a study to assess how deeply a bullet penetrates ceramic body armor (“Testing Body Armor Materials for Use by the U.S. Army-Phase III,” 2012). In the standard test, a cylindrical clay model is layered under the armor vest. A projectile is then fired, causing an indentation in the clay. The deepest impression in the clay is measured as an indication of survivability of someone wearing the armor. Here is data from one testing organization under particular experimental conditions; measurements (in ) were made using a manually controlled digital caliper:
22.4 | 23.6 | 24.0 | 24.9 | 25.5 | 25.6 |
25.8 | 26.1 | 26.4 | 26.7 | 27.4 | 27.6 |
28.3 | 29.0 | 29.1 | 29.6 | 29.7 | 29.8 |
29.9 | 30.0 | 30.4 | 30.5 | 30.7 | 30.7 |
31.0 | 31.0 | 31.4 | 31.6 | 31.7 | 31.9 |
31.9 | 32.0 | 32.1 | 32.4 | 32.5 | 32.5 |
32.6 | 32.9 | 33.1 | 33.3 | 33.5 | 33.5 |
33.5 | 33.5 | 33.6 | 33.6 | 33.8 | 33.9 |
34.1 | 34.2 | 34.6 | 34.6 | 35.0 | 35.2 |
35.2 | 35.4 | 35.4 | 35.4 | 35.5 | 35.7 |
35.8 | 36.0 | 36.0 | 36.0 | 36.1 | 36.1 |
36.2 | 36.4 | 36.6 | 37.0 | 37.4 | 37.5 |
37.5 | 38.0 | 38.7 | 38.8 | 39.8 | 41.0 |
42.0 | 42.1 | 44.6 | 48.3 | 55.0 |
a. Construct a boxplot of the data and comment on interesting features.
b. Construct a normal probability plot. Is it plausible that impression depth is normally distributed? Is a normal distribution assumption needed in order to calculate a confidence interval or bound for the true average depth using the foregoing data? Explain.
c. Use the accompanying Minitab output as a basis for calculating and interpreting an upper confidence bound for with a confidence level of . Variable Count Mean SE Mean StDev Q1 Median Q3 IQR
-
The article “Limited Yield Estimation for Visual Defect Sources” (IEEE Trans. on Semiconductor Manuf., 1997: 17-23) reported that, in a study of a particular wafer inspection process, 356 dies were examined by an inspection probe and 201 of these passed the probe. Assuming a stable process, calculate a (two-sided) confidence interval for the proportion of all dies that pass the probe.
-
TV advertising agencies face increasing challenges in reaching audience members because viewing TV programs via digital streaming is gaining in popularity. The Harris poll reported on November 13, 2012, that 53% of 2343 American adults surveyed said they have watched digitally streamed TV programming on some type of device.
a. Calculate and interpret a confidence interval at the confidence level for the proportion of all adult Americans who watched streamed programming up to that point in time.
b. What sample size would be required for the width of a to be at most .05 irrespective of the value of ?
-
In a sample of 1000 randomly selected consumers who had opportunities to send in a rebate claim form after purchasing a product, 250 of these people said they never did so (“Rebates: Get What You Deserve,” Consumer Reports, May 2009: 7). Reasons cited for their behavior included too many steps in the process, amount too small, missed deadline, fear of being placed on a mailing list, lost receipt, and doubts about receiving the money. Calculate an upper confidence bound at the confidence level for the true proportion of such consumers who never apply for a rebate. Based on this bound, is there compelling evidence that the true proportion of such consumers is smaller than 1/3? Explain your reasoning.
-
The technology underlying hip replacements has changed as these operations have become more popular (over 250,000 in the United States in 2008). Starting in 2003, highly durable ceramic hips were marketed. Unfortunately, for too many patients the increased durability has been counterbalanced by an increased incidence of squeaking. The May 11, 2008, issue of the New York Times reported that in one study of 143 individuals who received ceramic hips between 2003 and 2005, 10 of the hips developed squeaking.
a. Calculate a lower confidence bound at the confidence level for the true proportion of such hips that develop squeaking.
b. Interpret the confidence level used in (a).
- The Pew Forum on Religion and Public Life reported on Dec. 9, 2009, that in a survey of 2003 American adults, said they believed in astrology.
a. Calculate and interpret a confidence interval at the confidence level for the proportion of all adult Americans who believe in astrology.
b. What sample size would be required for the width of a to be at most .05 irrespective of the value of ?
-
A sample of 56 research cotton samples resulted in a sample average percentage elongation of 8.17 and a sample standard deviation of 1.42 (“An Apparent Relation Between the Spiral Angle , the Percent Elongation , and the Dimensions of the Cotton Fiber,” Textile Research J., 1978: 407-410). Calculate a large-sample CI for the true average percentage elongation . What assumptions are you making about the distribution of percentage elongation?
-
A state legislator wishes to survey residents of her district to see what proportion of the electorate is aware of her position on using state funds to pay for abortions.
a. What sample size is necessary if the for is to have a width of at most .10 irrespective of ?
b. If the legislator has strong reason to believe that at least of the electorate know of her position, how large a sample size would you recommend?
- The superintendent of a large school district, having once had a course in probability and statistics, believes that the number of teachers absent on any given day has a Poisson distribution with parameter . Use the accompanying data on absences for 50 days to obtain a large-sample CI for . [Hint: The mean and variance of a Poisson variable both equal , so
has approximately a standard normal distribution. Now proceed as in the derivation of the interval for by making a probability statement (with probability ) and solving the resulting inequalities for - see the argument just after (7.10).]
Number of
absences | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
Frequency | 1 | 4 | 8 | 10 | 8 | 7 | 5 | 3 | 2 | 1 |
- Reconsider the CI (7.10) for , and focus on a confidence level of . Show that the confidence limits agree quite well with those of the traditional interval (7.11) once two successes and two failures have been appended to the sample [i.e.,(7.11) based on ’s in trials]. [Hint: . Note: Agresti and Coull showed that this adjustment of the traditional interval also has an actual confidence level close to the nominal level.]
7.3 Intervals Based on a Normal Population Distribution
The CI for presented in Section 7.2 is valid provided that is large. The resulting interval can be used whatever the nature of the population distribution. The CLT cannot be invoked, however, when is small. In this case, one way to proceed is to make a specific assumption about the form of the population distribution and then derive a CI tailored to that assumption. For example, we could develop a CI for when the population is described by a gamma distribution, another interval for the case of a Weibull distribution, and so on. Statisticians have indeed carried out this program for a number of different distributional families. Because the normal distribution is more frequently appropriate as a population model than is any other type of distribution, we will focus here on a CI for this situation.
ASSUMPTION The population of interest is normal, so that constitutes a random sample from a normal distribution with both and unknown.
The key result underlying the interval in Section 7.2 was that for large , the rv has approximately a standard normal distribution. When is small, the additional variability in the denominator implies that the probability distribution of will be more spread out than the standard normal distribution. The result on which inferences are based introduces a new family of probability distributions called distributions.
THEOREM
When is the mean of a random sample of size from a normal distribution with mean , the rv
has a probability distribution called a distribution with degrees of freedom (df).
Properties of Distributions
Before applying this theorem, a discussion of properties of distributions is in order. Although the variable of interest is still , we now denote it by to emphasize that it does not have a standard normal distribution when is small. Recall that a normal distribution is governed by two parameters; each different choice of in combination with gives a particular normal distribution. Any particular distribution results from specifying the value of a single parameter, called the number of degrees of freedom, abbreviated df. We’ll denote this parameter by the Greek letter . Possible values of are the positive integers 1, So there is a distribution with , another with , yet another with , and so on.
For any fixed value of , the density function that specifies the associated curve is even more complicated than the normal density function. Fortunately, we need concern ourselves only with several of the more important features of these curves.
Properties of Distributions
Let denote the distribution with df.
-
Each curve is bell-shaped and centered at 0 .
-
Each curve is more spread out than the standard normal curve.
-
As increases, the spread of the corresponding curve decreases.
-
As , the sequence of curves approaches the standard normal curve (so the curve is often called the curve with ).
Figure 7.7 illustrates several of these properties for selected values of .
Figure and curves
The number of df for in (7.13) is because, although is based on the deviations implies that only of these are “freely determined.” The number of df for a variable is the number of freely determined deviations on which the estimated standard deviation in the denominator of is based.
The use of distribution in making inferences requires notation for capturing -curve tail areas analogous to for the curve. You might think that would do the trick. However, the desired value depends not only on the tail area captured but also on df.
NOTATION
Let the number on the measurement axis for which the area under the
curve with df to the right of is is called a critical value.
For example, is the critical value that captures an upper-tail area of .05 under the curve with 6 df. The general notation is illustrated in Figure 7.8. Because curves are symmetric about zero, captures lower-tail area . Appendix Table A. 5 gives for selected values of and . This table also appears inside the back cover. The columns of the table correspond to different values of . To obtain , go to the column, look down to the row, and read . Similarly, (.05 column, row), and .
Figure 7.8 Illustration of a critical value
The values of exhibit regular behavior as we move across a row or down a column. For fixed increases as decreases, since we must move farther to the right of zero to capture area in the tail. For fixed , as is increased (i.e., as we look down any particular column of the table) the value of decreases. This is because a larger value of implies a distribution with smaller spread, so it is not necessary to go so far from zero to capture tail area . Furthermore, decreases more slowly as increases. Consequently, the table values are shown in increments of 2 between 30 df and 40 df and then jump to , and finally . Because is the standard normal curve, the familiar values appear in the last row of the table. The rule of thumb suggested earlier for use of the large-sample CI (if ) comes from the approximate equality of the standard normal and distributions for .
The One-Sample Confidence Interval
The standardized variable has a distribution with , and the area under the corresponding density curve between and is (area lies in each tail), so
Expression (7.14) differs from expressions in previous sections in that and are used in place of and , but it can be manipulated in the same manner to obtain a confidence interval for .
PROPOSITION
Let and be the sample mean and sample standard deviation computed from the results of a random sample from a normal population with mean . Then a confidence interval for is
or, more compactly, .
An upper confidence bound for is
and replacing + by - in this latter expression gives a lower confidence bound for , both with confidence level .
EXAMPLE 7.11 Even as traditional markets for sweetgum lumber have declined, large section solid timbers traditionally used for construction bridges and mats have become increasingly scarce. The article “Development of Novel Industrial Laminated Planks from Sweetgum Lumber” (J. of Bridge Engr., 2008: 64-66) described the manufacturing and testing of composite beams designed to add value to low-grade sweetgum lumber.
Here is data on the modulus of rupture (psi; the article contained summary data expressed in ):
6807.99 | 7637.06 | 6663.28 | 6165.03 | 6991.41 | 6992.23 |
6981.46 | 7569.75 | 7437.88 | 6872.39 | 7663.18 | 6032.28 |
6906.04 | 6617.17 | 6984.12 | 7093.71 | 7659.50 | 7378.61 |
7295.54 | 6702.76 | 7440.17 | 8053.26 | 8284.75 | 7347.95 |
7422.69 | 7886.87 | 6316.67 | 7713.65 | 7503.33 | 7674.99 |
Figure 7.9 shows a normal probability plot from the R software. The straightness of the pattern in the plot provides strong support for assuming that the population distribution of MOR is at least approximately normal.
Figure 7.9 A normal probability plot of the modulus of rupture data
The sample mean and sample standard deviation are 7203.191 and 543.5400, respectively (for anyone bent on doing hand calculation, the computational burden is eased a bit by subtracting 6000 from each value to obtain ; then and , from which and as given).
Let’s now calculate a confidence interval for true average MOR using a confidence level of . The CI is based on degrees of freedom, so the necessary critical value is . The interval estimate is now
We estimate that with confidence. If we use the same formula on sample after sample, in the long run of the calculated intervals will contain . Since the value of is not available, we don’t know whether the calculated interval is one of the “good” 95% or the “bad” 5%. Even with the moderately large sample size, our interval is rather wide. This is a consequence of the substantial amount of sample variability in MOR values.
A lower confidence bound would result from retaining only the lower confidence limit (the one with -) and replacing 2.045 with .
Unfortunately, it is not easy to select to control the width of the interval. This is because the width involves the unknown (before the data is collected) and because enters not only through but also through . As a result, an appropriate can be obtained only by trial and error.
In Chapter 15, we will discuss a small-sample CI for that is valid provided only that the population distribution is symmetric, a weaker assumption than normality. However, when the population distribution is normal, the interval tends to be narrower than would be any other interval with the same confidence level.
A Prediction Interval for a Single Future Value
In many applications, the objective is to predict a single value of a variable to be observed at some future time, rather than to estimate the mean value of that variable.
EXAMPLE 7.12 Consider the following sample of fat content (in percentage) of randomly selected hot dogs (“Sensory and Mechanical Assessment of the Quality of Frankfurters,” J. of Texture Studies, 1990: 395-409):
Assuming that these were selected from a normal population distribution, a CI for (interval estimate of) the population mean fat content is
Suppose, however, you are going to eat a single hot dog of this type and want a prediction for the resulting fat content. A point prediction, analogous to a point estimate, is just . This prediction unfortunately gives no information about reliability or precision.
The general setup is as follows: We have available a random sample from a normal population distribution, and wish to predict the value of , a single future observation (e.g., the lifetime of a single lightbulb to be purchased or the fuel efficiency of a single vehicle to be rented). A point predictor is , and the resulting prediction error is . The expected value of the prediction error is
Since is independent of , it is independent of , so the variance of the prediction error is
The prediction error is normally distributed because it is a linear combination of independent, normally distributed rv’s. Thus
has a standard normal distribution. It can be shown that replacing by the sample standard deviation (of ) results in
Manipulating this variable as was manipulated in the development of a CI gives the following result.
PROPOSITION
A prediction interval (PI) for a single observation to be selected from a normal population distribution is
The prediction level is . A lower prediction bound results from replacing by and discarding the + part of (7.16); a similar modification gives an upper prediction bound.
The interpretation of a prediction level is similar to that of a confidence level. If the interval (7.16) is calculated for sample after sample and after each calculation is observed, in the long run of these intervals will include the corresponding future values.
EXAMPLE 7.13
(Example 7.12 continued)
With , and , a PI for the fat content of a single hot dog is
This interval is quite wide, indicating substantial uncertainty about fat content. Notice that the width of the PI is more than three times that of the CI.
The error of prediction is , a difference between two random variables, whereas the estimation error is , the difference between a random variable and a fixed (but unknown) value. The PI is wider than the CI because there is more variability in the prediction error (due to ) than in the estimation error. In fact, as gets arbitrarily large, the CI shrinks to the single value , and the PI approaches . There is uncertainty about a single value even when there is no need to estimate.
Tolerance Intervals
Consider a population of automobiles of a certain type, and suppose that under specified conditions, fuel efficiency (mpg) has a normal distribution with and . Then since the interval from -1.645 to 1.645 captures of the area under the curve, of all these automobiles will have fuel efficiency values between and . But what if the values of and are not known? We can take a sample of size , determine the fuel efficiencies, and , and form the interval whose lower limit is and whose upper limit is . However, because of sampling variability in the estimates of and , there is a good chance that the resulting interval will include less than of the population values. Intuitively, to have an a priori chance of the resulting interval including at least of the population values, when and are used in place of and we should also replace 1.645 by some larger number. For example, when , the value 2.310 is such that we can be confident that the interval will include at least of the fuel efficiency values in the population.
Let be a number between 0 and 100 . A tolerance interval for capturing at least of the values in a normal population distribution with a confidence level has the form
Tolerance critical values for , and 99 in combination with various sample sizes are given in Appendix Table A.6. This table also includes critical values for a confidence level of (these values are larger than the corresponding values). Replacing by + gives an upper tolerance bound, and using - in place of results in a lower tolerance bound. Critical values for obtaining these one-sided bounds also appear in Appendix Table A.6.
EXAMPLE 7.14 As part of a larger project to study the behavior of stressed-skin panels, a structural component being used extensively in North America, the article “Time-Dependent Bending Properties of Lumber” (J. of Testing and Eval., 1996: 187-193) reported on various mechanical properties of Scotch pine lumber specimens. Consider the following observations on modulus of elasticity obtained 1 minute after loading in a certain configuration:
There is a pronounced linear pattern in a normal probability plot of the data. Relevant summary quantities are . For a confidence level of , a two-sided tolerance interval for capturing at least of the modulus of elasticity values for specimens of lumber in the population sampled uses the tolerance critical value of 2.903 . The resulting interval is
We can be highly confident that at least of all lumber specimens have modulus of elasticity values between 8,564.9 and 20,500.1.
The CI for is , and the prediction interval for the modulus of elasticity of a single lumber specimen is . Both the prediction interval and the tolerance interval are substantially wider than the confidence interval.
Intervals Based on Nonnormal Population Distributions
The one-sample for is robust to small or even moderate departures from normality unless is quite small. By this we mean that if a critical value for confidence, for example, is used in calculating the interval, the actual confidence level will be reasonably close to the nominal level. If, however, is small and the population distribution is highly nonnormal, then the actual confidence level may be considerably different from the one you think you are using when you obtain a particular critical value from the table. It would certainly be distressing to believe that your confidence level is about when in fact it was really more like ! The bootstrap technique, introduced in Section 7.1, has been found to be quite successful at estimating parameters in a wide variety of nonnormal situations.
In contrast to the confidence interval, the validity of the prediction and tolerance intervals described in this section is closely tied to the normality assumption. These latter intervals should not be used in the absence of compelling evidence for normality. The excellent reference Statistical Intervals, cited in the bibliography at the end of this chapter, discusses alternative procedures of this sort for various other situations. EXERCISES Section 7.3 (28-41)
- Determine the values of the following quantities:
a. b. c. d. e.
- Determine the critical value(s) that will capture the desired -curve area in each of the following cases:
a. Central area
b. Central area
c. Central area
d. Central area
e. Upper-tail area
f. Lower-tail area
- Determine the critical value for a two-sided confidence interval in each of the following situations:
a. Confidence level
b. Confidence level
c. Confidence level
d. Confidence level
e. Confidence level
f. Confidence level
-
Determine the critical value for a lower or an upper confidence bound for each of the situations described in Exercise 30.
-
According to the article “Fatigue Testing of Condoms” (Polymer Testing, 2009: 567-571), “tests currently used for condoms are surrogates for the challenges they face in use,” including a test for holes, an inflation test, a package seal test, and tests of dimensions and lubricant quality (all fertile territory for the use of statistical methodology!). The investigators developed a new test that adds cyclic strain to a level well below breakage and determines the number of cycles to break. A sample of 20 condoms of one particular type resulted in a sample mean number of 1584 and a sample standard deviation of 607. Calculate and interpret a confidence interval at the confidence level for the true average number of cycles to break. [Note: The article presented the results of hypothesis tests based on the distribution; the validity of these depends on assuming normal population distributions.]
-
The article “Measuring and Understanding the Aging of Kraft Insulating Paper in Power Transformers” (IEEE Electrical Insul. Mag., 1996: 28-34) contained the following observations on degree of polymerization for paper specimens for which viscosity times concentration fell in a certain middle range: 454 463 465
a. Construct a boxplot of the data and comment on any interesting features.
b. Is it plausible that the given sample observations were selected from a normal distribution?
c. Calculate a two-sided confidence interval for true average degree of polymerization (as did the authors of the article). Does the interval suggest that 440 is a plausible value for true average degree of polymerization? What about 450 ?
- A sample of 14 joint specimens of a particular type gave a sample mean proportional limit stress of and a sample standard deviation of (“Characterization of Bearing Strength Factors in Pegged Timber Connections,” J. of Structural Engr., 1997: 326-332).
a. Calculate and interpret a lower confidence bound for the true average proportional limit stress of all such joints. What, if any, assumptions did you make about the distribution of proportional limit stress?
b. Calculate and interpret a lower prediction bound for the proportional limit stress of a single joint of this type.
- Silicone implant augmentation rhinoplasty is used to correct congenital nose deformities. The success of the procedure depends on various biomechanical properties of the human nasal periosteum and fascia. The article “Biomechanics in Augmentation Rhinoplasty” (J. of Med. Engr. and Tech., 2005: 14-17) reported that for a sample of 15 (newly deceased) adults, the mean failure strain (%) was 25.0 , and the standard deviation was 3.5.
a. Assuming a normal distribution for failure strain, estimate true average strain in a way that conveys information about precision and reliability.
b. Predict the strain for a single adult in a way that conveys information about precision and reliability. How does the prediction compare to the estimate calculated in part (a)?
- A normal probability plot of the observations on escape time given in Exercise 36 of Chapter 1 shows a substantial linear pattern; the sample mean and sample standard deviation are 370.69 and 24.36 , respectively.
a. Calculate an upper confidence bound for population mean escape time using a confidence level of .
b. Calculate an upper prediction bound for the escape time of a single additional worker using a prediction level of . How does this bound compare with the confidence bound of part (a)?
c. Suppose that two additional workers will be chosen to participate in the simulated escape exercise. Denote their escape times by and , and let denote the average of these two values. Modify the formula for a PI for a single value to obtain a PI for , and calculate a two-sided interval based on the given escape data.
- A study of the ability of individuals to walk in a straight line (“Can We Really Walk Straight?” Amer. J. of Physical Anthro., 1992: 19-27) reported the accompanying data on cadence (strides per second) for a sample of randomly selected healthy men.
A normal probability plot gives substantial support to the assumption that the population distribution of cadence is approximately normal. A descriptive summary of the data from Minitab follows:
VariableN | Mean | Median | TrMean | StDev | SEMean |
cadence 20 | 0.9255 | 0.9300 | 0.9261 | 0.0809 | 0.0181 |
Variable | Min | Max | Q1 | Q3 | |
cadence | 0.7800 | 1.0600 | 0.8525 | 0.9600 |
a. Calculate and interpret a confidence interval for population mean cadence.
b. Calculate and interpret a prediction interval for the cadence of a single individual randomly selected from this population.
c. Calculate an interval that includes at least of the cadences in the population distribution using a confidence level of .
- Ultra high performance concrete (UHPC) is a relatively new construction material that is characterized by strong adhesive properties with other materials. The article “Adhesive Power of Ultra High Performance Concrete from a Thermodynamic Point of View” (J. of Materials in Civil Engr., 2012: 1050-1058) described an investigation of the intermolecular forces for UHPC connected to various substrates. The following work of adhesion measurements (in ) for UHPC specimens adhered to steel appeared in the article:
107.1 109.5 107.4 106.8 1
a. Is it plausible that the given sample observations were selected from a normal distribution?
b. Calculate a two-sided confidence interval for the true average work of adhesion for UHPC adhered to steel. Does the interval suggest that 107 is a plausible value for the true average work of adhesion for UHPC adhered to steel? What about 110 ?
c. Predict the resulting work of adhesion value resulting from a single future replication of the experiment by calculating a prediction interval, and compare the width of this interval to the width of the CI from (b).
d. Calculate an interval for which you can have a high degree of confidence that at least of all UHPC specimens adhered to steel will have work of adhesion values between the limits of the interval.
- Exercise 72 of Chapter 1 gave the following observations on a receptor binding measure (adjusted distribution volume) for a sample of 13 healthy individuals: 23, 39, .
a. Is it plausible that the population distribution from which this sample was selected is normal?
b. Calculate an interval for which you can be confident that at least of all healthy individuals in the population have adjusted distribution volumes lying between the limits of the interval.
c. Predict the adjusted distribution volume of a single healthy individual by calculating a prediction interval. How does this interval’s width compare to the width of the interval calculated in part (b)?
- Exercise 13 of Chapter 1 presented a sample of observations on ultimate tensile strength, and Exercise 17 of the previous section gave summary quantities and requested a large-sample confidence interval. Because the sample size is large, no assumptions about the population distribution are required for the validity of the CI.
a. Is any assumption about the tensile-strength distribution required prior to calculating a lower prediction bound for the tensile strength of the next specimen selected using the method described in this section? Explain.
b. Use a statistical software package to investigate the plausibility of a normal population distribution.
c. Calculate a lower prediction bound with a prediction level of for the ultimate tensile strength of the next specimen selected.
- A more extensive tabulation of critical values than what appears in this book shows that for the distribution with
, the areas to the right of the values , and 1.064 are , and .15, respectively. What is the confidence level for each of the following three confidence intervals for the mean of a normal population distribution? Which of the three intervals would you recommend be used, and why?
a.
b.
c.
7.4 Confidence Intervals for the Variance and Standard Deviation of a Normal Population
Although inferences concerning a population variance or standard deviation are usually of less interest than those about a mean or proportion, there are occasions when such procedures are needed. In the case of a normal population distribution, inferences are based on the following result concerning the sample variance .
THEOREM
Let be a random sample from a normal distribution with parameters and . Then the rv
has a chi-squared probability distribution with .
As discussed in Sections 4.4 and 7.1, the chi-squared distribution is a continuous probability distribution with a single parameter , called the number of degrees of freedom, with possible values . The graphs of several probability density functions (pdf’s) are illustrated in Figure 7.10. Each pdf is positive only for , and each has a positive skew (stretched out upper tail), though the distribution moves rightward and becomes more symmetric as increases. To specify inferential procedures that use the chi-squared distribution, we need notation analogous to that for a critical value .
Figure 7.10 Graphs of chi-squared density functions
NOTATION Let , called a chi-squared critical value, denote the number on the horizontal axis such that of the area under the chi-squared curve with df lies to the right of .
Symmetry of distributions made it necessary to tabulate only upper-tailed critical values for small values of . The chi-squared distribution is not symmetric, so Appendix Table A. 7 contains values of both for near 0 and near 1, as illustrated in Figure 7.11(b). For example, , and (the 5th percentile) .
Figure notation illustrated
The rv satisfies the two properties on which the general method for obtaining a CI is based: It is a function of the parameter of interest , yet its probability distribution (chi-squared) does not depend on this parameter. The area under a chi-squared curve with df to the right of is , as is the area to the left of . Thus the area captured between these two critical values is . As a consequence of this and the theorem just stated,
The inequalities in (7.17) are equivalent to
Substituting the computed value into the limits gives a CI for , and taking square roots gives an interval for .
A confidence interval for the variance of a normal population has lower limit
and upper limit
A confidence interval for has lower and upper limits that are the square roots of the corresponding limits in the interval for . An upper or a lower confidence bound results from replacing with in the corresponding limit of the CI.
7.15 The accompanying data on breakdown voltage of electrically stressed circuits was read from a normal probability plot that appeared in the article “Damage of Flexible Printed Wiring Boards Associated with Lightning-Induced Voltage Surges”
(IEEE Transactions on Components, Hybrids, and Manuf. Tech., 1985: 214-220).
The straightness of the plot gave strong support to the assumption that breakdown voltage is approximately normally distributed.
Let denote the variance of the breakdown voltage distribution. The computed value of the sample variance is , the point estimate of . With , a CI requires and . The interval is
Taking the square root of each endpoint yields (276.0,564.0) as the for . These intervals are quite wide, reflecting substantial variability in breakdown voltage in combination with a small sample size.
CIs for and when the population distribution is not normal can be difficult to obtain. For such cases, consult a knowledgeable statistician.
EXERCISES Section 7.4 (42-46)
- Determine the values of the following quantities:
a. b.
c. d.
e. f.
- Determine the following:
a. The 95th percentile of the chi-squared distribution with
b. The 5th percentile of the chi-squared distribution with
c. , where is a chi-squared rv with
d. or , where is a chi-squared rv with
-
The amount of lateral expansion (mils) was determined for a sample of pulsed-power gas metal arc welds used in LNG ship containment tanks. The resulting sample standard deviation was mils. Assuming normality, derive a for and for .
-
Wire electrical-discharge machining (WEDM) is a process used to manufacture conductive hard metal components. It uses a continuously moving wire that serves as an electrode. Coating on the wire electrode allows for cooling of the wire electrode core and provides an improved cutting performance. The article “High-Performance Wire Electrodes for Wire Electrical-Discharge Machining-A Review” (J. of Engr. Manuf., 2012: 1757-1773) gave the following sample observations on total coating layer thickness (in ) of eight wire electrodes used for WEDM:
Calculate a for the standard deviation of the coating layer thickness distribution. Is this interval valid whatever the nature of the distribution? Explain.
- The article “Concrete Pressure on Formwork” (Mag. of Concrete Res., 2009: 407-417) gave the following observations on maximum concrete pressure :
a. Is it plausible that this sample was selected from a normal population distribution?
b. Calculate an upper confidence bound with confidence level for the population standard deviation of maximum pressure.
- Example 1.11 introduced the accompanying observations on bond strength.
11.5 | 12.1 | 9.9 | 9.3 | 7.8 | 6.2 | 6.6 | 7.0 |
13.4 | 17.1 | 9.3 | 5.6 | 5.7 | 5.4 | 5.2 | 5.1 |
4.9 | 10.7 | 15.2 | 8.5 | 4.2 | 4.0 | 3.9 | 3.8 |
3.6 | 3.4 | 20.6 | 25.5 | 13.8 | 12.6 | 13.1 | 8.9 |
8.2 | 10.7 | 14.2 | 7.6 | 5.2 | 5.5 | 5.1 | 5.0 |
5.2 | 4.8 | 4.1 | 3.8 | 3.7 | 3.6 | 3.6 | 3.6 |
a. Estimate true average bond strength in a way that conveys information about precision and reliability. [Hint: and .]
b. Calculate a for the proportion of all such bonds whose strength values would exceed 10 .
- The article “Distributions of Compressive Strength Obtained from Various Diameter Cores” (ACI Materials J., 2012: 597-606) described a study in which compressive strengths were determined for concrete specimens of various types, core diameters, and length-to-diameter ratios. For one particular type, diameter, and ratio, the 18 tested specimens resulted in a sample mean compressive strength of and a sample standard deviation of . Normality of the compressive strength distribution was judged to be quite plausible.
a. Calculate a confidence interval with confidence level for the true average compressive strength under these circumstances.
b. Calculate a lower prediction bound for the compressive strength of a single future specimen tested under the given circumstances. [Hint: 2.224.]
-
For those of you who don’t already know, dragon boat racing is a competitive water sport that involves 20 paddlers propelling a boat across various race distances. It has become increasingly popular over the last few years. The article “Physiological and Physical Characteristics of Elite Dragon Boat Paddlers” (J. of Strength and Conditioning, 2013: 137-145) summarized an extensive statistical analysis of data obtained from a sample of 11 paddlers. It reported that a confidence interval for true average force during a simulated 200-m race was . Obtain a prediction interval for the force of a single randomly selected dragon boat paddler undergoing the simulated race.
-
A journal article reports that a sample of size 5 was used as a basis for calculating a for the true average natural frequency of delaminated beams of a certain type. The resulting interval was (229.764, 233.504). You decide that a confidence level of is more appropriate than the level used. What are the limits of the interval? [Hint: Use the center of the interval and its width to determine and .]
-
Unexplained respiratory symptoms reported by athletes are often incorrectly considered secondary to exercise-induced asthma. The article “High Prevalence of Exercise-Induced Laryngeal Obstruction in Athletes” (Medicine and Science in Sports and Exercise, 2013: 2030-2035) suggested that many such cases could instead be explained by obstruction of the larynx. In a sample of 88 athletes referred for an asthma workup, 31 were found to have the EILO condition.
a. Calculate and interpret a confidence interval using a confidence level for the true proportion of all athletes found to have the EILO condition under these circumstances.
b. What sample size is required if the desired width of the is to be at most .04, irrespective of the sample results?
c. Does the upper limit of the interval in (a) specify a upper confidence bound for the proportion being estimated? Explain.
- High concentration of the toxic element arsenic is all too common in groundwater. The article “Evaluation of Treatment Systems for the Removal of Arsenic from Groundwater” (Practice Periodical of Hazardous, Toxic, and Radioactive Waste Mgmt., 2005: 152-157) reported that for a sample of water specimens selected for treatment by coagulation, the sample mean arsenic concentration was , and the sample standard deviation was 4.1. The authors of the cited article used -based methods to analyze their data, so hopefully had reason to believe that the distribution of arsenic concentration was normal.
a. Calculate and interpret a for true average arsenic concentration in all such water specimens.
b. Calculate a upper confidence bound for the standard deviation of the arsenic concentration distribution.
c. Predict the arsenic concentration for a single water specimen in a way that conveys information about precision and reliability.
- Aphid infestation of fruit trees can be controlled either by spraying with pesticide or by inundation with ladybugs. In a particular area, four different groves of fruit trees are selected for experimentation. The first three groves are sprayed with pesticides 1,2 , and 3 , respectively, and the
fourth is treated with ladybugs, with the following results on yield:
Treatment | ${n}_{i} =$ Number of Trees | ${\bar{x}}_{i}$ (Bushels/Tree) | ${s}_{i}$ |
---|---|---|---|
1 | 100 | 10.5 | 1.5 |
2 | 90 | 10.0 | 1.3 |
3 | 100 | 10.1 | 1.8 |
4 | 120 | 10.7 | 1.6 |
Let the true average yield (bushels/tree) after receiving the th treatment. Then
measures the difference in true average yields between treatment with pesticides and treatment with ladybugs. When , and are all large, the estimator obtained by replacing each by is approximately normal. Use this to derive a large-sample CI for , and compute the interval for the given data.
-
It is important that face masks used by firefighters be able to withstand high temperatures because firefighters commonly work in temperatures of . In a test of one type of mask, 11 of 55 masks had lenses pop out at . Construct a upper confidence bound for the true proportion of masks of this type whose lenses would pop out at .
-
A manufacturer of college textbooks is interested in estimating the strength of the bindings produced by a particular binding machine. Strength can be measured by recording the force required to pull the pages from the binding. If this force is measured in pounds, how many books should be tested to estimate the average force required to break the binding to within with confidence? Assume that is known to be .8 .
-
The accompanying data on crack initiation depth was read from a lognormal probability plot that appeared in the article “Incorporating Small Fatigue Crack Growth in Probabilistic Life Prediction: Effect of Stress Ratio in Ti-6Al-2Sn-6Mo” (Intl. J. of Fatigue, 2013: 83-95). Although the pattern in the plot was quite straight, a normal probability plot of the data also shows a reasonably linear pattern. And a boxplot indicates that the distribution is quite symmetric in the middle of the data and only mildly skewed overall. It is therefore reasonable to estimate and predict using intervals.
a. Estimate the true average crack initiation depth with a and interpret the resulting interval.
b. Predict the value of a single crack initiation depth by constructing a PI.
c. Interpret in context the meaning of in (b).
-
In Example 6.8, we introduced the concept of a censored experiment in which components are put on test and the experiment terminates as soon as of the components have failed. Suppose component lifetimes are independent, each having an exponential distribution with parameter . Let denote the time at which the first failure occurs, the time at which the second failure occurs, and so on, so that is the total accumulated lifetime at termination. Then it can be shown that has a chi-squared distribution with . Use this fact to develop a CI formula for true average lifetime . Compute a CI from the data in Example 6.8.
-
Let be a random sample from a continuous probability distribution having median (so that .
a. Show that
so that is a confidence interval for with . [Hint: The complement of the event is . But max iff for all .]
b. For each of six normal male infants, the amount of the amino acid alanine was determined while the infants were on an isoleucine-free diet, resulting in the following data:2.70
Compute a for the true median amount of alanine for infants on such a diet (“The Essential Amino Acid Requirements of Infants,” Amer. J. of Nutrition, 1964: 322-330).
c. Let denote the second smallest of the ’s and denote the second largest of the ’s. What is the confidence level of the interval for ?
- Let be a random sample from a uniform distribution on the interval , so that
Then if , it can be shown that the rv has density function
a. Use to verify that
and use this to derive a CI for .
b. Verify that , and derive a CI for based on this probability statement.
c. Which of the two intervals derived previously is shorter? If my waiting time for a morning bus is uniformly distributed and observed waiting times are , and , derive a for by using the shorter of the two intervals.
- Let . Then a CI for when is large is
The choice yields the usual interval derived in Section 7.2; if , this interval is not symmetric about . The width of this interval is . Show that is minimized for the choice , so that the symmetric interval is the shortest. [Hints: (a) By definition of , so that ; (b) the relationship between the derivative of a function and the inverse function is
- Suppose are observed values resulting from a random sample from a symmetric but possibly heavy-tailed distribution. Let and denote the sample median and fourth spread, respectively. Chapter 11 of Understanding Robust and Exploratory Data Analysis (see the bibliography in Chapter 6) suggests the following robust for the population mean (point of symmetry):
The value of the quantity in parentheses is 2.10 for for , and 1.91 for . Compute this for the data of Exercise 45, and compare to the appropriate for a normal population distribution.
- a. Use the results of Example 7.5 to obtain a lower confidence bound for the parameter of an exponential distribution, and calculate the bound based on the data given in the example.
b. If lifetime has an exponential distribution, the probability that lifetime exceeds is . Use the result of part (a) to obtain a lower confidence bound for the probability that breakdown time exceeds .
BIBLIOGRAPHY
DeGroot, Morris, and Mark Schervish, Probability and Statistics (4th ed.), Addison-Wesley, Upper Saddle River, . A very good exposition of the general principles of statistical inference.
Devore, Jay, and Kenneth Berk, Modern Mathematical Statistics with Applications, Springer, New York, 2012. The exposition is a bit more comprehensive and sophisticated
than that of the current book, and includes more material on bootstrapping.
Hahn, Gerald, and William Meeker, Statistical Intervals, Wiley, New York, 1991. Almost everything you ever wanted to know about statistical intervals (confidence, prediction, tolerance, and others).