Content
- 4.6.1 Sample Percentiles
- 4.6.2 A Probability Plot
- 4.6.3 Beyond Normality
- EXERCISES Section 4.6 (87-97)
Introduction
An investigator will often have obtained a numerical sample and wish to know whether it is plausible that it came from a population distribution of some particular type (e.g., from a normal distribution). For one thing, many formal procedures from statistical inference are based on the assumption that the population distribution is of a specified type. The use of such a procedure is inappropriate if the actual underlying probability distribution differs greatly from the assumed type. For example, the article “Toothpaste Detergents: A Potential Source of Oral Soft Tissue Damage” (Intl. J. of Dental Hygiene, 2008: 193-198) contains the following statement: “Because the sample number for each experiment (replication) was limited to three wells per treatment type, the data were assumed to be normally distributed.” As justification for this leap of faith, the authors wrote that “Descriptive statistics showed standard deviations that suggested a normal distribution to be highly likely.” Note: This argument is not very persuasive.
Additionally, understanding the underlying distribution can sometimes give insight into the physical mechanisms involved in generating the data. An effective way to check a distributional assumption is to construct what is called a probability plot. The essence of such a plot is that if the distribution on which the plot is based is correct, the points in the plot should fall close to a straight line. If the actual distribution is quite different from the one used to construct the plot, the points will likely depart substantially from a linear pattern.