Skip to main content

Checking Model Assumptions

  • Chapter
  • First Online:
Design and Analysis of Experiments

Part of the book series: Springer Texts in Statistics ((STS))

Abstract

Every model contains underlying assumptions about its form and about the distribution of error variables. In this chapter discusses methods of checking such assumptions for the one-way analysis of variance model, including checking the normality, constant variance, and independence of the errors. In this chapter, and throughout the book, the model assumption checks are made by examining residual plots. In the case of unequal variances, a transformation of data is suggested as well as methods for data analysis which incorporate unequal variances. The normality assumption is checked through construction of half-normal probability plots. A real experiment illustrates the techniques, and the use of SAS and R software is illustrated.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Angela Dean .

Exercises

Exercises

  1. 1.

    Meat cooking experiment, continued

    Check the assumptions on the one-way analysis of variance model (3.3.1) for the meat cooking experiment, which was introduced in Exercise 14 of Chap. 3. The data were given in Table 3.14. (the order of collection of observations is not available).

  2. 2.

    Soap experiment, continued

    Check the assumptions on the one-way analysis of variance model (3.3.1) for the soap experiment, which was introduced in Sect. 2.5.1. The data are reproduced in Table 5.15 (the order of collection of observations is not available).

Table 5.15 Weight loss for the soap experiment
Table 5.16 Melting times for margarine in seconds
  1. 3.

    Margarine experiment (Amy L. Phelps, 1987)

    The data in Table 5.16 are the melting times in seconds for three different brands of margarine (coded 1–3) and one brand of butter (coded 4). The butter was used for comparison purposes. The sizes and shapes of the initial margarine/butter pats were as similar as possible, and these were melted one by one in a clean frying pan over a constant heat.

    1. (a)

      Check the equal-variance assumption on model (3.3.1) for these data. If a transformation is required, choose the best transformation of the form (5.6.3), and recheck the assumptions.

    2. (b)

      Using the transformed data, compute a \(95\%\) confidence interval comparing the average melting times for the margarines with the average melting time for the butter.

    3. (c)

      Repeat part (b) using the untransformed data and Satterthwaite’s approximation for unequal variances. Compare the results with those of part (b).

    4. (d)

      For this set of data, which analysis do you prefer? Why?

  2. 4.

    Reaction time experiment, continued

    The reaction time pilot experiment was described in Exercise 4 of Chap. 4. The experimenters were interested in the different effects on the reaction time of the aural and visual cues and also in the different effects of the elapsed time between the cue and the stimulus. There were six treatment combinations:

    $$\begin{aligned} \begin{array}{lcl} 1 = \text {aural, 5 seconds} &{} &{} 4 = \text {visual, 5 seconds}\\ 2 = \text {aural, 10 seconds} &{}&{} 5 = \text {visual, 10 seconds}\\ 3 = \text {aural, 15 seconds} &{}&{} 6 = \text {visual, 15 seconds}\\ \end{array} \end{aligned}$$

    The data are reproduced, together with their order of observation, in Table 5.17. The pilot experiment employed a single subject. Of concern to the experimenters was the possibility that the subject may show signs of fatigue. Consequently, fixed rest periods were enforced between every pair of observations.

    1. (a)

      Check whether or not the assumptions on the one-way analysis of variance model (3.3.1) are approximately satisfied for these data. Pay particular attention to the experimenter’s concerns about fatigue.

    2. (b)

      Suggest a way to design the experiment using more than one subject. (Hint: consider using subjects as blocks in the experiment).

Table 5.17 Reaction times (in seconds) for the reaction time experiment
Table 5.18 Production rates for the catalyst experiment
  1. 5.

    Catalyst experiment

    H. Smith, the 1969 volume of Journal of Quality Technology, described an experiment that investigated the effect of four reagents and three catalysts on the production rate in a catalyst plant. He coded the reagents as A, B, C, and D, and the catalysts as X, Y, and Z, giving twelve treatment combinations, coded as \(AX,~AY, \ldots , DZ\). Two observations were taken on each treatment combination, and these are shown in Table 5.18, together with the order in which the observations were collected.

    Are the assumptions on the one-way analysis of variance model (3.3.1) approximately satisfied for these data? If not, can you suggest what needs to be done in order to be able to analyze the experiment?

  2. 6.

    Bicycle experiment (Debra Schomer 1987)

    The bicycle experiment was run to compare the crank rates required to keep a bicycle at certain speeds, when the bicycle was in twelfth gear on flat ground. The speeds chosen were 5, 10, 15, 20, and 25 mph, (coded 1–5). The data are given in Table 5.19. The experimenter fitted the one-way analysis of variance model (3.3.1) and plotted the standardized residuals. She commented in her report:

    Note the larger spread of the data at lower speeds. This is due to the fact that in such a high gear, to maintain such a low speed consistently for a long period of time is not only bad for the bike, it is rather difficult to do.

    Thus the experimenter was not surprised to find a difference in the variances of the error variables at different levels of the treatment factor.

    1. (a)

      Plot the standardized residuals against \(\widehat{y}_{it}\), compare the sample variances, and evaluate equality of the error variances for the treatments.

    2. (b)

      Choose the best transformation of the data of the form (5.6.3), and test the hypotheses that the linear and quadratic trends in crank rates due to the different speeds are negligible, using an overall significance level of 0.01.

    3. (c)

      Repeat part (b), using the untransformed data and Satterthwaite’s approximation for unequal variances,

    4. (d)

      Discuss the relative merits of the methods applied in parts (b) and (c).

Table 5.19 Data for the bicycle experiment
  1. 7.

    Dessert experiment

    (P. Clingan, Y. Deng, M. Geil, J. Mesaros, and J. Whitmore, 1996)

    The experimenters were interested in whether the melting rate of a frozen orange dessert would be affected (and, in particular, slowed down) by the addition of salt and/or sugar. At this point, they were not interested in taste testing. Six treatments were selected, as follows:

    $$\begin{aligned} \begin{array}{lcl} 1 = \text {1/8 tsp salt, 1/4 cup sugar} &{}&{} 4 = \text {1/4 tsp salt, 1/4 cup sugar}\\ 2 = \text {1/8 tsp salt, 1/2 cup sugar} &{}&{} 5 = \text {1/4 tsp salt, 1/2 cup sugar}\\ 3 = \text {1/8 tsp salt, 3/4 cup sugar} &{}&{} 6 = \text {1/4 tsp salt, 3/4 cup sugar}\\ \end{array} \end{aligned}$$

    For each observation of each treatment, the required amount of sugar and salt was added to the contents of a 12-ounce can of frozen orange juice together with 3 cups of water. The orange juice mixes were frozen in ice cube trays and allocated to random positions in a freezer. After 48 hours, the cubes were removed from the freezer, placed on half-inch mesh wire grid and allowed to melt into a container in the laboratory (which was held at 24.4 \(^\circ \)C) for 30 minutes. The percentage melting (by weight) of the cubes are recorded in Table 5.20. The coded position on the table during melting is also recorded.

    1. (a)

      Plot the data. Does it appear that the treatments have different effects on the melting of the frozen orange dessert?

    2. (b)

      Check whether the assumptions on the one-way analysis of variance model (3.3.1) are satisfied for these data. Pay particular attention to the equal-variance assumption.

    3. (c)

      Use Satterthwaite’s method to compare the pairs of treatments, using individual 99% confidence intervals. If doing the computations by hand, compute only the confidence intervals corresponding to the three most disparate pairs of treatment sample means.

    4. (d)

      What conclusions can you draw about the effects of the treatments on the melting of the frozen orange dessert? If your concern was to produce frozen dessert with a long melting time, which treatment would you recommend? What other factors should be taken into account before production of such a dessert?

Table 5.20 Percentage melting of frozen orange cubes for the dessert experiment
  1. 8.

    Wildflower experiment (Barbra Foderaro 1986)

    An experiment was run to determine whether or not the germination rate of the endangered species of Ohio plant Froelichia floridana is affected by storage temperature or storage method. The two levels of the factor “temperature” were “spring temperature, 14–24 \(^\circ \)C” and “summer temperature, 18–27 \(^\circ \)C.” The two levels of the factor “storage” were “stratified” and “unstratified.” Thus, there were four treatment combinations in total. Seeds were divided randomly into sets of 20 and the sets assigned at random to the treatments. Each stratified set of seeds was placed in a mesh bag, spread out to avoid overlapping, buried in two inches of moist sand, and placed in a refrigeration unit for two weeks at 50 \(^\circ \)F. The unstratified sets of seeds were kept in a paper envelope at room temperature. After the stratification period, each set of seeds was placed on a dish with 5 ml of distilled deionized water, and the dishes were put into one of two growth chambers for two weeks according to their assigned level of temperature. At the end of this period, each dish was scored for the number of germinated seeds. The resulting data are given in Table 5.21.

    1. (a)

      For the original data, evaluate the constant-variance assumption on the one-way analysis of variance model (3.3.1) both graphically and by comparing sample variances.

    2. (b)

      It was noted by the experimenter that since the data were the numbers of germinated seeds out of a total of 20 seeds, the observations \(Y_{it}\) should have a binomial distribution. Does the corresponding transformation help to stabilize the variances?

    3. (c)

      Plot \(\ln (s^2_i)\) against \(\ln (\overline{y}_{i.})\) and discuss whether or not a power transformation of the form given in Eq. (5.6.3) might equalize the variances.

    4. (d)

      Use Scheffé’s method of multiple comparisons, in conjunction with Satterthwaite’s approximation, to construct 95% confidence intervals for all pairwise comparisons and for the two contrasts

      $$ \frac{1}{2}[1, 1,-1,-1] \mathrm{~~~~and~~~~} \frac{1}{2}[1,-1, 1,-1] \,, $$

      which compare the effects of temperature and storage methods, respectively.

    Table 5.21 Data for the wildflower experiment
    Table 5.22 Weights (in grams) for the spaghetti sauce experiment
  2. 9.

    Spaghetti sauce experiment

    (K. Brewster, E. Cesmeli, J, Kosa, M. Smith, and M. Soliman 1996)

    The spaghetti sauce experiment was run to compare the thicknesses of three particular brands of spaghetti sauce, both when stirred and unstirred. The six treatments were:

    $$\begin{aligned} \begin{array}{lcl} 1 = \text {store brand, unstirred} &{}&{} 2 = \text {store brand, stirred}\\ 3 = \text {national brand, unstirred} &{}&{} 4 = \text {national brand, stirred}\\ 5 = \text {gourmet brand, unstirred} &{}&{} 6 = \text {gourmet brand, stirred}\\ \end{array} \end{aligned}$$

    Part of the data collected is shown in Table 5.22. There are three observations per treatment, and the response variable is the weight (in grams) of sauce that flowed through a colander in a given period of time. A thicker sauce would give rise to smaller weights.

    1. (a)

      Check the assumptions on the one-way analysis of variance model (3.3.1).

    2. (b)

      Use Satterthwaite’s method to obtain simultaneous confidence intervals for the six preplanned contrasts

      $$ \tau _1 - \tau _2\,, \quad \tau _3 - \tau _4\,, \quad \tau _5 - \tau _6\,, \quad \tau _1 - \tau _5\,, \quad \tau _1 - \tau _3\,, \quad \tau _3 - \tau _5 \,, $$

      Select an overall confidence level of at least 94%.

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this chapter

Cite this chapter

Dean, A., Voss, D., Draguljić, D. (2017). Checking Model Assumptions. In: Design and Analysis of Experiments. Springer Texts in Statistics. Springer, Cham. https://doi.org/10.1007/978-3-319-52250-0_5

Download citation

Publish with us

Policies and ethics