Split-Plot Designs: What, Why, and How
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Split-Plot Designs: What, Why, and How BRADLEY JONES SAS Institute, Cary, NC 27513 CHRISTOPHER J. NACHTSHEIM Carlson School of Management, University of Minnesota, Minneapolis, MN 55455 The past decade has seen rapid advances in the development of new methods for the design and analysis of split-plot experiments. Unfortunately, the value of these designs for industrial experimentation has not been fully appreciated. In this paper, we review recent developments and provide guidelines for the use of split-plot designs in industrial applications. Key Words: Block Designs; Completely Randomized Designs; Factorial Designs; Fractional Factorial De- signs; Optimal Designs; Random-Effects Model; Random-Run-Order Designs; Response Surface Designs; Robust Parameter Designs. Introduction ments previously thought to be completely random- ized experiments also exhibit split-plot structure. All industrial experiments are split-plot experiments. This surprising result adds creedence to the Daniel proclamation and has motivated a great deal of pi- T HISprovocative remark has been attributed to the famous industrial statistician, Cuthbert Daniel, by Box et al. (2005) in their well-known text oneering work in the design and analysis of split- plot experiments. We note in particular the identi- fication of minimum-aberration split-plot fractional on the design of experiments. Split-Plot experiments factorial designs by Bingham and Sitter and cowork- were invented by Fisher (1925) and their importance ers (Bingham and Sitter (1999, 2001, 2003), Bingham in industrial experimentation has been long recog- et al. (2004), Loeppky and Sitter (2002)), the devel- nized (Yates (1936)). It is also well known that many opment of equivalent-estimation split-plot designs for industrial experiments are fielded as split-plot exper- response surface and mixture designs (Kowalski et iments and yet erroneously analyzed as if they were al. (2002), Vining et al. (2005), Vining and Kowal- completely randomized designs. This is frequently ski (2008)), and the development of optimal split- the case when hard-to-change factors exist and eco- plot designs by Goos, Vandebroek, and Jones and nomic constraints preclude the use of complete ran- their coworkers (Goos (2002), Goos and Vandebroek domization. Recent work, most notably by Lucas and (2001, 2003, 2004), Jones and Goos (2007 and 2009)). his coworkers (Anbari and Lucas (1994), Ganju and Lucas (1997, 1999, 2005), Ju and Lucas (2002), Webb Our goal here is to review these recent develop- et al. (2004)) has demonstrated that many experi- ments and to provide guidelines for the practicing statistician on their use. Dr. Jones is Director of Research and Development for What Is a Split-Plot Design? the JMP Division of SAS Institute, Inc. His email address is bradley.jones@sas.com. In simple terms, a split-plot experiment is a Dr. Nachtsheim is the Curtis L. Carlson Professor and blocked experiment, where the blocks themselves Chair of the Operations and Management Science Department serve as experimental units for a subset of the factors. at the Carlson School of Management of the University of Min- Thus, there are two levels of experimental units. The nesota. His email address is nacht001@umn.edu. blocks are referred to as whole plots, while the experi- Journal of Quality Technology 340 Vol. 41, No. 4, October 2009
SPLIT-PLOT DESIGNS: WHAT, WHY, AND HOW 341 TABLE 1. Split-Plot Design and Data for Studying the Corrosion Resistance of Steel Bars (Box et al. (2005)) Temperature Coating Whole-plot (◦ C) (randomized order) 1. 360 C2 C3 C1 C4 73 83 67 89 2. 370 C1 C3 C4 C2 65 87 86 91 FIGURE 1. Split Plot Agricultural Layout. (Factor A is the whole-plot factor and factor B is the split-plot factor.) 3. 380 C3 C1 C2 C4 147 155 127 212 mental units within blocks are called split plots, split units, or subplots. Corresponding to the two levels of 4. 380 C4 C3 C2 C1 experimental units are two levels of randomization. 153 90 100 108 One randomization is conducted to determine the assignment of block-level treatments to whole plots. 5. 370 C4 C1 C3 C2 Then, as always in a blocked experiment, a random- 150 140 121 142 ization of treatments to split-plot experimental units occurs within each block or whole plot. 6. 360 C1 C4 C2 C3 33 54 8 46 Split-plot designs were originally developed by Fisher (1925) for use in agricultural experiments. As a simple illustration, consider a study of the effects of two irrigation methods (factor A) and two fertiliz- to-change factor. The experiment was designed to ers (factor B) on yield of a crop, using four available study the corrosion resistance of steel bars treated fields as experimental units. In this investigation, it with four coatings, C1 , C2 , C3 , and C4 , at three fur- is not possible to apply different irrigation methods nace temperatures, 360◦ C, 370◦ C, and 380◦ C. Fur- (factor A) in areas smaller than a field, although dif- nace temperature is the hard-to-change factor be- ferent fertilizer types (factor B) could be applied in cause of the time it takes to reset the furnace and relatively small areas. For example, if we subdivide reach a new equilibrium temperature. Once the equi- each whole plot (field) into two split plots, each of the librium temperature is reached, four steel bars with two fertilizer types can be applied once within each randomly assigned coatings C1 , C2 , C3 , and C4 are whole plot, as shown in Figure 1. In this split-plot randomly positioned in the furnace and heated. The design, a first randomization assigns the two irriga- layout of the experiment as performed is given in Ta- tion types to the four fields (whole plots); then within ble 1. Notice that each whole-plot treatment (tem- each field, a separate randomization is conducted to perature) is replicated twice and that there is just assign the two fertilizer types to the two split plots one complete replicate of the split-plot treatments within each field. (coatings) within each whole plot. Thus, we have six whole plots and four subplots within each whole plot. In industrial experiments, factors are often differ- Other illustrative examples of industrial exper- entiated with respect to the ease with which they can iments having both easy-to-change and hard-to- be changed from experimental run to experimental change factors include run. This may be due to the fact that a particular treatment is expensive or time-consuming to change, 1. Box and Jones (1992) describe an experiment or it may be due to the fact that the experiment is in which a packaged-foods manufacturer wished to be run in large batches and the batches can be to develop an optimal formulation of a cake subdivided later for additional treatments. Box et mix. Because cake mixes are made in large al. (2005) describe a prototypical split-plot experi- batches, the ingredient factors (flour, shorten- ment with one easy-to-change factor and one hard- ing, and egg powder) are hard to change. Many Vol. 41, No. 4, October 2009 www.asq.org
342 BRADLEY JONES AND CHRISTOPHER J. NACHTSHEIM packages of cake mix are produced from each the ANOVA sums of squares are orthogonal, a stan- batch, and the individual packages of cake mix dard, mixed-model, ANOVA-based approach to the can be baked using different baking times and analysis is possible. This is the approach that is taken baking temperatures. Here, the easy-to-change, by most DOE textbooks that treat split-plot de- split-plot factors are time and temperature. signs (see, e.g., Kutner et al. (2005), Montgomery 2. Another example involving large batches and (2008), Box et al. (2005)). With this approach, all the subsequent processing of subbatches was experimental factor effects are assumed to be fixed. discussed by Bingham and Sitter (2001). In this The standard ANOVA model for the balanced two- instance, a company wished to study the ef- factor split-plot design, where there are a levels of fect of various factors on the swelling of a wood the whole-plot factor A and b levels of the split-plot product after it has been saturated with water treatment B, where each level of whole-plot factor A and allowed to dry. A batch is characterized is applied to c whole plots (so that the whole plot ef- by the type and size of wood being tested, the fect is nested within factor A), and where the number amounts of two additives used, and the level of runs is n = abc, is given by of moisture saturation. Batches can later be Yijk = μ... + αi + βj + (αβ)ij + γk(i) + εijk , (1) subdivided for processing, which is character- ized by three easy-to-change factors, i.e., pro- where: μ... is a constant; αi , the a whole-plot treat- cess time, pressure, and material density. ment effects, are constants subject to αi = 0; βj , the b split-plot treatment effects, are constants sub- 3. Bisgaard (2000) discusses the use of split-plot ject to βj = 0; (αβ)ij ,the ab interaction effects, arrangements in parameter design experiments. are constants subject to i (αβ)ij = 0 for all j and In one example, he describes an experiment in- volving four prototype combustion engine de- j (αβ)ij = 0 for all i; γk(i) , the nw = ac whole-plot 2 signs and two grades of gasoline. The engineers errors, are independent N (0, σw ); εijk are indepen- 2 preferred to make up the four motor designs dent N (0, σ ); i = 1, . . . , a, j = 1, . . . , b, k = 1, . . . , c. and test both of the gasoline types while a par- The ANOVA model (1) is appropriate for balanced ticular motor was on a test stand. Here the factorial and fractional factorial designs. When bal- hard-to-change factor is motor type and the ance is not present due to, e.g., missing observations easy-to-change factor is gasoline grade. or a more complex design structure, such as a re- sponse surface experiment, more general methods are The analysis of a split-plot experiment is more required for analysis. For that reason, we prefer a complex than that for a completely randomized ex- more general, regression-based modeling approach. periment due to the presence of both split-plot and The regression model for a split-plot experiment is whole-plot random errors. In the Box et al. corrosion- a straightforward extension of the standard multiple resistance example, a whole-plot effect is introduced regression model, with a term added to account for with each setting or re-setting of the furnace. This the whole-plot error. Let wi = (wi1 , wi2 , . . . , wimw ) may be due, e.g., to operator error in setting the tem- and si = (si1 , si2 , . . . , sims ) denote, respectively, the perature, to calibration error in the temperature con- settings of the mw whole-plot factors and the settings trols, or to changes in ambient conditions. Split-plot of the ms split plot factors for the ith run. The linear errors might arise due to lack of repeatability of the statistical model for the ith run is measurement system, to variation in the distribution of heat within the furnace, to variation in the thick- yi = f (wi , si )β + zi γ + εi , (2) ness of the coatings from steel bar to steel bar, and where f (wi , si ) gives the (1 × p) polynomial model so on. We will assume that the whole-plot errors are 2 expansion of the factor settings, β is the p × 1 pa- independent and identically distributed as N(0, σw ), rameter vector of factor effects, zi is an indicator that the split-plot errors are independent and iden- vector whose kth element is one when the ith run tically distributed as N(0, σ 2 ), and that whole-plot was assigned to the kth whole plot and zero if not, errors and the split-plot errors are mutually indepen- γ is a b × 1 vector of whole-plot random effects, and dent. Randomization guarantees that the whole-plot εi is the split-plot error. We prefer the regression ap- errors are mutually independent and that the split- proach for its simplicity and flexibility. It permits the plot errors are mutually independent within whole use of polynomial terms for continuous or quantita- plots. tive treatments, as well as the use of indicators for When the split-plot experiment is balanced and qualitative predictors. Journal of Quality Technology Vol. 41, No. 4, October 2009
SPLIT-PLOT DESIGNS: WHAT, WHY, AND HOW 343 In matrix form, model (2) can be expressed as reset (e.g., by changing its level and then changing it back), inadvertent split plotting occurs. Such de- Y = Xβ + Zγ + ε, (3) signs, termed random run order (RRO) designs by where Y is the n × 1 vector of responses, X is the Ganju and Lucas (1997) are widely used and always n × p model matrix with ith row given by f (wi , si ), analyzed inappropriately. We consider this issue in Z is an n × b matrix with ith row equal to zi , and greater detail, under “Validity” later, and discuss ε = (ε1 , . . . , εn ) . We assume that ε ∼ N(0n , σ 2 In ), here the relative cost of split-plot experiments and γ ∼ N(0c , σw 2 Ic ), and Cov(ε, γ ) = 0c×n , where In true CRDs. Simple accounting tells us that the cost denotes an n × n identity matrix. of a split-plot experiment can be significantly less because much of the cost (or time required) of run- Our model assumptions imply that the variance– ning a split-plot experiment is tied to changes in the covariance matrix of the response vector Y can be hard-to-change factors. If the cost of setting a hard- written as to-change factor is Cw and the cost of setting an V = σ 2 In + σ w 2 ZZ , (4) easy-to-change factor is Cs , and if there are nw whole where In denotes the n×n identity matrix. If the runs plots and ns split plots, the CRD cost is ns (Cw +Cs ), are grouped by whole plots, the variance–covariance whereas the SPD cost is nw Cw +ns Cs , so that the ad- matrix is block-diagonal, ditional cost for the CRD is (ns − nw )Cw . Split-plot ⎡ ⎤ V1 0 · · · 0 designs are useful precisely because this additional ⎢ 0 V2 · · · 0 ⎥ cost can be substantial. Anbari and Lucas (1994, V=⎢ ⎣ ... .. .. ⎥ . . ⎦ .. 2008) contrast the costs of CRDs and SPDs for vari- . . ous design scenarios. Bisgaard (2000) also discussed 0 0 · · · Vb cost functions. The ith matrix on the diagonal is given by V i = σ 2 I si + σ w 2 1si 1si , Efficiency where si is the number of observations in the ith Split-plot experiments are not just less expen- whole plot and 1si is an si × 1 vector of ones. sive to run than completely randomized experiments; For statistical inference, we are interested primar- they are often more efficient statistically. Ju and Lu- ily in the estimation of the fixed effects parameter cas (2002) show that, with a single hard-to-change vector β and the variance components σ 2 and σw 2 . factor or a single easy-to-change factor, use of a split- There are several alternative methods available for plot layout leads to increased precision in the esti- estimation of model parameters for this general ap- mates for all factor effects except for whole-plot main proach. Recommendations for estimation and testing effects. Overall measures of design efficiency have are taken up in the later section “How to analyze a been considered by Anbari and Lucas (1994) and split-plot design.” Goos and Vandebroek (2004). Considering two-level, full factorial designs only, Anbari and Lucas (1994) Why Use Split-Plot Designs? showed that the maximum variance of prediction over a cuboidal design region for a completely ran- In this paper, we advocate greater consideration domized design in some instances is greater than that of split-plot experiments for three reasons: cost, effi- for a blocked split-plot design involving one hard- ciency, and validity. In this section, we discuss each to-change factor. In other words, the G-efficiency of of these advantages in turn. the split-plot design can, at times, be higher than the corresponding CRD. Similarly, Goos and Van- Cost debroek (2001, 2004) demonstrated that the deter- The cost of running a set of treatments in split- minant of a D-optimal split-plot design (discussed plot order is generally less than the cost of the same later) frequently exceeds that of the corresponding experiment when completely randomized. As Ganju D-optimal completely randomized design. These re- and Lucas (1997, 1998, 2005) have noted, a properly sults indicate that running split-plot designs can rep- implemented completely randomized design (CRD) resent a win–win proposition: less cost with greater requires that all factors must be independently re- overall precision. As a result, many authors recom- set with each run. If a factor level does not change mend the routine use of split-plot layouts, even when from one run to the next and the factor level is not a CRD is a feasible alternative. Vol. 41, No. 4, October 2009 www.asq.org
344 BRADLEY JONES AND CHRISTOPHER J. NACHTSHEIM Validity sions based on the split-plot analysis are exactly the opposite. Completely randomized designs are prescribed fre- quently in industry, but are typically not run as such One way to avoid these mistakes is to plan the in the presence of hard-to-change factors. One of two experiment as a split-plot design in the first place. shortcuts may be taken by the experimenter in the Not only does this avoid mistakes, it also leverages interest of saving time or money, i.e., the economic and statistical efficiencies already de- 1. A random-run-order (RRO) design—alternate- scribed. Interestingly, it is not necessarily the case ly referred to as a random-not-reset (RNR) that an RRO experiment is to be universally avoided. design—is utilized. As noted above, in this case, Webb et al. (2004) discuss comparisons of predic- the order of treatments is randomized, but the tion properties of experiments that are completely factor levels are not reset with each run of the randomized versus experiments that have one or experiment, or perhaps all factors, except the more factors that are not reset. They argue that, in hard-to-change factors are reset. Not resetting process-improvement experiments, the cost savings the levels of the hard-to-change factor leads to that accrues from not resetting each factor on every a correlation between adjacent runs. Statistical run may at times justify the use of an RRO design, tests that do not consider these correlations will even though prediction variances will be inflated and be biased (Ganju and Lucas (1997)). tests will be biased. 2. Although a random run order is provided to In general, we believe that considerations about the experimenter, the experimenter may decide the presence of hard-to-change factors and their po- to resort the treatment order to minimize the tential impact on the randomization employed should number of changes of the hard-to-change factor. be a part of the planning of every industrial ex- Although this leads to a split-plot design, the periment. That is, asking questions about necessary results are typically analyzed as if the design groupings of experimental runs is something that a had been fielded as a completely randomized consultant should do regularly. This is in contrast experiment. with the usual standard advice dispensed by con- sultants and short courses, that designs should al- Either scenario leads to an incorrect analysis. The ways be randomized. Finally, to ensure the validity Type I error rate for whole plot factors increases, of the experiment and the ensuing analysis, the ex- as does the Type II error rate for split-plot factors periment should be carefully monitored to verify that and whole-plot by split-plot interactions (see, e.g., whole-plot and split-plot factor settings are reset as Goos et al. (2006)). The corrosion-resistance exam- required by the design. ple of Table 1 is a good example. The table below provides p-values for tests of effects in the corrosion- resistance example for the standard (incorrect) anal- How to Choose a Split-Plot Design ysis and the split-plot analysis. At the α = .05 level of Choosing a “best” split-plot design for a given significance, the p-values for the completely random- design scenario can be a daunting task, even for a ized analysis lead to the conclusion that the temper- professional statistician. Facilities for construction of ature main effect is statistically significant, whereas split-plot designs are not as yet generally available in the coating and temperature-by-coating interaction software packages (with SAS/JMP being one excep- effects are not statistically significant. The conclu- tion). Moreover, introductory textbooks frequently consider only very simple, restrictive situations. Un- fortunately, in many cases, obtaining an appropriate p-Values for corrosion-resistance example design requires reading (and understanding) a rele- vant recent research paper or it requires access to Split-plot Standard software for generating the design. To make matters Effect analysis Analysis even more difficult, construction of split-plot exper- iments is an active area of research and differences Temperature (A) .209 .003 in opinion about the value of the various recent ad- Coating (B) .002 .386 vances exist. In what follows, we discuss the choice of Temp × Coating (A × B) .024 .852 split-plot designs in five areas: (1) two-level full facto- rial designs; (2) two-level fractional factorial designs; Journal of Quality Technology Vol. 41, No. 4, October 2009
SPLIT-PLOT DESIGNS: WHAT, WHY, AND HOW 345 (3) mixture and response surface designs; (4) split- TABLE 2. Full-Factorial Split-Plot Design with plot designs for robust product experiments; and (5) Split-Plot Blocking (in standard order) optimal designs. Whole plot Split plot Full Factorial Split-Plot Designs Whole plot number A B p q r Construction of single replicates of a two-level full factorial designs is straightforward. In this case, a 1 −1 −1 −1 −1 −1 completely randomized design in the whole-plot fac- −1 −1 1 1 −1 tors is conducted and, within each whole plot, a −1 −1 1 −1 1 completely randomized design in the split-plot fac- −1 −1 −1 1 1 tors is also conducted. Note that there are nw + 1 separate randomizations. The disadvantage of run- 2 −1 −1 1 −1 −1 ning the split-plot in a single replicate is the same −1 −1 −1 1 −1 as that for a completely randomized design: there −1 −1 −1 −1 1 are no degrees of freedom for error, either at the −1 −1 1 1 1 whole-plot level or the split-plot level, and so tests for various effects cannot be obtained without assum- 3 1 −1 −1 −1 −1 ing that various higher order interactions are negli- 1 −1 1 1 −1 gible. One design-based solution to this is to fully 1 −1 1 −1 1 replicate the experiment, although this may be ex- 1 −1 −1 1 1 pensive. Intermediate alternatives include replicat- ing the split-plots within a given whole plot (if the 4 1 −1 1 −1 −1 size of the whole plot permits), giving an estimate 1 −1 −1 1 −1 of the split-plot error variance, or the use of classi- 1 −1 −1 −1 1 cal split-plot blocking (as advocated by Ju and Lu- 1 −1 1 1 1 cas (2002)), where the split-plot treatments are run in blocks and the whole-plot factor level is reset for 5 −1 1 −1 −1 −1 each block. This increases the number of whole plots −1 1 1 1 −1 (while decreasing the number of split-plots per whole −1 1 1 −1 1 plot), providing an estimate of whole-plot error. −1 1 −1 1 1 An example of a blocked full-factorial split-plot 6 −1 1 1 −1 −1 design is given in Table 2. For this example, there −1 1 −1 1 −1 are two hard-to-change factors, A and B, and three −1 1 −1 −1 1 easy-to-change factors p, q, and r. Run as an un- −1 1 1 1 1 blocked full-factorial split-plot design, there would be 22 = 4 whole plots with each whole plot consist- ing of the 23 = 8 treatment combinations involving 7 1 1 −1 −1 −1 p, q, and r. Here we use b = pqr as the block gen- 1 1 1 1 −1 erator to divide the eight split-plot treatment com- 1 1 1 −1 1 binations in each whole plot into two blocks of size 1 1 −1 1 1 four. Each block is now treated as a separate whole- plot, thereby increasing the number of whole plots 8 1 1 1 −1 −1 from four to eight. An abbreviated ANOVA table 1 1 −1 1 −1 is shown in Table 3. (Note that the four degrees of 1 1 −1 −1 1 freedom for whole-plot error are obtained by pooling 1 1 1 1 1 each of the four whole-plot (= pqr) effects that are nested within the levels of A and B.) The fact that the number of whole plots has doubled will increase Fractional Factorial Split-Plot Designs the cost of the experiment. It is also important to reset the whole-plot factor level between whole plots If a full factorial split-plot design is too large, con- even if level does not change. sideration of a fractional factorial split-plot (FFSP) Vol. 41, No. 4, October 2009 www.asq.org
346 BRADLEY JONES AND CHRISTOPHER J. NACHTSHEIM TABLE 3. ANOVA (Source and DF) for Design in Table 1 to designs that are obtained by crossing as Cartesian- product designs, and we adopt this terminology in Source DF what follows. The use of Cartesian-product designs is a simple strategy that is frequently used, but it A 1 is not guaranteed to give the FFSP design of max- B 1 imum resolution. An alternative approach is to use AB 1 the technique of split-plot confounding. Whole plot error 4 To illustrate the concept of split-plot confounding, we borrow from an example provided by Bingham p 1 and Sitter (1999). Assume that there are four whole- q 1 plot factors, A, B, C, and D, and three split-plot r 1 factors, p, q and r, resources permit at most 16 runs, pq 1 and we require a design of resolution III or greater. pr 1 Employing a Cartesian-product design, the highest qr 1 resolution possible is obtained by crossing a resolu- Ap 1 tion IV 24−1 design in the whole plots (based on the Aq 1 generator D = ABC ) with a 23−2 design in the split Ar 1 plots (based, for example, on the split-plot genera- Apq 1 tors q = p and r = p). This leads to a 2(4+3)−(1+2) Apr 1 FFSP with the following defining contrast subgroup: Aqr 1 Bp 1 I = ABCD = pq = pr = ABCDpq = ABCDpr Bq 1 = qr = ABCDqr, Br 1 Split-plot error 9 which is resolution II. A better design can be con- Bpq structed using split-plot confounding (Kempthorne Bpr (1952)) by including whole-plot factors in the split- Bqr plot factorial generators. In the current example, if ABp we use D = ABC as the whole-plot design generator ABq and q = BC p and r = AC p as the split-plot design ABr generators, the defining contrast subgroup is ABpq ABpr I = ABCD = BC pq = AC pr = CDpq = BDpr ABqr = BC qr = ADqr, Total 31 which is resolution IV. Huang et al. (1998, pp. 319-322) give tables of minimum aberration (MA) FFSP designs having five design may be warranted. As noted by Bisgaard through fourteen factors for 16, 32, and 64 runs and (2000), there are generally two approaches to con- these designs were extended by Bingham and Sitter structing FFSPs. The first is to (1) select a 2p1 −k1 (1999). fractional factorial design in the p1 whole-plot fac- tors of resolution r1 (or a full factorial if k1 = 0); To illustrate the construction of an FFSP design (2) select a 2p2 −k2 fractional factorial design of res- using the tables of Huang et al., consider again the olution r2 in the p2 split-plot factors (or a full fac- previous example involving four whole-plot factors torial if k2 = 0); and then (3) cross these designs to and three split-plot factors, to be fielded in 16 runs. obtain a 2(p1 +p2 )−(k1 +k2 ) fractional factorial design. In this case, we have n1 = 4, n2 = 3, k1 = 1, and k2 (This design will be a fractional factorial of resolu- = 2. The “word-length pattern/generators” for this tion min(r1 , r2 ), as long as k1 + k2 > 0). By cross- case are provided on line 9 of Table 1 (Huang et al. ing, we simply mean that every whole-plot treatment (1998), p. 319) and are ABCD, BCpq, and ACpr. combination is run in combination with every split- Hence, we have D = ABC, q = BCp, and r = ACp. plot treatment combination. Bisgaard (2000) refers The independent columns are A, B, C, and p. Most Journal of Quality Technology Vol. 41, No. 4, October 2009
SPLIT-PLOT DESIGNS: WHAT, WHY, AND HOW 347 TABLE 4. Minimum Aberration SPFF Design Whole plot A B C D p q r 1 −1 −1 −1 −1 −1 −1 −1 −1 −1 −1 −1 1 1 1 2 −1 −1 1 1 −1 1 1 −1 −1 1 1 1 −1 −1 3 −1 1 −1 1 1 −1 1 −1 1 −1 1 −1 1 −1 4 −1 1 1 −1 1 1 −1 −1 1 1 −1 −1 −1 1 5 1 −1 −1 1 −1 −1 1 1 −1 −1 1 1 1 −1 6 1 −1 1 −1 −1 1 −1 1 −1 1 −1 1 −1 1 7 1 1 −1 −1 1 −1 −1 FIGURE 2. Selecting Generators for Minimum Aberration 1 1 −1 −1 −1 1 1 Split-Plot Fractional Factorial Design. 8 1 1 1 1 −1 −1 −1 1 1 1 1 1 1 1 standard software packages for the design of experi- ments permit the user to specify generators or, equiv- alently, a preferred defining contrast subgroup. Fig- 32-run designs provided by Huang et al. (1998). He ure 2 illustrates this step for SAS/JMP. The resulting starts with a 16-run design that uses fewer whole design is displayed in standard order in Table 4. plots than those in the minimum aberration tables, While the tables in Huang et al. (1998) and Bing- and then adds 8 runs to this design using semi-folding ham and Sitter (1999) are useful, they are by no to break various alias chains. He shows that the 24- means exhaustive, and a number of papers have aug- run design is a good compromise between the 16- mented these results. One limitation of these designs and 32-run designs. We consider these designs in a is that the whole-plot designs were not replicated. bit more detail in our discussion of optimal split-plot Bingham et al. (2004) developed methods for split- designs below. ting the whole plots (as discussed above in the case of a full factorial split-plot design), thereby increasing Split-Plot Designs for Robust Product the number of whole plots while decreasing the num- Experiments ber of split plots per whole plot. McLeod and Brew- ster (2004) developed three methods for constructing The purpose of robust product experiments is to blocked minimum aberration 2-level fractional fac- identify settings of product-design factors, sometimes torial split-plot designs, for n = 32 runs. McLeod termed product-design parameters that lead to prod- and Brewster (2008) also developed foldover plans ucts or processes that are insensitive to the effects of for two-level fractional factorial designs. Relatedly, uncontrollable environmental factors. As illustrated Almimi et al. (2008) develop follow-up designs to re- by Box and Jones (1992), product-design parameters solve confounding in split-plot experiments. Kowal- for a cake mix might be the amount of flour, short- ski (2002) provides additional SPFF designs with 24 ening, and egg powder contained in the mix. Un- runs, in an effort to fill the gap between the 16- and controllable environmental factors include the baking Vol. 41, No. 4, October 2009 www.asq.org
348 BRADLEY JONES AND CHRISTOPHER J. NACHTSHEIM time and baking temperature used by the consumer for either of alternatives (1) and (2). We refer to produce the cake. These ideas were introduced the reader to their paper for further details. by Taguchi (see, e.g., Taguchi (1987)), who recom- mended the use of Cartesian product experiments to Of course, while alternatives (1) and (2) arise fre- determine the optimal settings of the design parame- quently, environmental- and product-design factors ters. Specifically, Taguchi employed a designed exper- need not be of one type or the other. The case iment in the design factors—the inner array, a second may arise in which the set of hard-to-change factors designed experiment for the environmental factors— is comprised of both environmental- and product- the outer array, and these designs were crossed to design factors. Generally, in robust product experi- form the robust product experiment. A “best” set- ments, however, interest centers on the design-factor ting of the product design factors is one that leads to effects and the design-factor × environmental-factor an average response closest to the product designer’s interactions. Main effects of the environmental fac- target, and that also exhibits little variation as the tors are of secondary interest, as they have no bear- environmental factors are changed. ing on determination of best settings for the design factors. For this reason, Box and Jones (1992) con- Unfortunately, a barrier to the widespread use clude that environmental factors will be more appro- of crossed-array parameter-design experiments has priately used as the whole-plot treatment. been cost. Clearly, if n1 is the number of runs in the design-factors array and n2 is the number of runs A number of authors have explored the question in the environmental-factors array, then the required of how to choose a “best” split-plot design for a given number of runs—and therefore the number of factor set of design and environmental factors. Bisgaard resettings—is n = n1 × n2 . Thus, the size of the the (2000) considered the use of 2p1 −k1 × 2p2 −k2 frac- required robust parameter CRD will be prohibitive tional factorial split-plot experiments, where there unless both n1 and n2 are small. Recently, various are p1 whole-plot factors and p2 split-plot factors. He authors, beginning with Box and Jones (1992), have began by considering Cartesian-product designs, as demonstrated how split-plot designs can be used ef- advocated by Taguchi, and showed how to determine fectively to reduce the cost of robust product exper- the defining contrast subgroup for the combined ar- iments when there are hard-to-change factors. Box ray as a function of the defining contrast subgroups and Jones (1992) discuss three ways in which split- for the inner and outer arrays. He then noted that plot designs might arise in robust product experi- split-plot confounding can lead to better designs and ments: gave rules for implementing split-plot confounding. 1. The environmental factors are hard to change The problem of finding a “best” split-plot frac- while the product-design factors are easy to tional factorial design for robust product design was change. In the context of the cake-mix exam- taken up later by Bingham and Sitter (2003). They ple, suppose that the cake mixes can be eas- began by noting that their own tables of MA SPFF ily formulated as individual packages, but the designs (Bingham and Sitter (1999)) and those de- ovens are large and it takes a good deal of time veloped previously by Huang et al. (1998) were not to reset the temperature. In this case, the time directly applicable to product-design experiments. for the experiment may be minimized by using Let C denote any (controllable) product-design fac- time and temperature as whole-plot factors. tor and let N denote any environmental (noise) fac- tor. If we ignore the robust design aspect of the 2. The product-design factors are hard to change design problem, the importance of the various fac- and the environmental factors are easy to tors (of third order or less) to the investigator will change. In the context of the cake-mix example, typically be ranked in descending order of likely it may be that the cake mixes must be made significance—main effects, followed by second-order in large batches, which can be subdivided for effects, followed by third-order effects—as shown in baking at the different conditions indicated by the top panel of Table 5. However, in a product- the environmental-factor (outer) array. Thus, design experiment, we are less interested in main ef- we want to minimize the number of required fects of the noise factors, and Bingham and Sitter batches (whole plots). (2003) argue that a more relevant ranking of fac- 3. In this alternative, Box and Jones (1992) con- tor effects is as shown in the bottom panel of Table sidered the use of strip-plot experiments, which 5. The key point is that the importance ranking of can further reduce the cost of the experiment the control-by-noise interactions has been increased Journal of Quality Technology Vol. 41, No. 4, October 2009
SPLIT-PLOT DESIGNS: WHAT, WHY, AND HOW 349 TABLE 5. Ranking of Factor Effects in Standard Notice that, from Equation (4), we have vs. Parameter Design Experiments V = σ 2 In + σ w 2 ZZ Importance = σ 2 (In + dZZ ), (6) ranking Effect type 2 where d = σw /σ 2 is the variance ratio—the ratio of Standard experiment the whole-plot variance component to the split-plot variance component. From Equations (5) and (6), it 1. C, N follows that 2. C × C, N × N , C × N 3. C × C × C, C × C × N , C × N × N , CD (X, Z, V) = (σ 2 )−p X (In + dZZ )−1 X N ×N ×N ∝ X (In + dZZ )−1 X , (7) Robust parameter experiment which indicates that the determinant criterion de- pends on the unknown variance components only 1. C, N through their ratio, d. The absolute magnitudes are 2. C ×N not required for comparing designs. Two alternative 3. C × C, N × N designs, [X1 , Z1 ] and [X2 , Z2 ], can be compared by 4. C × C × N, C × N × N computing their relative efficiencies. The relative D- 5. C × C × C, N × N × N efficiency of design [X1 , Z1 ], relative to the design [X2 , Z2 ] is given by relative to other effects of the same order, consistent RED ([X1 , Z1 ], [X2 , Z2 ]) 1/p with the experimenter’s objectives in robust prod- X1 (In + dZ1 Z1 )−1 X1 uct experiments. As noted by Bingham and Sitter = . X2 (In + dZ2 Z2 )−1 X2 (2003), the low-order noise main effects must be re- tained because they are likely to be significant, and The relative efficiency gives the number times the it is important to keep other effects from becoming design [X2 , Z2 ] must be replicated in order to achieve aliased with them. They then modify the definition the overall precision of design [X1 , Z1 ]. of aberration to reflect the new importance rankings and use their search algorithm (Bingham and Sitter Letsinger et al. (1996) also gave the integrated (1999)) to produce two tables of MA FFSP designs variance of prediction criterion (IV) for split-plot de- for robust product design experiments. The first ta- signs as ble is for use when the product-design factors are CIV (X, Z, V) hard to change, the second is for applications when σ2 the environmental factors are hard to change. These = f (x)[X (In + dZZ )−1 X]−1 f (x)dx tables are useful for 16- and 32-run experiments and dx R R for total numbers of factors ranging from 5 to 10. ∝ Trace [X (In + dZZ )−1 X]−1 Split-Plot Designs for Response Surface Experiments × f (x)f (x)dx Relatively little research has been conducted with R regard to the construction of split-plot designs for = Trace{[X (In + dZZ )−1 X]−1 [MR ]}, (8) response surface experiments. Recall that, in a re- sponse surface experiment, we are typically inter- where x = (w , s ) denotes the vector of factor set- ested in estimating a response model comprised of tings, R is the region of interest, MR is the moment all first- and second-order polynomial terms. matrix, and we again note the dependence of the cri- Letsinger et al. (1996) first discussed the notion terion on the unknown variance ratio d. For the in- of optimality criteria in connection with split-plot tegrated variance criterion, the efficiency of design response surface experiments. They noted that a best [{X1 , Z1 }], relative to design [{X2 , Z2 }] is given by design will maximize, according to the determinant REIV ([X1 , Z1 ], [X2 , Z2 ]) criterion CD , Trace{[X2 (In + dZ2 Z2 )−1 X2 ]−1 [MR ]} = CD (X, Z, V) = X V−1 X . (5) Trace{[X1 (In + dZ1 Z1 )−1 X1 ]−1 [MR ]}. Vol. 41, No. 4, October 2009 www.asq.org
350 BRADLEY JONES AND CHRISTOPHER J. NACHTSHEIM Letsinger et al. examined, for the case of four factors, the efficiencies of the Box–Behnken design, the uni- form precision design, the rotatable central compos- ite design, the small composite design for the case of one or two whole-plot factors. They conclude that the central composite design, run as a split-plot, seemed to give the best overall performance. Other recent developments in this area include Draper and John (1998), who discuss how to choose factor levels for their CUBE and STAR designs in split-plot situations. They use an index of rotata- bility as their criterion for choosing the factor lev- FIGURE 3. Equivalent Estimation Response Surface De- els to create optimal designs. Split-plot mixture re- sign. sponse surface experiments have only very recently been considered (Kowalski et al. (2002), Goos and Donev (2006)). 0) are replicated four times. In one of the four center- Recently, Vining et al. (2005) discussed the de- point whole plots, the split-plot treatments occupy sign and analysis of response surface experiments the four corners. This leads to 7 × (4 − 1) = 21 de- when run in the form of a split plot. Their paper grees of freedom for split-plot pure error. Also, be- established conditions under which ordinary least cause three of the four whole-plot center points have squares (OLS) estimates, which are usually inappro- the identical split-plot treatment structure (i.e., four priate for analysis of split-plot experiments, and the center points), they are true replicates, and we ob- preferred generalized least squares (GLS) estimates tain three degrees of freedom for whole-plot pure er- are equivalent. Their result builds on the Letsinger ror. In general, if the center points of the split-plot et al. (1996) paper just discussed, which demon- factors are replicated nspc times within each of the strated that OLS and GLS estimates are equivalent 2k whole-plot axial points and r whole-plot center for Cartesian-product split-plot designs. Vining et points, there will be (2k + r)(nspc − 1) degrees of al. (2005) then show how various points in central freedom for split-plot pure error and r − 1 degrees of composite designs and Box–Behnken designs can be freedom for whole-plot pure error. An unbiased es- replicated in such a way as to induce the equivalence timate of the split-plot pure-error variance is given of OLS and GLS. These designs, named equivalent by 2k 2 r estimation designs (EEDs) by the authors, have two l=1 sAl s2cp0i 2 ssp = + i=1 , useful features. First, the required replication struc- 2k r ture leads to designs with pure replicates of whole where s2Al and s2cp0i are the usual sample variances plots and split plots, so that model-free estimates of 2 of the responses from the lth whole-point axial point the variance components σw and σ 2 are provided. and the ith whole-plot center point containing split- Second, OLS software, which is widely available, can plot center points. An estimate of the whole-plot be used to obtain parameter estimates. variance component is also easy to compute. Let An equivalent-estimation design for a split-plot re- Ȳi. denote the mean of the values in the ith repli- sponse surface scenario, developed by Vining et al. cated whole plot and let s2wp denote the usual sam- (2005), is shown in Figure 3, and the design points ple variance of the r replicated whole-plot means, 2 are listed in Table 6. In this example, there are two Ȳ1. , . . . , Ȳr. . An unbiased estimate of σw is given by hard-to-change factors, Z1 and Z2 , and there are s2sp two easy-to-change factors, X1 and X2 , and the ex- s2w = s2wp − . perimenter is interested in estimating all first- and nspc second-order terms. The design consists of a Box– As was the case with the standard ANOVA estimator Behnken design with 12 whole plots corresponding (9), this estimator can, at times, be negative. to the four corner points, the four axial points, and four center points. In the four axial whole plots and in In summary, equivalent estimation designs for three of the four center-point whole plots, the center response-surface experiments designs possess two no- points for the split-plot treatments (i.e., X1 = X2 = table properties. First, they permit the use of OLS Journal of Quality Technology Vol. 41, No. 4, October 2009
SPLIT-PLOT DESIGNS: WHAT, WHY, AND HOW 351 TABLE 6. Equivalent Estimation Split-Plot Response Surface Design Whole plot Z1 Z2 X1 X2 Whole plot Z1 Z2 X1 X2 1 −1 −1 0 0 7 0 −1 −1 0 (Corner) −1 −1 0 0 (Axial) 0 −1 1 0 −1 −1 0 0 0 −1 0 −1 −1 −1 0 0 0 −1 0 1 2 1 −1 0 0 8 0 1 −1 0 (Corner) 1 −1 0 0 (Axial) 0 1 1 0 1 −1 0 0 0 1 0 −1 1 −1 0 0 0 1 0 1 3 −1 1 0 0 9 0 0 −1 −1 (Corner) −1 1 0 0 (Center) 0 0 1 −1 −1 1 0 0 0 0 −1 1 −1 1 0 0 0 0 1 1 4 1 1 0 0 10 0 0 0 0 (Corner) 1 1 0 0 (Center) 0 0 0 0 1 1 0 0 0 0 0 0 1 1 0 0 0 0 0 0 5 −1 0 −1 0 11 0 0 0 0 (Axial) −1 0 1 0 (Center) 0 0 0 0 −1 0 0 −1 0 0 0 0 −1 0 0 1 0 0 0 0 6 1 0 −1 0 12 0 0 0 0 (Axial) 1 0 1 0 (Center) 0 0 0 0 1 0 0 −1 0 0 0 0 1 0 0 1 0 0 0 0 for obtaining factor-effects estimates. Second, the au- approach. Optimal designs for split-plot experiments thors’ recommended replication structure leads to were first proposed by Goos and Vandebroek (2001). model-free estimates of the whole-plot and split-plot The general approach is as follows: variance components. However, these designs are not 1. Specify the number of whole plots, nw . without drawbacks. Inadvertent changes to the de- sign during implementation, e.g., the occurrence of 2. Specify the number of split plots per whole plot, a missing observation, will destroy the OLS–GLS ns . equivalence. Moreover, Goos (2006) noted that the 3. Specify the response model, f (w, s). relatively large proportion of runs devoted to pure 4. Specify a prior estimate of the variance ratio, replication significantly reduces the overall efficiency d. of the design. Goos advocates the use of optimal ex- 5. Use computer software to construct the design perimental designs, which we take up next. for steps 1 through 4 that maximizes the D- optimality criterion of Equation (5). Optimal Split-Plot Designs 6. Study the sensitivity of the optimal design to small changes in d, nw , and ns . A general approach to the problem of constructing split-plot designs is provided by the optimal-design The algorithm given by Goos and Vandebroek (2001) Vol. 41, No. 4, October 2009 www.asq.org
352 BRADLEY JONES AND CHRISTOPHER J. NACHTSHEIM also requires that the user specify a candidate set of TABLE 7. D-Optimal Split-Plot Screening Design possible design points. This is usually the 2p vertices of the hypercube ([−1, 1]p ) for main-effects models Whole plot A B C D p q r or for main-effects models plus interactions. It may consist of points from the 3p factorial when pure 1 −1 −1 −1 1 −1 −1 1 quadratic effects are present. An alternative algo- 1 −1 −1 −1 1 1 1 −1 rithm based on the coordinate exchange algorithm (Meyer and Nachtsheim (1995)), which does not re- 2 −1 −1 1 −1 −1 −1 −1 quire candidate sets, was later developed by Jones 2 −1 −1 1 −1 1 1 1 and Goos (2007). Note that the specified numbers of whole plots and split plots must both be large enough to enable estimation of all of the desired model ef- 3 −1 1 −1 −1 −1 1 1 fects and that the designs produced are optimal for 3 −1 1 −1 −1 1 −1 −1 the model as specified in item (3) above. 4 −1 1 1 1 −1 −1 −1 The approach is general in the sense that it can 4 −1 1 1 1 1 1 1 be applied to virtually any split-plot design prob- lem and, unlike the minimum-aberration designs de- 5 1 −1 −1 −1 −1 1 −1 scribed earlier, there is no need to limit run sizes to 5 1 −1 −1 −1 1 −1 1 powers of two. We demonstrate the use of the op- timal split-plot design approach in connection with 6 1 −1 1 1 −1 1 −1 three of the examples already discussed. 6 1 −1 1 1 1 −1 1 Example 1: Screening 7 1 1 −1 1 −1 1 1 Consider the screening design that was con- 7 1 1 −1 1 1 −1 −1 structed earlier involving four hard-to-change fac- tors and three easy-to-change factors. The minimum- 8 1 1 1 −1 −1 −1 1 aberration design, shown in Table 4, was based on 8 1 1 1 −1 1 1 −1 eight whole plots with two split plots per whole plot. Thus, we specify nw = 8, ns = 2, and the model of interest is the seven-factor first-order model. The D-optimal design (produced by SAS/JMP) is shown by using semi-folding of 16-run designs or balanced in Table 7. Note that the design is balanced over- incomplete blocks. In Kowalski’s case 1, based on all and that every whole plot is separately balanced. balanced incomplete block designs, there were two The design, in fact, is orthogonal, although it is not a whole-plot factors, four split-plot factors, and the ob- regular orthogonal design. For nonregular orthogonal jective was to run the experiment in four whole-plots designs, effects are not directly aliased with other ef- with six split plots per whole plot in order to es- fects, as they are for MA fractional factorial designs. timate all main effects and two-factor interactions. This example illustrates advantages and disadvan- The design recommended by Kowalski is displayed tages of the optimal design approach. The simplicity in Table 8. The optimal design approach produced of this approach is undeniable, and whereas the MA the design shown in Table 9. In both cases, all main design is a D-optimal design for the first-order model, effects and second-order interactions are estimable, a D-optimal design need not be an MA design. One and in both cases, the designs are balanced both might question the marginal value of a resolution IV overall and within whole plots. The D-efficiency of design in this instance, when most of the two-factor the combinatoric design, relative to the D-optimal interactions are aliased with two or more other two- design, is 88.9%. factor interactions. Correctly identifying the active two-factor interactions will be difficult, at best, with- Example 3: Response Surface Designs out substantial followup testing. Vining et al. (2005) developed an EED based on 2 hard-to-change and 2 easy-to-change factors, in Example 2: Main Effects Plus Interactions 12 whole plots having 4 split plots per whole plot. We noted that Kowalski (2002) provided ap- The design, shown previously in Figure 3, is a Box– proaches for constructing 24-run split-plot designs Behnken design with the split-plot and whole-plot Journal of Quality Technology Vol. 41, No. 4, October 2009
SPLIT-PLOT DESIGNS: WHAT, WHY, AND HOW 353 TABLE 8. 24 Run Design for Estimating Main Effects TABLE 9. 24 Run D-Optimal Design for Estimating and Two-Factor Interactions (Kowalski, 2002) Main Effects and Two-Factor Interactions Whole plot A B p q r s Whole plot A B p q r s 1 1 −1 1 −1 −1 −1 1 −1 −1 −1 −1 −1 1 1 1 −1 −1 1 −1 −1 1 −1 −1 −1 −1 1 −1 1 1 −1 −1 −1 1 −1 1 −1 −1 −1 1 −1 −1 1 1 −1 1 1 −1 1 1 −1 −1 1 −1 −1 −1 1 1 −1 1 −1 1 1 1 −1 −1 1 1 −1 1 1 1 −1 −1 1 1 1 1 −1 −1 1 1 1 −1 2 −1 1 −1 1 −1 −1 2 −1 1 −1 −1 1 −1 2 −1 1 −1 −1 −1 −1 2 −1 1 −1 −1 −1 1 2 −1 1 −1 1 −1 1 2 −1 1 1 1 1 −1 2 −1 1 −1 1 1 −1 2 −1 1 1 1 −1 1 2 −1 1 −1 1 1 1 2 −1 1 1 −1 1 1 2 −1 1 1 −1 1 1 2 −1 1 1 1 −1 −1 3 1 1 1 1 −1 −1 3 1 1 1 −1 1 −1 3 1 −1 −1 −1 −1 −1 3 1 1 1 −1 −1 1 3 1 −1 −1 1 −1 1 3 1 1 −1 1 1 −1 3 1 −1 −1 1 1 −1 3 1 1 −1 1 −1 1 3 1 −1 −1 1 1 1 3 1 1 −1 −1 1 1 3 1 −1 1 −1 1 1 3 1 −1 1 1 −1 −1 4 −1 −1 1 −1 1 −1 4 −1 −1 1 −1 −1 1 4 1 1 −1 −1 1 1 4 −1 −1 −1 1 1 −1 4 1 1 −1 1 −1 −1 4 −1 −1 −1 1 −1 1 4 1 1 1 −1 −1 −1 4 −1 −1 −1 −1 −1 −1 4 1 1 1 −1 −1 1 4 −1 −1 1 1 1 1 4 1 1 1 −1 1 −1 4 1 1 1 1 1 1 replication structure chosen in such a way that an EED results. The D-optimal design for the same second-order model is listed in Table 10 and dis- approach emphasizes a different characteristic of the played in Figure 4. It is clear that the D-optimal design and the resulting designs can be quite differ- design has traded split-plot and whole-plot center- ent. Our view is that all of these designs are excel- point replicates for treatment combinations at the lent, especially when viewed against the alternatives boundaries of the design space. Notice, for example, of running a CRD incorrectly or simply not running a that there are no center-point whole plots and that designed experiment. The particular choice depends all whole plots include corner points. Consistent with on the goals of the experiment. its criterion, the optimal-design approach is sacrific- In this subsection, we have emphasized the use of ing information on variance components for informa- the D-optimality criterion, although alternative cri- tion on factor effects. teria and computer-based approaches do exist. For These results indicate that there are currently a example, the ability to construct split-plot designs variety of approaches for obtaining split-plot designs. that are optimal by the average variance criterion (8) We have discussed the use of MA fractional facto- may be constructed, although general-purpose soft- rial designs, various combinatoric modifications to ware for doing so is not yet available. Such designs the MA designs, EED response surface designs, and would be particularly appropriate for estimating re- optimal split-plot designs. The results show that each sponse surfaces. In an alternative computer-based Vol. 41, No. 4, October 2009 www.asq.org
354 BRADLEY JONES AND CHRISTOPHER J. NACHTSHEIM TABLE 10. D-Optimal Split-Plot Response Surface Design Whole plot Z1 Z2 X1 X2 Whole plot Z1 Z2 X1 X2 1 −1 −1 −1 −1 7 1 −1 −1 −1 1 −1 −1 −1 1 7 1 −1 −1 1 1 −1 −1 1 −1 7 1 −1 0 1 1 −1 −1 1 1 7 1 −1 1 0 2 −1 1 −1 −1 8 1 1 −1 −1 2 −1 1 −1 1 8 1 1 −1 1 2 −1 1 1 −1 8 1 1 0 −1 2 −1 1 1 1 8 1 1 1 0 3 1 −1 −1 −1 9 −1 0 −1 −1 3 1 −1 −1 1 9 −1 0 −1 1 3 1 −1 1 −1 9 −1 0 0 0 3 1 −1 1 1 9 −1 0 1 1 4 1 1 −1 −1 10 1 0 −1 0 4 1 1 −1 1 10 1 0 0 −1 4 1 1 1 −1 10 1 0 1 −1 4 1 1 1 1 10 1 0 1 1 5 −1 −1 −1 1 11 0 −1 −1 −1 5 −1 −1 0 −1 11 0 −1 −1 0 5 −1 −1 1 0 11 0 −1 0 1 5 −1 −1 1 1 11 0 −1 1 −1 6 −1 1 −1 −1 12 0 1 −1 0 6 −1 1 −1 1 12 0 1 0 1 6 −1 1 0 0 12 0 1 1 −1 6 −1 1 1 −1 12 0 1 1 1 approach, Trinca and Gilmour (2001) presented a cated split-plot experiments—and move on to more strategy for constructing multistratum response sur- complex cases, including the analysis of unreplicated face designs (split plots are a special case of these). split-plot designs and a completely general approach, This is a multistage approach that moves from the based on restricted maximimum likelihood estima- highest stratum to the lowest, where, at each stage, tion (REML). designs are constructed to ensure near orthogonal- ity between strata. Weighted trace-optimal designs— Balanced, Replicated Split-Plot Designs not generally available—would be useful for robust- When full factorial designs are balanced and repli- product designs, because the rankings of effects uti- cated, ANOVA model (1) is fully applicable. The lized by Bingham and Sitter (2003) could be easily ANOVA table for the corrosion-resistance experi- reflected in the design-optimality criterion. ment is given in Table 11. Note that there are two replicates of each of the three whole-plot factor set- How to Analyze a Split-Plot Design tings, leading to three degrees of freedom for pure In this section, we review methods for the anal- whole-plot error. In contrast, the split-plot treat- ysis of split-plot designs. We begin with the sim- ments are not replicated within the whole plots and, plest case—the ANOVA approach for balanced, repli- for this reason, there are no degrees of freedom for Journal of Quality Technology Vol. 41, No. 4, October 2009
SPLIT-PLOT DESIGNS: WHAT, WHY, AND HOW 355 From the corrosion-resistance ANOVA in Table (11), we obtain s2 = 124.5 and s2w = (4813.2 − 124.5)/4 = 1172.2. We note that, if MS(split-plot error) is greater than MS(whole-plot error), the ANOVA- based estimated variance component for whole-plot effects, s2w , will be negative. Procedures for con- structing confidence intervals for the variance compo- nents are detailed in standard texts (see, e.g., Kutner et al. (2005)). As noted previously, in the corrosion-resistance example, whole plots are replicated, but split plots are not. In this case, a pure-error estimate of whole- FIGURE 4. D-optimal Split-Plot Response Surface De- plot error is available, whereas the estimate of the sign Based on 12 Whole Plots. (Note: Figure shows layouts split-plot error is based on the whole-plot error by for two whole plots at each corner. split-plot factor interaction. In choosing a design, the experimenter must weigh the importance of true pure split-plot error. An assumption frequently made replicates at both levels of the experiment and their is that there is no whole-plot error by split-plot fac- implications for the power of the factor-effects tests tor (i.e., block-by-treatment) interaction and, from to be conducted. In industrial experiments, the cost this, we obtain 3 × 3 = 9 degrees of freedom for split- of whole plots frequently precludes the use of repli- plot error. From an examination of the EMS column, cates at the whole-plot level and methods for the we discern (by setting all fixed effects to zero in the analysis of unreplicated experiments are required. expressions) that the whole-plot error mean square We turn now to the analysis of unreplicated split- is the appropriate term for the denominator of the plot designs. F -test for the effect of temperature, and the sub- plot error mean square is the appropriate denomina- Unreplicated Two-Level Factorial Split-Plot tor for F -tests for the effects of coatings and for the Designs temperature-by-coating interaction. The general rule As is the case with completely randomized un- for testing factor effects is simple: whole-plot main ef- replicated two-level factorial experiments, an exper- fects are tested based on the whole-plot mean square imenter has a number of options for the analysis of error, while all other effects—split-plot main effects unreplicated split-plot two-level designs. If the design and split-plot by whole-plot interactions—are tested permits the estimation of higher order interactions using the split-plot mean square error. and if it can be safely assumed that these higher or- From the expected mean square column of the der interactions are not active, the estimates can be ANOVA table (Table 11), we note that pooled to provide unbiased estimates of experimental error at both levels of the experiment. Factor-effects E[MS(Whole-Plot Error)] = σ 2 + bσw 2 tests can then be conducted in standard fashion. An- E[MS(Split-Plot Error)] = σ 2 . other approach recommended by Daniel (1959) was It follows that the use of two half-normal (or normal) plots of ef- E[MS(Whole-Plot Error)] fects, one for whole-plot factor main effects and in- 2 σw = teractions and one for split-plot factor effects and b split-plot-factor by whole-plot-factor interactions. E[MS(Split-Plot Error)] − . b To illustrate these approaches, we use data from We obtain unbiased estimators of σ 2 and σw 2 by sub- a robust product design reported by Lewis et al. stituting observed for expected mean squares as (1997). This experiment has also been discussed by Bingham and Sitter (2003) and Loeppky and Sit- s2 = MS(Split-Plot Error) ter (2002). The experiment studied the effects of MS(Whole-Plot Error) customer usage factors on the time between fail- s2w = b ures of a wafer handling subsystem in an automated MS(Split-Plot Error) memory that relied on optical-pattern recognition − . (9) b to position a wafer for repair. The response was a Vol. 41, No. 4, October 2009 www.asq.org
You can also read