Split-Plot Designs: What, Why, and How

Page created by Clara Avila
 
CONTINUE READING
Split-Plot Designs:
                            What, Why, and How
                                                   BRADLEY JONES
                                              SAS Institute, Cary, NC 27513

                                        CHRISTOPHER J. NACHTSHEIM
                Carlson School of Management, University of Minnesota, Minneapolis, MN 55455

             The past decade has seen rapid advances in the development of new methods for the design and analysis
         of split-plot experiments. Unfortunately, the value of these designs for industrial experimentation has not
         been fully appreciated. In this paper, we review recent developments and provide guidelines for the use of
         split-plot designs in industrial applications.

         Key Words: Block Designs; Completely Randomized Designs; Factorial Designs; Fractional Factorial De-
         signs; Optimal Designs; Random-Effects Model; Random-Run-Order Designs; Response Surface Designs;
         Robust Parameter Designs.

                    Introduction                                      ments previously thought to be completely random-
                                                                      ized experiments also exhibit split-plot structure.
    All industrial experiments are split-plot experiments.            This surprising result adds creedence to the Daniel
                                                                      proclamation and has motivated a great deal of pi-

  T    HISprovocative remark has been attributed to
       the famous industrial statistician, Cuthbert
Daniel, by Box et al. (2005) in their well-known text
                                                                      oneering work in the design and analysis of split-
                                                                      plot experiments. We note in particular the identi-
                                                                      fication of minimum-aberration split-plot fractional
on the design of experiments. Split-Plot experiments                  factorial designs by Bingham and Sitter and cowork-
were invented by Fisher (1925) and their importance                   ers (Bingham and Sitter (1999, 2001, 2003), Bingham
in industrial experimentation has been long recog-                    et al. (2004), Loeppky and Sitter (2002)), the devel-
nized (Yates (1936)). It is also well known that many                 opment of equivalent-estimation split-plot designs for
industrial experiments are fielded as split-plot exper-                response surface and mixture designs (Kowalski et
iments and yet erroneously analyzed as if they were                   al. (2002), Vining et al. (2005), Vining and Kowal-
completely randomized designs. This is frequently                     ski (2008)), and the development of optimal split-
the case when hard-to-change factors exist and eco-                   plot designs by Goos, Vandebroek, and Jones and
nomic constraints preclude the use of complete ran-                   their coworkers (Goos (2002), Goos and Vandebroek
domization. Recent work, most notably by Lucas and                    (2001, 2003, 2004), Jones and Goos (2007 and 2009)).
his coworkers (Anbari and Lucas (1994), Ganju and
Lucas (1997, 1999, 2005), Ju and Lucas (2002), Webb                      Our goal here is to review these recent develop-
et al. (2004)) has demonstrated that many experi-                     ments and to provide guidelines for the practicing
                                                                      statistician on their use.

    Dr. Jones is Director of Research and Development for                   What Is a Split-Plot Design?
the JMP Division of SAS Institute, Inc. His email address
is bradley.jones@sas.com.                                                In simple terms, a split-plot experiment is a
    Dr. Nachtsheim is the Curtis L. Carlson Professor and             blocked experiment, where the blocks themselves
Chair of the Operations and Management Science Department             serve as experimental units for a subset of the factors.
at the Carlson School of Management of the University of Min-         Thus, there are two levels of experimental units. The
nesota. His email address is nacht001@umn.edu.                        blocks are referred to as whole plots, while the experi-

Journal of Quality Technology                                   340                               Vol. 41, No. 4, October 2009
SPLIT-PLOT DESIGNS: WHAT, WHY, AND HOW                                              341

                                                                        TABLE 1. Split-Plot Design and Data for
                                                                     Studying the Corrosion Resistance of Steel Bars
                                                                                   (Box et al. (2005))

                                                                               Temperature             Coating
                                                                Whole-plot        (◦ C)           (randomized order)

                                                                     1.             360         C2     C3     C1       C4
                                                                                                73     83     67       89

                                                                     2.             370         C1     C3     C4       C2
                                                                                                65     87     86       91
   FIGURE 1. Split Plot Agricultural Layout. (Factor A is
the whole-plot factor and factor B is the split-plot factor.)        3.             380         C3     C1    C2        C4
                                                                                                147    155   127       212
mental units within blocks are called split plots, split
units, or subplots. Corresponding to the two levels of               4.             380         C4     C3    C2        C1
experimental units are two levels of randomization.                                             153    90    100       108
One randomization is conducted to determine the
assignment of block-level treatments to whole plots.                 5.             370         C4     C1    C3        C2
Then, as always in a blocked experiment, a random-                                              150    140   121       142
ization of treatments to split-plot experimental units
occurs within each block or whole plot.                              6.             360         C1     C4     C2       C3
                                                                                                33     54     8        46
   Split-plot designs were originally developed by
Fisher (1925) for use in agricultural experiments. As
a simple illustration, consider a study of the effects
of two irrigation methods (factor A) and two fertiliz-          to-change factor. The experiment was designed to
ers (factor B) on yield of a crop, using four available         study the corrosion resistance of steel bars treated
fields as experimental units. In this investigation, it          with four coatings, C1 , C2 , C3 , and C4 , at three fur-
is not possible to apply different irrigation methods            nace temperatures, 360◦ C, 370◦ C, and 380◦ C. Fur-
(factor A) in areas smaller than a field, although dif-          nace temperature is the hard-to-change factor be-
ferent fertilizer types (factor B) could be applied in          cause of the time it takes to reset the furnace and
relatively small areas. For example, if we subdivide            reach a new equilibrium temperature. Once the equi-
each whole plot (field) into two split plots, each of the        librium temperature is reached, four steel bars with
two fertilizer types can be applied once within each            randomly assigned coatings C1 , C2 , C3 , and C4 are
whole plot, as shown in Figure 1. In this split-plot            randomly positioned in the furnace and heated. The
design, a first randomization assigns the two irriga-            layout of the experiment as performed is given in Ta-
tion types to the four fields (whole plots); then within         ble 1. Notice that each whole-plot treatment (tem-
each field, a separate randomization is conducted to             perature) is replicated twice and that there is just
assign the two fertilizer types to the two split plots          one complete replicate of the split-plot treatments
within each field.                                               (coatings) within each whole plot. Thus, we have six
                                                                whole plots and four subplots within each whole plot.
   In industrial experiments, factors are often differ-
                                                                  Other illustrative examples of industrial exper-
entiated with respect to the ease with which they can
                                                                iments having both easy-to-change and hard-to-
be changed from experimental run to experimental
                                                                change factors include
run. This may be due to the fact that a particular
treatment is expensive or time-consuming to change,               1. Box and Jones (1992) describe an experiment
or it may be due to the fact that the experiment is                  in which a packaged-foods manufacturer wished
to be run in large batches and the batches can be                    to develop an optimal formulation of a cake
subdivided later for additional treatments. Box et                   mix. Because cake mixes are made in large
al. (2005) describe a prototypical split-plot experi-                batches, the ingredient factors (flour, shorten-
ment with one easy-to-change factor and one hard-                    ing, and egg powder) are hard to change. Many

Vol. 41, No. 4, October 2009                                                                                 www.asq.org
342                             BRADLEY JONES AND CHRISTOPHER J. NACHTSHEIM

      packages of cake mix are produced from each           the ANOVA sums of squares are orthogonal, a stan-
      batch, and the individual packages of cake mix        dard, mixed-model, ANOVA-based approach to the
      can be baked using different baking times and          analysis is possible. This is the approach that is taken
      baking temperatures. Here, the easy-to-change,        by most DOE textbooks that treat split-plot de-
      split-plot factors are time and temperature.          signs (see, e.g., Kutner et al. (2005), Montgomery
  2. Another example involving large batches and            (2008), Box et al. (2005)). With this approach, all
     the subsequent processing of subbatches was            experimental factor effects are assumed to be fixed.
     discussed by Bingham and Sitter (2001). In this        The standard ANOVA model for the balanced two-
     instance, a company wished to study the ef-            factor split-plot design, where there are a levels of
     fect of various factors on the swelling of a wood      the whole-plot factor A and b levels of the split-plot
     product after it has been saturated with water         treatment B, where each level of whole-plot factor A
     and allowed to dry. A batch is characterized           is applied to c whole plots (so that the whole plot ef-
     by the type and size of wood being tested, the         fect is nested within factor A), and where the number
     amounts of two additives used, and the level           of runs is n = abc, is given by
     of moisture saturation. Batches can later be              Yijk = μ... + αi + βj + (αβ)ij + γk(i) + εijk ,               (1)
     subdivided for processing, which is character-
     ized by three easy-to-change factors, i.e., pro-       where: μ... is a constant; αi , the a whole-plot          treat-
     cess time, pressure, and material density.             ment effects, are constants subject to                αi = 0; βj ,
                                                            the b split-plot
                                                                            treatment effects, are constants sub-
  3. Bisgaard (2000) discusses the use of split-plot
                                                            ject to     βj = 0; (αβ)ij ,the ab interaction effects,
     arrangements in parameter design experiments.
                                                            are constants subject to i (αβ)ij = 0 for all j and
                                                            
     In one example, he describes an experiment in-
     volving four prototype combustion engine de-              j (αβ)ij = 0 for all i; γk(i) , the nw = ac whole-plot
                                                                                                 2
     signs and two grades of gasoline. The engineers        errors, are independent N (0, σw       ); εijk are indepen-
                                                                         2
     preferred to make up the four motor designs            dent N (0, σ ); i = 1, . . . , a, j = 1, . . . , b, k = 1, . . . , c.
     and test both of the gasoline types while a par-       The ANOVA model (1) is appropriate for balanced
     ticular motor was on a test stand. Here the            factorial and fractional factorial designs. When bal-
     hard-to-change factor is motor type and the            ance is not present due to, e.g., missing observations
     easy-to-change factor is gasoline grade.               or a more complex design structure, such as a re-
                                                            sponse surface experiment, more general methods are
   The analysis of a split-plot experiment is more          required for analysis. For that reason, we prefer a
complex than that for a completely randomized ex-           more general, regression-based modeling approach.
periment due to the presence of both split-plot and            The regression model for a split-plot experiment is
whole-plot random errors. In the Box et al. corrosion-      a straightforward extension of the standard multiple
resistance example, a whole-plot effect is introduced        regression model, with a term added to account for
with each setting or re-setting of the furnace. This        the whole-plot error. Let wi = (wi1 , wi2 , . . . , wimw )
may be due, e.g., to operator error in setting the tem-     and si = (si1 , si2 , . . . , sims ) denote, respectively, the
perature, to calibration error in the temperature con-      settings of the mw whole-plot factors and the settings
trols, or to changes in ambient conditions. Split-plot      of the ms split plot factors for the ith run. The linear
errors might arise due to lack of repeatability of the      statistical model for the ith run is
measurement system, to variation in the distribution
of heat within the furnace, to variation in the thick-                      yi = f  (wi , si )β + zi  γ + εi ,             (2)
ness of the coatings from steel bar to steel bar, and
                                                            where f  (wi , si ) gives the (1 × p) polynomial model
so on. We will assume that the whole-plot errors are
                                                     2      expansion of the factor settings, β is the p × 1 pa-
independent and identically distributed as N(0, σw     ),
                                                            rameter vector of factor effects, zi is an indicator
that the split-plot errors are independent and iden-
                                                            vector whose kth element is one when the ith run
tically distributed as N(0, σ 2 ), and that whole-plot
                                                            was assigned to the kth whole plot and zero if not,
errors and the split-plot errors are mutually indepen-
                                                            γ is a b × 1 vector of whole-plot random effects, and
dent. Randomization guarantees that the whole-plot
                                                            εi is the split-plot error. We prefer the regression ap-
errors are mutually independent and that the split-
                                                            proach for its simplicity and flexibility. It permits the
plot errors are mutually independent within whole
                                                            use of polynomial terms for continuous or quantita-
plots.
                                                            tive treatments, as well as the use of indicators for
  When the split-plot experiment is balanced and            qualitative predictors.

Journal of Quality Technology                                                                 Vol. 41, No. 4, October 2009
SPLIT-PLOT DESIGNS: WHAT, WHY, AND HOW                                           343

   In matrix form, model (2) can be expressed as                reset (e.g., by changing its level and then changing
                                                                it back), inadvertent split plotting occurs. Such de-
                   Y = Xβ + Zγ + ε,                      (3)
                                                                signs, termed random run order (RRO) designs by
where Y is the n × 1 vector of responses, X is the              Ganju and Lucas (1997) are widely used and always
n × p model matrix with ith row given by f  (wi , si ),        analyzed inappropriately. We consider this issue in
Z is an n × b matrix with ith row equal to zi , and            greater detail, under “Validity” later, and discuss
ε = (ε1 , . . . , εn ) . We assume that ε ∼ N(0n , σ 2 In ),   here the relative cost of split-plot experiments and
γ ∼ N(0c , σw     2
                    Ic ), and Cov(ε, γ ) = 0c×n , where In      true CRDs. Simple accounting tells us that the cost
denotes an n × n identity matrix.                               of a split-plot experiment can be significantly less
                                                                because much of the cost (or time required) of run-
   Our model assumptions imply that the variance–
                                                                ning a split-plot experiment is tied to changes in the
covariance matrix of the response vector Y can be
                                                                hard-to-change factors. If the cost of setting a hard-
written as
                                                                to-change factor is Cw and the cost of setting an
               V = σ 2 In + σ w
                              2
                                ZZ ,         (4)
                                                                easy-to-change factor is Cs , and if there are nw whole
where In denotes the n×n identity matrix. If the runs           plots and ns split plots, the CRD cost is ns (Cw +Cs ),
are grouped by whole plots, the variance–covariance             whereas the SPD cost is nw Cw +ns Cs , so that the ad-
matrix is block-diagonal,                                       ditional cost for the CRD is (ns − nw )Cw . Split-plot
                 ⎡                    ⎤
                   V1 0 · · · 0                                 designs are useful precisely because this additional
                 ⎢ 0 V2 · · · 0 ⎥                               cost can be substantial. Anbari and Lucas (1994,
            V=⎢  ⎣ ...    ..       .. ⎥ .
                                    . ⎦
                             ..                                 2008) contrast the costs of CRDs and SPDs for vari-
                           .    .
                                                                ous design scenarios. Bisgaard (2000) also discussed
                      0       0    · · · Vb
                                                                cost functions.
The ith matrix on the diagonal is given by
                V i = σ 2 I si + σ w
                                   2
                                     1si 1si ,                 Efficiency

where si is the number of observations in the ith                  Split-plot experiments are not just less expen-
whole plot and 1si is an si × 1 vector of ones.                 sive to run than completely randomized experiments;
   For statistical inference, we are interested primar-         they are often more efficient statistically. Ju and Lu-
ily in the estimation of the fixed effects parameter              cas (2002) show that, with a single hard-to-change
vector β and the variance components σ 2 and σw      2
                                                       .        factor or a single easy-to-change factor, use of a split-
There are several alternative methods available for             plot layout leads to increased precision in the esti-
estimation of model parameters for this general ap-             mates for all factor effects except for whole-plot main
proach. Recommendations for estimation and testing              effects. Overall measures of design efficiency have
are taken up in the later section “How to analyze a             been considered by Anbari and Lucas (1994) and
split-plot design.”                                             Goos and Vandebroek (2004). Considering two-level,
                                                                full factorial designs only, Anbari and Lucas (1994)
       Why Use Split-Plot Designs?                              showed that the maximum variance of prediction
                                                                over a cuboidal design region for a completely ran-
   In this paper, we advocate greater consideration             domized design in some instances is greater than that
of split-plot experiments for three reasons: cost, effi-          for a blocked split-plot design involving one hard-
ciency, and validity. In this section, we discuss each          to-change factor. In other words, the G-efficiency of
of these advantages in turn.                                    the split-plot design can, at times, be higher than
                                                                the corresponding CRD. Similarly, Goos and Van-
Cost
                                                                debroek (2001, 2004) demonstrated that the deter-
   The cost of running a set of treatments in split-            minant of a D-optimal split-plot design (discussed
plot order is generally less than the cost of the same          later) frequently exceeds that of the corresponding
experiment when completely randomized. As Ganju                 D-optimal completely randomized design. These re-
and Lucas (1997, 1998, 2005) have noted, a properly             sults indicate that running split-plot designs can rep-
implemented completely randomized design (CRD)                  resent a win–win proposition: less cost with greater
requires that all factors must be independently re-             overall precision. As a result, many authors recom-
set with each run. If a factor level does not change            mend the routine use of split-plot layouts, even when
from one run to the next and the factor level is not            a CRD is a feasible alternative.

Vol. 41, No. 4, October 2009                                                                                www.asq.org
344                             BRADLEY JONES AND CHRISTOPHER J. NACHTSHEIM

Validity                                                    sions based on the split-plot analysis are exactly the
                                                            opposite.
   Completely randomized designs are prescribed fre-
quently in industry, but are typically not run as such         One way to avoid these mistakes is to plan the
in the presence of hard-to-change factors. One of two       experiment as a split-plot design in the first place.
shortcuts may be taken by the experimenter in the           Not only does this avoid mistakes, it also leverages
interest of saving time or money, i.e.,                     the economic and statistical efficiencies already de-
  1. A random-run-order (RRO) design—alternate-             scribed. Interestingly, it is not necessarily the case
     ly referred to as a random-not-reset (RNR)             that an RRO experiment is to be universally avoided.
     design—is utilized. As noted above, in this case,      Webb et al. (2004) discuss comparisons of predic-
     the order of treatments is randomized, but the         tion properties of experiments that are completely
     factor levels are not reset with each run of the       randomized versus experiments that have one or
     experiment, or perhaps all factors, except the         more factors that are not reset. They argue that, in
     hard-to-change factors are reset. Not resetting        process-improvement experiments, the cost savings
     the levels of the hard-to-change factor leads to       that accrues from not resetting each factor on every
     a correlation between adjacent runs. Statistical       run may at times justify the use of an RRO design,
     tests that do not consider these correlations will     even though prediction variances will be inflated and
     be biased (Ganju and Lucas (1997)).                    tests will be biased.
  2. Although a random run order is provided to                In general, we believe that considerations about
     the experimenter, the experimenter may decide          the presence of hard-to-change factors and their po-
     to resort the treatment order to minimize the          tential impact on the randomization employed should
     number of changes of the hard-to-change factor.        be a part of the planning of every industrial ex-
     Although this leads to a split-plot design, the        periment. That is, asking questions about necessary
     results are typically analyzed as if the design        groupings of experimental runs is something that a
     had been fielded as a completely randomized             consultant should do regularly. This is in contrast
     experiment.                                            with the usual standard advice dispensed by con-
                                                            sultants and short courses, that designs should al-
   Either scenario leads to an incorrect analysis. The
                                                            ways be randomized. Finally, to ensure the validity
Type I error rate for whole plot factors increases,
                                                            of the experiment and the ensuing analysis, the ex-
as does the Type II error rate for split-plot factors
                                                            periment should be carefully monitored to verify that
and whole-plot by split-plot interactions (see, e.g.,
                                                            whole-plot and split-plot factor settings are reset as
Goos et al. (2006)). The corrosion-resistance exam-
                                                            required by the design.
ple of Table 1 is a good example. The table below
provides p-values for tests of effects in the corrosion-
resistance example for the standard (incorrect) anal-
                                                              How to Choose a Split-Plot Design
ysis and the split-plot analysis. At the α = .05 level of      Choosing a “best” split-plot design for a given
significance, the p-values for the completely random-        design scenario can be a daunting task, even for a
ized analysis lead to the conclusion that the temper-       professional statistician. Facilities for construction of
ature main effect is statistically significant, whereas       split-plot designs are not as yet generally available in
the coating and temperature-by-coating interaction          software packages (with SAS/JMP being one excep-
effects are not statistically significant. The conclu-        tion). Moreover, introductory textbooks frequently
                                                            consider only very simple, restrictive situations. Un-
                                                            fortunately, in many cases, obtaining an appropriate
      p-Values for corrosion-resistance example             design requires reading (and understanding) a rele-
                                                            vant recent research paper or it requires access to
                                Split-plot    Standard      software for generating the design. To make matters
           Effect                 analysis     Analysis      even more difficult, construction of split-plot exper-
                                                            iments is an active area of research and differences
Temperature (A)                  .209          .003        in opinion about the value of the various recent ad-
Coating (B)                      .002         .386         vances exist. In what follows, we discuss the choice of
Temp × Coating (A × B)           .024         .852         split-plot designs in five areas: (1) two-level full facto-
                                                            rial designs; (2) two-level fractional factorial designs;

Journal of Quality Technology                                                            Vol. 41, No. 4, October 2009
SPLIT-PLOT DESIGNS: WHAT, WHY, AND HOW                                            345

(3) mixture and response surface designs; (4) split-             TABLE 2. Full-Factorial Split-Plot Design with
plot designs for robust product experiments; and (5)                Split-Plot Blocking (in standard order)
optimal designs.
                                                                                    Whole plot           Split plot
Full Factorial Split-Plot Designs
                                                           Whole plot number        A        B       p      q         r
   Construction of single replicates of a two-level full
factorial designs is straightforward. In this case, a                1              −1      −1      −1      −1    −1
completely randomized design in the whole-plot fac-                                 −1      −1       1       1    −1
tors is conducted and, within each whole plot, a                                    −1      −1       1      −1     1
completely randomized design in the split-plot fac-                                 −1      −1      −1       1     1
tors is also conducted. Note that there are nw + 1
separate randomizations. The disadvantage of run-                    2              −1      −1       1      −1    −1
ning the split-plot in a single replicate is the same                               −1      −1      −1       1    −1
as that for a completely randomized design: there                                   −1      −1      −1      −1     1
are no degrees of freedom for error, either at the                                  −1      −1       1       1     1
whole-plot level or the split-plot level, and so tests
for various effects cannot be obtained without assum-                 3                1     −1      −1      −1    −1
ing that various higher order interactions are negli-                                 1     −1       1       1    −1
gible. One design-based solution to this is to fully                                  1     −1       1      −1     1
replicate the experiment, although this may be ex-                                    1     −1      −1       1     1
pensive. Intermediate alternatives include replicat-
ing the split-plots within a given whole plot (if the                4                1     −1       1      −1    −1
size of the whole plot permits), giving an estimate                                   1     −1      −1       1    −1
of the split-plot error variance, or the use of classi-                               1     −1      −1      −1     1
cal split-plot blocking (as advocated by Ju and Lu-                                   1     −1       1       1     1
cas (2002)), where the split-plot treatments are run
in blocks and the whole-plot factor level is reset for
                                                                     5              −1        1     −1      −1    −1
each block. This increases the number of whole plots
                                                                                    −1        1      1       1    −1
(while decreasing the number of split-plots per whole
                                                                                    −1        1      1      −1     1
plot), providing an estimate of whole-plot error.
                                                                                    −1        1     −1       1     1
   An example of a blocked full-factorial split-plot
                                                                     6              −1        1      1      −1    −1
design is given in Table 2. For this example, there
                                                                                    −1        1     −1       1    −1
are two hard-to-change factors, A and B, and three
                                                                                    −1        1     −1      −1     1
easy-to-change factors p, q, and r. Run as an un-
                                                                                    −1        1      1       1     1
blocked full-factorial split-plot design, there would
be 22 = 4 whole plots with each whole plot consist-
ing of the 23 = 8 treatment combinations involving                   7                1       1     −1      −1    −1
p, q, and r. Here we use b = pqr as the block gen-                                    1       1      1       1    −1
erator to divide the eight split-plot treatment com-                                  1       1      1      −1     1
binations in each whole plot into two blocks of size                                  1       1     −1       1     1
four. Each block is now treated as a separate whole-
plot, thereby increasing the number of whole plots                   8                1       1      1      −1    −1
from four to eight. An abbreviated ANOVA table                                        1       1     −1       1    −1
is shown in Table 3. (Note that the four degrees of                                   1       1     −1      −1     1
freedom for whole-plot error are obtained by pooling                                  1       1      1       1     1
each of the four whole-plot (= pqr) effects that are
nested within the levels of A and B.) The fact that
the number of whole plots has doubled will increase
                                                           Fractional Factorial Split-Plot Designs
the cost of the experiment. It is also important to
reset the whole-plot factor level between whole plots         If a full factorial split-plot design is too large, con-
even if level does not change.                             sideration of a fractional factorial split-plot (FFSP)

Vol. 41, No. 4, October 2009                                                                              www.asq.org
346                             BRADLEY JONES AND CHRISTOPHER J. NACHTSHEIM

TABLE 3. ANOVA (Source and DF) for Design in Table 1         to designs that are obtained by crossing as Cartesian-
                                                             product designs, and we adopt this terminology in
              Source                         DF              what follows. The use of Cartesian-product designs
                                                             is a simple strategy that is frequently used, but it
         A                                   1               is not guaranteed to give the FFSP design of max-
         B                                   1               imum resolution. An alternative approach is to use
         AB                                  1               the technique of split-plot confounding.
         Whole plot error                    4                  To illustrate the concept of split-plot confounding,
                                                             we borrow from an example provided by Bingham
         p                                   1               and Sitter (1999). Assume that there are four whole-
         q                                   1               plot factors, A, B, C, and D, and three split-plot
         r                                   1               factors, p, q and r, resources permit at most 16 runs,
         pq                                  1               and we require a design of resolution III or greater.
         pr                                  1               Employing a Cartesian-product design, the highest
         qr                                  1               resolution possible is obtained by crossing a resolu-
         Ap                                  1               tion IV 24−1 design in the whole plots (based on the
         Aq                                  1               generator D = ABC ) with a 23−2 design in the split
         Ar                                  1               plots (based, for example, on the split-plot genera-
         Apq                                 1               tors q = p and r = p). This leads to a 2(4+3)−(1+2)
         Apr                                 1               FFSP with the following defining contrast subgroup:
         Aqr                                 1
         Bp                                  1                  I = ABCD = pq = pr = ABCDpq = ABCDpr
         Bq                                  1                    = qr = ABCDqr,
         Br                                  1
         Split-plot error                    9               which is resolution II. A better design can be con-
            Bpq                                              structed using split-plot confounding (Kempthorne
            Bpr                                              (1952)) by including whole-plot factors in the split-
            Bqr                                              plot factorial generators. In the current example, if
            ABp                                              we use D = ABC as the whole-plot design generator
            ABq                                              and q = BC p and r = AC p as the split-plot design
            ABr                                              generators, the defining contrast subgroup is
            ABpq
            ABpr                                               I = ABCD = BC pq = AC pr = CDpq = BDpr
            ABqr                                                 = BC qr = ADqr,

         Total                              31               which is resolution IV.
                                                                Huang et al. (1998, pp. 319-322) give tables of
                                                             minimum aberration (MA) FFSP designs having five
design may be warranted. As noted by Bisgaard                through fourteen factors for 16, 32, and 64 runs and
(2000), there are generally two approaches to con-           these designs were extended by Bingham and Sitter
structing FFSPs. The first is to (1) select a 2p1 −k1         (1999).
fractional factorial design in the p1 whole-plot fac-
tors of resolution r1 (or a full factorial if k1 = 0);          To illustrate the construction of an FFSP design
(2) select a 2p2 −k2 fractional factorial design of res-     using the tables of Huang et al., consider again the
olution r2 in the p2 split-plot factors (or a full fac-      previous example involving four whole-plot factors
torial if k2 = 0); and then (3) cross these designs to       and three split-plot factors, to be fielded in 16 runs.
obtain a 2(p1 +p2 )−(k1 +k2 ) fractional factorial design.   In this case, we have n1 = 4, n2 = 3, k1 = 1, and k2
(This design will be a fractional factorial of resolu-       = 2. The “word-length pattern/generators” for this
tion min(r1 , r2 ), as long as k1 + k2 > 0). By cross-       case are provided on line 9 of Table 1 (Huang et al.
ing, we simply mean that every whole-plot treatment          (1998), p. 319) and are ABCD, BCpq, and ACpr.
combination is run in combination with every split-          Hence, we have D = ABC, q = BCp, and r = ACp.
plot treatment combination. Bisgaard (2000) refers           The independent columns are A, B, C, and p. Most

Journal of Quality Technology                                                           Vol. 41, No. 4, October 2009
SPLIT-PLOT DESIGNS: WHAT, WHY, AND HOW                                        347

                                                                 TABLE 4. Minimum Aberration SPFF Design

                                                           Whole plot     A     B     C     D     p     q      r

                                                                1        −1    −1    −1    −1    −1     −1    −1
                                                                         −1    −1    −1    −1     1      1     1

                                                                2        −1    −1      1     1   −1      1     1
                                                                         −1    −1      1     1    1     −1    −1

                                                                3        −1      1   −1      1    1     −1     1
                                                                         −1      1   −1      1   −1      1    −1

                                                                4        −1      1     1   −1     1      1    −1
                                                                         −1      1     1   −1    −1     −1     1

                                                                5          1   −1    −1      1   −1     −1     1
                                                                           1   −1    −1      1    1      1    −1

                                                                6          1   −1      1   −1    −1      1    −1
                                                                           1   −1      1   −1     1     −1     1

                                                                7          1     1   −1    −1     1     −1    −1
   FIGURE 2. Selecting Generators for Minimum Aberration
                                                                           1     1   −1    −1    −1      1     1
Split-Plot Fractional Factorial Design.
                                                                8          1     1     1     1   −1     −1    −1
                                                                           1     1     1     1    1      1     1
standard software packages for the design of experi-
ments permit the user to specify generators or, equiv-
alently, a preferred defining contrast subgroup. Fig-
                                                           32-run designs provided by Huang et al. (1998). He
ure 2 illustrates this step for SAS/JMP. The resulting
                                                           starts with a 16-run design that uses fewer whole
design is displayed in standard order in Table 4.
                                                           plots than those in the minimum aberration tables,
   While the tables in Huang et al. (1998) and Bing-       and then adds 8 runs to this design using semi-folding
ham and Sitter (1999) are useful, they are by no           to break various alias chains. He shows that the 24-
means exhaustive, and a number of papers have aug-         run design is a good compromise between the 16-
mented these results. One limitation of these designs      and 32-run designs. We consider these designs in a
is that the whole-plot designs were not replicated.        bit more detail in our discussion of optimal split-plot
Bingham et al. (2004) developed methods for split-         designs below.
ting the whole plots (as discussed above in the case of
a full factorial split-plot design), thereby increasing
                                                           Split-Plot Designs for Robust Product
the number of whole plots while decreasing the num-
                                                           Experiments
ber of split plots per whole plot. McLeod and Brew-
ster (2004) developed three methods for constructing          The purpose of robust product experiments is to
blocked minimum aberration 2-level fractional fac-         identify settings of product-design factors, sometimes
torial split-plot designs, for n = 32 runs. McLeod         termed product-design parameters that lead to prod-
and Brewster (2008) also developed foldover plans          ucts or processes that are insensitive to the effects of
for two-level fractional factorial designs. Relatedly,     uncontrollable environmental factors. As illustrated
Almimi et al. (2008) develop follow-up designs to re-      by Box and Jones (1992), product-design parameters
solve confounding in split-plot experiments. Kowal-        for a cake mix might be the amount of flour, short-
ski (2002) provides additional SPFF designs with 24        ening, and egg powder contained in the mix. Un-
runs, in an effort to fill the gap between the 16- and       controllable environmental factors include the baking

Vol. 41, No. 4, October 2009                                                                          www.asq.org
348                             BRADLEY JONES AND CHRISTOPHER J. NACHTSHEIM

time and baking temperature used by the consumer               for either of alternatives (1) and (2). We refer
to produce the cake. These ideas were introduced               the reader to their paper for further details.
by Taguchi (see, e.g., Taguchi (1987)), who recom-
mended the use of Cartesian product experiments to           Of course, while alternatives (1) and (2) arise fre-
determine the optimal settings of the design parame-      quently, environmental- and product-design factors
ters. Specifically, Taguchi employed a designed exper-     need not be of one type or the other. The case
iment in the design factors—the inner array, a second     may arise in which the set of hard-to-change factors
designed experiment for the environmental factors—        is comprised of both environmental- and product-
the outer array, and these designs were crossed to        design factors. Generally, in robust product experi-
form the robust product experiment. A “best” set-         ments, however, interest centers on the design-factor
ting of the product design factors is one that leads to   effects and the design-factor × environmental-factor
an average response closest to the product designer’s     interactions. Main effects of the environmental fac-
target, and that also exhibits little variation as the    tors are of secondary interest, as they have no bear-
environmental factors are changed.                        ing on determination of best settings for the design
                                                          factors. For this reason, Box and Jones (1992) con-
   Unfortunately, a barrier to the widespread use         clude that environmental factors will be more appro-
of crossed-array parameter-design experiments has         priately used as the whole-plot treatment.
been cost. Clearly, if n1 is the number of runs in the
design-factors array and n2 is the number of runs            A number of authors have explored the question
in the environmental-factors array, then the required     of how to choose a “best” split-plot design for a given
number of runs—and therefore the number of factor         set of design and environmental factors. Bisgaard
resettings—is n = n1 × n2 . Thus, the size of the the     (2000) considered the use of 2p1 −k1 × 2p2 −k2 frac-
required robust parameter CRD will be prohibitive         tional factorial split-plot experiments, where there
unless both n1 and n2 are small. Recently, various        are p1 whole-plot factors and p2 split-plot factors. He
authors, beginning with Box and Jones (1992), have        began by considering Cartesian-product designs, as
demonstrated how split-plot designs can be used ef-       advocated by Taguchi, and showed how to determine
fectively to reduce the cost of robust product exper-     the defining contrast subgroup for the combined ar-
iments when there are hard-to-change factors. Box         ray as a function of the defining contrast subgroups
and Jones (1992) discuss three ways in which split-       for the inner and outer arrays. He then noted that
plot designs might arise in robust product experi-        split-plot confounding can lead to better designs and
ments:                                                    gave rules for implementing split-plot confounding.

  1. The environmental factors are hard to change            The problem of finding a “best” split-plot frac-
     while the product-design factors are easy to         tional factorial design for robust product design was
     change. In the context of the cake-mix exam-         taken up later by Bingham and Sitter (2003). They
     ple, suppose that the cake mixes can be eas-         began by noting that their own tables of MA SPFF
     ily formulated as individual packages, but the       designs (Bingham and Sitter (1999)) and those de-
     ovens are large and it takes a good deal of time     veloped previously by Huang et al. (1998) were not
     to reset the temperature. In this case, the time     directly applicable to product-design experiments.
     for the experiment may be minimized by using         Let C denote any (controllable) product-design fac-
     time and temperature as whole-plot factors.          tor and let N denote any environmental (noise) fac-
                                                          tor. If we ignore the robust design aspect of the
  2. The product-design factors are hard to change        design problem, the importance of the various fac-
     and the environmental factors are easy to            tors (of third order or less) to the investigator will
     change. In the context of the cake-mix example,      typically be ranked in descending order of likely
     it may be that the cake mixes must be made           significance—main effects, followed by second-order
     in large batches, which can be subdivided for        effects, followed by third-order effects—as shown in
     baking at the different conditions indicated by       the top panel of Table 5. However, in a product-
     the environmental-factor (outer) array. Thus,        design experiment, we are less interested in main ef-
     we want to minimize the number of required           fects of the noise factors, and Bingham and Sitter
     batches (whole plots).                               (2003) argue that a more relevant ranking of fac-
  3. In this alternative, Box and Jones (1992) con-       tor effects is as shown in the bottom panel of Table
     sidered the use of strip-plot experiments, which     5. The key point is that the importance ranking of
     can further reduce the cost of the experiment        the control-by-noise interactions has been increased

Journal of Quality Technology                                                        Vol. 41, No. 4, October 2009
SPLIT-PLOT DESIGNS: WHAT, WHY, AND HOW                                        349

    TABLE 5. Ranking of Factor Effects in Standard        Notice that, from Equation (4), we have
         vs. Parameter Design Experiments
                                                                          V = σ 2 In + σ w
                                                                                         2
                                                                                           ZZ
Importance                                                                     = σ 2 (In + dZZ ),             (6)
  ranking                      Effect type                             2
                                                         where d = σw   /σ 2 is the variance ratio—the ratio of
                 Standard experiment                     the whole-plot variance component to the split-plot
                                                         variance component. From Equations (5) and (6), it
     1.       C, N                                       follows that
     2.       C × C, N × N , C × N                                                                       
     3.       C × C × C, C × C × N , C × N × N ,              CD (X, Z, V) = (σ 2 )−p X (In + dZZ )−1 X
                                                                                                  
                N ×N ×N                                                     ∝ X (In + dZZ )−1 X ,       (7)
             Robust parameter experiment                 which indicates that the determinant criterion de-
                                                         pends on the unknown variance components only
     1.       C, N                                       through their ratio, d. The absolute magnitudes are
     2.       C ×N                                       not required for comparing designs. Two alternative
     3.       C × C, N × N                               designs, [X1 , Z1 ] and [X2 , Z2 ], can be compared by
     4.       C × C × N, C × N × N                       computing their relative efficiencies. The relative D-
     5.       C × C × C, N × N × N                       efficiency of design [X1 , Z1 ], relative to the design
                                                         [X2 , Z2 ] is given by

relative to other effects of the same order, consistent         RED ([X1 , Z1 ], [X2 , Z2 ])
                                                                                                   1/p
with the experimenter’s objectives in robust prod-                      X1  (In + dZ1 Z1  )−1 X1 
uct experiments. As noted by Bingham and Sitter                    =                                   .
                                                                          X2 (In + dZ2 Z2  )−1 X2 
(2003), the low-order noise main effects must be re-
tained because they are likely to be significant, and     The relative efficiency gives the number times the
it is important to keep other effects from becoming       design [X2 , Z2 ] must be replicated in order to achieve
aliased with them. They then modify the definition        the overall precision of design [X1 , Z1 ].
of aberration to reflect the new importance rankings
and use their search algorithm (Bingham and Sitter          Letsinger et al. (1996) also gave the integrated
(1999)) to produce two tables of MA FFSP designs         variance of prediction criterion (IV) for split-plot de-
for robust product design experiments. The first ta-      signs as
ble is for use when the product-design factors are       CIV (X, Z, V)
hard to change, the second is for applications when                     
                                                                   σ2
the environmental factors are hard to change. These           =           f  (x)[X (In + dZZ )−1 X]−1 f  (x)dx
tables are useful for 16- and 32-run experiments and                 dx  R
                                                                   R
                                                                       
for total numbers of factors ranging from 5 to 10.
                                                              ∝ Trace [X (In + dZZ )−1 X]−1
Split-Plot Designs for Response Surface
Experiments                                                                                   
                                                                                        
                                                                          ×         f (x)f (x)dx
   Relatively little research has been conducted with                           R
regard to the construction of split-plot designs for
                                                              = Trace{[X (In + dZZ )−1 X]−1 [MR ]},
                                                                           
                                                                                                               (8)
response surface experiments. Recall that, in a re-
sponse surface experiment, we are typically inter-       where x = (w , s ) denotes the vector of factor set-
ested in estimating a response model comprised of        tings, R is the region of interest, MR is the moment
all first- and second-order polynomial terms.             matrix, and we again note the dependence of the cri-
   Letsinger et al. (1996) first discussed the notion     terion on the unknown variance ratio d. For the in-
of optimality criteria in connection with split-plot     tegrated variance criterion, the efficiency of design
response surface experiments. They noted that a best     [{X1 , Z1 }], relative to design [{X2 , Z2 }] is given by
design will maximize, according to the determinant         REIV ([X1 , Z1 ], [X2 , Z2 ])
criterion CD ,                                                    Trace{[X2 (In + dZ2 Z2 )−1 X2 ]−1 [MR ]}
                                                             =
            CD (X, Z, V) = X V−1 X .          (5)              Trace{[X1 (In + dZ1 Z1 )−1 X1 ]−1 [MR ]}.

Vol. 41, No. 4, October 2009                                                                          www.asq.org
350                             BRADLEY JONES AND CHRISTOPHER J. NACHTSHEIM

Letsinger et al. examined, for the case of four factors,
the efficiencies of the Box–Behnken design, the uni-
form precision design, the rotatable central compos-
ite design, the small composite design for the case of
one or two whole-plot factors. They conclude that the
central composite design, run as a split-plot, seemed
to give the best overall performance.
   Other recent developments in this area include
Draper and John (1998), who discuss how to choose
factor levels for their CUBE and STAR designs in
split-plot situations. They use an index of rotata-
bility as their criterion for choosing the factor lev-
                                                              FIGURE 3. Equivalent Estimation Response Surface De-
els to create optimal designs. Split-plot mixture re-
                                                           sign.
sponse surface experiments have only very recently
been considered (Kowalski et al. (2002), Goos and
Donev (2006)).                                             0) are replicated four times. In one of the four center-
   Recently, Vining et al. (2005) discussed the de-        point whole plots, the split-plot treatments occupy
sign and analysis of response surface experiments          the four corners. This leads to 7 × (4 − 1) = 21 de-
when run in the form of a split plot. Their paper          grees of freedom for split-plot pure error. Also, be-
established conditions under which ordinary least          cause three of the four whole-plot center points have
squares (OLS) estimates, which are usually inappro-        the identical split-plot treatment structure (i.e., four
priate for analysis of split-plot experiments, and the     center points), they are true replicates, and we ob-
preferred generalized least squares (GLS) estimates        tain three degrees of freedom for whole-plot pure er-
are equivalent. Their result builds on the Letsinger       ror. In general, if the center points of the split-plot
et al. (1996) paper just discussed, which demon-           factors are replicated nspc times within each of the
strated that OLS and GLS estimates are equivalent          2k whole-plot axial points and r whole-plot center
for Cartesian-product split-plot designs. Vining et        points, there will be (2k + r)(nspc − 1) degrees of
al. (2005) then show how various points in central         freedom for split-plot pure error and r − 1 degrees of
composite designs and Box–Behnken designs can be           freedom for whole-plot pure error. An unbiased es-
replicated in such a way as to induce the equivalence      timate of the split-plot pure-error variance is given
of OLS and GLS. These designs, named equivalent            by
                                                                              2k 2        r
estimation designs (EEDs) by the authors, have two                               l=1 sAl          s2cp0i
                                                                        2
                                                                       ssp =             + i=1           ,
useful features. First, the required replication struc-                           2k            r
ture leads to designs with pure replicates of whole
                                                           where s2Al and s2cp0i are the usual sample variances
plots and split plots, so that model-free estimates of
                              2                            of the responses from the lth whole-point axial point
the variance components σw       and σ 2 are provided.
                                                           and the ith whole-plot center point containing split-
Second, OLS software, which is widely available, can
                                                           plot center points. An estimate of the whole-plot
be used to obtain parameter estimates.
                                                           variance component is also easy to compute. Let
   An equivalent-estimation design for a split-plot re-    Ȳi. denote the mean of the values in the ith repli-
sponse surface scenario, developed by Vining et al.        cated whole plot and let s2wp denote the usual sam-
(2005), is shown in Figure 3, and the design points        ple variance of the r replicated whole-plot means,
                                                                                                          2
are listed in Table 6. In this example, there are two      Ȳ1. , . . . , Ȳr. . An unbiased estimate of σw is given by
hard-to-change factors, Z1 and Z2 , and there are
                                                                                              s2sp
two easy-to-change factors, X1 and X2 , and the ex-                            s2w = s2wp −        .
perimenter is interested in estimating all first- and                                          nspc
second-order terms. The design consists of a Box–
                                                           As was the case with the standard ANOVA estimator
Behnken design with 12 whole plots corresponding
                                                           (9), this estimator can, at times, be negative.
to the four corner points, the four axial points, and
four center points. In the four axial whole plots and in      In summary, equivalent estimation designs for
three of the four center-point whole plots, the center     response-surface experiments designs possess two no-
points for the split-plot treatments (i.e., X1 = X2 =      table properties. First, they permit the use of OLS

Journal of Quality Technology                                                            Vol. 41, No. 4, October 2009
SPLIT-PLOT DESIGNS: WHAT, WHY, AND HOW                                          351

                           TABLE 6. Equivalent Estimation Split-Plot Response Surface Design

  Whole plot         Z1         Z2        X1        X2        Whole plot         Z1        Z2       X1         X2

      1              −1         −1          0         0            7               0       −1       −1          0
   (Corner)          −1         −1          0         0         (Axial)            0       −1        1          0
                     −1         −1          0         0                            0       −1        0         −1
                     −1         −1          0         0                            0       −1        0          1

      2                1        −1          0         0            8               0           1    −1          0
   (Corner)            1        −1          0         0         (Axial)            0           1     1          0
                       1        −1          0         0                            0           1     0         −1
                       1        −1          0         0                            0           1     0          1

      3              −1           1         0         0             9              0           0    −1         −1
   (Corner)          −1           1         0         0         (Center)           0           0     1         −1
                     −1           1         0         0                            0           0    −1          1
                     −1           1         0         0                            0           0     1          1

      4                1          1         0         0            10              0           0      0          0
   (Corner)            1          1         0         0         (Center)           0           0      0          0
                       1          1         0         0                            0           0      0          0
                       1          1         0         0                            0           0      0          0

      5              −1           0       −1         0             11              0           0      0          0
    (Axial)          −1           0        1         0          (Center)           0           0      0          0
                     −1           0        0        −1                             0           0      0          0
                     −1           0        0         1                             0           0      0          0

      6                1          0       −1         0             12              0           0      0          0
    (Axial)            1          0        1         0          (Center)           0           0      0          0
                       1          0        0        −1                             0           0      0          0
                       1          0        0         1                             0           0      0          0

for obtaining factor-effects estimates. Second, the au-        approach. Optimal designs for split-plot experiments
thors’ recommended replication structure leads to             were first proposed by Goos and Vandebroek (2001).
model-free estimates of the whole-plot and split-plot         The general approach is as follows:
variance components. However, these designs are not
                                                                 1. Specify the number of whole plots, nw .
without drawbacks. Inadvertent changes to the de-
sign during implementation, e.g., the occurrence of              2. Specify the number of split plots per whole plot,
a missing observation, will destroy the OLS–GLS                     ns .
equivalence. Moreover, Goos (2006) noted that the                3. Specify the response model, f  (w, s).
relatively large proportion of runs devoted to pure              4. Specify a prior estimate of the variance ratio,
replication significantly reduces the overall efficiency               d.
of the design. Goos advocates the use of optimal ex-             5. Use computer software to construct the design
perimental designs, which we take up next.                          for steps 1 through 4 that maximizes the D-
                                                                    optimality criterion of Equation (5).
Optimal Split-Plot Designs
                                                                 6. Study the sensitivity of the optimal design to
                                                                    small changes in d, nw , and ns .
   A general approach to the problem of constructing
split-plot designs is provided by the optimal-design          The algorithm given by Goos and Vandebroek (2001)

Vol. 41, No. 4, October 2009                                                                              www.asq.org
352                             BRADLEY JONES AND CHRISTOPHER J. NACHTSHEIM

also requires that the user specify a candidate set of         TABLE 7. D-Optimal Split-Plot Screening Design
possible design points. This is usually the 2p vertices
of the hypercube ([−1, 1]p ) for main-effects models        Whole plot     A     B     C      D     p     q      r
or for main-effects models plus interactions. It may
consist of points from the 3p factorial when pure               1        −1     −1    −1      1   −1    −1       1
quadratic effects are present. An alternative algo-              1        −1     −1    −1      1    1     1      −1
rithm based on the coordinate exchange algorithm
(Meyer and Nachtsheim (1995)), which does not re-
                                                                2        −1     −1      1   −1    −1    −1      −1
quire candidate sets, was later developed by Jones
                                                                2        −1     −1      1   −1     1     1       1
and Goos (2007). Note that the specified numbers of
whole plots and split plots must both be large enough
to enable estimation of all of the desired model ef-            3        −1       1   −1    −1    −1     1       1
fects and that the designs produced are optimal for             3        −1       1   −1    −1     1    −1      −1
the model as specified in item (3) above.
                                                                4        −1       1     1     1   −1    −1      −1
   The approach is general in the sense that it can             4        −1       1     1     1    1     1       1
be applied to virtually any split-plot design prob-
lem and, unlike the minimum-aberration designs de-              5          1    −1    −1    −1    −1     1      −1
scribed earlier, there is no need to limit run sizes to         5          1    −1    −1    −1     1    −1       1
powers of two. We demonstrate the use of the op-
timal split-plot design approach in connection with
                                                                6          1    −1      1     1   −1     1      −1
three of the examples already discussed.
                                                                6          1    −1      1     1    1    −1       1
Example 1: Screening
                                                                7          1      1   −1      1   −1     1       1
   Consider the screening design that was con-                  7          1      1   −1      1    1    −1      −1
structed earlier involving four hard-to-change fac-
tors and three easy-to-change factors. The minimum-
                                                                8          1      1     1   −1    −1    −1       1
aberration design, shown in Table 4, was based on
                                                                8          1      1     1   −1     1     1      −1
eight whole plots with two split plots per whole plot.
Thus, we specify nw = 8, ns = 2, and the model
of interest is the seven-factor first-order model. The
D-optimal design (produced by SAS/JMP) is shown            by using semi-folding of 16-run designs or balanced
in Table 7. Note that the design is balanced over-         incomplete blocks. In Kowalski’s case 1, based on
all and that every whole plot is separately balanced.      balanced incomplete block designs, there were two
The design, in fact, is orthogonal, although it is not a   whole-plot factors, four split-plot factors, and the ob-
regular orthogonal design. For nonregular orthogonal       jective was to run the experiment in four whole-plots
designs, effects are not directly aliased with other ef-    with six split plots per whole plot in order to es-
fects, as they are for MA fractional factorial designs.    timate all main effects and two-factor interactions.
This example illustrates advantages and disadvan-          The design recommended by Kowalski is displayed
tages of the optimal design approach. The simplicity       in Table 8. The optimal design approach produced
of this approach is undeniable, and whereas the MA         the design shown in Table 9. In both cases, all main
design is a D-optimal design for the first-order model,     effects and second-order interactions are estimable,
a D-optimal design need not be an MA design. One           and in both cases, the designs are balanced both
might question the marginal value of a resolution IV       overall and within whole plots. The D-efficiency of
design in this instance, when most of the two-factor       the combinatoric design, relative to the D-optimal
interactions are aliased with two or more other two-       design, is 88.9%.
factor interactions. Correctly identifying the active
two-factor interactions will be difficult, at best, with-    Example 3: Response Surface Designs
out substantial followup testing.                            Vining et al. (2005) developed an EED based on
                                                           2 hard-to-change and 2 easy-to-change factors, in
Example 2: Main Effects Plus Interactions
                                                           12 whole plots having 4 split plots per whole plot.
   We noted that Kowalski (2002) provided ap-              The design, shown previously in Figure 3, is a Box–
proaches for constructing 24-run split-plot designs        Behnken design with the split-plot and whole-plot

Journal of Quality Technology                                                          Vol. 41, No. 4, October 2009
SPLIT-PLOT DESIGNS: WHAT, WHY, AND HOW                                      353

  TABLE 8. 24 Run Design for Estimating Main Effects          TABLE 9. 24 Run D-Optimal Design for Estimating
     and Two-Factor Interactions (Kowalski, 2002)               Main Effects and Two-Factor Interactions

Whole plot      A      B       p     q      r       s     Whole plot      A      B      p      q      r        s

     1            1    −1       1   −1      −1     −1
                                                               1         −1     −1     −1     −1     −1         1
     1            1    −1      −1    1      −1     −1
                                                               1         −1     −1     −1     −1      1        −1
     1            1    −1      −1   −1       1     −1
                                                               1         −1     −1     −1      1     −1        −1
     1            1    −1       1    1      −1      1
                                                               1         −1     −1      1     −1     −1        −1
     1            1    −1       1   −1       1      1
                                                               1         −1     −1      1      1     −1         1
     1            1    −1      −1    1       1      1
                                                               1         −1     −1      1      1      1        −1
     2          −1       1     −1    1      −1     −1
     2          −1       1     −1   −1       1     −1          2         −1       1    −1     −1     −1        −1
     2          −1       1     −1   −1      −1      1          2         −1       1    −1      1     −1         1
     2          −1       1      1    1       1     −1          2         −1       1    −1      1      1        −1
     2          −1       1      1    1      −1      1          2         −1       1    −1      1      1         1
     2          −1       1      1   −1       1      1          2         −1       1     1     −1      1         1
                                                               2         −1       1     1      1     −1        −1
     3            1      1      1    1      −1     −1
     3            1      1      1   −1       1     −1          3           1    −1     −1     −1     −1        −1
     3            1      1      1   −1      −1      1          3           1    −1     −1      1     −1         1
     3            1      1     −1    1       1     −1          3           1    −1     −1      1      1        −1
     3            1      1     −1    1      −1      1          3           1    −1     −1      1      1         1
     3            1      1     −1   −1       1      1          3           1    −1      1     −1      1         1
                                                               3           1    −1      1      1     −1        −1
     4          −1     −1       1   −1       1     −1
     4          −1     −1       1   −1      −1      1
                                                               4           1      1    −1     −1      1         1
     4          −1     −1      −1    1       1     −1
                                                               4           1      1    −1      1     −1        −1
     4          −1     −1      −1    1      −1      1
                                                               4           1      1     1     −1     −1        −1
     4          −1     −1      −1   −1      −1     −1
                                                               4           1      1     1     −1     −1         1
     4          −1     −1       1    1       1      1
                                                               4           1      1     1     −1      1        −1
                                                               4           1      1     1      1      1         1

replication structure chosen in such a way that an
EED results. The D-optimal design for the same
second-order model is listed in Table 10 and dis-         approach emphasizes a different characteristic of the
played in Figure 4. It is clear that the D-optimal        design and the resulting designs can be quite differ-
design has traded split-plot and whole-plot center-       ent. Our view is that all of these designs are excel-
point replicates for treatment combinations at the        lent, especially when viewed against the alternatives
boundaries of the design space. Notice, for example,      of running a CRD incorrectly or simply not running a
that there are no center-point whole plots and that       designed experiment. The particular choice depends
all whole plots include corner points. Consistent with    on the goals of the experiment.
its criterion, the optimal-design approach is sacrific-
                                                             In this subsection, we have emphasized the use of
ing information on variance components for informa-
                                                          the D-optimality criterion, although alternative cri-
tion on factor effects.
                                                          teria and computer-based approaches do exist. For
   These results indicate that there are currently a      example, the ability to construct split-plot designs
variety of approaches for obtaining split-plot designs.   that are optimal by the average variance criterion (8)
We have discussed the use of MA fractional facto-         may be constructed, although general-purpose soft-
rial designs, various combinatoric modifications to        ware for doing so is not yet available. Such designs
the MA designs, EED response surface designs, and         would be particularly appropriate for estimating re-
optimal split-plot designs. The results show that each    sponse surfaces. In an alternative computer-based

Vol. 41, No. 4, October 2009                                                                        www.asq.org
354                             BRADLEY JONES AND CHRISTOPHER J. NACHTSHEIM

                                TABLE 10. D-Optimal Split-Plot Response Surface Design

  Whole plot         Z1         Z2        X1        X2        Whole plot        Z1        Z2        X1        X2

       1             −1         −1        −1        −1             7              1       −1        −1        −1
       1             −1         −1        −1         1             7              1       −1        −1         1
       1             −1         −1         1        −1             7              1       −1         0         1
       1             −1         −1         1         1             7              1       −1         1         0

       2             −1           1       −1        −1             8              1         1       −1        −1
       2             −1           1       −1         1             8              1         1       −1         1
       2             −1           1        1        −1             8              1         1        0        −1
       2             −1           1        1         1             8              1         1        1         0

       3               1        −1        −1        −1             9            −1          0       −1        −1
       3               1        −1        −1         1             9            −1          0       −1         1
       3               1        −1         1        −1             9            −1          0        0         0
       3               1        −1         1         1             9            −1          0        1         1

       4               1          1       −1        −1            10              1         0       −1         0
       4               1          1       −1         1            10              1         0        0        −1
       4               1          1        1        −1            10              1         0        1        −1
       4               1          1        1         1            10              1         0        1         1

       5             −1         −1        −1         1            11              0       −1        −1        −1
       5             −1         −1         0        −1            11              0       −1        −1         0
       5             −1         −1         1         0            11              0       −1         0         1
       5             −1         −1         1         1            11              0       −1         1        −1

       6             −1           1       −1        −1            12              0         1       −1         0
       6             −1           1       −1         1            12              0         1        0         1
       6             −1           1        0         0            12              0         1        1        −1
       6             −1           1        1        −1            12              0         1        1         1

approach, Trinca and Gilmour (2001) presented a              cated split-plot experiments—and move on to more
strategy for constructing multistratum response sur-         complex cases, including the analysis of unreplicated
face designs (split plots are a special case of these).      split-plot designs and a completely general approach,
This is a multistage approach that moves from the            based on restricted maximimum likelihood estima-
highest stratum to the lowest, where, at each stage,         tion (REML).
designs are constructed to ensure near orthogonal-
ity between strata. Weighted trace-optimal designs—          Balanced, Replicated Split-Plot Designs
not generally available—would be useful for robust-
                                                                When full factorial designs are balanced and repli-
product designs, because the rankings of effects uti-
                                                             cated, ANOVA model (1) is fully applicable. The
lized by Bingham and Sitter (2003) could be easily
                                                             ANOVA table for the corrosion-resistance experi-
reflected in the design-optimality criterion.
                                                             ment is given in Table 11. Note that there are two
                                                             replicates of each of the three whole-plot factor set-
 How to Analyze a Split-Plot Design
                                                             tings, leading to three degrees of freedom for pure
   In this section, we review methods for the anal-          whole-plot error. In contrast, the split-plot treat-
ysis of split-plot designs. We begin with the sim-           ments are not replicated within the whole plots and,
plest case—the ANOVA approach for balanced, repli-           for this reason, there are no degrees of freedom for

Journal of Quality Technology                                                            Vol. 41, No. 4, October 2009
SPLIT-PLOT DESIGNS: WHAT, WHY, AND HOW                                          355

                                                            From the corrosion-resistance ANOVA in Table (11),
                                                            we obtain s2 = 124.5 and s2w = (4813.2 − 124.5)/4 =
                                                            1172.2. We note that, if MS(split-plot error) is
                                                            greater than MS(whole-plot error), the ANOVA-
                                                            based estimated variance component for whole-plot
                                                            effects, s2w , will be negative. Procedures for con-
                                                            structing confidence intervals for the variance compo-
                                                            nents are detailed in standard texts (see, e.g., Kutner
                                                            et al. (2005)).
                                                               As noted previously, in the corrosion-resistance
                                                            example, whole plots are replicated, but split plots
                                                            are not. In this case, a pure-error estimate of whole-
    FIGURE 4. D-optimal Split-Plot Response Surface De-
                                                            plot error is available, whereas the estimate of the
sign Based on 12 Whole Plots. (Note: Figure shows layouts
                                                            split-plot error is based on the whole-plot error by
for two whole plots at each corner.
                                                            split-plot factor interaction. In choosing a design,
                                                            the experimenter must weigh the importance of true
pure split-plot error. An assumption frequently made        replicates at both levels of the experiment and their
is that there is no whole-plot error by split-plot fac-     implications for the power of the factor-effects tests
tor (i.e., block-by-treatment) interaction and, from        to be conducted. In industrial experiments, the cost
this, we obtain 3 × 3 = 9 degrees of freedom for split-     of whole plots frequently precludes the use of repli-
plot error. From an examination of the EMS column,          cates at the whole-plot level and methods for the
we discern (by setting all fixed effects to zero in the       analysis of unreplicated experiments are required.
expressions) that the whole-plot error mean square          We turn now to the analysis of unreplicated split-
is the appropriate term for the denominator of the          plot designs.
F -test for the effect of temperature, and the sub-
plot error mean square is the appropriate denomina-         Unreplicated Two-Level Factorial Split-Plot
tor for F -tests for the effects of coatings and for the     Designs
temperature-by-coating interaction. The general rule
                                                               As is the case with completely randomized un-
for testing factor effects is simple: whole-plot main ef-
                                                            replicated two-level factorial experiments, an exper-
fects are tested based on the whole-plot mean square
                                                            imenter has a number of options for the analysis of
error, while all other effects—split-plot main effects
                                                            unreplicated split-plot two-level designs. If the design
and split-plot by whole-plot interactions—are tested
                                                            permits the estimation of higher order interactions
using the split-plot mean square error.
                                                            and if it can be safely assumed that these higher or-
  From the expected mean square column of the               der interactions are not active, the estimates can be
ANOVA table (Table 11), we note that                        pooled to provide unbiased estimates of experimental
                                                            error at both levels of the experiment. Factor-effects
       E[MS(Whole-Plot Error)] = σ 2 + bσw
                                         2
                                                            tests can then be conducted in standard fashion. An-
         E[MS(Split-Plot Error)] = σ 2 .                    other approach recommended by Daniel (1959) was
It follows that                                             the use of two half-normal (or normal) plots of ef-
                E[MS(Whole-Plot Error)]                     fects, one for whole-plot factor main effects and in-
           2
          σw =                                              teractions and one for split-plot factor effects and
                            b
                                                            split-plot-factor by whole-plot-factor interactions.
                  E[MS(Split-Plot Error)]
                −                         .
                              b                                To illustrate these approaches, we use data from
We obtain unbiased estimators of σ 2 and σw 2
                                              by sub-       a robust product design reported by Lewis et al.
stituting observed for expected mean squares as             (1997). This experiment has also been discussed by
                                                            Bingham and Sitter (2003) and Loeppky and Sit-
            s2 = MS(Split-Plot Error)                       ter (2002). The experiment studied the effects of
                  MS(Whole-Plot Error)                      customer usage factors on the time between fail-
            s2w =
                            b                               ures of a wafer handling subsystem in an automated
                    MS(Split-Plot Error)                    memory that relied on optical-pattern recognition
                  −                      .           (9)
                              b                             to position a wafer for repair. The response was a

Vol. 41, No. 4, October 2009                                                                           www.asq.org
You can also read