The Utility of Alternative Fit Indices in Tests of Measurement Invariance

 
CONTINUE READING
Meade, A. W., Johnson, E. C., & Braddy, P. W. (2006, August). The Utility of Alternative Fit Indices in Tests of Measurement Invariance. Paper
presented at the annual Academy of Management conference, Atlanta, GA.

                              The Utility of Alternative Fit Indices in
                                Tests of Measurement Invariance

                                                       Adam W. Meade
                                                       Emily C. Johnson
                                                       Phillip W. Braddy

                                               North Carolina State University

                 Confirmatory factor analytic tests of measurement invariance (MI) based on the chi-square
                 statistic are known to be sensitive to sample size. For this reason, Cheung and Rensvold (2002)
                 recommended using alternative fit indices in MI investigations. However, previous studies have
                 not established the power of fit indices to detect data with a lack of invariance. In this study, we
                 investigated the performance of fit indices with simulated data known to not be invariant. Our
                 results indicate that alternative fit indices can be successfully used in MI investigations.
                 Specifically, we suggest reporting McDonald’s noncentrality index along with CFI, and Gamma-
                 hat.

          Measurement invariance (MI) can be                                Thus, in large samples, power to detect even trivial
considered the degree to which measurements                                 differences in the properties of a measure between
conducted under different conditions yield measures                         groups is extremely high, potentially leading to over-
of the same attributes (Drasgow, 1984; Horn &                               identification of a lack of invariance (LOI). For this
McArdle, 1992). These different conditions include                          reason, Cheung and Rensvold (2002) examined the
stability of measurement over time (Chan, 1998;                             potential use of change in alternative fit indices in MI
Chan & Schmitt, 2000), across different populations                         investigations. As with overall model fit, these
(e.g., cultures, Riordan & Vandenberg, 1994; gender,                        alternative fit indices (AFIs) are less strongly affected
Marsh, 1985, 1987; age groups, Marsh & Hocevar,                             by sample size than is chi-square in measurement
1985), rater groups (e.g., Facteau & Craig, 2001), or                       invariance tests. While their groundbreaking work is
over      different   mediums      of     measurement                       extremely promising, one crucial omission from
administration (Chan & Schmitt, 1997; Ployhart,                             Cheung and Rensvold’s study is that they only
Weekley, Holtz, & Kemp, 2003). Recently, there has                          examined the performance of AFIs under the null
been a substantial increase in research involving tests                     hypothesis of perfect MI between groups. Thus,
of MI due in part to an increased awareness of both                         while they recommended the use of the some AFIs,
the importance of comparing equivalent measures, as                         the power of these indices to detect a lack of
well as increased access and understanding of the                           invariance between groups is unknown.                This
methodology utilized to perform tests of MI (Meade                          deficiency in the literature precludes more
& Lautenschlager, 2004; Vandenberg, 2002).                                  widespread use of AFIs in MI studies as researchers
          Though multiple methods of establishing MI                        have no indication that AFIs are sensitive to an LOI.
exist, multiple-group confirmatory factor analysis                          One reason for this omission from their study is that
(CFA) has been the most commonly used method in                             no standard measure or amount of effect size has
organizational research (Vandenberg & Lance, 2000).                         been established in MI research. Thus, it is difficult
With these tests, constrained and free CFA models                           to justify the simulation of any one level of LOI
are typically compared using a chi-square-based                             between groups.          This study overcomes this
likelihood ratio test (LRT; sometimes called a chi-                         limitation by generating data with many levels of a
square difference test). However, like chi-square                           lack of invariance (from trivial to severe) in order to
tests of overall model fit, the LRT has been shown to                       examine the performance of AFIs in MI tests of equal
be sensitive to sample size (Brannick, 1995;                                factor loadings.
Kelloway, 1995; Meade & Lautenschlager, 2004).
AFIs in Measurement Invariance                                             2

CFA Tests of MI                                               LOI is indicated in those parameters most recently
          Measurement invariance can be technically           constrained (see Vandenberg & Lance, 2000 for a
defined in terms of probabilities such that in order for      review). While there is some disagreement as to how
MI to exist, the probability of observed responses            many model parameters must be equal before MI is
conditioned upon latent scores must be unaffected by          established, the most commonly investigated portion
group membership (Meredith & Millsap, 1992;                   of the MI model are tests of equality of factor
Millsap, 1995). Commonly used CFA tests of MI                 loadings (Vandenberg & Lance, 2000). Moreover,
involve simultaneously fitting a measurement model            factor loadings and item intercepts are generally
to two or more data samples. The multi-group CFA              considered to be the most important aspects of the
measurement model between p observed variables                model essential for MI to be established (Meade &
and m latent factors is given by the equation:                Kroustalis, in press). For this reason, we focused on
                    Xg = g + g g + g ,                (1)     MI tests of factor loadings (metric invariance) for this
where X is a px1 vector of observed scores, is a px1          initial investigation.
vector of intercepts,       is a pxm matrix of factor
loadings, is a mx1 vector of latent variable scores,          Alternative Fit Indices (AFIs)
is a px1 vector of unique factor scores, and g denotes                  We could locate only one published study
that these parameters are group specific. Observed            that has simulated data in order to determine the
variable covariances are then given as:                       feasibility of using differences in AFIs in order to
                      g=   g g ’g +   g,              (2)     establish measurement invariance. In this study,
where g is a pxp matrix of observed score                     Cheung and Rensvold (2002) achieved several
covariances,      is a mxm latent variance/covariance         important goals. First, they specified three criteria
matrix, and       is a pxp diagonal matrix of unique          desirable in an AFI used for establishing MI. These
variances.                                                    include (1) independence between the overall fit in
          MI can therefore exist for multiple parts of        the baseline model and the change in AFI witnessed
the CFA model. For instance, if g= g for all                  with the imposed model constraints ( AFI), (2) an
groups, metric invariance is said to exist (Horn &            AFI should not be affected by model complexity, and
McArdle, 1992); if g = g for all groups, scalar               (3) a lack of redundancy with other AFIs. The first
invariance is indicated (Meredith, 1993); and if g =          of these criteria is important because the degree to
  g for all groups, uniqueness invariance exists. If all      which sampling error is present in the data should
three types of invariance are found, strict factorial         influence the baseline and constrained models to the
invariance is indicated (Meredith, 1993) such that            same degree. The extent to which this is true will be
differences in observed score means or covariances            manifest via a lack of correlation between the initial
are a product of differences in latent means                  AFI value and the AFI associated with the
(sometimes called impact; Holland & Wainer, 1993)             additional constraints on the model. Cheung and
or latent covariances.                                        Rensvold investigated the performance of twenty
          Typically, when conducting CFA MI tests, a          AFIs with regards to these three criteria. These
sequence of nested multi-group models are examined            twenty included 2, 2/df (Wheaton, Muthen, Alwin,
in order to detect an LOI across samples. In the first        & Summers, 1977), Root Mean Squared Error of
model, both data sets (representing groups, time              Approximation (RMSEA; Steiger, 1989), the
periods, etc.) are examined simultaneously, holding           Noncentrality Parameter (NCP; Steiger, Shapiro, &
only the pattern of factor loadings invariant. This           Browne, 1985), Akaike’s Information Criterion (AIC;
model serves two functions: First, it serves as a test        Akaike, 1987), Browne and Cudeck’s Criterion
of configural invariance (Horn & McArdle, 1992);              (1989), the Expected Cross-Validation Index (ECVI;
that is, poor fit of this model indicates that either the     Browne & Cudeck, 1993), Normed Fit Index (NFI;
same factor structure does not hold for the two               Bentler & Bonett, 1980), Relative Fit Index (RFI;
samples, or that the model is misspecified in one or          Bollen, 1986), Incremental Fit Index (IFI; Bollen,
both samples. The second function of the configural           1989), Tucker-Lewis Index (TLI; Tucker & Lewis,
invariance model is that it serves as a baseline of           1973), Comparative Fit Index (CFI; Bentler, 1990),
model fit for comparison to other, more restrictive           Relative Non-Centrality Index (RNI; McDonald &
models. Once adequate fit is established for this             Marsh, 1990), Parsimony-Adjusted NFI (James,
model, tests of equality of parameters in the CFA             Muliak, & Brett, 1982), Parsimonious CFI (Arbuckle
model are conducted in a series of sequential models          & Wothke, 1999), Gamma-hat (Steiger, 1989),
in which typically factor loadings, intercepts, and           rescaled AIC (Cudeck & Browne, 1983), Cross-
uniqueness terms or other model parameters are                Validation Index (CVI; Browne & Cudeck, 1989),
constrained in sequence.          Once a statistically        McDonald’s (1989) Non-Centrality Index, and
significant decrement in model fit is witnessed, an           Critical N (Hoelter, 1983).
AFIs in Measurement Invariance                                            3

         In order to assess the AFIs, Cheung and             DF in factor loadings for some items. Group 1 item
Rensvold (2002) simulated data under a variety of            intercepts were set at zero for all data and uniqueness
conditions varying the number of factors, factor             terms were created so that item variance was equal to
variances, correlations between factors, number of           unity. Moreover, a population correlation of .3
items per factor, factor loadings, and sample size.          between the latent factors was constant across all
Importantly, they only simulated data that had no            study conditions (cf. Cheung & Rensvold, 2002).
LOI in the population.        They then conducted            Factor loadings for Group 1 and Group 2 can be seen
ANOVAs in order to determine the effect of the               in Table 1. Once population data were simulated,
number of items, factors, and the interaction between        sampling error was introduced into simulated sample
the two on the AFIs. Of the AFIs, only RMSEA                 data.    Three-hundred sample replications, each
was immune from all simulated factors. They also             containing sampling error, were simulated for each of
examined the correlation between the initial AFI             the study conditions.
value and the AFI. Using this criterion, only NCP,                       ------------------------------------
IFI, CFI, RNI, Gamma-hat, McDonald’s NCI, and                               Insert Table 1 about here
Critical N showed insignificant correlations.                            ------------------------------------
Moreover, using a six-way ANOVA, they found that                      The study design constituted a 5 (sample
of the indices mentioned above, only NCP and                 size) x 20 (magnitude of DF) fully crossed design.
Critical N showed a dependence on sample size                Sample sizes from 100 to 500 were simulated in
accounting for more than 5% of the variance in the           increments of 100 for both Group 1 and Group 2
change in the fit index. Given their results, they           data. Sample sizes were always equal in Group 1 and
suggested only reporting results of CFI, Gamma-              2 MI comparisons. We simulated DF for 4 of 16
hat, and McDonald’s NCI as INI and RNI                       items, with two DF items per factor. The amount of
correlated extremely highly with CFI.                        DF in item factor loadings varied from a difference
                                                             between groups of .02 to .40 in increments of .02.
The current study                                            These differences in factor loadings were created by
          In this study, we expand on the work of            subtracting the amount of DF from the Group 1 factor
Cheung and Rensvold (2002) by assessing the utility          loading in order to create the Group 2 factor loading
of differences in AFIs ( AFIs) for detecting a lack of       for items indicated as DF in Table 1. The magnitude
MI in item factor loadings. In order to achieve this         of DF across the DF items was uniform in all
goal, we simulated data under a constant factor model        conditions.
in two groups. Several conditions of sample size and
differential functioning (DF) of item factor loadings        Analyses
between groups were then simulated.                                   A CFA baseline model was estimated in
                                                             which the correct factor structure (see Table 1) was
                     METHOD                                  specified for both Group 1 and Group 2. Next, a
                                                             constrained model was estimated in which the entire
          In order to evaluate the performance of            factor loading matrix was constrained to be equal for
  AFIs for detecting an LOI, we simulated item-level         the Group 1 and Group 2 data. Correlation matrices
data for one group, then modified the properties of          were analyzed and factor variances were standardized
these data in several ways for some items (our DF            in order to achieve model identification for all
items) in order to simulate item-level data for another      conditions. Results from models with standardized
hypothetical group. We decided to investigate the            latent variances are equal to those using referent
potential of AFIs for detecting an LOI in factor             indicators when latent variances are known to be
loadings. While there is some consensus that tests of        invariant across groups. A probability value of .05
item intercepts are also necessary for establishing MI,      was used in computing LRTs; LISREL 8.54
tests of factor loadings always occur before tests of        (Jöreskog & Sörbom, 1996) was used for all
item intercepts (Vandenberg & Lance, 2000) and thus          analyses.
seemed a good starting point in this initial                          We also examined the change in several
investigation of the feasibility of these indices to         AFIs between baseline and constrained models,
evaluate MI.                                                 focusing on the AFIs found to be most promising by
                                                             Cheung and Rensvold (2002). Their study revealed
Initial Data Properties                                      that many AFIs had the disadvantageous property
         An initial structural model was developed           of being correlated with initial model fit; thus, we
for two correlated eight-item scales representing            focused on those AFIs found not to have this
“Group 1.” Several conditions of “Group 2” data              property.       Specifically, we concentrated our
were created by modifying Group 1 data to simulate           investigation and reporting of results on the CFI,
AFIs in Measurement Invariance                                                4

  Gamma hat, McDonald’s NCI, NCP, IFI,                        sizable effect due to the interaction between DF and
  RNI, and Critical N.             We also examined           sample size. Conversely, IFI, RNI, McDonald’s
  RMSEA as that index was found by Cheung and                 NCI, and Gamma-hat showed almost no effect of the
Rensvold to be independent of model complexity.               interaction between sample size and DF, but small
          We were primarily concerned with                    effects of sample size. Interestingly, the Critical-N
identifying AFIs that were both (1) sensitive to the          showed considerably worse properties than did chi-
magnitude of DF, and (2) not sensitive to sample              square. These patterns can be seen in Figure 1 in
size. Thus, we assessed the suitability of each AFI           which the level of the AFIs are plotted by DF and
by conducting ANOVAs using SAS’s Proc GLM. In                 sample size (chi-square is plotted for comparison).
each model, the AFI was entered as the dependent              Based on these results, it appears that Gamma-hat,
variable, with sample size and magnitude of DF as             McDonald’s NCI, IFI, RNI, and CFI are among the
predictors.    We then calculated 2 effect size               most promising AFIs for establishing MI.
measures for the magnitude of DF, sample size, and                       ------------------------------------------
the interaction between the two. Optimal AFIs are                              Insert Figure 1 about here
identified by displaying large 2 values for level of                     ------------------------------------------
DF and small 2 values for both the sample size and                      We also examined the correlation between
the interaction between sample size and level of DF.          the AFIs, as highly correlated indices provide little
          We also graphed the relationship between            unique information. As can be seen in Table 4, we
  AFIs and the amount of DF simulated. Such graphs            found that McDonald’s NCI, RNI, IFI, and Gamma-
provide a visual indication of the relationship               hat were very highly correlated. Thus, like Cheung
between the AFIs and the amount of DF present.                and Rensvold (2002), our results suggest reporting all
Moreover, they present information much more                  four     AFIs would provide largely redundant
succinctly than a series of large tables. These graphs        information.
feature the amount of DF simulated on the x-axis                         ------------------------------------------
with the value of the change in the fit statistic on the                       Insert Table 4 about here
y-axis.                                                                  ------------------------------------------
                                                                        In order for the AFIs to be of utility to
                      RESULTS                                 applied researchers, cutoff values need to be
                                                              established so the indices can be used in practice.
          None of the 60,000 analyses resulted in             Based on their simulation work, Cheung and
convergence errors or inadmissible solutions. The 2           Rensvold (2002) suggested values of .01 for CFI, a
effect size estimates for level of DF, sample size, and       value of .001 for             Gamma-hat, and .02 for
the interaction between the two are presented in                McDonald’s NCI1.              They did not provide
Tables 2 and 3. While the data in these tables are the        recommendations for other indices as, like this study,
same, Table 2 sorts fit indices by the effect of DF           they found that those indices correlated so highly as
while Table 3 sorts the AFIs by the effect of sample          to not provide unique useful information. Based on
size and the interaction between sample size and              our analyses, we concur with Cheung and Rensvold
level of DF.                                                  (2002) such that we also recommend reporting CFI,
          ------------------------------------------            Gamma-hat, and McDonald’s NCI. As such, we
            Insert Tables 2 and 3 about here                  evaluated the cutoff scores recommended by Cheung
          ------------------------------------------          and Rensvold by creating cutoff values in AFIs for
          As can be seen in Tables 2 and 3, all AFIs          these indices and plotting the percentage of
outperform chi-square in both being responsive to DF          significant samples in which an LOI was detected
and in being insensitive to sample size, with the             (out of 300 replications) for each level of DF for
exception of the NCP and Critical N. Because the              these three AFIs and the LRT. These plots can be
degrees of freedom of the baseline and constrained            seen in Figure 2.
models were the same in all study conditions, the                        ------------------------------------------
NCP (defined as chi-square minus degrees of                                    Insert Figure 2 about here
freedom) and chi-square had equal effect size                            ------------------------------------------
estimates. As can be seen in the tables, no one index
was superior to the others for both criteria (maximum         1
                                                                Note that Cheung and Rensvold report negative
sensitivity to DF and minimum sensitive to sample             values for these indices. In this study, we calculated
size). Gamma-hat, McDonald’s NCI, IFI, and RNI                the AFIs in order to keep AFIs values (generally)
were somewhat more sensitive to DF than the other             positive. We have changed the sign on the
indices. CFI and RMSEA showed considerably lower              recommended cutoff values from Cheung and
effects of sample size, though RMSEA showed a                 Rensvold to be consistent with our coding.
AFIs in Measurement Invariance                                              5

          As can be seen in Figure 2, it appears that        we examined the extent to which the AFIs are
Cheung and Rensvold suggested a cutoff value for             sensitive to DF. Second, we examined insensitivity
the CFI that is somewhat out of line with those of the       to sample size when an LOI exists. Third, we have
other fit indices in that their CFI value is                 demonstrated the relationship between several AFIs
considerably less sensitive to DF than the others.           and many levels of DF for several sample size
Figure 3 plots these same data, though organizing the        conditions. Fourth, we evaluate the power (%
results by fit index to allow a better visualization of      significant analyses) for the AFIs using Cheung and
the effects of sample size. As can be seen in Figure         Rensvold’s (2002) recommended cutoff values.
3, none of the AFIs were unaffected by sample size.                   The results of our study largely concur with
This is to be expected, however, because although the        those of Cheung and Rensvold (2002) in that we
mean of the fit indices may not vary by sample size,         found that CFI, Gamma-hat, and McDonald’s
their sampling distributions will still be affected          NCI were among the most promising AFIs in that
(Marsh, Balla, & McDonald, 1988). In other words,            they were (1) less sensitive to sample size than was
when examining model fit to two models that fit              chi-square, (2) more sensitive to DF than chi-square,
equally well in the population, larger samples will be       and (3) generally provided non-redundant
associated with less variation around the mean AFI           information with other AFIs. However, we found
than will smaller samples due to less sampling error.        that Cheung and Rensvold’s recommended cutoff
Thus, when comparing a constrained and baseline              values affected the performance of the AFIs for
model with a given level of DF, larger sample sizes          detecting an LOI. In particular, the recommended
lead to less variation in the difference between the         value for CFI seems excessively large.               For
model AFIs and thus a higher percentage of the 300           example, when 4 of 16 items showed DF with factor
replications in which an LOI is deemed significant           loadings differences of .3, power to detect this
than with smaller sample sizes.                              difference was below 50% in all sample size
          ------------------------------------------         conditions with the CFI (see Figure 3). In contrast,
                Insert Figure 3 about here                   the Gamma-hat, McDonald’s NCI, and the LRT
          ------------------------------------------         all showed power near 100% for sample sizes of 200
                                                             and larger for these data.        In this study, the
                   DISCUSSION                                  McDonald’s NCI of .02 seemed to perform
                                                             optimally of the four indices.          The LRT and
          As recognition of MI as an important                 Gamma-hat seemed overly sensitive to small (5000) as a
their earlier work in several important ways. First,         condition. At these large sample sizes, power to
AFIs in Measurement Invariance                                            6

detect an LOI is very high with the LRT, and it may         Browne, M. W., & Cudeck, R. (1989). Single sample
well be that researchers dealing with these large                    cross-validation indices for covariance
sample sizes may be the most likely to pursue using a                structures. Multivariate Behavioral
  AFI to evaluate MI. Third, we simulated data that                  Research, 24, 445-455.
were somewhat idealized as compared to that                 Browne, M. W., & Cudeck, R. (1993). Alternative
simulated by Cheung and Rensvold (2002). Our                         ways of assessing model fit. In K. A. Bollen
factor model was simulated to be ‘clean’. In other                   & J. S. Long (Eds.), Testing structural
words, our population model used to derive our                       equations models (pp. 136-162). Newbury
sample replications had zero values for cross-                       Park, CA: Sage.
loadings. While our choice of factor model was no           Chan, D. (1998). The conceptualization and analysis
more arbitrary than was Cheung and Rensvold’s                        of change over time: An integrative
(2002), the better fit associated with our model may                 approach incorporating longitudinal mean
be less likely to be encountered in practice.                        and covariance structures analysis (LMACS)
          We consider our study to be an initial                     and multiple indicator latent growth
expansion of earlier work on AFIs for evaluating                     modeling (MLGM). Organizational
MI.       Future research needs to address the                       Research Methods, 1(4), 421-483.
performance of these indices in identifying an LOI in       Chan, D., & Schmitt, N. (1997). Video-based versus
item intercepts, uniqueness terms, factor variances                  paper-and-pencil method of assessment in
and covariances, and latent means. Also, the effects                 situational judgment tests: Subgroup
of model misspecification and model complexity on                    differences in test performance and face
AFIs need to be examined under conditions in which                   validity perceptions. Journal of Applied
MI does not hold. Importantly, a follow-up study                     Psychology, 82(1), 143-159.
that included very large sample sizes would also be         Chan, D., & Schmitt, N. (2000). Interindividual
valuable.                                                            differences in intraindividual changes in
          In sum, it appears that examining AFIs                     proactivity during organizational entry: A
may be a valuable tool for establishing MI. These                    latent growth modeling approach to
indices could supplement or replace the LRT for                      understanding newcomer adaptation.
some data conditions. However, further study is                      Journal of Applied Psychology, 85(2), 190-
needed before widespread implementation should                       210.
proceed.                                                    Cheung, G. W., & Rensvold, R. B. (2002).
                                                                     Evaluating goodness-of-fit indexes for
                  REFERENCES                                         testing measurement invariance. Structural
                                                                     Equation Modeling, 9(2), 233-255.
Akaike, H. (1987). Factor analysis and AIC.                 Cudeck, R., & Browne, M. W. (1983). Cross-
         Psychometrika, 52, 317-322.                                 Validation of Covariance-Structures.
Arbuckle, J. L., & Wothke, W. (1999). Amos 4.0                       Multivariate Behavioral Research, 18(2),
         user's guide. Chicago: SmallWaters.                         147-168.
Bentler, P. M. (1990). Comparative fit indexes in           Drasgow, F. (1984). Scrutinizing psychological tests:
         structural models. Psychological Bulletin,                  Measurement equivalence and equivalent
         107, 238-246.                                               relations with external variables are the
Bentler, P. M., & Bonett, D. G. (1980). Significance                 central issues. Psychological Bulletin, 95(1),
         tests and goodness of fit in the analysis of                134-135.
         covariance structures. Psychological               Facteau, J. D., & Craig, S. B. (2001). Are
         Bulletin, 88, 588-606.                                      performance appraisal ratings from different
Bollen, K. A. (1986). Sample size and Bentler and                    rating sources comparable? Journal of
         Bonett' s nonnormed fit index.                              Applied Psychology, 86(2), 215-227.
         Psychometrika, 51, 375-377.                        Hoelter, J. W. (1983). The analysis of covariance
Bollen, K. A. (1989). A new incremental fit index for                structures: Goodness-of-fit indices.
         general structural equation models.                         Sociological Methods & Research, 11, 325-
         Sociological methods and research, 17, 303-                 344.
         316.                                               Holland, P. W., & Wainer, H. (1993). Differential
Brannick, M. T. (1995). Critical Comments on                          item functioning. Hillside, NJ: Erlbaum.
         Applying Covariance Structure Modeling.            Horn, J. L., & McArdle, J. J. (1992). A practical and
         Journal of Organizational Behavior, 16(3),                  theoretical guide to measurement invariance
         201-213.                                                    in aging research. Experimental Aging
                                                                     Research, 18(3-4), 117-144.
AFIs in Measurement Invariance                                           7

James, L. R., Muliak, S. A., & Brett, J. M. (1982).                   and-pencil testing of applicants in a
         Causal analysis: Assumptions, models and                     proctored setting: Are personality, biodata
         data. Beverly Hills: Sage.                                   and situational judgment tests comparable?
Jöreskog, K. & Sörbom, D. (1996). LISREL 8: Users                     Personnel Psychology, 56(3), 733-752.
         Reference Guide. Chicago: Scientific               Riordan, C. M., & Vandenberg, R. J. (1994). A
         Software International.                                      central question in cross-cultural research:
Kelloway, E. K. (1995). Structural Equation                           Do employees of different cultures interpret
         Modeling in Perspective. Journal of                          work-related measures in an equivalent
         Organizational Behavior, 16(3), 215-224.                     manner? Journal of Management, 20(3),
Marsh, H. W. (1985). The structure of                                 643-671.
         masculinity/femininity: An application of          Steiger, J. H. (1989). EzPATH: Causal modeling.
         confirmatory factor analysis to higher-order                 Evanston, IL: SYSTAT.
         factor structures and factorial invariance.        Steiger, J. H., Shapiro, A., & Browne, M. W. (1985).
         Multivariate Behavioral Research, 20(4),                     On the multivariate asymptotic distribution
         427-449.                                                     of sequential chi-square statistics.
Marsh, H. W. (1987). The factorial invariance of                      Psychometrika, 50, 253-263.
         responses by males and females to a                Tucker, L. R., & Lewis, C. (1973). A reliability
         multidimensional self-concept instrument:                    coefficient for maximum likelihood factor
         Substantive and methodological issues.                       analysis. Psychometrika, 38, 1-10.
         Multivariate Behavioral Research, 22(4),           Vandenberg, R. J. (2002). Toward a further
         457-480.                                                     understanding of an improvement in
Marsh, H. W., & Hocevar, D. (1985). Application of                    measurement invariance methods and
         confirmatory factor analysis to the study of                 procedures. Organizational Research
         self-concept: First- and higher order factor                 Methods, 5(2), 139-158.
         models and their invariance across groups.         Vandenberg, R. J., & Lance, C. E. (2000). A review
         Psychological Bulletin, 97(3), 562-582.                      and synthesis of the measurement invariance
Marsh, H. W., Balla, J. R., & McDonald, R. P.                         literature: Suggestions, practices, and
         (1988). Goodness-of-fit indexes in                           recommendations for organizational
         confirmatory factor analysis: The effect of                  research. Organizational Research Methods,
         sample size. Psychological Bulletin, 103(3),                 3(1), 4-69.
         391-410.                                           Wheaton, B., Muthen, B., Alwin, D. F., & Summers,
McDonald, R. P. (1989). An index of goodness-of-fit                   G. F. (1977). Assessing reliability and
         based on noncentrality. Journal of                           stability in panel models. In D. R. Heise
         Classification, 6, 97-103.                                   (Ed.), Sociological methodology (pp. 84-
McDonald, R. P., & Marsh, H. W. (1990). Choosing                      136). San Francisco: Jossey-Bass.
         a multivariate model: Noncentrality and
         goodness of fit. Psychological Bulletin, 107,
         247-255.                                                           Author Contact Info:
Meade, A. W., & Lautenschlager, G. J. (2004). A
         Monte-Carlo Study of Confirmatory Factor           Adam W. Meade
         Analytic Tests of Measurement                      Department of Psychology
         Equivalence/Invariance. Structural Equation        North Carolina State University
         Modeling, 11(1), 60-72.                            Campus Box 7650
Meredith, W. (1993). Measurement invariance, factor         Raleigh, NC 27695-7650
         analysis and factorial invariance.                 Phone: 919-513-4857
         Psychometrika, 58(4), 525-543.                     Fax: 919-515-1716
Meredith, W., & Millsap, R. E. (1992). On the               E-mail: awmeade@ncsu.edu
         misuse of manifest variables in the detection
         of measurement bias. Psychometrika, 57(2),
         289-311.
Millsap, R. E. (1995). Measurement invariance,
         predictive invariance, and the duality
         paradox. Multivariate Behavioral Research,
         30(4), 577-605.
Ployhart, R. E., Weekley, J. A., Holtz, B. C., &
         Kemp, C. (2003). Web-based and paper-
AFIs in Measurement Invariance                            8

                                          TABLE 1

               Population Factor Loadings for Group 1 and Group 2 Data

                                      Group 1                Group 2

                               Factor     Factor     Factor      Factor
                        Item
                                 1          2          1           2

                         1       .80            -      .80             -

                         2       .70            -      .70             -

                         3       .60            -      .60             -

                         4       .50            -      .50             -

                         5       .80            -      XX              -

                         6       .70            -      XX              -

                         7       .60            -      .60             -

                         8       .50            -      .50             -

                         9        -         .80          -         .80

                         10       -         .70          -         .70

                         11       -         .60          -         XX

                         12       -         .50          -         XX

                         13       -         .80          -         .80

                         14       -         .70          -         .70

                         15       -         .60          -         .60

                         16       -         .50          -         .50

Note: XX indicates DF item with variable magnitude of DF. Numeric Group 2 loadings are
equal to their Group 1 counterparts (i.e., are not DF items).
AFIs in Measurement Invariance                         9

                                      TABLE 2

Omega-Squared Effect Size Estimates for The Amount of DF and Sample Size on AFI

                  Indices; Sorted by the Effect of Amount of DF.

                                    Amount
                                                  Sample
                     AFI             of DF                   DF*N
                                                  Size (N)
                                     (DF)

                 Gamma Hat            0.824         0.007    0.000

               McDonald’s NCI         0.824         0.007    0.000

                     IFI              0.812         0.005    0.000

                     RNI              0.811         0.006    0.000

                     CFI              0.722         0.002    0.002

                   RMSEA              0.651         0.001    0.022
                       2
                                      0.588         0.010    0.130

                    NCP               0.588         0.010    0.130

                  Critical-N          0.389         0.013    0.198
AFIs in Measurement Invariance                      10

                                         TABLE 3

Omega-Squared Effect Size Estimates for The Amount of DF and Sample Size on AFI

                   Indices; Sorted by the Effects of Sample Size.

                                          Amount Sample
                        AFI                of DF  Size         DF*N
                                           (DF)    (N)

                        CFI                 0.722      0.002   0.002

                        IFI                 0.812      0.005   0.000

                       RNI                  0.811      0.006   0.000

                  McDonalds NCI             0.824      0.007   0.000

                    Gamma Hat               0.824      0.007   0.000

                     RMSEA                  0.651      0.001   0.022
                          2
                                            0.588      0.010   0.130

                       NCP2                 0.588      0.010   0.130

                     Critical-N             0.389      0.013   0.198
             Note: Table sorted by the sum of the effects of N and DF*N.
AFIs in Measurement Invariance                              11

                                            TABLE 4

                                 Correlations Between AFIs

                   2                                                McD
                        CFI    Critical N     G-hat          IFI           RMSEA   RNI    NCP
                                                                    NCI
 2
                 1.00

CFI              0.83   1.00

Critical N       0.94   0.64     1.00

Gamma-hat        0.86   0.94     0.70          1.00

IFI              0.85   0.96     0.68          0.99          1.00

McDonald’s NCI   0.87   0.93     0.71          1.00          0.99   1.00

RMSEA            0.87   0.88     0.78          0.89          0.87   0.89    1.00

RNI              0.85   0.96     0.68          0.99          1.00   0.99    0.87   1.00

NCP              1.00   0.83     0.94          0.86          0.85   0.87    0.87   0.85   1.00
AFIs in Measurement Invariance                                           12

                                                                                        FIGURE 1
Change in Chi-Square .                                                 Changes in AFIs by Level of DF and Sample Size

                             200

                             150                                                                                                                              100
                                                                                                                                                              200
                             100                                                                                                                              300
                                                                                                                                                              400
                                      50                                                                                                                      500

                                               0
                                                      0.02        0.06     0.1         0.14         0.18   0.22      0.26       0.3         0.34     0.38

                                                                                                   Amount of DF
Change in Gamma Hat .

                              0.025
                                       0.02
                                                                                                                                                              100
                              0.015
                                                                                                                                                              200
                                       0.01                                                                                                                   300
                              0.005                                                                                                                           400
                                                                                                                                                              500
                                                      0
                         -0.005
                                                          0.02     0.06         0.1         0.14    0.18      0.22     0.26     0.3         0.34     0.38

                                                                                                   Amount of DF
                         Change in McDonald's NCI .

                                                          0.2

                                                      0.15
                                                                                                                                                              100
                                                          0.1                                                                                                 200
                                                                                                                                                              300
                                                      0.05                                                                                                    400
                                                                                                                                                              500
                                                           0

                                                      -0.05
                                                                0.02     0.06         0.1      0.14    0.18     0.22     0.26         0.3     0.34     0.38

                                                                                                      Amount of DF
AFIs in Measurement Invariance                         13

Change in IFI .    0.02

                  0.015
                                                                                                    100
                   0.01                                                                             200
                                                                                                    300
                  0.005                                                                             400
                                                                                                    500
                      0

                  -0.005
                           0.02   0.06   0.1     0.14    0.18   0.22    0.26    0.3   0.34   0.38

                                                        Amount of DF

                   0.02
Change in RNI .

                  0.015
                                                                                                    100
                   0.01                                                                             200
                                                                                                    300
                  0.005                                                                             400
                                                                                                    500
                      0

                  -0.005
                           0.02   0.06   0.1     0.14    0.18   0.22   0.26     0.3   0.34   0.38

                                                        Amount of DF

                   0.02
Change in CFI .

                  0.015
                                                                                                    100
                   0.01                                                                             200
                                                                                                    300
                  0.005                                                                             400
                                                                                                    500
                      0

                  -0.005
                           0.02   0.06   0.1     0.14    0.18   0.22    0.26    0.3   0.34   0.38

                                                        Amount of DF
AFIs in Measurement Invariance                         14

Change in RMSEA .         0.035
                           0.03
                          0.025                                                                               100
                           0.02
                                                                                                              200
                          0.015
                                                                                                              300
                           0.01
                          0.005                                                                               400
                              0                                                                               500
                         -0.005
                          -0.01
                                  0.02     0.06    0.1     0.14    0.18    0.22   0.26    0.3   0.34   0.38

                                                                  Amount of DF

                         200
Change in NCP .

                         150
                                                                                                              100
                         100                                                                                  200
                                                                                                              300
                          50                                                                                  400
                                                                                                              500
                           0

                         -50
                                0.02     0.06     0.1    0.14     0.18    0.22    0.26    0.3   0.34   0.38

                                                                 Amount of DF
Change in Critical N .

                         600
                         500
                         400                                                                                  100
                                                                                                              200
                         300
                                                                                                              300
                         200
                                                                                                              400
                         100                                                                                  500
                           0
                         -100
                                0.02     0.06     0.1     0.14     0.18   0.22    0.26    0.3   0.34   0.38

                                                                  Amount of DF
AFIs in Measurement Invariance                          15

                                                          FIGURE 2
                              Percentage of Significant Analyses by Level of DF and Sample Size

                                                       N=100
                   1.2
 %Significant .

                        1
                   0.8                                                                             2
                                                                                                  Gamma-hat
                   0.6
                                                                                                  McD's NCI
                   0.4
                                                                                                  CFI
                   0.2
                        0
                            0.02 0.06   0.1   0.14 0.18 0.22 0.26   0.3   0.34 0.38

                                              Amount of DF

                                                   N=200
                  1.2
% Significant .

                   1
                  0.8                                                                  2
                                                                                      Gamma-hat
                  0.6
                                                                                      McD's NCI
                  0.4
                                                                                      CFI
                  0.2
                   0
                        0.02 0.06 0.1 0.14 0.18 0.22 0.26 0.3 0.34 0.38

                                          Amount of DF

                                                   N=300
                  1.2
% Significant .

                   1
                  0.8                                                                  2
                                                                                      Gamma-hat
                  0.6
                                                                                      McD's NCI
                  0.4
                                                                                      CFI
                  0.2
                   0
                        0.02 0.06 0.1 0.14 0.18 0.22 0.26 0.3 0.34 0.38

                                          Amount of DF
AFIs in Measurement Invariance               16

                                               N=400
                  1.2
% Significant .

                   1
                  0.8                                                               2
                                                                                   Gamma-hat
                  0.6
                                                                                   McD's NCI
                  0.4
                                                                                   CFI
                  0.2
                   0
                        0.02 0.06 0.1 0.14 0.18 0.22 0.26 0.3 0.34 0.38

                                       Amount of DF

                                               N=500
                  1.2
% Significant .

                   1
                  0.8                                                               2
                                                                                   Gamma-hat
                  0.6
                                                                                   McD's NCI
                  0.4
                                                                                   CFI
                  0.2
                   0
                        0.02 0.06 0.1 0.14 0.18 0.22 0.26 0.3 0.34 0.38

                                       Amount of DF
AFIs in Measurement Invariance                         17

                                                    FIGURE 3
                        Percentage of Significant Analyses by Level of DF and Sample Size

                                            Change in Chi-Square
                                                                                                 100
% Significant .

                   1
                                                                                                 200
                  0.8
                                                                                                 300
                  0.6
                                                                                                 400
                  0.4                                                                            500
                  0.2
                   0
                        0.02   0.06   0.1    0.14    0.18    0.22    0.26    0.3   0.34   0.38

                                                    Amount of DF

                                            Change in Gamma-hat
                                                                                                 100
% Significant .

                   1
                                                                                                 200
                  0.8
                                                                                                 300
                  0.6
                                                                                                 400
                  0.4                                                                            500
                  0.2
                   0
                        0.02   0.06   0.1    0.14    0.18    0.22    0.26    0.3   0.34   0.38

                                                    Amount of DF

                                      Change in McDonald's NCI
                                                                                                 100
% Significant .

                   1
                                                                                                 200
                  0.8
                                                                                                 300
                  0.6
                                                                                                 400
                  0.4                                                                            500
                  0.2
                   0
                        0.02   0.06   0.1    0.14    0.18    0.22    0.26    0.3   0.34   0.38

                                                    Amount of DF
AFIs in Measurement Invariance                         18

                                                Change in CFI
                                                                                                 100
% Significant .

                   1
                                                                                                 200
                  0.8
                                                                                                 300
                  0.6
                                                                                                 400
                  0.4                                                                            500
                  0.2
                   0
                        0.02   0.06   0.1    0.14    0.18    0.22    0.26    0.3   0.34   0.38

                                                    Amount of DF
You can also read