Testing and Reducing Skindex-29 Using Rasch Analysis: Skindex-17

Page created by Travis Taylor
 
CONTINUE READING
ORIGINAL ARTICLE

      Testing and Reducing Skindex-29 Using Rasch
      Analysis: Skindex-17
      Tamar E.C. Nijsten1, Francesca Sampogna2, Mary-Margaret Chren3 and Damiano D. Abeni2

      The Skindex is a well-studied dermatology-specific health-related quality of life (HRQOL) instrument. The
      objective of this study was to test Skindex-29 using Rasch analysis and, if necessary, to refine it so that it would
      fit this item response theory based model. The Skindex-29 of 454 Italian dermatological patients was subjected
      to Rasch analysis to investigate threshold order, differential item functioning (DIF), and item and overall fit to
      the model. The Skindex-29 did not fit the Rasch model (Po0.001). The 5-point scoring system was re-grouped
      into three categories and demonstrated logical response order for all but one item. Rasch analyses of a
      combined emotion and social functioning subscale of Skindex-29 resulted in a 12-item psychosocial subscale.
      Five of seven items were retained in a symptoms subscale. Both subscales fitted the model (P ¼ 0.32 and 0.13,
      respectively) without significant individual item misfit or DIF (P40.05). Classical psychometric properties such
      as response distribution, item–rest correlation, item complexity, and internal consistency of the two subscales of
      Skindex-17 were at least adequate. The Skindex-17 is a Rasch reduced version of Skindex-29, with two
      independent scores that can be used in the measurement of HRQOL in dermatological patients.
      Journal of Investigative Dermatology (2006) 126, 1244–1250. doi:10.1038/sj.jid.5700212; published online 16 March 2006

      INTRODUCTION                                                                 Prieto et al., 2003; Nijsten et al., 2006), this approach is
      The Skindex is a health-related quality of life (HRQOL)                      unable to solve some fundamental measurement issues such
      instrument designed to measure the effects of skin disease on                as response order (i.e., logical ordering of the response
      patients’ lives. This dermatology-specific questionnaire has                 categories), additivity, which requires unidimensionality of
      been extensively studied and refined in different population                 the measurement, and differential item functioning (DIF)
      samples (Chren et al., 1996, 1997; Abeni et al., 2002;                       (assessing the effect of external factors, including diagnosis,
      Augustin et al., 2004). Although the original Skindex                        on item response) (McHorney, 1997; Michell, 2003; Tesio,
      consisted of 61 items (Chren et al., 1996), a refinement study               2003; Tennant et al., 2004a).
      demonstrated that Skindex-29 decreased respondent burden                        Recently, item response theory-based models such as the
      and improved its discriminative and evaluative capability                    Rasch model have become widely available, which should
      (Chren et al., 1997). The methodology in both the original                   encourage psychometric research to address some of these
      and the refinement study was based on classical test theory to               fundamental measurement issues (Rasch, 1960; McHorney,
      retain or discard items.                                                     1997; Michell, 2003; Streiner and Norman, 2003; Tennant
         For decades, classical test theory has been the most                      et al., 2004a). Basically, these models attempt to ensure
      commonly used approach to create and reduce patient-based                    the transition from representational (nominal, ordinal, inter-
      assessments, and includes multiple psychometric features                     val, and ratio scales) (Stevens, 1946) to fundamental
      (Chren, 1999; Guyatt et al., 2002; Nijsten et al., 2006).                    measurement (i.e., weight (g), distance (m), or temperature
      Although classical test theory has been proven to be a valid                 1C) (Michell, 1997, 2003; Chren, 2005). This transition is
      and reliable methodology and there is no consensus of how                    important because it allows investigators to calculate mean
      to reduce existing HRQOL instruments (Coste et al., 1997;                    scores and change of scores without the restrictions
                                                                                   associated with nonparametric, representational measure-
      1
       Department of Dermatology, Erasmus Medical Center, Rotterdam, The
                                                                                   ment. Rasch analysis is one among the most commonly used
      Netherlands; 2Health Services Research Unit, Dermatological Institute        item response theory models to create new or test existing
      IDI-IRCCS, Rome, Italy and 3Department of Dermatology, University            HRQOL instruments (Tennant et al., 2004a).
      of California at San Francisco, San Francisco, California, USA                  The overall goal of this work was to use item response
      Correspondence: Dr Damiano D. Abeni, Health Services Research Unit,          theory to examine the responses to Skindex-29 of a large
      IDI – IRCCS, via dei Monti di Creta, 104, I-00167 Rome, Italy.
      E-mail: d.abeni@idi.it
                                                                                   sample of patients. In this study, we used the Rasch model to
      Abbreviations: DIF, differential item functioning; HRQOL, health-related
                                                                                   test threshold order, item fit, and DIF. In addition, we aspired
      quality of life                                                              to create a reduced version of this instrument that fitted the
      Received 26 September 2005; revised 12 December 2005; accepted 3             Rasch model and behaved psychometrically well among
      January 2006; published online 16 March 2006                                 a heterogenous group of dermatological patients.

1244 Journal of Investigative Dermatology (2006), Volume 126                                                & 2006 The Society for Investigative Dermatology
TEC Nijsten et al.
                                                                                                                              Rasch Reduced Skindex

 Table 1. Demographic and disease characteristics of                                        1.0 0
                                                                                                                                                  4
 the study population1

                                                                              Probability
                   Total (n=454)       Sample 1         Sample 2
                       No. of       (n=227) No. of   (n=227) No. of                         0.5                          2       3
                    patients (%)     patients (%)     patients (%)

 Gender                                                                                                          1

     Female         270 (59.7)         131 (57.7)      139 (61.7)                           0.0
                                                                                                  –3   –2    –1       0          1         2      3
     Male           182 (40.3)          96 (42.3)       86 (38.2)                                           Person location (logits)

                                                                      Figure 1. The category probability curve for item 24 (‘‘sensitive skin’’)
 Age (years)                                                          (n ¼ 454).
     o45            225 (49.8)         110 (48.9)      115 (51.1)
     45–65          162 (35.8)          86 (37.9)       76 (33.8)
                                                                      probability curves demonstrated a disordered pattern bet-
     465              65 (14.4)         31 (13.7)       34 (15.1)     ween 1 and 2 for all items. In addition, 20 of 29 items
                                                                      (69.0%) showed a logical transition between response
 Diagnosis                                                            category 2 and 3 and 25 of 29 (86.2%) items had a good
     Acne           151 (33.3)          73 (32.2)       78 (34.4)     transition between category 3 and 4. However, re-grouping
                                                                      the original Skindex scores into 0 (0), 1 (1 and 2), and 2 (3)
     Psoriasis        76 (16.7)         40 (17.6)       36 (15.9)
                                                                      and 3 (4) demonstrated reversed thresholds in 8 of the 29
     Seborrheic       54 (12.0)         27 (11.9)       27 (11.9)
     dermatitis
                                                                      items. To avoid creating individual scoring systems for these
                                                                      items or deleting them from the reduced instrument, the
     Alopecia         46 (10.2)         25 (11.0)       21 (9.3)
     areata
                                                                      original response categories were re-grouped into three
                                                                      levels, as 0 (0), 1 (1 and 2), and 2 (3 and 4), which resulted
     Vitiligo         27 (6.0)          13 (5.7)        14 (6.2)
                                                                      in ordered thresholds for all items, except item 30.
     Nevi           100 (21.7)          49 (21.6)       51 (22.5)

                                                                      Individual item fit and DIF for the psychosocial subscale.
 Physician global assessment                                          Although item 30 was the first to be discarded due to reversed
     Very mild        64 (14.2)         30 (13.2)       34 (15.0)     thresholds, it also misfitted the Rasch model (P ¼ 0.01)
     Mild           160 (35.4)          87 (38.3)       73 (32.2)     and demonstrated DIF for age (P ¼ 0.03) and diagnosis
     Moderate       189 (41.8)          97 (42.7)       92 (40.5)
                                                                      (P ¼ 0.0005). Each of the other eight items that were deleted
                                                                      demonstrated significant item misfit in the Rasch model
     Severe/very      31 (6.9)          13 (5.7)        18 (7.9)
     severe                                                           (Pp0.02), except for item 12, which demonstrated significant
 1
                                                                      DIF for both age (P ¼ 0.008) and gender (P ¼ 0.003). Items 2
  Totals may vary because of missing values.
                                                                      and 19 showed significant DIF for diagnosis and item 2 for
                                                                      age (Pp0.006). Items 3, 9, 13, 15, and 19 were primarily
                                                                      discarded because they misfitted the Rasch model.
RESULTS                                                                  Overall, the 12 remaining items of the psychosocial
Study population                                                      subscale fitted the Rasch model well (total-item w2 ¼ 39.45,
The majority of the 454 patients were young to middle-aged            df ¼ 36, and P ¼ 0.32) (Table S1). After removing the ‘‘Rasch
women (Table 1). More than 60% of the participants were               factor’’, a principal component analysis demonstrated that
diagnosed with an inflammatory skin disease such as acne,             the first component accounted for only 16.3% of the
psoriasis, and seborrheic dermatitis, and almost half of the          variance.
patients were graded as having at least moderate disease
severity. No significant differences of demographic or disease        Individual item fit and DIF for the symptoms subscale.
characteristics were detected between the two random                  Although the overall fit for the seven symptoms-related items
samples of the study population.                                      was acceptable (total-item w2 ¼ 40.28, df ¼ 35, and P ¼ 0.13),
                                                                      item 24 was deleted due to significant DIF for gender
Rasch analysis                                                        (Po0.001) and age (P ¼ 0.02). Subsequently, item 7 misfitted
Threshold order for all items. The threshold of the original          the model (residual ¼ 1.40 and P ¼ 0.01) and was discar-
5-point scoring system of Skindex-29 was significantly                ded. The remaining items showed a good overall fit to the
disordered for all 29 items in sample 1. For example, the             Rasch model (total-item w2 ¼ 29.31, df ¼ 25, and P ¼ 0.25)
probability that a person with little HRQOL impairment                (Table S2). After extracting the Rasch factor, the second factor
scored 0 was superior to that of scoring 1, which is defined as       only reflected 27.7% of the variance of this subscale.
threshold order, but those with little impairment were more
likely to score 2 than 1, which is defined as threshold               Validation of Rasch analyses
disorder, for item 24 (Figure 1). The order between 2, 3, and 4       The items retained in the analyses using sample 1 (presented
appeared to be logical for this item. In fact, the category           data) or 2 were comparable, except that the validation

                                                                                                                                          www.jidonline.org 1245
TEC Nijsten et al.
      Rasch Reduced Skindex

      sample did not reject item 2 in the psychosocial and 7 in the                  original corresponding subscales (rX0.83). The items of
      symptoms subscale. The overall summary fit statistics of both                  Skindex-17 of both subscales represented 94% of the vari-
      groups of patients were nonsignificant (P ¼ 0.32 vs 0.16 and                   ance of the total score of Skindex-29 (adjusted R2 ¼ 0.94).
      P ¼ 0.25 vs 0.12 for the psychosocial and symptoms subscale,                   Both new subscales captured at least 87% of the variance
      respectively), with similar Cronbach’s a’s (0.81 vs 0.83 and                   of their original Skindex-29 subscales (adjusted R2 scores of
      0.77 vs 0.78 for the psychosocial and symptoms subscales,                      the psychosocial items were 0.95 and 0.87 for functioning
      respectively). Also, both Rasch reduced subscales captured                     and emotions, respectively, and 0.87 of the reduced
      more than 85% of the total variance of the original Skindex-                   symptoms subscale compared to its original version).
      29. Adding item 2 to the retained psychosocial items of
      sample 1 resulted in a poor fit to the Rasch model (P ¼ 0.002                  Item behaviour of Skindex-17. Of the 12 items of the psycho-
      and item 2 showed item significant misfit and DIF for age)                     social subscale, five items (8, 14, 17, 24, and 27) showed a
      and adding item 7 to the symptoms items of sample 1 resulted                   suboptimal response distribution with more than 70% of the
      in borderline overall fit (P ¼ 0.06), but with individual item                 514 participants who scored ‘‘0’’. The ‘‘floor’’ effect of these
      misfit. Therefore, items 2 and 7 were not included in the                      items disappeared after deleting the 100 patients with nevi
      final Rasch reduced Skindex, which we named Skindex-17                         (21.6% of the total study population). The correlation
      (Appendix S2).                                                                 coefficients between the 12 items of the psychosocial
                                                                                     subscale ranged from 0.31 to 0.64 and for the symptom
      Classical psychometric properties of Skindex-17                                subscale from 0.33 to 0.57. Each of the retained items
      Validity of Skindex-17. The face validity of Skindex-17                        correlated moderately (rX0.40), with at least one other item
      appeared to be adequate because it distinguished symptoms                      in its subscale (data not shown) suggesting item redundancy.
      from functioning and emotions, which are likely to show                        For both subscales, the Cronbach’s a’s were X0.70 for the
      some extent of overlap. As hypothesized, patients with acne,                   items and the item–rest correlation coefficients were X0.35,
      psoriasis, and seborrheic dermatitis reported significantly                    suggesting good internal consistency and reasonable ‘‘homo-
      higher scores for both subscales of Skindex-17 than nevi                       geneity’’ (Table 2). None of the items of the psychosocial
      patients (Po0.001) (Table S3). The new Skindex-17 subscales                    subscale correlated equally or stronger for the symptoms
      correlated very well to excellent with Skindex-29 and its                      subscale and vice versa, which indicates good item discrimi-

        Table 2. Classical psychometric properties of the Skindex-17 (n=454)1
                                                              Factor analysis2
                                             Factor 1, psychosocial     Factor 2, symptom subscale                                         Item–rest
        Items                              subscale (eigenvalue=4, 7)        (eigenvalue=1, 5)           Uniqueness       Crohnbach’s a   correlation

            4 ‘‘work or hobbies’’                     0.61                          0.41                     0.39             0.93           0.67
            5 ‘‘social life’’                         0.79                          0.23                     0.26             0.92           0.80
            6 ‘‘depressed’’                           0.64                          0.27                     0.43             0.93           0.66
            8 ‘‘stay at home’’                        0.74                          0.19                     0.36             0.93           0.72
        11 ‘‘closeness with loved ones’’              0.74                          0.26                     0.34             0.92           0.75
        14 ‘‘do things by themselves’’                0.65                          0.25                     0.42             0.92           0.67
        17 ‘‘showing affection’’                      0.69                          0.21                     0.39             0.93           0.70
        21 ‘‘embarrassed’’                            0.73                          0.25                     0.30             0.92           0.73
        23 ‘‘frustrated’’                             0.75                          0.15                     0.35             0.93           0.73
        25 ‘‘desire to be with people’’               0.83                          0.09                     0.29             0.93           0.78
        26 ‘‘humiliated’’                             0.72                          0.08                     0.39             0.93           0.68
        29 ‘‘sex life’’                               0.57                          0.12                     0.60             0.93           0.56
            1 ‘‘hurts’’                               0.42                          0.54                     0.49             0.74           0.52
        10 ‘‘itches’’                                 0.09                          0.67                     0.51             0.72           0.55
        16 ‘‘bothered by water’’                      0.14                          0.43                     0.72             0.78           0.40
        19 ‘‘irritated’’                              0.33                          0.58                     0.53             0.72           0.57
        27 ‘‘bleeding’’                               0.20                          0.77                     0.35             0.70           0.35
        1
         Except for factor analysis, the analyses were performed for the psychosocial and symptom subscales separately.
        2
         Principal component analysis followed by orthogonal rotation. Factors were retained if eigenvalue41.
        Bold values indicate loadings X0.40.

1246 Journal of Investigative Dermatology (2006), Volume 126
TEC Nijsten et al.
                                                                                                           Rasch Reduced Skindex

nant validity (data not shown). Factor analysis of Skindex-17       that these different reduction approaches (patient-based vs
confirms its two-dimensional structure and separates the            mathematics-based) may result in similar reduced instru-
psychosocial from the symptoms subscale (Table 2). Items 1          ments. Nevertheless, it would be interesting to compare the
and 4 demonstrate moderate item complexity because their            outcomes of Skindex-17 and -16 in a population to study
loadings are X0.40.                                                 these different ways of model selection in more detail.

Categorization of Skindex-17                                        Effect of external factors on HRQOL impairment
For both subscales we performed mixture analysis to                 It has been demonstrated in the development of new HRQOL
categorize the two scores of Skindex-17 (Table S4). The             for psoriasis and atopic dermatitis that DIF across demo-
scores of the psychosocial subscale varied between 1 and 26,        graphic variables is an important consideration (McKenna
which could consist of three distinct distributions with mean       et al., 2003; Whalley et al., 2004; Nijsten et al., 2005, 2006)
scores of about 2, 7, and 15, which were subsequently               and Skindex-17 is the first dermatology-specific instrument
categorized as having little (o5), moderate (5–9), and high         to address the issue of response differences of patients in
(49) impact on patients’ lives. The total symptoms score            functions of age, gender, diagnosis, and disease severity. Of
could be dichotomized; patients who reported 5 or more on           interest, none of the items of Skindex-17 showed significant
this subscale could be categorized as experiencing a lot of         DIF for education and marital status (data not shown). Of
symptoms.                                                           importance for a dermatology-specific HRQOL instrument is
                                                                    that the items of Skindex-17 psychosocial subscale behave
DISCUSSION                                                          similarly among patients with different diagnoses even with
Skindex-17 vs Skindex-29                                            varying levels of clinical disease severity. For example, a
The Skindex-17 (Appendix S2) is derived from Skindex-29,            psoriasis patient who has little HRQOL impairment responds
which is a well-designed and extensively studied HRQOL              in a similar way to each of the items as a vitiligo patient with
instrument in dermatology. However, a Rasch reduced                 the same level of impairment. In contrast to the psychosocial
version of Skindex-29 has practical advantages such as              subscale, the scores of the symptoms subscale should be
decreased response burden and theoretical advantages such           compared between different skin diseases cautiously because
as fit to an item response theory model. This transition from       symptoms scores depend on patients’ disease. Intuitively, it is
representational (Stevens, 1946) to fundamental measure-            not surprising that items that assess skin symptoms behave
ment theory has multiple implications, such as that it ensures      differently among different skin conditions. For example, DIF
unidimensionality, additivity, response order, DIF, and speci-      analyses demonstrated that individuals with seborrheic
fic objectivity (Michell, 1997, 2003), and is considered to be      dermatitis were significantly more frequently bothered by
the new standard in psychometric methodology (McHorney,             water but significantly less by bleeding compared to patients
1997; Michell, 2003). Although the retained items with              with psoriasis or acne, despite the same level of HRQOL
response categories in Skindex-17 are identical to the original     impairment (Po0.001). Ideally, the symptoms subscale
items, the scores should be collapsed into a 3-point scoring        should only be used to compare the impact of symptoms
system and the three subscales with an overall score have           within and not between skin diseases. Although to a lesser
been replaced by two subscales (psychosocial and symptoms)          extent than for diagnosis, the scores of the symptoms subscale
with separate summing scores. In HRQOL assessments, it is           should be compared with caution across gender because
not uncommon to report multiple summing scores without a            women are more likely to give higher scores, but not signi-
total overall score (Hemingway et al., 1997; Houssien et al.,       ficantly, than men for most symptoms items (Table S2).
1997) or to collapse scoring categories (Piccinelli et al., 1993;   Measurement of (very) limited HRQOL impairment in
Nijsten et al., 2006). Collapsing or reducing response              ‘‘benign’’ skin conditions such as nevi and (senile) warts
categories does not necessarily decrease the precision of an        remains challenging, because generic and dermatology-
HRQOL instrument (Piccinelli et al., 1993; De Jong et al.,          specific instruments may not be sensitive enough. Despite
1997; ). The fact that the subscales of Skindex-17 reflected        the floor effect of 9 of 12 psychosocial items and three of five
more than 85% of the total variance of its counterparts of          symptoms items (data not shown) among nevi patients and
Skindex-29 and that the correlations between the (sub)scales        that none of the respondents with nevi reported ‘‘often’’ or
was excellent suggest that deleting 12 items from Skindex-29        ‘‘all the time’’ for any of the 17 items (except for item 21,
has resulted in a minimal loss of information. Moreover, this       ‘‘embarrassed’’), Skindex-17 was able to select those who
reduction process did not substantially alter the excellent         indicated at least moderate psychosocial impairment and
psychometric properties of the original Skindex-29 (Chren           those with more than a few symptoms (12 and 5% of the nevi
et al., 1997).                                                      patients, respectively). Re-doing the Rasch analysis in sample
    Comparing Skindex-16, which focuses on degree of bother         1 without the 49 nevi patients resulted in exactly the same
rather than frequency and is shortened based on patients’           item retention with adequate fit to the model (P ¼ 0.25,
perspectives (Chren et al., 2001), with Skindex-17, which is        Cronbach’s a ¼ 0.93). In the future, it would be interesting to
the result of the statistical model, is of interest. Of the 12      study the effect of cultural background on the HRQOL tools
items that remained relatively the same in Skindex-16 and           that are used in international collaborations (Bullinger
-29, nine items (75%) were both in Skindex-17 and -16.              et al., 1993; Tennant et al., 2004b). In addition to static
Overall, a simple and preliminary comparison suggests               assessments, change in HRQOL impairment may also depend

                                                                                                                   www.jidonline.org 1247
TEC Nijsten et al.
      Rasch Reduced Skindex

      on external factors such as age, education, and marital status             groups: (1) acne; (2) psoriasis; (3) seborrheic dermatitis; (4) alopecia
      (Hemingway et al., 1997; Unaeze et al., in press). Therefore,              areata; (5) vitiligo; and (6) nevi. These diagnostic groups were
      the effect of patients’ characteristics should be carefully                selected because they represent relatively homogenous groups
      evaluated before comparing HRQOL scores of different                       of diagnoses, may be associated with comparable HRQOL impair-
      patient populations, as is done in clinical trials or surveys,             ment, and consisted of at least 25 patients. As there were relatively
      especially if the used tool has not been tested for DIF                    more patients with nevi, we randomly selected 100 of the 258
      previously.                                                                individuals whose major concern was nevi. Subsequently, the 454
                                                                                 subjects were divided into two groups. Both selection procedures
      Sensitivity analyses                                                       used the random number generator method. The sampling of
      To test whether combining the functioning and emotion                      two smaller subsets was done because the w2 statistics used in the
      subscales did affect our findings, we re-did the Rasch                     Rasch model are very sensitive to the sample size (if n is large, even
      analyses accepting the three dimensions of Skindex-29 and                  slight deviations from unidimensionality may be statistically
      analyzing each separately and combined. Neither approach                   significant) (Tesio, 2003) and it enabled us to compare the findings
      improved the reduction process of this instrument (data not                in both samples.
      shown). For example, the emotion subscale did not fit the                      Age was categorized into three levels (o45; 45–64; and X65
      overall Rasch model (Po0.0001), and even after shortening it               years). The 5-point physician global assessment re-grouped into four
      to three items based on item fit and DIF its overall fit to the            levels ((1) very mild; (2) mild; (3) moderate; and (4) severe and very
      Rasch model remained very poor (P ¼ 0.004). Alternatively,                 severe), because the very severe score was reported in only 1% of
      Rasch analyses of the three subscales combined resulted in                 the patients. Ethical approval was granted by the institutional review
      significant DIF for diagnosis (Pp0.006) for six out of seven               board and the study was carried out in compliance with the
      symptoms-related items. These a posteriori Rasch analyses                  Declaration of Helsinki Principles.
      demonstrate how the selection of the included items affects
      the result because item fit and DIF are determined by the total            Statistics
      amount of underlying construct (i.e., HRQOL impairment                     Rasch analysis. Statistical analyses have been described in
      due to skin conditions), which entirely depends on the items               detail previously (Nijsten et al., 2005, 2006). The same procedure
      selected. This underlying methodology also plays a role in the             of Rasch analysis was followed for samples 1 and 2. To ensure
      order of deleting items that demonstrate significant item misfit           a homogeneous and ordered scoring system, the threshold order
      and/or DIF because it may change the outcome substantially.                was examined for each of the 29 items. Owing to practical reasons,
                                                                                 the scoring categories were not altered for each item separately,
      CONCLUSION                                                                 but according to the most optimal ordering for the vast majority of
      The Skindex-17 is a Rasch-reduced version of Skindex-29 that               the items.
      overcomes some of the fundamental issues in measurement.                       Based on a cluster analysis that included Skindex-29 (Sampogna
      This reduced questionnaire behaved psychometrically very                   et al., 2004) that showed a high correlation between functional and
      well in this study population and may be a valuable tool in                emotional subscales, we combined these two subscales (psychoso-
      the measurement of patient-based outcomes in dermatology,                  cial subscale), and analyzed them separately from the symptoms
      but its psychometric properties and clinical value should first            subscale (21 and 7 items, respectively). As the data of the two
      be tested in other populations.                                            subscales of sample 1 and 2 did not fit the rating scale model
                                                                                 (Po0.001), we used the unrestricted Rasch model for all analyses. In
      MATERIALS AND METHODS                                                      the Rasch model, individual item misfit was present if an item
      Participants and settings                                                  showed a significant w2 statistic or a large positive (42.5) or negative
      The design of the study included the characteristics of its participants   (o2.5) standardized residual. To ensure that responses to
      who were all aged 18 or more and attending the outpatient clinics of       individual items were unaffected by external factors (age, gender,
      dermatology and dermatological surgery of a large Italian dermato-         diagnosis, and disease severity) to the measurement tool, DIF
      logical hospital on predetermined days. More than 80% of the               analyses were performed (Angoff, 1993). For the symptoms subscale,
      patients are referred to this institution by their general practitioner.   DIF for diagnosis was not assessed because preliminary analyses
      After signing an informed consent, the Italian version of Skindex-29,      showed that all symptom-related items had significant DIF for
      which was created according to the guidelines for the cross-cultural       diagnosis (Po0.0001), except items 1 and 26 (P ¼ 0.38 and 0.05,
      adaptation of the HRQOL measures (Guillemin et al., 1993) and              respectively). Overall fit of the data to the model was assessed using
      validated (Abeni et al., 2001, 2002), was administered (Appendix           item–trait interaction.
      S1). After the visit, the physician registered the diagnosis and rated         To confirm the unidimensionality of the retained items that fitted
      the severity of the skin condition by answering the question ‘‘In your     the Rasch model, a principal component analysis was performed. If
      experience, among all patients you have seen with this condition,          the first factor after discounting the ‘‘Rasch factor’’ accounted for
      how severe is this patient’s condition?’’, which was defined as the        o40% of the variance, the scale was assumed to be unidimensional.
      physician global assessment. The possible responses were: very
      mild, mild, moderate, severe, and very severe. The 2,242                   Classical psychometric features of Skindex-17. To test the
      dermatological patients presented various dermatological diagnoses.        validity of Skindex-17, we examined the relation between the
      However, in the present study, we restricted the analyses to 454           original and the short version of the Skindex using Pearson’s
      patients whose diagnoses were part of the following diagnostic             correlation coefficients (r) and the adjusted R2 in a linear regression

1248 Journal of Investigative Dermatology (2006), Volume 126
TEC Nijsten et al.
                                                                                                                                     Rasch Reduced Skindex

model. Construct validity was assessed by testing the assumption                    Table S4. Categorization of the psychosocial and symptoms subscale of
                                                                                    Skindex-17 based on mixture analysis (n ¼ 454)*.
that patients with inflammatory dermatoses such as psoriasis and
                                                                                    Appendix S1. Skindex-29 items.
acne would have higher scores than patients with nevi. In addition,
                                                                                    Appendix S2. Skindex-17 items.
multiple classical test theory -based psychometric features of
Skindex-17 and its items were studied, including response distribu-
                                                                                    REFERENCES
tion, item–rest correlation, item discriminant validity, item complex-
                                                                                    Abeni D, Picardi A, Pasquini P, Melchi CF, Chren MM (2002) Further
ity, and Cronbach’s a’s (Nijsten et al., 2005, 2006). A principal
                                                                                        evidence of the validity and reliability of the Skindex-29: an Italian study
component analysis followed by oblique rotation was performed to                        on 2,242 dermatological outpatients. Dermatology 204:43–9
test the proposed subscaling of Skindex-17. Factors with eigenvalues                Abeni D, Picardi A, Puddu P, Pasquini P, Chren MM (2001) Construction and
of 1 or more were retained.                                                             validation of the Italian version of Skindex-29, a new instrument to
                                                                                        measure quality of life in dermatology (in Italian, abstract in English).
Categorization of scores. The scores of the new Rasch reduced                           G Ital Dermatol Venereol 136:73–6
psychosocial and symptoms subscale of the Skindex were entered                      Angoff WH (1993) Perspectives on differential item functioning methodology.
                                                                                        In: Differential item functioning (Holland PW, Wainer H, eds), Hillsdale,
into a mixture analysis using computer-assisted mixture analysis
                                                                                        NJ: Lawrence Erlbaum
(Haughton, 1997). This statistical method detects whether multiple,
                                                                                    Augustin M, Wenninger K, Amon U, Schroth MJ, Kuster W, Chren M et al.
distinct subdistributions are present in the distribution of the scores                 (2004) German adaptation of the Skindex-29 questionnaire on quality of
of all the individuals and would, therefore, facilitate the interpreta-                 life in dermatology: validation and clinical results. Dermatology 209:
tion of the new scores. We used the same procedure as described                         14–20
previously for the categorization of another HRQOL instrument                       Boehning DB, Schlattmann P (2005) Computer assisted analysis of mix-
                                                                                        tures (C.A.MAN). Available at: ftp.ukbf.fu-berlin.de/sozmed/caman.html
(Nijsten et al., 2006). In brief, we estimated the weights for the
                                                                                        (accessed 20 August)
maximum number of grid points suggesting Poisson distributions,
                                                                                    Bullinger M, Anderson R, Cella D, Aaronson N (1993) Developing and
which were subsequently used as starting values for the expecta-                         evaluating cross-cultural instruments from minimum requirements to
tion–maximization algorithm. Then, a much smaller number of                              optimal models. Qual Life Res 12:451–9
weights and support points were computed with nonparametric                         Chren MM (1999) Understanding research about quality of life and other
maximum likelihood estimations of 1 and by combining support                            health outcomes. J Cutan Med Surg 3:312–6
points that are very close to each other or have very low weights. The              Chren MM, Lasek RJ, Sahay AP, Sands LP (2001) Measurement properties of
support points and weights that remained were then entered in a                         Skindex-16: a brief quality-of-life measure for patients with skin diseases.
                                                                                        J Cutan Med Surg 5:105–10
fixed mixture to classify each observation in one of the different
                                                                                    Chren MM (2005) Measurement of vital signs of skin diseases. J Invest
mixture components using posterior probabilities.                                       Dermatol 125:vii–ix
    All statistical tests were two sided. Significance was assessed at              Chren MM, Lasek RJ, Flocke SA, Zyzanski SJ (1997) Improved discriminative
an a level o0.05, with the exceptions outlined above. Stata version                     and evaluative capability of a refined version of Skindex, a quality-of-life
7.0 (Stata Corp., College Station, TX) was used to estimate                             instrument for patients with skin diseases. Arch Dermatol 133:
Cronbach’s a, the correlation coefficients, and the principal axis                      1433–40
analyses. SPSS version 10.0 for Windows (SPSS Inc., Chicago, IL)                    Chren MM, Lasek RJ, Quinn LM, Mostow EN, Zyzanski SJ (1996) Skindex,
                                                                                        a quality-of-life measure for patients with skin disease: reliability,
was used to randomize the sample in two groups. Rasch analyses                          validity, and responsiveness. J Invest Dermatol 107:707–13
were performed using RUMM2020 (RUMM Laboratory Pty Ltd,
                                                                                    Coste J, Guillemin F, Pouchot J, Fermanian J (1997) Methodological approa-
Perth, Australia). To categorize the Rasch version of the Skindex,                      ches to shortening composite measurement scales. J Clin Epidemiol
computer-assisted mixture analysis 2.0 was used (Boehning and                           50:247–52
Schlattmann, 2005).                                                                 de Jong Z, van der Heijde D, McKenna SP, Whalley D (1997) The reliability
                                                                                         and construct validity of the RAQoL: a rheumatoid arthritis-specific
                                                                                         quality of life instrument. Br J Rheumatol 36:878–83
CONFLICT OF INTEREST
The authors state no conflict of interest.                                          Guillemin F, Bombardier C, Beaton D (1993) Cross-cultural adaptation of
                                                                                         health-related quality of life measures: literature review and proposed
                                                                                         guidelines. J Clin Epidemiol 46:1417–32
ACKNOWLEDGMENTS
This study was financially supported, in part, by the Italian Ministry of Health,   Guyatt GH, Osoba D, Wu AW, Wyrwich KW, Norman GR, Clinical
Rome, Italy. Dr Chren was supported by an Independent Scientist Award                   Significance Consensus Meeting Group (2002) Methods to explain the
(#K02 AR 02203-01) from the National Institute of Arthritis and Musculoske-             clinical significance of health status measures. Mayo Clin Proc 77:
letal and Skin Diseases, National Institutes of Health. Tamar Nijsten was               371–83
supported by a grant from the Fund for Scientific Research-Flanders, Belgium        Haughton D (1997) Packages for estimating finite mixtures: a review. Am Stat
(FWO-Vlaanderen).                                                                       51:194–204
                                                                                    Hemingway H, Stafford M, Stansfeld S, Shipley M, Marmot M (1997) Is the
SUPPLEMENTARY MATERIAL                                                                 SF-36 a valid measure of change in population health? Results from the
Table S1. Individual item fit to the Rasch model and DIF for the items of the          Whitehall II Study. BMJ 315:1273–9
psychosocial subscale of Skindex-17 across demographic and disease                  Houssien DA, McKenna SP, Scott DL (1997) The Nottingham Health Profile
characteristics (sample 1, n ¼ 227 patients).                                           as a measure of disease activity and outcome in rheumatoid arthritis.
Table S2. Individual item fit to the Rasch model and DIF for the items of the           Br J Rheumatol 36:69–73
symptoms subscale of Skindex-17 across demographic and disease character-           McHorney CA (1997) Generic health measurement: past accomplishments
istics (sample 1, n ¼ 227 patients).                                                   and a measurement paradigm for the 21st century. Ann Intern Med
Table S3. For the total and the 2 random samples, the mean and median (25th            127:743–50
and 75th percentiles) of Skindex-17 scores among patients with different skin       McKenna SP, Cook SA, Whalley D, Doward LC, Richards HL, Griffiths CE
conditions.                                                                            et al. (2003) Development of the PSORIQoL, a psoriasis-specific measure

                                                                                                                                               www.jidonline.org 1249
TEC Nijsten et al.
      Rasch Reduced Skindex

           of quality of life designed for use in clinical practice and trials. Br J           severity, quality of life, and psychological distress in patients with
           Dermatol 149:323–31                                                                 psoriasis: a cluster analysis. J Invest Dermatol 122:602–7
      Michell J (1997) Quantitative science and the definition of measurement in          Stevens SS (1946) On the theory of scales. Science 103:677–88
          psychology. Br J Psychol 88:355–83
                                                                                          Streiner DL, Norman GF (2003) Health measurement scales: a practical
      Michell J (2003) Measurement: a beginner’s guide. J Appl Meas 4:298–308                  guide to their development and use. 3rd edn. Oxford: Oxford
      Nijsten T, Unaeze J, Stern RS (2006) Refinement and reduction of the                     University Press
           Impact of Psoriasis Questionnaire: classical test theory vs Rasch analysis.    Tennant A, McKenna SP, Hagell P (2004a) Application of Rasch analysis in
           Br J Dermatol. 16 January [E-pub ahead of print, doi:10.1111/j.1365-               the development and application of quality of life instruments. Value
           2133.2005.07066.X]
                                                                                              Health 7:S22–6
      Nijsten T, Whalley D, Gelfand J, Margolis DJ, McKenna SP, Stern RS (2005)
                                                                                          Tennant A, Penta M, Tesio L et al. (2004b) Assessing and adjusting for cross-
           The psychometric properties of the Psoriasis Disability Index in United
                                                                                              cultural validity of impairment and activity limitation scales through
           States patients. J Invest Dermatol 125:665–72
                                                                                              differential item functioning within the framework of the Rasch model:
      Piccinelli M, Bisoffi G, Bon MG, Cunico L, Tansella M (1993) Validity and
                                                                                              the PRO-ESOR project. Med Care 42:I37–48
           test–retest reliability of the Italian version of the 12-item General Health
           Questionnaire in general practice: a comparison between three scoring          Tesio L (2003) Measuring behaviours and perceptions: Rasch analysis as a tool
           methods. Compr Psychiatry 34:198–205                                                for rehabilitation research. J Rehabil Med 35:105–15
      Prieto L, Alonso J, Lamarca R (2003) Classical test theory versus Rasch             Unaeze J, Nijsten T, Murphy A, Ravichandran C, Stern RS (in press) Impact of
           analysis for quality of life questionnaire reduction. Health Qual Life             psoriasis on health related quality of life decrease over time: an 11-year
           Outcomes 1:27                                                                      prospective study. J Invest Dermatol
      Rasch G (1960) Probabilistic models for some intelligence and attainment            Whalley D, McKenna SP, Dewar AL, Erdman RA, Kohlmann T, Niero M et al.
          tests. Chicago: University of Chicago Press (reprinted, 1980)                      (2004) A new instrument for assessing quality of life in atopic dermatitis:
      Sampogna F, Sera F, Abeni D, IDI Multipurpose Psoriasis Research on                    international development of the Quality of Life Index for Atopic
          Vital Experiences (IMPROVE) Investigators (2004) Measures of clinical              Dermatitis (QoLIAD). Br J Dermatol 150:274–83

1250 Journal of Investigative Dermatology (2006), Volume 126
You can also read