SEXUAL ASSAULT EDUCATION PROGRAMS: A META-ANALYTIC EXAMINATION OF THEIR EFFECTIVENESS
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Psychology of Women Quarterly, 29 (2005), 374–388. Blackwell Publishing. Printed in the USA. Copyright C 2005 Division 35, American Psychological Association. 0361-6843/05 SEXUAL ASSAULT EDUCATION PROGRAMS: A META-ANALYTIC EXAMINATION OF THEIR EFFECTIVENESS Linda A. Anderson Susan C. Whiston Oregon State University Indiana University Meta-analyses of the effectiveness of college sexual assault education programs on seven outcome measure categories were conducted using 69 studies that involved 102 treatment interventions and 18,172 participants. Five of the outcome categories had significant average effect sizes (i.e., rape attitudes, rape-related attitudes, rape knowledge, behavioral intent, and incidence of sexual assault), while the outcome areas of rape empathy and rape awareness behaviors did not have average effect sizes that differed from zero. A significant finding of this study is that longer interventions are more effective than brief interventions in altering both rape attitudes and rape-related attitudes. Moderator analyses also suggest that the content of programming, type of presenter, gender of the audience, and type of audience may also be associated with greater program effectiveness. Implications for research and practice are discussed. The disturbingly high incidence of sexual assault expe- dissertations and publications in this area can be a daunting rienced by college women has been widely documented task. Consequently, despite increases in recent research, lit- over the last few decades (e.g., Abbey, Ross, McDuffle, & tle is actually known about the overall effectiveness of these McAuslan, 1996; Brener, McMahon, Warren, & Douglas, programs and whether they produce lasting attitudinal or 1999; Koss, Gidycz, & Wisniewski, 1987). Consequently, behavioral changes (Heppner, Neville, Smith, Kivlighan, & the need for sexual assault prevention on college campuses Gershuny, 1999). nationwide has become increasingly apparent. The federal In an attempt to understand the value of these programs, government has acknowledged the importance of this issue several narrative reviews of the literature have been pub- by mandating that sexual assault prevention efforts be con- lished (e.g., Bachar & Koss, 2001; Breitenbecher, 2000; ducted on campuses that receive federal funding (Neville Gidycz, Rich, & Marioni, 2002; Lonsway, 1996; Schewe & Heppner, 2002). As a result, college education programs & O’Donohue, 1993a; Yeater & O’Donohue, 1999). While have emerged as one of the more popular strategies for several reviewers have concluded that most programs dis- sexual assault prevention. play short-term effectiveness in altering rape-supportive at- Although interventions have been developed and im- titudes, there is little understanding of the impact of these plemented at various colleges and universities across the interventions beyond this point. Unfortunately, narrative United States since the 1980s, few of these programs have review of such a broad range of findings has several limita- been empirically evaluated (Schewe & O’Donohue, 1993a; tions. First, past narrative reviews of the sexual assault ed- Yeater & O’Donohue, 1999). McCall (1993) summarized ucation literature typically have not included unpublished this situation by contending, “[S]exual assault prevention studies, and thus may tend to inflate the effectiveness of pro- programming remains a confused, scattered, and sporadic gramming (Brecklin & Forde, 2001; Breitenbecher, 2000). enterprise with little scientific underpinning” (p. 277). For- Furthermore, narrative reviews do not provide quantita- tunately, in recent years the research on these programs has tive indices of the degree to which particular program ap- expanded; however, drawing conclusions from the myriad of proaches are effective, nor do they typically systematically identify factors that may moderate or influence program effectiveness. For many involved in sexual assault educa- tion, it is not sufficient to know if these interventions are Linda A. Anderson, University Counseling and Psychological Ser- vices, Oregon State University; Susan Whiston, Department of generally effective because they are interested in develop- Counseling & Educational Psychology, Indiana University. ing programs that have the most significant impact on par- Address correspondence and reprint requests to: Linda A. ticipants. Hence, identification and analysis of moderator Anderson, University Counseling and Psychological Services, variables may be particularly important. Oregon State University, 500 Snell Hall, Corvallis, OR 97331. Meta-analysis is a technique that overcomes some of the E-mail: Linda.Anderson@oregonstate.edu limitations of narrative reviews and provides a numerical 374
Sexual Assault 375 indicator of program effectiveness (i.e., effect size) that al- behavioral categories and were adapted from construct lows individuals to determine the degree to which interven- categories offered by Breitenbecher (2000). Although these tions are effective (Lipsey & Wilson, 2001). Furthermore, outcome categories are likely to be correlated, they were meta-analytic techniques can be used to identify variables analyzed separately to obtain information about the differ- that influence effect size (i.e., moderator variables). There ential impact of programming on various constructs, partic- have been two previous meta-analytic reviews of sexual as- ularly in light of concerns about the tenuous link between sault education programs. The first meta-analytic review of attitudes and behaviors. Lipsey and Wilson (2001) recom- the sexual assault education literature (Flores & Hartlaub, mended this approach, as averaging effect sizes across all of 1998) yielded an average effect size of .30. This effect size these constructs would result in more ambiguous and less indicates that those attending a sexual assault education meaningful results. program would have outcomes almost a third of a standard Attitude measurements employed in the sexual assault deviation better than participants who had not attended a education literature represent a diverse range and thus program. Flores and Hartlaub’s meta-analysis included only were divided into three categories. For the purpose of 11 studies that used rape-myth acceptance as the sole out- this investigation, dependent measures that may be cat- come measure. The second meta-analysis found an overall egorized as rape attitudes were those that assessed at- mean effect size of .35, based on 45 studies (Brecklin & titudes specific to rape, such as rape myth acceptance, Forde, 2001). Brecklin and Forde (2001) were also able to attitudes toward rape, and rape victim blame. This cat- identify several variables that moderated effect size. They egory is similar to the outcomes analyzed in prior meta- found that programs were more effective for men in single- analyses. Rape empathy was the second outcome category. gender than in mixed-gender groups. They also found that It included scales designed to measure empathy specifi- published studies had larger effect sizes, supporting the cally related to rape and the degree to which participants need to include both published and unpublished research identified with either rape victims or perpetrators. This out- in future reviews. However, Brecklin and Forde’s (2001) come was differentiated because reviewers have specifi- meta-analysis considered only one category of outcome, cally identified empathy as a construct targeted in educa- rape attitudes. The current study was designed to expand tional programming (Lonsway, 1996; Schewe, 2002). The on previous analyses of the effectiveness of sexual assault third attitudinal construct was rape-related attitudes. The education programs by (a) examining a more diverse set of scales incorporated into this category did not measure rape- outcomes and (b) exploring whether a wider spectrum of specific attitudes, but assessed attitudes thought to pro- program factors (e.g., type of presenter, content of program) mote the occurrence of sexual assault. This category in- may influence program effectiveness. cluded measures of sex-role stereotyping, attitudes toward Concerning the first goal mentioned above, there are women, and adversarial sexual beliefs. These measures a variety of outcome measures that have been used by were not included in past meta-analyses; thus, this category individual researchers to examine the impact of sexual may contribute additional information about the impact of assault education programs. Until recently, most studies programming. have relied exclusively upon attitudinal measures to as- The fourth outcome, rape-related knowledge, consisted sess the effectiveness of sexual assault education. Several of measures of participants’ factual knowledge about sexual researchers have questioned this restricted focus on at- assault. The final three outcome categories reflected vary- titudes as the only indicator of change (e.g., Heppner, ing dimensions of behavior, which are typically defined and Humphrey, Hillenbrand-Gunn, & DeBord, 1995; Lonsway, assessed differently for women and men. Behavioral inten- 1996; Schewe & O’Donohue, 1993a) because there is still tions included participants’ self-reported intent to rape or to debate on whether a reduction in rape-supportive attitudes engage in certain dating behaviors. Rape awareness behav- will reduce the actual incidence of sexual assault. Conse- iors referred to the actual self-reported or observed behav- quently, investigators have begun to utilize measures that iors of participants that may reflect heightened awareness assess other outcomes, such as knowledge about sexual as- about sexual assault (e.g., differences in dating behaviors or sault, behaviors thought to be associated with sexual assault, willingness to volunteer for rape prevention efforts). The and incidence of perpetration or victimization. Due to the final outcome category included in this study was the actual recent expansion of outcome measurement in the literature, incidence of sexual assault perpetration and victimization it was considered important to incorporate diverse outcome following an intervention. measures in the current meta-analysis. The second major goal of the current investigation was In this investigation, seven different outcome variables to examine the impact of several potential moderating vari- were analyzed separately: rape attitudes, rape empathy, ables. Descriptive characteristics about the study (e.g., pub- rape-related attitudes, rape knowledge, behavioral inten- lished or unpublished) and its methods and procedures tions, rape awareness behaviors, and incidence of sex- (e.g., sample size, random assignment, and time of follow- ual assault. In essence, seven separate meta-analyses were up measure) were incorporated in the moderator anal- conducted, one for each distinct outcome variable. These ysis to examine if the type and quality of the research outcomes represent various attitudinal, knowledge, and influence effect size. In addition, it was deemed critical to
376 ANDERSON AND WHISTON examine both characteristics of the participants and the in- METHOD terventions to delineate the attributes of effective programs. Selection of Studies Attending to issues of gender is particularly important in an- alyzing sexual assault education programs because rape is Several strategies were used to ensure that virtually all per- primarily a crime committed by males against females, and tinent data, both published and unpublished, were used therefore there are likely to be gender differences on the in this meta-analysis. Initially, seven computerized refer- outcome variables. The gender of the audience was targeted ence database systems were searched: PsycINFO, ERIC, for moderator analysis because the question of whether MEDLINE, Dissertation Abstracts Online, Criminal Jus- all-male, all-female, or mixed groups are more effective tice Abstracts, Sociological Abstracts, and the Social Science has been frequently debated. In a narrative review, Breit- Citation Index. A number of combinations of key words enbecher (2000) tentatively concluded that single-gender (e.g., rape, sexual assault, prevention, intervention) were programs appeared more effective. Similarly, Berkowitz used to identify pertinent studies, dissertations or theses, (2002) and Rozee and Koss (2001) argued that mixed- and program evaluation reports. The second step for iden- gender programs are less effective because men may be- tifying relevant studies was to examine the references of the come defensive in the presence of women. Brecklin and articles obtained and of previous reviews. The third step in- Forde (2001) explored this issue in their meta-analysis and volved searching by hand the last 5 years of journals that concluded that single-gender programs are indeed more have traditionally published articles relating to sexual as- effective at reducing men’s rape-supportive attitudes. This sault to find articles that were comparatively recent and not investigation sought to replicate prior conclusions concern- in databases or cited by other researchers. The final strategy ing rape attitudes, while also exploring the impact of single- was to contact authors who have published multiple articles and mixed-gender programming on additional outcome concerning sexual assault education and to request their as- constructs. sistance in locating studies. For more detailed information Based on the recommendations of previous review- concerning methodological procedures, refer to Anderson ers (e.g., Breitenbecher, 2000; Lonsway, 1996; Yeater & (2003). O’Donohue, 1999), additional moderators examined in this To have been eligible for inclusion, studies must have study included status of the intervention facilitator (e.g., examined an intervention intended to reduce negative at- peer, graduate student, or professional), type of popula- titudes and/or behaviors associated with sexual assault. tion that received the intervention (e.g., fraternity mem- Eligible studies must have measured the effectiveness of bers), length of the program, and intervention content. an intervention using one or more quantitative dependent Surprisingly, previous reviews have not explored in de- measures designed to assess one of seven different outcome tail the degree to which content influences program ef- categories: rape attitudes, rape empathy, rape-related atti- fectiveness; thus, practitioners have little information con- tudes, rape knowledge, behavioral intentions, rape aware- cerning what type of content to include to maximize a ness behaviors, and/or incidence of sexual assault. Because program’s impact. Based on a comprehensive review of the prevalence, definition, and general understanding of the literature and an analysis of over 100 program de- sexual assault may vary by culture, only studies that involved scriptions, four general types of rape education programs North American college students as participants were in- were identified. Although some interventions exhibited el- cluded in this meta-analysis. Furthermore, a study had to ements of more than one type of program, an attempt was compare one or more interventions with a control group. made to categorize the content of each program based Control groups could either be placebo, wait-list, minimal upon its primary focus. The nature of coding this partic- treatment, or no-treatment groups. Either random assign- ular item was somewhat subjective; however, this strat- ment of participants or pretests had to be given to ensure egy provided at least an initial exploration of the effect of equivalence of groups on outcome measures prior to the differences in content to guide practitioners in program intervention. Consistent with the suggestions of Hedges development. and Olkin (1985), neither treatment-versus-treatment nor In sum, this meta-analysis expands on previous reviews pretest–posttest comparisons were included so as to cal- by focusing on additional characteristics of the participants, culate the least biased estimate of the effectiveness of rape facilitators, and content of the interventions and how these education programs. Finally, each study also needed to pro- factors may moderate outcome. Furthermore, because pre- vide the necessary information to calculate effect sizes, such vious reviews have examined the effect of sexual assault as means and standard deviations or other statistical test education programs on attitudinal variables only, this meta- information. analysis examined additional outcomes related to knowl- Of the 120 studies identified as research on sexual as- edge and behavior. Finally, additional statistical procedures sault education programs, 51 of those studies could not be as recommended by Hedges and Olkin (1985) and Lipsey used for this meta-analysis. Many of those studies (n = 19) and Wilson (2001) were implemented to calculate weighted were eliminated because they lacked a control group. Other effect sizes and to consider methodological factors of stud- studies (n = 11) were not included because they dupli- ies in drawing conclusions. cated another study (e.g., a dissertation later published as a
Sexual Assault 377 journal article), and some studies (n = 10) did not pro- domly selected portion of articles (n = 12) so that relia- vide sufficient information to calculate effect sizes. Other bility estimates could be obtained. The rate of agreement studies were eliminated because the participants were not between the two coders was calculated according to the college students (n = 4) or because the studies included no four types of moderator variables. The rate of agreement relevant dependent measures (n = 2), lacked pretest or ran- for source descriptors combined with methods and proce- dom assignment (n = 4), or did not involve an intervention dures was 95.5%, for participant characteristics 97.6%, and (n = 1). After being screened, 69 studies met the criteria for intervention characteristics 92.1%. for inclusion in this review. Analysis Coding Procedures Statistical analysis was conducted in accordance with the Consistent with the recommendations of Stock (1994), a guidelines provided by Lipsey and Wilson (2001). To iden- manual was developed to guide coding. Attention was given tify effects on behavior, knowledge, and attitudes, separate to coding of moderator variables to determine whether fac- meta-analyses were conducted for each of the seven out- tors such as methodological sophistication or content of the come categories. Each effect size (g) was calculated by sub- intervention moderated the effect size of an investigation. tracting the mean of the control group from the mean of As suggested by Lipsey and Wilson (2001), moderator vari- the experimental group and dividing by the pooled standard ables were divided into three categories: source descriptors, deviation. When means and standard deviations were not methods and procedures, and substantive issues (i.e., char- available from primary research studies, procedures sum- acteristics of the participants and the intervention). marized by Lipsey and Wilson (2001) to estimate standard- Source descriptors coded for this study included publi- ized mean difference effect sizes from statistics (e.g., t tests) cation form (journal, dissertation/thesis, unpublished) and were utilized. Following the calculation of effect sizes, these year of publication. Information concerning methods and values were then transformed to d to correct for small sam- procedures were coded as follows: unit and type of assign- ple bias (Hedges & Olkin, 1985). Once effect sizes were ment to conditions (individual random, group random, in- calculated for each study, adjusted for bias and error, and dividual nonrandom, group nonrandom), nature of control combined within each construct, the standard error of the group (placebo, no treatment, minimal information), time estimate was calculated and each d was then weighted by of follow-up measure, and overall quality of study. Guide- its inverse variance, which produced the aggregated effect lines were developed to standardize evaluation of the over- size estimate d+ (Hedges & Olkin, 1985). This procedure all quality of the study (see Anderson, 2003). We further also corrects for bias because it requires each effect size coded for attempt to control for social desirability or de- to be weighted by a value (inverse variance weight) that mand characteristics, sample attrition, standardization of represents its precision (Lipsey & Wilson, 2001). measures, and reliability of measures; however, these fac- Lipsey and Wilson (2001) and Rosenthal (1991) argued tors were not included in the moderator analyses because that effect sizes within a distribution need to be statisti- the information was inconsistently reported. cally independent and suggested using only one effect size Information coded about the participants included gen- for each construct examined. One method for ensuring in- der, mean age, race, gender of audience (all-female, women dependence in outcome measures is to not use data from from a mixed-gender group, all-male, men from a mixed- the same measure more than one time. For example, some gender group, women and men combined), and type of au- studies will assess outcome at the end of treatment and then dience (Greek members, general students, high risk). Infor- later in a follow-up analysis to determine the long-term ef- mation about the intervention was coded as follows: status of fect of the intervention. We opted to calculate effect sizes facilitator (peer, graduate student, or professional), length in this meta-analysis using measures from the last follow- of program, and content of program. The content of the up analysis conducted for each study. Given that effect size program was coded as follows: (a) informative (providing tends to decrease with time, this conservative method was factual information and statistics, review of myths and facts, chosen to enhance our understanding of the long-term ef- discussion of consequences of rape, identification of char- fectiveness of programming, which may not be reflected acteristics of rape scenarios); (b) empathy focused (helping in an immediate posttest. As another way to ensure inde- participants develop empathy for rape victims); (c) socializa- pendence of data, we averaged the effect sizes within each tion focused (examining gender-role stereotyping, societal category to produce one effect size per outcome. Finally, messages that influence rape, oppression); (d) risk reducing when there is more than one treatment intervention within (teaching specific strategies to reduce one’s risk of rape); a single study, effect sizes were calculated for each treat- (e) more than one content; (f) cannot determine; and (g) ment relative to the control group, to avoid losing valuable other. Theoretical foundation was also coded, but could not information about different types of programs. be analyzed as a moderator due to the dearth of studies that Due to our interest in examining gender issues, effect included this variable. sizes for women and men were coded separately whenever The first author coded each article, and the second au- possible, in an effort to determine whether women and men thor (a professor of counseling psychology) coded a ran- in single-gender groups benefit more from programming
378 ANDERSON AND WHISTON than those in mixed-gender groups. However, some authors men whenever possible, the results are based on 262 effect reported the results only for women and men combined. sizes. Approximately 43.9% of these effect sizes represent Following the calculation of mean weighted effect sizes rape attitudes, 7.6% rape empathy, 21.4% rape-related at- for each outcome construct, 95% confidence intervals were titudes, 6.9% knowledge, 9.5% behavioral intentions, 5.7% computed to determine the significance of each mean effect rape awareness behaviors, and 5% incidence of sexual as- size. If the confidence interval does not include zero, then sault. Before discussing the results of the seven separate the mean effect is significantly different from zero (p < .05). meta-analyses, we will provide a brief overview of all of the Next, a homogeneity analysis was conducted to determine studies included. whether each effect size represented a common popula- tion mean. Homogeneity testing was performed according Studies Overview to procedures described by Hedges and Olkin (1985) and The studies included in this meta-analytic review involved was based upon the Q statistic, which follows a chi-square 18,172 participants, of which 48.7% were women (SD = distribution with k – 1 degrees of freedom (k being the num- 35). The participants’ average age was 20.3 (SD = 2.3). ber of effect sizes). If the effect size distribution was found Race could be determined in 71% of the studies as fol- to be heterogeneous (a significant Q value), this hetero- lows: 4.1% African American, 4.3% Asian American, 84% geneity indicated that the dispersion of effect sizes around Caucasian, 3.1% Latino/a, and 4.9% other participants. The their mean was greater than that which would be expected studies were authored between 1978 and 2002 (M = 1995.1, from sampling error alone (Lipsey & Wilson, 2001). In this SD = 4.7), 58% were published in academic journals, 37.6% circumstance, analyses of moderator variables should be were dissertations or theses, and 4.3% were conference pa- conducted in order to identify additional sources of vari- pers or other unpublished works. Regarding methodology, ance. In addition, as recommended by Hedges and Olkin 68% of studies used random assignment (of either groups (1985), outliers were examined in an effort to determine or individuals), while 17% of studies attempted to control if the removal of these values would cause the distribution for social desirability or demand characteristics. The num- to achieve homogeneity. For this investigation, an outlier ber of outcome measures used in each study ranged from was defined as an effect size that fell two standard devia- 1 to 26 with an average of 4.1 measures per study (SD = tions above or below the mean of the distribution, which is a 3.7); however, only an average of 3.6 (SD = 2.7) of the mea- common procedure according to Lipsey and Wilson (2001). sures could be coded. Reliability estimates were reported Of the 13 variables identified as potential moderators, for the majority of measures. Regarding the type of control 7 were categorical and were investigated using a proce- group employed, 59.4% used a no-treatment control, 10.1% dure analogous to analyses of variance (ANOVA; Hedges & used a wait-list control, 24.6% used a placebo intervention, Olkin, 1985). Modified weighted least squares regression and 5.8% used minimal information groups for comparison (Hedges & Olkin, 1985) was utilized for analysis of the six with treatment groups. For additional information about continuous variables. the studies included, refer to Anderson (2003). Concerning the effectiveness of the 102 interventions evaluated, Table 1 provides the overall effect sizes for each RESULTS of the seven outcome categories. As Table 1 indicates, This meta-analysis included results from 69 articles rep- the mean weighted effect size for sexual assault education resenting 102 treatment interventions. Because some programs ranged from .061 for rape awareness behavior interventions were evaluated using multiple outcome mea- measures to .574 for measures of rape knowledge. As re- sures, and effect sizes were coded separately for women and flected by the confidence intervals, the effect sizes for rape Table 1 Sample Size, Mean Weighted Effect Size, 95% Confidence Interval, Homogeneity Test, and Fail-Safe N for Seven Outcome Categories Outcome Category k d+ 95% C.I. Homogeneity Test Fail-Safe N Rape attitudes 115 .211∗∗ .175/.246 Q (114) = 215.41∗∗ 569 Rape empathy 20 .072 −.003/.147 Q (19) = 45.86∗∗ — Rape-related attitudes 56 .125∗∗ .076/.174 Q (55) = 79.68∗ 86 Rape knowledge 18 .574∗∗ .498/.650 Q (17) = 119.54∗∗ 118 Behavioral intent 25 .136∗∗ .054/.217 Q (24) = 42.47∗ 16 Awareness behavior 15 .061 −.018/.140 Q (14) = 19.24 — Incidence 13 .101∗∗ .036/.167 Q (12) = 33.47∗∗ 7 Note. k = Number of effect sizes, d+ = Mean weighted effect size, C.I. = Confidence interval, and Q = Hedges & Olkin’s (1985) measure of homogeneity. ∗ p < .05. ∗∗ p < .01.
Sexual Assault 379 knowledge, rape attitudes, behavioral intent, rape-related in which the content was unable to be determined did not attitudes, and incidence of sexual assault were all signifi- produce significant change. However, because of signifi- cant at the .05 level. For the five significant effect sizes, cant heterogeneity within most of these categories, caution fail-safe Ns (Rosenthal, 1991) are also provided in Table 1. should be exercised in interpreting these results. Random This statistic provides an estimate of the number of unpub- assignment, nature of control group, type of population, lished studies with null results required to reduce the mean and status of facilitator were not found to be significant weighted effect size to a value that is no longer statistically moderators of effect size. significant. When a significant overall test of homogeneity (Q) exists The five outcome categories that had significant effect and moderator variables are continuous (as compared to sizes were examined to determine if the average effect categorical in the previous analysis), then it is common to sizes represented a homogenous group of effect sizes or use regression to test which of the moderator variables con- a heterogeneous group that should be further explored tributes uniquely to variation in effect size. Two indices are through moderator analysis. Although rape knowledge and used to assess the overall fit of the regression models, with incidence had significant mean effect sizes and significant QR reflecting a partitioning of the variability into the por- tests of homogeneity, there were too few studies in these tion associated with the regression model and QE indicating categories to ascertain reliable findings; thus, moderator the variability unaccounted for by the model. Examination analyses were not conducted for these outcomes. Modera- of the six predictor variables (see Table 3) revealed that only tor analyses were performed on the outcome categories of two of them (length of intervention and number of partic- rape attitudes, rape-related attitudes, and behavioral inten- ipants) made significant contributions to variation among tions because these tests of homogeneity were significant, rape attitude effect sizes. The positive direction of each and a sufficient number of studies were available in these B-weight indicates that longer interventions and larger sam- categories for analysis. ple sizes are both associated with more positive change in rape attitudes. Rape Attitudes Rape-Related Attitudes There were 115 effect sizes in this outcome category, based on 89 treatment interventions conducted within 57 studies, This outcome category contained 56 effect sizes, based which produced an average effect size of .211. On average, upon 43 treatment interventions from 26 studies, that pro- 41% of the participants were women (SD = 29.5). The most duced an effect size of .125. On average, 38% of the par- common outcome measure was the Rape Myth Acceptance ticipants were women (SD = 33). The average length of Scale (Burt, 1980), and the average time for follow-up as- the intervention was 240.1 (SD = 526.0) minutes, and the sessment was 34.9 days. The average intervention lasted follow-up assessment was conducted an average of 36.7 142.6 minutes, but there was substantial variation in length (SD = 111.5) days after the termination of treatment. The of programs (SD = 362.1). For the homogeneity analysis, homogeneity test revealed significant variation among ef- results (Q = 214.41, p < .001) indicated moderator analyses fect sizes (Q = 79.68, p < .05). Three effect sizes met were warranted. Outlier analysis revealed the presence of the criteria as outliers; however, because deletion of these six outliers. However, deletion of these outliers only mini- outliers did not reduce the Q statistic beyond significant mally reduced the overall effect size (i.e., from .211 to .209) heterogeneity and did not substantially change the average and did not impact its significance. Furthermore, the Q effect size, these effect sizes were retained in the moderator statistic retained significance. These outliers were conse- analysis. quently retained for moderator analysis. Concerning rape-related attitudes, the moderator analy- The results of moderator analysis for the seven categor- ses for the categorical variables appear in Table 4. Nonran- ical variables are presented in Table 2. When moderator dom assignment and nature of the control group appeared variables are categorical, the interpretation is analogous to to impact the magnitude of effect size. Once again it should ANOVA, where QB reflects the portion explained by the be noted that some of the categories did not obtain within- categorical variable and QW indicates the residual pooled- group homogeneity. In addition, only one study used a wait- within-groups portion. Journal articles were found to have list control group; this category was dropped from analysis. a greater mean effect size than unpublished works. An anal- Average effect sizes also differed significantly by the type of ysis of the gender of the audience also revealed significant population that received the intervention, the status of the between-group differences. Women who received an inter- facilitator, and the content of the intervention. However, vention in an all-female group displayed the largest effect these significant between-group findings are tempered by sizes, although only three studies were found, and this value the lack of homogeneity within several of the categories. was not significant. Finally, the content of the intervention Modified weighted least squares regression analysis (see appeared to moderate mean effect size, and interventions Table 5) was conducted for the six continuous variables in categorized as “risk reducing” resulted in the largest mean the same manner as previously described for the rape atti- effect size, while empathy-focused programs and programs tudes outcome, and the regression model was statistically
380 ANDERSON AND WHISTON Table 2 Categorical Moderators of Rape Attitudes Outcome Variable & Class k d+ 95% C.I. QW QB Type of publication 4.00∗ Journal 60 .239 .19/.28 129.43∗∗ Diss/thesis/unpub 55 .164 .11/.22 81.98∗∗ Random assignment 1.74 Individual random 47 .215 .16/.27 94.92∗∗ Group random 32 .241 .17/.31 27.33 Nonrandom 36 .181 .12/.24 91.42∗∗ Nature of control group 7.15 No treatment 71 .245 .20/.29 133.50∗∗ Wait-list 5 .081 −.06/.22 10.59∗ Attention placebo 33 .202 .13/.27 47.27∗ Minimal information 6 .140 .03/.25 16.90∗∗ Type of population 7.49 General students 94 .205 .17/.25 154.21∗∗ Greek members 13 .290 .20/.38 40.22∗∗ High-risk 5 .011 −.22/.24 10.39∗ Other 3 .045 −.21/.30 3.10 Gender of audience 14.83∗∗ Female/female group 3 .287 −.013/.587 2.43 Female/mixed group 27 .236 .163/.310 50.52∗∗ Male/male group 24 .111 .027/.196 38.46∗ Male/mixed group 29 .124 .039/.209 36.25 Female/male combined 32 .273 .217/.329 72.93∗∗ Status/facilitator 3.59 Peer 19 .246 .17/.33 29.81∗ Graduate student 30 .172 .10/.25 30.22 Professional 28 .233 .17/.30 68.28∗∗ Combination 16 .154 .05/.26 38.94∗∗ Unknown-N/A 22 .222 .12/.32 44.58∗∗ Content/intervention 29.48∗∗ Information 39 .257 .20/.31 69.78∗∗ Empathy 9 .130 −.02/.28 18.69∗ Socialization 13 .327 .19/.46 19.17 Risk reduction 5 .435 .28/.59 12.53∗ More than one 36 .165 .10/.23 56.34∗ Other 13 −.019 −.14/.11 9.41 Note. k = Number of effect sizes, d+ = Mean weighted effect size, C.I. = Confidence interval, QW = Homogeneity within each class, and QB = Homogeneity between classes. ∗ p < .05. ∗∗ p < .01. significant. The overall rating of the quality of the study and ing measures of behavioral intentions was .136. In this out- the time of measurement each contributed significantly to come category, 31% of the participants were women. The the prediction of effect size. More specifically, studies with average length of the intervention was 215 (SD = 386.2) higher overall quality ratings and studies with a greater time minutes, and the follow-up assessment was conducted an lapse between the end of the intervention and the time of average of 19.4 (SD = 43.9) days later. The homogeneity measurement were both associated with smaller effect sizes. test revealed significant variation among effect sizes. Out- Furthermore, consistent with the outcome category rape at- lier analysis revealed the presence of one outlier (d+ = titudes, length of intervention was significant, with longer −.89), and removal of this outlier from the distribution interventions being associated with larger effect sizes. and recalculation of the average weighted effect size in- creased the value from .136 to .15; however, the Q statis- tic dropped from 42.47 to 33.77, which borders on sta- Behavioral Intentions tistical significance (p = .068). This outlier was excluded Based upon 25 effect sizes obtained from 22 treatment in- from further moderator analysis because, unlike previously terventions within 16 studies, the overall d+ for studies us- identified outliers, the study displayed several outstanding
Sexual Assault 381 Table 3 has a small but positive influence on rape attitudes. This Continuous Moderators for Rape Attitudes Outcome mean effect size is somewhat lower than the previous meta- analyses of Brecklin and Forde (2001, p. 35) and Flores and Predictor B SEB ß z Harlaub (1998; .30). Possible reasons for these differences are the larger number of studies included in this investiga- Year of publication −.0091 .0049 −.147 −1.8548 tion, implementation of controls for data dependence, and Overall quality .0021 .0192 .009 .1083 Length of intervention .0003 .0001 .227 2.7921∗∗ weighting of effect sizes as suggested by Hedges and Olkin Percent attrition −.0009 .0010 −.068 −.8993 (1985) and Lipsey and Wilson (2001). Although the effect Time of measurement −.0002 .0004 −.045 −.5195 sizes for behavioral intentions, rape-related attitudes, and N of sample (after attrition) .0002 .0001 .218 2.9792∗∗ incidence of sexual assault were statistically significant (.14, .12, and .10, respectively), the influence of sexual assault Note. QR (6) = 19.134∗∗ , QE (82) = 196.279∗∗ , and R2 = .09. B = Unstan- dardized regression coefficient, SEB = Corrected standard error value, programs on these outcomes may be of little clinical sig- ß = Standardized regression coefficient, and z = z-test of significance. nificance, as these effect sizes do not reach the criteria for ∗∗ p < .01. a small effect size (i.e., .20 to .40) as suggested by Cohen (1988). Sexual assault education programming did not ap- pear to have any impact on rape empathy or rape awareness methodological characteristics that may have impacted the behaviors because the studies produced overall mean effect results and was not typical of the general pool of studies. sizes that were not significantly different from zero. Although this exclusion caused the Q statistic to drop be- Consequently, the answer to the question “Are sexual as- low significance, moderator analysis was nevertheless con- sault education programs effective?” indeed depends upon ducted to explore variations in effect sizes. the criteria used to define effectiveness (Breitenbecher, As Table 6 reflects, three variables appeared to mod- 2000). If effectiveness is defined solely as a decrease in erate effect size for this outcome category—nature of the sexual assault, then there is little support available from the control group, gender of audience, and status of the facil- current pool of studies. Although a decline in incidence itator. Modified weighted least squares regression analysis may be the ultimate goal of education programming, the was conducted for the six continuous variables for this out- extreme difficulty in obtaining accurate long-term statis- come as well. The regression model was not statistically tics regarding involvement with sexual assault following an significant, while the residual also did not attain statisti- intervention (Schewe & O’Donohue, 1993a) indicates that cal significance. This finding indicated that the regression additional outcomes should be considered. Our findings do model was not correctly specified for this outcome; thus, it indicate that sexual assault education programs are some- was not appropriate to conclude that any of the regression what effective in changing attitudes toward rape and in- coefficients was significantly different from zero. creasing rape knowledge. However, due to the dearth of studies using behavioral outcomes, more research using be- havioral indices is needed before definitive conclusions can DISCUSSION be reached. The primary purpose of this meta-analysis was to investi- In addition, the effect sizes of certain outcome constructs gate the effectiveness of sexual assault education programs were most likely influenced by characteristics of the inter- on college campuses using both published and unpublished vention, participants, and research methodology. For exam- studies. Specifically, we were interested in whether these ple, rape attitudes may be more subject to demand char- programs influenced attitudinal outcomes, knowledge mea- acteristics than other outcomes because these attitudes are sures, and behavioral indices. Seven separate meta-analyses often overtly discussed and disputed in sexual assault ed- were conducted to determine the impact programming has ucation workshops (Lonsway, 1996), while other outcome on these distinct outcome categories. The results of this measures may be less directly associated with program con- investigation indicate that the efficacy of sexual assault ed- tent. A factor that may have influenced the effect sizes for ucation programming on college campuses appears to differ behavioral variables pertains to when the outcome was mea- depending on which types of outcomes are considered. sured. In general, effect sizes tend to decrease when there The outcome category that evidenced the most positive is a longer time between when the intervention is deliv- change was rape knowledge, which produced a mean effect ered and when the outcome is measured. Outcomes such size of .57. This finding indicates that those who participated as rape awareness behaviors had longer average follow-up in a sexual assault education program displayed greater measurement times (89 days) compared to rape attitudes factual knowledge about rape than those who did not at- (35 days). Hence, it is difficult to determine whether the tend a program. The positive effect size for rape knowledge lower effect sizes associated with behavioral outcomes rel- could be considered to produce a “medium” effect using the ative to attitudinal measures are due to program impact or guidelines suggested by Cohen (1988). The second largest to time of assessment. effect size was found for the rape attitudes category (.21), An advantage of meta-analytic methodology is the abil- which suggests that sexual assault education programming ity to examine variables that influence program outcome
382 ANDERSON AND WHISTON Table 4 Categorical Moderators for Rape-Related Attitudes Outcome Variable & Class k d+ 95% C.I. QW QB Type of publication .81 Journal 25 .139 .08/.20 48.85∗∗ Diss/thesis/unpub 31 .090 .00/.18 30.03 Random assignment 9.03∗ Individual random 25 .056 −.02/.13 23.65 Group random 14 .077 −.04/.19 5.08 Nonrandom 17 .214 .14/.29 41.91∗∗ Nature of control group 14.92∗∗ No treatment 38 .200 .14/.26 53.26∗ Wait-list 1 — — — Attention placebo 11 .053 −.06/.17 7.66 Minimal information 6 −.028 −.14/.08 3.83 Type of population 6.38∗ General students 43 .098 .04/.16 53.62 Greek members 8 .242 .13/.35 14.68∗ High-risk 5 .011 −.22/.24 5.00 Other 0 — — — Gender of audience 1.99 Female/female group 1 — — — Female/mixed group 14 .114 .032/.196 33.09∗∗ Male/male group 17 .169 .066/.273 19.20 Male/mixed group 16 .094 −.002/.191 17.98 Female/male combined 8 .123 −.001/.247 7.42 Status/facilitator 10.56∗ Peer 6 .157 .02/.29 15.41∗∗ Graduate student 21 .029 −.06/.12 12.23 Professional 12 .209 .13/.29 27.83∗∗ Combination 10 .171 .02/.32 5.64 Unknown-N/A 7 .032 −.12/.18 8.01 Content/intervention 14.73∗ Information 19 .217 .13/.30 20.95 Empathy 4 .094 −.22/.41 1.60 Socialization 8 .300 .11/.48 6.00 Risk reduction 2 .203 −.15/.56 2.33 More than one 17 .030 −.04/.10 18.27 Other 5 .125 −.03/.28 15.79∗∗ Note. k = Number of effect sizes, d+ = Mean weighted effect size, C.I. = Confidence interval, QW = Homogeneity within each class, and QB = Homogeneity between classes. ∗ p < .05. ∗∗ p < .01. through moderator analysis. A significant finding of this tion and program effectiveness. This difference may in meta-analysis is that longer interventions (i.e., length of part be due to their analysis of this variable as categorical, time exposed to material in minutes) seemed to be more whereas we analyzed length of intervention as a continu- effective in altering both rape attitudes and rape-related at- ous variable. We believe that, given the larger number of titudes. Interestingly, the range in length of interventions studies in our meta-analyses, our finding that longer inter- was substantial (7 to 2,520 minutes); it seems sensible to ventions were more effective may more accurately repre- conclude that a 7-minute intervention would be less ef- sent the research in this area. Hence, we would encourage fective than a much longer intervention. Although we did those designing educational programs to institute longer, not specifically test single- versus multi-session program- more thorough interventions rather than brief programs. ming, these findings suggest that semester-long courses Because the attention span of students may be limited dur- or possibly multi-session workshops may be more effec- ing one sitting, an educator might consider multi-session tive in promoting positive change. Flores and Hartlaub programming. (1998) and Brecklin and Forde (2001), however, did not This study also found that the status of the facilita- find an association between the length of the interven- tor appears to influence changes in rape-related attitudes
Sexual Assault 383 Table 5 Table 6 Continuous Moderators for Rape-Related Attitudes Categorical Moderators for Behavioral Intent Outcome Outcome Variable & Class k d+ 95% C.I. QW QB Predictor B SEB ß z Type of publication .03 Year of publication .0066 .0072 .123 .9192 Journal 12 .142 .01/.27 13.66 Overall quality −.0665 .0264 −.348 −2.5186∗ Diss/thesis/unpub 12 .157 .05/.26 20.08∗ Length of intervention .0004 .0001 .481 3.1913∗∗ Random assignment 4.47 Percent attrition .0012 .0014 .109 .8195 Individual random 14 .074 −.04/.19 14.90 Time of measurement −.0014 .0005 −.410 −2.7104∗∗ Group random 5 .278 .12/.44 8.07 N of sample (after attrition) .00002 .0001 −.024 −.1870 Nonrandom 5 .187 −.01/.38 6.32 Nature of control group 10.38∗∗ Note. QR (6) = 24.757∗∗ , QE (49) = 54.779 (ns), and R2 = .31. B = Unstan- No treatment 16 .235 .12/.35 21.73 dardized regression coefficient, SEB = Corrected standard error value, ß = Standardized regression coefficient, and z = z-test of significance. Attention placebo 6 .009 −.13/.14 1.67 ∗ p < .05. ∗∗ p < .01. Type of population 1.69 General students 14 .178 .07/.28 26.62∗ Greek members 4 .090 −.07/.25 2.38 High-risk 4 .207 −.11/.52 3.07 and behavioral intentions. Professional presenters were Gender of audience 10.77∗ more successful, while graduate students and peer presen- Female/female group 2 .552 .15/.96 .43 ters were generally less successful in promoting positive Female/mixed group 2 −.095 −.33/.14 .00 changes. Although there should be some caution in inter- Male/male group 14 .133 .02/.24 18.98 preting these results, these findings do raise questions about Male/mixed group 5 .265 .10/.43 3.59 the common practice of employing peer facilitators. Peer Female/male combined 1 — — — education is popular not only in rape education but also in a Status/facilitator 12.51∗ number of other health-related educational programs (e.g., Peer 8 .043 −.07/.16 2.13 substance abuse, HIV, sexuality); however, both Backett- Graduate student 5 .124 −.07/.32 4.98 Professional 3 .449 .09/.81 1.04 Milburn and Wilson (2000) and Parkin and McKeganey Combination 3 .168 −.08/.41 4.89 (2000) have questioned whether there is sufficient research Unknown-N/A 5 .427 .21/.64 8.22 to support this prevalent approach. Walker and Avis (1999) Content/intervention 5.27 suggested several reasons why peer intervention might fail, Information 6 .095 −.05/.24 8.33 including a lack of investment in peer education (viewing Empathy 5 .074 −.13/.27 1.96 peers as “cheap labor”); lack of appreciation of the complex- Socialization 4 .359 .13/.58 1.25 ity of the peer education process and the need for highly Risk reduction 4 .105 −.07/.28 7.28 skilled personnel; and inadequate supervision, training, and More than one 5 .232 .03/.44 9.68∗ support. Consequently, it may be beneficial to address these Note. k = Number of effect sizes, d+ = Mean weighted effect size, C.I. = concerns in future research before any conclusions can be Confidence interval, QW = Homogeneity within each class, and QB = offered concerning the effectiveness of peer educators. Homogeneity between classes. ∗ p < .05. ∗∗ p < .01. Another significant moderator of effect size for both rape attitudes and rape-related attitudes was the content of the intervention. The results suggest that interventions that focus on gender-role socialization, provide general in- pirical research examining the content of programming is formation about rape, discuss rape myths/facts, and address needed. risk-reduction strategies have a more positive impact on Another pertinent finding was that programs that in- participants’ attitudes than rape empathy programs and cluded more than one topic appeared to be less effective interventions with unspecified contents. However, some than more focused programs, which may indicate that more considerations should be addressed before concluding that in-depth programming produces better outcomes than ses- rape empathy interventions are ineffective. First, the dif- sions that cover multiple topics more superficially. Fur- ference in effectiveness for these programs could be associ- thermore, this finding may be related to our finding that ated with the types of outcome measures utilized to assess longer interventions are more effective and that attempting positive change. These attitudinal measures tend to assess to cover information too quickly may result in weak effects concepts discussed in myth/fact and socialization-focused that have little long-term impact. A final issue to consider programs, while these concepts may not be as directly ad- when evaluating the content of sexual assault education in- dressed in empathy programs. In addition, these findings terventions is that the type of program offered may vary include only attitudinal data; thus, whether programs with depending upon the gender of the participants. Women are different content have any differential impact upon the be- more likely to receive a risk-reduction intervention, while havior of participants is unknown. Consequently, more em- men may be more likely to receive an empathy intervention.
384 ANDERSON AND WHISTON Due to gender differences in rape attitudes and behaviors, research and the potential to create programs based on mis- the gender of participants may influence findings of overall leading findings; therefore, particular attention was focused effectiveness within a particular content category. in this meta-analysis on incorporating research methodol- Type of audience was also a significant moderator of ogy and design variables into our analyses. Our findings effect size for rape-related attitudes. Greek members ap- suggest that studies that are published, are rated as lower peared to be the most positively impacted by educa- quality, lack random assignment, have larger sample sizes, tional programming, which is of interest because it is of- and employ no-treatment control groups have larger effect ten thought that fraternity and sorority members are at sizes. Collectively, these results suggest that low method- greater risk to experience sexual assault (e.g., Copenhaver ological standards may lead to potentially erroneous con- & Grauerholz, 1991; Sandy, 1990; Schwartz & DeKeseredy, clusions about the effectiveness of sexual assault education 1997). Although high-risk populations did not appear to interventions. Although these findings are not consistent demonstrate positive changes in attitudes, this finding must across every methodological characteristic and varied across be observed with caution because this category included outcome variables, methodological rigor is necessary in fu- only five studies and consisted of heterogeneous groups. ture research to provide more precise findings. Brecklin Consequently, more research is needed to explore the dif- and Forde (2001) also found publication bias in their meta- ference in responsiveness to education among specific high- analysis, which suggests that unpublished studies should risk groups. continue to be included in future reviews of this literature. Another important moderator is the gender of the au- Consistent with Brecklin and Forde (2001) and Flores and dience. For women, a significant positive effect size was Hartlaub (1998), we found that for rape-related attitudes, found for rape attitudes when the program was conducted the length of time between the end of treatment and the with mixed-gender groups. Although a relatively high ef- assessment of the impact of the program was a significant fect size (.29) was found for women in all-female groups, moderator of treatment effectiveness. Therefore, there are this value was not significant and was based on only three consistent findings that the positive effects of treatment studies. In contrast, tentative findings for behavioral in- tend to diminish over time. tentions suggest that women may have a better outcome in an all-female setting and that mixed-gender program- Limitations ming may not be effective. However, these findings are based on only four studies, and thus further research is There are several limitations to this study that must be ac- needed before any conclusion is drawn. Surprisingly, there knowledged. First, the results of any meta-analytic review was no evidence from these data that men are more likely are only as sound as the studies included in the analysis to benefit from programming administered in all-male (Lipsey & Wilson, 2001). Although criteria were specified a groups as compared to men in mixed-gender groups. Al- priori to exclude studies with serious methodological prob- though there was no significant difference between the lems, it should be noted that many studies contained some two groups, it is surprising to note that men from mixed- limitations, which in turn restricted the conclusions of this gender groups displayed a larger effect size for behavioral meta-analysis. Another limitation concerns the amount of intentions. unexplained variance found in many of the univariate mod- These results contradict Brecklin and Forde’s (2001) erator analyses, as well as the underspecification of the re- findings that single-gender programs were more effective gression model for the rape attitudes outcome. These condi- for men than mixed-gender programs. Differences between tions suggest that several of the findings from the moderator our study and Brecklin and Forde’s may provide some in- analyses should be viewed with caution, because there were sights into reasons for the conflicting findings. It should be additional sources of variance that remain unexplained. In noted that Brecklin and Forde’s results were related only particular, the possibility of interaction effects must be con- to rape attitudes and their meta-analysis did not include sidered because the findings of moderator analyses may be behavioral measures. In addition, the current investigation influenced by other potentially related variables. Moreover, included a larger number of studies and controlled for data the small number of studies included in the behavioral in- dependency. Because we included effect sizes only from tentions moderator analysis also limits the generalizability the last follow-up evaluation, our study may also offer a of these findings. Although attempts were made to limit the more accurate indication of the longer-term effectiveness number of moderator analyses, another issue concerns a po- of programming. Considering the significance of this issue tential Type I error due to the number of univariate analy- and the recent support for single-gender programs above ses that were conducted for each outcome. However, given mixed-gender programs (e.g., Berkowitz, 2002; Gidycz our adherence to the procedures suggested by Hedges and et al., 2002; Rozee & Koss, 2001; Schewe, 2002), more em- Olkin (1985) and Lipsey and Wilson (see Anderson, 2003 pirical research on this question is necessary. for details), the probability of a Type I error was reduced. Schewe and O’Donohue (1993a) and Yeater and Although attention was given to systematic and objective O’Donohue (1999) have voiced concerns about the lack coding, certain moderator variables were more sensitive to of methodological sophistication in sexual assault education coder subjectivity. In particular, it was challenging to code
You can also read