Psychotherapy for the Treatment of Depression: A Comprehensive Review of Controlled Outcome Research

Page created by Byron Frazier
 
CONTINUE READING
Psychological Bulletin                                                                                        Copyright 1990 by the American Psychological Association, Inc.
1990, Vol. 108, No. 1,30-49                                                                                                                          0033-2909/90/$00.75

                     Psychotherapy for the Treatment of Depression:
                 A Comprehensive Review of Controlled Outcome Research
                                    Leslie A. Robinson, Jeffrey S. Herman, and Robert A. Neimeyer
                                                         Memphis State University

                              Previous quantitative reviews of research on the efficacy of psychotherapy for depression have in-
                              cluded only a subset of the available research or limited their focus to a single outcome measure.
                              The present review offers a more comprehensive quantitative integration of this literature. Using
                              studies that compared psychotherapy with either no treatment or another form of treatment, this
                              article assesses (a) the overall effectiveness of psychotherapy for depressed clients, (b) its effectiveness
                              relative to pharmacotherapy, and (c) the clinical significance of treatment outcomes. Findings from
                              the review confirm that depressed clients benefit substantially from psychotherapy, and these gains
                              appear comparable to those observed with pharmacotherapy. Initial analysis suggested some differ-
                              ences in the efficacy of various types of treatment; however, once the influence of investigator alle-
                              giance was removed, there remained no evidence for the relative superiority of any 1 approach. In
                              view of these results, the focus of future research should be less on differentiating among psychother-
                              apies for depression than on identifying the factors responsible for improvement.

   Depression is a prevalent clinical disorder with high eco-                          its a different etiological model for depression. For example, ad-
nomic and emotional costs. Epidemiological research has indi-                          vocates of behavioral approaches treat depression as a conse-
cated that 10%-20% of the population experience a major de-                            quence of a low rate of response-contingent positive reinforce-
pressive episode at some point in their lifetime (Boyd & Weiss-                        ment. The object of this therapy, then, is to increase
man, 1981), with the incidence highest during the adult years                          reinforcement, either by encouraging participation in pleasant
when family and career responsibilities may be most adversely                          activities (Lewinsohn, 1974) or by building the assertion skills
affected (Weissman & Myers, 1978). Although the remission                              necessary to elicit social rewards (LaPointe & Rimm, 1980;
rate for depressive disorders is relatively high (Beck, 1967, chap.                    Sanchez, Lewinsohn, & Larson, 1980). A second treatment ap-
3), a substantial portion of those afflicted remain chronically                        proach, cognitive therapy, is derived from Beck's (1967) view of
depressed (Weissman & Klerman, 1977), and those who do im-                             depression as an affective response to negative beliefs. Modify-
prove are at an increased risk for further episodes (Belsher &                         ing these unproductive beliefs is the primary focus of cognitive
Costello, 1988;Kessler, 1978; Klerman, 1978).                                          therapy. In addition, a number of treatment packages have been
   Until recently, depression was treated almost exclusively with                      developed that explicitly integrate elements from both cognitive
medication, traditional insight-oriented therapy, or a combina-                        and behavioral models. Examples include Lewinsohn's Coping
tion of the two. However, the 1970s witnessed the development                          with Depression course (Lewinsohn, Steinmetz, Antonuccio, &
of a number of new therapeutic approaches, each of which pos-                          Teri, 1985) and Rehm's self-control therapy (Fuchs & Rehm,
                                                                                        1977).
                                                                                          With the development of these new therapies has come a dra-
   This research was supported by a Centers of Excellence grant                        matic increase in outcome research on the efficacy of treat-
awarded to the Department of Psychology at Memphis State University                    ments for depression (Brown & Lewinsohn, 1984; Elkin, Par-
by the state of Tennessee. The research was also aided by a Faculty Re-                loff, Hadley, & Autry, 1985;McLean, 1981). Two major evalua-
search Grant awarded to Jeffrey S. Herman by Memphis State Univer-                     tion strategies have been used: (a) comparisons between treated
sity. While the research was being conducted, Leslie A. Robinson was
                                                                                       clients and wait-list controls and (b) comparisons between cli-
supported by a Van Vleet Memorial Fellowship.
   Portions of this work were presented at the annual meeting of the                   ents receiving different types of psychotherapy. The results of
Society for Psychotherapy Research, Santa Fe, New Mexico, June 1988.                   studies using wait-list comparisons have generally indicated
The research was also discussed as part of the symposium Psychother-                   that a number of treatment approaches are effective. However,
apy for Specific Disorders: Depression, Agoraphobia, and Anorexia Ner-                 as Kazdin (1981) has pointed out, the focus in recent depres-
vosa (Jeffrey S. Herman, Chair), conducted at the 96th annual meeting                  sion research has been not so much on comparisons with un-
of the American Psychological Association, Atlanta, Georgia, August                    treated controls as on comparisons between different types of
 1988.                                                                                 therapy, and here the results have been remarkably inconsistent.
   We would like to thank Marna Barrett, Crystal Blyler, Diane Camp,
                                                                                       For example, Shaw (1977) found that cognitive therapy was
and Chris Quails for their assistance in locating and coding the research
studies included in the review.
                                                                                       more effective than behavioral treatment, whereas others (e.g.,
   Correspondence concerning this article should be addressed to                       Hodgson, 1981; LaPointe & Rimm, 1980) have reached the op-
Jeffrey S. Herman, Department of Psychology, Memphis State Univer-                     posite conclusion. Similarly mixed results have been obtained
sity, Memphis, Tennessee 38152.                                                        when cognitive or behavioral methods have been compared
                                                                                  30
PSYCHOTHERAPY FOR THE TREATMENT OF DEPRESSION                                                        31

with more traditional approaches (e.g., Fleming & Thornton,                   Kornblith, 1979; Whitehead, 1979), whereas others have con-
1980;LaPointe&Rimm, 1980; Thompson & Gallagher, 1984).                        cluded that behavioral techniques have not yet received ade-
To further complicate matters, few studies have obtained con-                 quate empirical support (e.g., DeRubeis & Hollon, 1981; Hol-
sistent results across all outcome measures or assessment                     lon, 1981; Kovacs, 1980). In a few instances, it has been sug-
points.                                                                       gested that treatments incorporating both cognitive and
   In attempting to account for these inconsistencies, research-              behavioral components may be more effective than either ap-
ers have advanced two primary hypotheses. First, it has been                  proach alone (e.g., Blaney, 1981; Rehm & Kornblith, 1979;
suggested that genuine differences in efficacy do exist but have              Whitehead, 1979). Other reviewers have simply concluded that
been obscured by variations across studies in factors such as                 at this point, there is no clear treatment of choice (e.g., Emmel-
treatment procedures, client selection, and therapist training.               kamp, 1986; McLean, 1982; Rush, 1982). Furthermore, al-
According to this view, consistent differences in therapy out-                though differences in client populations and therapy formats
come might emerge in large-scale, tightly controlled research                 have often been cited as contributing to inconsistent results
programs. Such is the philosophy underlying the Treatment of                  across studies, few reviewers have systematically examined re-
Depression Collaborative Research Program (Elkin et al.,                      search relating these variables to outcome. Most often, they
1985), a multisite project initiated by the National Institute of             have simply noted that there are at present no clear prognostic
Mental Health to evaluate the efficacy of cognitive-behavioral,               or prescriptive indicators and have called for further research
interpersonal, and pharmacological treatments for depression.                 in these areas (e.g., Blaney, 1981; DeRubeis & Hollon, 1981;
   In contrast, others have suggested that there are no significant           Kovacs, 1980; McLean, 1981; Rush, 1982; Whitehead, 1979).
differences in the effects of the various therapies, partly because              Given the large number of studies on the treatment of depres-
of considerable overlap in their treatment methods. Both Kaz-                 sion and the complexity of their results, other reviewers have
din (1981) and McLean (1982) have pointed out that there may                  turned to quantitative techniques for summarizing the re-
well be no treatment techniques that are used only within a                   search. The advantage of these quantitative, or "meta-analytic,"
particular therapy. Instead,                                                  procedures (e.g., see Glass, McGaw, & Smith, 1981) is that they
                                                                              provide a powerful method for identifying trends that might
    what researchers and clinicians presently h a v e . . . is broad agree-   otherwise be overlooked. In addition, these techniques permit
    ment on the characteristics of clinical depression . . . ; divergent      the systematic assessment of factors that vary across individual
    theory to account for the hypothesized mechanisms responsible for
    the etiology, maintenance, and reversal of depression; and strik-         studies.
    ingly similar treatment procedures deriving from these diverse the-          Although several previous quantitative reviews of the psycho-
    ories. (McLean, 1982, p. 22)                                              therapy literature (Dush, Hirt, & Schroeder, 1983; Shapiro &
                                                                              Shapiro, 1982; Smith etal., 1980) have reported separate analy-
   If the methods and effects of different forms of psychotherapy             ses of depressed samples, in each case these analyses were lim-
are similar, why then do some studies reveal reliable differences             ited to a small sample of the research on psychotherapy treat-
in treatment efficacy, whereas others do not? One possible ex-                ments for depression. Three other reviews that focused specifi-
planation involves the impact of the researcher's allegiance. In              cally on depression have suffered from a similar limitation. In
previous reviews of psychotherapy research (Herman, Miller, &                 two of these reviews (Quality Assurance Project, 1983; Stein-
Massman, 1985; Smith, Glass, & Miller, 1980, chap. 5), the re-                brueck, Maxwell, & Howard, 1983), the primary purpose was
sults of comparisons between therapies have been found to vary                to estimate the relative benefits of pharmacotherapy and psy-
according to the theoretical preference of the investigator. Al-              chotherapy for depression; the third review (Conte, Plutchik,
though it seems reasonable to expect that a similar effect might              Wild, & Karasu, 1986) used a "box-score" summary technique
be operative in the depression literature, this possibility has not           to evaluate treatments combining pharmacotherapy with psy-
yet been explored.                                                            chotherapy. Given their focus on drug treatments, however,
   The typical response to the inconsistency in the existing liter-           these reviews included relatively few studies that assessed the
ature has been a call for further research, in the hope that better           efficacy of psychotherapy alone. Thus, many substantive issues
designed studies might reveal some client or treatment format                 concerning psychotherapy for depression could not be ad-
variables that could account for the contradictory findings of                dressed.
previous research. Additional studies would undoubtedly con-                     In contrast, two recent reviews (Dobson, 1989; Nietzel, Rus-
tribute to the literature, but it appears unlikely that they can              sell, Hemmings, & Gretter, 1987) have included substantially
provide any final answers to the central problems in depression               larger portions of the literature on psychotherapy for depres-
research. As Fiske (1983) has pointed out, although "the single               sion. However, even these reviews were less than comprehensive,
study may stimulate or irritate in a healthy fashion, only the                because they were both limited to studies that used the Beck
distillations from the entire body of research in an area have                Depression Inventory (BDI; Beck, Ward, Mendelson, Mock, &
lasting effects" (p. 65).                                                     Erbaugh, 1961) as an outcome measure, and only the BDI was
   This type of broad integration of the depression treatment                 analyzed. Moreover, the two reviews reached markedly different
literature has been attempted in a number of recent narrative                 conclusions. Nietzel et al. (1987) found that individual treat-
reviews. However, as of yet no clear and consistent conclusions               ment was more effective than group treatment, but there were
have been reached. For example, some commentators have indi-                  no reliable differences in the efficacy of cognitive, behavioral,
cated that the research supports the efficacy of both cognitive               and other forms of therapy. In contrast, Dobson (1989) con-
and behavioral approaches to the treatment of depression (e.g.,               cluded that Beck's cognitive therapy was more effective than
Blaney, 1981; Emmelkamp, 1986; McLean, 1982; Rehm &                           other therapeutic approaches. Perhaps one reason for these di-
32                                               L. ROBINSON, J. BERMAN, AND R. NEIMEYER

vergent findings is that neither review adjusted for the potential              substantially from the more individualized goals of the treatments in-
influence of investigator allegiance in the studies they exam-                  cluded in the reviewed studies.
ined.                                                                              The studies selected for review investigated a variety of psychothera-
   The purpose of this review was to provide a more complete                    peutic methods. Distinctions between the different types of therapy were
                                                                                often difficult to draw, in part because of overlap in the treatment tech-
quantitative summary of the controlled research evidence on
                                                                                niques. However, a careful inspection of the method sections of these
psychotherapy for depression. In the first of the following analy-
                                                                                studies indicated that most therapies could be classified into one of four
ses, we examined the efficacy of psychotherapy for depression                   categories: (a) cognitive, (b) behavioral, (c) cognitive-behavioral, and
by using evidence from a substantially larger number of empiri-                 (d) general verbal therapy. The first category, cognitive therapy, included
cal studies than have been included in any previous quantitative                those treatments that focused primarily on the evaluation and modifi-
review of this literature. In a second set of analyses, we used the             cation of cognitive patterns. For example, treatments that involved at-
available research to assess the comparative efficacy of psycho-                tributional retraining or challenging irrational beliefs were classified as
therapy and the leading alternative treatment for depression,                   cognitive. However, therapies that simply directed clients to substitute
pharmacotherapy. Finally, we evaluated the clinical significance                positive thoughts or images for negative ones were excluded from this
of psychotherapy by comparing treated clients with nonde-                       category. Such treatments differ from cognitive therapy as it is usually
                                                                                practiced in that no evaluation of existing cognitions is undertaken by
pressed individuals. Thus, our analyses allowed us to assess the
                                                                                the client (Ledwidge, 1978; also see Miller & Berman, 1983). The be-
effectiveness of psychotherapy for depression, its benefits rela-               havioral therapy category included treatments designed to decrease de-
tive to pharmacotherapy, and how much former clients resem-                     pression by changing behavioral patterns (e.g., by increasing assertive
ble those who are not depressed.                                                behavior or participation in pleasant activities). Therapies that included
                                                                                both cognitive and behavioral components were classified as cognitive-
                                                                                behavioral treatments. The final category, general verbal therapy, com-
          Analysis 1: Effectiveness of Psychotherapy                            prised treatments such as psychodynamic therapy, client-centered ap-
                                                                                proaches, and other forms of interpersonal therapy such as that outlined
   In this first set of analyses, we investigated not only the overall
                                                                                by Klerman, Weissman, Rounsaville, and Chevron (1984). The com-
effectiveness of psychotherapy but also the relative effectiveness              monality among these treatments is that each places relatively greater
of different forms of treatment. In addition, we examined a                     emphasis on insight than on the acquisition of a set of specific skills.
number of other substantive issues in research on the treatment                 Although the general verbal category was broad, too few studies were
of depression, including the role of investigator allegiance, the               available to evaluate the effectiveness of the specific therapies within this
impact of group and individual therapy formats, the impor-                      group.
tance of diagnostic screening procedures, and the influence of                     Treatment outcomes were assessed in the studies by a variety of in-
other variables such as therapist training, length of treatment,                struments. Many of the instruments were designed specifically to assess
and client characteristics.                                                     depressive symptomatology, but some were more general or evaluated
                                                                                other areas of functioning. We classified the following scales as specific
                                                                                measures of depression: the BDI (Beck et al., 1961), the Zung Self-Rat-
Method                                                                          ing Depression Scale (Zung, 1965), the Depression Adjective Check List
                                                                                (Lubin, 1965), the Hamilton Rating Scale for Depression (Hamilton,
   Studies. The first analysis was based on a total of 58 studies of psycho-     1960), the Depression Scale of the Minnesota Multiphasic Personality
therapy for the treatment of depression. (See Appendix A for a list of          Inventory (Hathaway & McKinley, 1967), the D-30 (Dempsey, 1964),
the references for these studies.) The studies were identified through a        and the Center for Epidemiologic Studies Depression Scale (Radloff,
search of the volumes of Psychological Abstracts (1976-1986), refer-             1977).
ences of published reviews and outcome studies, and an issue-by-issue              Descriptive characteristics of the 58 studies reviewed are presented
examination of the 1985 and 1986 volumes of relevant journals.1 Of the          in Table 1. As the table reveals, the typical client was a middle-aged
58 studies included in this analysis, 47 were not included in the recent        woman who was experiencing moderate depression as measured by the
Dobson (1989) review and 40 were not covered in the analyses by Niet-           BDI. The therapy was usually brief, and treatment sessions occurred
zeletal.(1987).                                                                 approximately once a week. In the 34 studies that included a follow-up
   We used several criteria in selecting studies for inclusion in the review.   assessment, this follow-up was conducted an average of 13 weeks
First, to assess the effects of therapy on depressive disorders (rather than    (range = 2-52) after treatment termination.
depressive moods), the analysis was restricted to studies using samples            In 28 (48%) of the studies, clients were solicited from the community
identified as primarily suffering from depression. Thus, studies that de-       through media announcements. Another 14 investigations (24%) relied
scribed clients in more general terms (e.g., neurotic) or in terms of an-       on students solicited in a university setting, and 9 studies (16%) used
other specific diagnostic category (e.g., alcoholic) were excluded, even        traditionally referred outpatients. In the remaining 7 studies (12%), ei-
when the researcher reported that the clients were also depressed. Stud-        ther the referral source was not reported or both solicited and tradition-
ies using inpatient samples and those that focused on children or adoles-       ally referred clients were included.
cents were also omitted, because treatment methods used with these                 Although all studies focused on the treatment of depression, some
groups often differ from those that are the focus of this review.               screened clients more rigorously than did others. In 20 investigations
   Second, we included a study in the review only if it contained a com-        (35%), clients were required to meet formal diagnostic criteria for a
parison between treatment and no treatment or between different types           depressive disorder in order to be included. Most often, the Research
or formats of therapy. Thus, we omitted case histories and studies using
simple pre-post designs.
                                                                                  1
   Third, because our primary interest was in the effects of psychother-            The issue-by-issue search was conducted for the following journals:
apy, we excluded research on treatments that did not have a prominent           Archives of General Psychiatry, Behavior Modification, Behavior Ther-
verbal component. Thus, treatments such as exercise and bibliotherapy           apy, Cognitive Therapy and Research, Journal of Clinical Psychology,
were not considered. In addition, we omitted the very few studies of            Journal of Consulting and Clinical Psychology, Journal of Counseling
family and marital therapy because their interactional focus differed           Psychology, and Psychotherapy: Theory, Research and Practice.
PSYCHOTHERAPY FOR THE TREATMENT OF DEPRESSION                                                                 33

Table 1                                                                       these measures would have inflated the estimate of overall effect size,
Characteristics of Psychotherapy Studies                                      because investigators are more likely to provide adequate information
                                                                              on measures that reveal reliable group differences. Thus, when findings
          Study characteristic                       M             Range      were not reported or were described simply as nonsignificant, we con-
No. of clients                                     40.4            9-155      servatively estimated the effect size to be zero.
Clients per group                                  14.3            4-47          Hedges (1982) has identified a small-sample bias in the estimate of
Percentage attrition (posttreatment)               11.0            0-65       effect size. Although this bias is of practical concern only when sample
Percentage female clients                          79.6           50-100      sizes are quite small (i.e., fewer than 20 subjects), we applied Hedges'
Client age (years)                                 39.4           19-71       (1982, Formula 4) correction for the bias to all effect sizes reported in
Initial Beck Depression Inventory score            22.7           12-30       our analyses.
No. of therapists                                   4.0            1-18          Preliminary analyses. Differences between groups were usually as-
Weeks of treatment                                  6.9            1-36       sessed by more than one outcome measure (M = 6.0, range = 1-25).
No. of sessions                                     8.7            1-46       Moreover, many studies reported results for more than one treatment
Note. Each mean is based on at least 42 of the 58 studies.                    comparison (M = 2.1, range = 1-6). Other reviewers have treated the
                                                                              effect sizes derived from individual outcome measures as separate obser-
                                                                              vations in their analyses (e.g., see Smith et al., 1980, chap. 4). However,
                                                                              this procedure arbitrarily weights studies according to the number of
Diagnostic Criteria (Spitzer, Endicott, & Robins, 1978) were used, but        outcome measures and treatment comparisons reported. Even worse,
other investigators used the Feighner criteria (Feighner et al., 1972) or     multiple effect sizes derived from the same study may not represent sta-
the Diagnostic and Statistical Manual of Mental Disorders (American           tistically independent observations. If effect sizes within a study are not
Psychiatric Association, 1980). The 38 other studies (65%) used less          independent, then using the effect size as the unit of analysis can seri-
stringent selection criteria, such as scores on self-report measures of       ously underestimate error variance and inflate tests of statistical signifi-
depression.                                                                   cance (e.g., see Glass et al., 1981, chap. 6).
   Estimating treatment effects. Each outcome measure reported in a              To assess the issue of nonindependence in our data, we first examined
study was expressed in terms of Cohen's (1977) d, a standardized mea-         whether the variation in effect sizes derived from different treatment
sure of effect size. Cohen's d is denned as                                   comparisons was greater than the variation of effect sizes within a single
                                                                              treatment comparison. Our focus in this analysis was on the most fre-
                                                                       (D     quent type of comparison, that between treated clients and untreated
                                                                              controls. We conducted an analysis of variance in which the unit of
where m, and m2 represent group means and s is the pooled within-             analysis was the individual effect size (N = 354) and the independent
groups standard deviation. For comparisons between controls and               variable was the treatment comparison (N = 78). This analysis revealed
treated clients, the control mean was subtracted from the treatment           that the variability among effect sizes drawn from different treatment
mean. Thus, an effect size of 0.5 indicated that the mean of the treated      comparisons was indeed far greater than the variability of effect sizes
group was one half of a standard deviation larger than the mean of the        within a treatment comparison, intraclass R = .66, F\ll, 276) = 9.68,
control group. In direct comparisons between different types or formats       p
34                                             L. ROBINSON, J. HERMAN, AND R. NEIMEYER

Table 2                                                                           One might also expect that the variability in the findings of studies
Efficacy of Psychotherapy at Posttreatment and Follow- Up                      would decrease as the sample sizes of the studies increased. To evaluate
                                                                               this issue, we conducted a regression analysis in which we used the sam-
                                                          Effect size          ple size of the study to predict the overall treatment effect. As expected,
                              AT of                                            the squared residuals or errors from this regression analysis decreased as
 Assessment                  studies               M                    SD     the sample size of the study increased, r(35) = -.35, p = .03. Therefore,
Posttreatment                  37                 0.73*                 0.69   studies with larger samples not only provided more conservative esti-
Follow-up                       9                 0.68*                 0.68   mates of treatment effects but also offered more reliable estimates.
Note. Means and standard deviations are based on weighted least                Given this difference in the reliability of studies with different sample
squares analyses in which effect sizes were weighted by sample size.           sizes, we conducted the following analyses using a weighted least squares
*p
PSYCHOTHERAPY FOR THE TREATMENT OF DEPRESSION                                                               35
    Type of treatment. Table 4 presents the effects of four specific      Table 5
types of psychotherapy: (a) cognitive, (b) behavioral, (c) cogni-         Direct Comparisons Between Different Types of Psychotherapy
tive-behavioral, and (d) general verbal. For each type of therapy,
                                                                                                                     Effect size"
treated clients improved more than did wait-list controls. The
                                                                                                         Not                          Estimate if no
effect sizes for some therapies were larger than for others, but                Comparison              studies      M         SD      allegiance11
this variation was not reliable, F(3,35) = 0.92, p = .4. However,
                                                                          Cognitive vs. behavioral        12         0.12     0.33       0.12(0.09)
it is difficult to judge the relative efficacy of the treatments from     Cognitive vs. cognitive-
this evidence, because the results were drawn from different sets           behavioral                     4       -0.03      0.24     -0.03(0.12)
of studies. For instance, studies evaluating general verbal ther-         Behavioral vs. cognitive-
apy might have used clients that were less amenable to treat-               behavioral                     8       -0.24*     0.20     -0.16(0.10)
                                                                          Cognitive vs. general
ment than the client samples used in studies of other types of
                                                                            verbal                         7         0.47*    0.30     -0.15(0.20)
therapy. Thus, differences in effect size may be due to variations        Behavioral vs. general
across studies in background variables such as sample charac-               verbal                        14         0.27*    0.33       0.15(0.13)
teristics rather than to differences in therapeutic efficacy.             Cognitive-behavioral vs.
    Investigations that directly compare two or more types of               general verbal                 8         0.37*    0.38       0.09(0.27)
treatment provide a better assessment of relative efficacy, be-           Note. Means, standard deviations, and standard errors are based on
cause background variables are held relatively constant across            weighted least squares analyses in which effect sizes were weighted by
the therapies. Thus, any differences between the treatment                sample size.
                                                                          * Positive numbers indicate that the first therapy in the comparison was
groups can be attributed more clearly to differences in the ther-         more effective; negative numbers indicate that the second therapy in the
apies themselves. Table 5 presents the results of studies that di-        comparison was more effective.
                                                                          b
rectly compared two or more therapies. Effect sizes were coded              Standard error of the estimated effect size is in parentheses.
so that positive numbers indicate that the first therapy in the           */>
36                                       L. ROBINSON, J. HERMAN, AND R. NEIMEYER

Therefore, it seems unlikely that the relation between outcome       Table 6
and allegiance is solely attributable to the tendency of research-   Efficacy of Individual and Group Psychotherapy
ers to write introductions tailored to their results.
   What would be the relative efficacy of the different therapies                                                           Effect size
                                                                                                   Not
in the absence of an allegiance effect? To answer this question,          Format                  studies            M                    SD
we used the regression equation for the relation between effect
                                                                     Individual therapy             16              0.83*                 0.77
size and allegiance rating to estimate the effect size that would    Group therapy                  15              0.84*                 0.60
have occurred if the investigators had no preference for either
of the therapies being compared. The estimated mean effect size      Note. Means and standard deviations are based on weighted least
                                                                     squares analyses in which effect sizes were weighted by sample size.
and its standard error are presented for each comparison in the      *p .9. Further data relevant to this point
tapes, or observers to ensure that the therapies were delivered      were provided by five investigations that directly compared in-
in an appropriate manner. If the similarity in outcome across        dividual and group approaches. From this smaller set of studies,
therapies is simply a reflection of poor treatment implementa-       effect sizes were calculated so that a positive number indicated
tion, then one might expect the use of these monitoring proce-       that individual therapy was more effective and a negative num-
dures to increase the differences observed between treatments.       ber indicated that group therapy was superior. Although indi-
However, in direct comparisons between treatments, our analy-        vidual therapy tended to produce better results than group ther-
ses revealed no reliable difference in the absolute magnitude of     apy in these five studies, the effect sizes representing the differ-
effect sizes from 14 studies that used these monitoring proce-       ence between the two methods (M = 0.31, SD = 0.35) did not
dures (M = 0.27, SD = 0.15) and 16 studies that did not (M =         differ reliably from zero, f(4) = 1.95, p = . 1.
0.38, SD = 0.41),/(28)= 1.04,p=.3.                                      In studies comparing group therapy with a wait-list control
   Another method of attempting to standardize treatment de-         group, the number of clients per therapy group ranged from 3
livery involves the use of therapy manuals. In some of the stud-     to 12 (M = 7). Some might suspect that smaller therapy groups
ies, therapists conducted treatment according to detailed manu-      would be more effective than larger groups. However, we de-
als; in other studies, no such formalized therapy program was        tected no consistent relation when we correlated group size with
used. When treatments were directly compared with one an-            the outcome of therapy, r( 12) = -. 13, p = .1. Thus, small treat-
other, however, the absolute magnitude of effect sizes from 11       ment groups did not appear to be any more effective than ther-
studies that used formal manuals (M = 0.28, SD = 0.30) did           apy groups with more clients.
not differ reliably from the absolute magnitude of effect sizes         Type of outcome. A wide range of outcome measures was
from 14 studies in which no manuals were used (M = 0.34,             used in these studies. Some of the instruments were designed
SD = 0.18), f(23) = 0.55, p = .6. Similar results were observed      specifically to assess changes in depressive symptomatology.
when we examined studies comparing treated groups with wait-         Others measured constructs that were less directly related to
list controls. The effect sizes of 14 studies using manual-driven    depression. To determine whether the effects of therapy varied
therapies (M = 0.82, SD = 0.64) did not differ systematically        across these different domains of outcome, effect sizes repre-
from the effect sizes of 17 studies for which no manual was de-      senting comparisons of treated clients with wait-list controls
veloped (M = 0.84, SD = 0.74), t(29) = 0.07, p = .9. Although        were calculated separately for measures of depression and mea-
the use of treatment manuals has increased in recent years,          sures of other constructs. These results are presented in the top
these data provide no indication that their use either increases     half of Table 7. For both types of outcome, there was a reliable
therapeutic efficacy or allows for a finer differentiation of the    effect of therapy, with treated clients improving more than waitr
relative effectiveness of treatments.                                list controls, but the difference between the effect sizes for the
   The length of treatment might also affect the outcome of psy-     different measures was not reliable, t(41) = 1.37, p = .2. How-
chotherapy. To address this issue, we correlated the results of      ever, in the 20 studies that included both types of measures,
PSYCHOTHERAPY FOR THE TREATMENT OF DEPRESSION                                                               37

Table 7                                                                     for both types of studies in the top half of Table 8. As this table
Efficacy of Psychotherapy for Different      Types                          reveals, the effects of therapy were virtually identical for for-
of Outcome Measures                                                         mally diagnosed samples and for those that were not formally
                                                                            diagnosed, f(27) = 0.08, p > .9.
                                                         Effect size            A final sample characteristic that might affect study outcome
      Measure                      Not
    characteristic                studies            M                SD    is the referral source of the clients. In most of the studies, clients
                                                                            were obtained through media announcements. Only rarely
    Focus
      Depression                     29            0.93*             0.76   were traditional outpatient referral procedures used. To assess
      Other                         20             0.64*             0.63   the influence of referral source, we compared the results of
    Source                                                                  studies using outpatients with those of studies using student or
      Self-report                    29            0.85*             0.73   community volunteers. Effect sizes for comparisons of treated
      Observer                        7            0.81*             0.77   clients with wait-list controls are presented for the three groups
Note. Means and standard deviations are based on weighted least             of studies in the bottom half of Table 8. As the values suggest,
squares analyses in which effect sizes were weighted by sample size.        there was no reliable relation between effect size and referral
*p
38                                             L. ROBINSON, J. HERMAN, AND R. NEIMEYER

2 studies in which all therapists were fully trained professionals            psychotherapy studies reviewed earlier in terms of age (M =41.1 years),
(M = 0.65, SD = 0.05), although this numerical difference was                 gender (M = 68.5% female), and initial BDI score (M = 25.5). In all
not statistically significant, t(\9) = 0.70, p = .5.                          except one of the studies, clients were required to meet formal diagnos-
                                                                              tic criteria for depression. Only three of these studies (21%) reported
                                                                              that clients were solicited through media announcements. Five studies
     Analysis 2: Comparisons With Pharmacotherapy                             (36%) used traditionally referred outpatients, and the remaining seven
                                                                              studies (47%) either included both solicited and traditionally referred
   As our analyses indicate, psychotherapy appears effective in
                                                                              clients or did not report the source of their subjects.
helping depressed clients. A further question, though, is the                    A variety of psychotherapeutic approaches were evaluated in this set
effectiveness of psychotherapy relative to the leading alternative            of studies. Cognitive-behavioral therapy was used in eight investigations
treatment, pharmacotherapy. Several previous quantitative re-                 (53%), and behavioral treatments were tested in three studies (20%).
views have suggested that psychotherapy compares favorably                    One study (7%) examined purely cognitive methods, whereas four stud-
with drug therapy for the treatment of depression (e.g., Quality              ies (27%) assessed a general verbal therapy. In one study, the type of
Assurance Project, 1983; Smith et al., 1980; Steinbrueck et al.,              therapy was not reported.
 1983). However, in each of these reviews the efficacy of psycho-                In 12 of the studies (80%), tricyclic antidepressants constituted the
therapy and pharmacotherapy was estimated from different sets                 pharmacotherapy under investigation, with amitriptyline most often
                                                                              administered. One study (7%) examined a tetracyclic, and another 2
of studies. Thus, differences in variables such as client charac-
                                                                              (13%) evaluated benzodiazepines. A final study allowed prescribing
teristics may account for the apparent variation in treatment
                                                                              physicians to use any of a variety of psychoactive drugs. Mean dosages
effects. In addition, the benefits of combining psychotherapy                 of these medications were generally within accepted therapeutic ranges,
with pharmacotherapy have received scant attention. Although                  although usually at the lower end. Treatment lasted on average 12 weeks,
several reviewers using narrative or box-score techniques have                allowing in most cases an adequate trial of the medication.
suggested that the combination approach may be more effective                    For each study, effect sizes representing differences in treatment out-
than either treatment alone (e.g., Conte et al., 1986; Weissman,              come were calculated by using the procedures outlined for the earlier
 1979), the evidence has not been subjected to rigorous inferen-              psychotherapy analyses. Once again, multiple outcomes within a treat-
tial analysis.                                                                ment comparison and multiple treatment comparisons within a study
   Fortunately, a number of studies have recently been pub-                   were averaged to ensure that estimates of error were based on indepen-
lished in which both psychotherapy and pharmacotherapy or                     dent observations. As before, a weighted least squares procedure was
                                                                              used to compensate for differences in the sample sizes of the studies.
the combination of the two have been evaluated for the treat-
ment of depression. The following analyses provide a quantita-
tive summary of this research.                                                Results
                                                                                 The top half of Table 9 presents the results of comparisons
Method                                                                        between psychotherapy, pharmacotherapy, and the combina-
   To locate appropriate studies, we conducted computerized searches          tion of both. As can be seen, psychotherapy appeared more
of both medical and psychological data bases and examined the refer-          effective than pharmacotherapy in the treatment of depression.
ences of published reviews and outcome studies in this area. In addition,     However, the outcomes of a combination approach did not
recent volumes of relevant journals were searched on an issue-by-issue        differ systematically from the outcomes of either treatment
basis.4 We selected studies containing at least one of the following com-     alone. Thus, the benefits of psychotherapy and pharmacother-
parisons: (a) psychotherapy versus pharmacotherapy, (b) a psychother-         apy do not appear to be additive.
apy-pharmacotherapy combination versus psychotherapy alone, and                  Could the superiority of psychotherapy over pharmacother-
(c) a combination treatment versus pharmacotherapy alone. Compari-            apy be an artifact of researcher allegiance? To investigate this
sons with "treatment as usual" were excluded when the treatment-as-           possibility, we had two independent raters judge researcher alle-
usual condition could not be clearly classified as psychotherapy, phar-
                                                                              giance by using the procedures outlined in the preceding psy-
macotherapy, or the combination of the two. In addition, when a psy-
                                                                              chotherapy review.6 We then used these allegiance ratings to
chotherapy-pharmacotherapy combination was evaluated, we required
that it be compared with one of its components. For example, a compar-        predict the effect size that would have occurred if the research-
ison between cognitive therapy and the combination of psychodynamic
therapy and imipramine would have been excluded, because the differ-
                                                                                 4
ence between the two conditions reflects not only the addition of phar-            The issue-by-issue search was conducted for the following journals:
macotherapy but also a difference in the type of psychotherapy. Other         American Journal of Psychiatry, Archives of General Psychiatry, Behav-
inclusion criteria paralleled those outlined earlier for the larger psycho-   ior Therapy, Behaviour Research and Therapy, British Journal of Psy-
therapy review. Thus, we included only studies using adult outpatients        chiatry, Cognitive Therapy and Research, Journal of Consulting and
suffering from unipolar depression, and once again the focus was on           Clinical Psychology, and Psychopharmacology Bulletin.
                                                                                 5
treatments with individual or group formats.                                       One additional study (Daneman, 1961), which provided a compari-
   Using these criteria we were able to locate 15 studies that provided       son between a psychotherapy-pharmacotherapy combination and phar-
the relevant comparisons between psychotherapy, pharmacotherapy, or           macotherapy alone, was eliminated from our analyses because the effect
the combination of the two.5 (Appendix B presents a list of the refer-        size derived from the study was more than nine standard deviations
ences for the studies.) Many of these studies were reported in more than      above the mean of other studies providing this same comparison. We
one publication. In fact, on average more than two reports were pub-          also conducted our analyses with this study included. Although inclu-
lished from each research project. Thus, the actual number of separate        sion of the study altered effect size values, it did not change any of the
studies in this area is far smaller than might be expected on the basis of    conclusions.
                                                                                 6
the number of publications.                                                        The intraclass correlation for the reliability of the mean of the two
   The client samples used in these studies were similar to those of the      raters was .85.
PSYCHOTHERAPY FOR THE TREATMENT OF DEPRESSION                                                                39

Table 9                                                                    Method
Relative Efficacy of Psychotherapy, Pharmacotherapy,
and the Combination of Both                                                    We were able to identify a total of 39 studies reporting normative data
                                                                            on the BDI. (See Appendix C for a list of the references for the studies.)
                                            Effect size' Estimate if        Of these studies, 20 had been previously identified by Nietzel et al.
                                     Af of                   no            (1987) in their search of the 1978 through September 1985 issues of
            Comparison              studies M SD allegiance"                six psychology journals.7 We located another 19 normative studies by
All pharmacological treatments                                             extending Nietzel et al.'s journal search from September 1985 through
   Psychotherapy vs. drug therapy         8 0.13* 0.12 0.07(0.04)           November 1987 and by reviewing bibliographies of articles on the as-
   Combination vs. psychotherapy         12 0.01 0.25 -0.01(0.08)          sessment of depression. These normative studies included samples that
   Combination vs. drug therapy           5 0.17 0.24 -0.05(0.21)          were similar in terms of their age and gender to the clients treated in the
Tricyclic antidepressants only                                             psychotherapy studies. The average normative sample consisted of 64%
   Psychotherapy vs. tricyclics           7 0.12*0.13 0.07(0.04)            female subjects, and the average age of those in the sample was 30 years.
   Combination vs. psychotherapy          9 0.02 0.25 -0.05(0.08)              In 28 of the normative studies, the BDI was administered to a group
   Combination vs. tricyclics             4 0.15 0.27 -0.05(0.26)          of subjects unscreened for mental health difficulties. Examples include
Note. Means, standard deviations, and standard errors are based on          investigations that randomly sampled community residents and those
weighted least squares analyses in which effect sizes were weighted by      using data from large groups of university students. In 12 studies, norms
sample size.                                                               were reported for individuals who had been screened on some measure
' Positive numbers indicate that thefirsttherapy in the comparison was      of mental health. An example is a study that obtained data from sub-
more effective; negative numbers indicate that the second therapy in the   jects who did not meet Research Diagnostic Criteria for depression.
comparison
b
             was more effective.                                           Not surprisingly, BDI scores for nondistressed samples were lower than
  Standard error of the estimated effect size is in parentheses.           those derived from samples of the general population, /(38) = 2.14, p =
*p
40                                          L. ROBINSON, J. HERMAN, AND R. NEIMEYER

Table 10                                                                treatments (e.g., Bowers &Clum, 1988; Casey &Berman, 1985;
Beck Depression Inventory (BDI) Scores for                              Dush et al., 1983; Miller & Berman, 1983; Shapiro & Shapiro,
Normative Samples, Clients Treated With                                  1983). However, these previous reviews were not restricted to
Psychotherapy, and Untreated Controls                                   depressed samples. It may be that depression is particularly re-
                                                                        sponsive to common curative factors occurring in both psycho-
                                                         BDI"           therapy and placebo treatments. If so, then the specific proce-
                                 Not
       Group                    studies            M             SD     dures that define the type of treatment may be less important
                                                                        in alleviating depression than previously recognized.
Normative samples
  General population              28               7.0           1.3       Despite their improvement, clients treated with psychother-
  Nondistressed                   12               4.9           2.0    apy remain distinguishable from healthy controls. When we
Treated clients                                                         compared the BDI scores of treated clients with norms derived
  Pretreatment                    22              21.8           4.7    from the general population and from nondistressed samples,
  Posttreatment                   22              11.8           4.4    treated clients were found to be more depressed at the end of
Untreated controls
  Pretreatment                    22              20.7           5.0    therapy than both normative groups. However, the magnitude
  Posttreatment                   22              18.1           5.2    of improvement over the course of therapy appeared impres-
Note. Means and standard deviations are based on weighted least         sive. On average, clients were functioning within one standard
squares analyses in which BDI scores were weighted by sample size.      deviation of the general population after treatment, compared
" BDI scores from 10 to 20 indicate mild depression; from 20 to 30,     with a pretreatment difference of more than two standard devi-
moderate depression; and over 30, severe depression (Kendall, Hollon,   ations. Such a change clearly represents substantial improve-
Beck, Hammen, & Ingram, 1987).
                                                                        ment.
                                                                           The research evidence demonstrates, furthermore, that the
                                                                        benefits of psychotherapy for depression are not short-lived. In
be distinguished from healthy individuals who have not sought           those studies that included a follow-up assessment, improve-
treatment.                                                              ment at posttreatment was quite similar to that observed at a
   Jacobson et al. (1984) have suggested that one way to judge          later follow-up. Not only does this finding emphasize the endur-
the impact of treatment is to express client functioning relative       ing nature of the changes that occur during treatment, but it
to the mean and standard deviation of a population of healthy
normal individuals. To generate such estimates, we pooled the
standard deviations of BDI scores reported in studies of the gen-
                                                                                                      (Posttreatment - |i+OJto)
eral population and those reported in studies of nondepressed
samples, and we used the pooled values as estimates of the dis-                 •o                     I
persion of individual scores in each normative population. We
then expressed the BDI scores of clients treated with psycho-                    ao
therapy in terms of these normative standard deviations. The                    Q.
results, presented in Figure 1, indicate that before treatment                  ~co
clients reported substantially more depression on the BDI than
                                                                                                             (Pretreatment - u + 2.4
PSYCHOTHERAPY FOR THE TREATMENT OF DEPRESSION                                                     41

argues against the widely held assumption that follow-up assess-       of specific forms of treatment. Although such arguments have
ment is essential because of unusually high relapse rates for de-      merit, the onus would appear to be on the proponents of such
pression. As in the more general review by Nicholson and Her-          hypotheses to offer convincing empirical evidence in support of
man (1983), our results suggest that follow-up findings often          them.
add little to information obtained at the end of therapy. Thus,           The relative efficacy of group and individual therapy has only
rather than including a costly follow-up assessment, psychother-       recently become an issue in depression research, perhaps be-
apy researchers might justifiably choose to invest resources in        cause group therapy for depression has traditionally been con-
other aspects of their design. For example, researchers could          sidered a contradiction in terms. It has been feared that the spe-
focus on obtaining larger samples of clients and therapists,           cial needs of depressed clients could not be met in a group set-
thereby increasing both the power and generalizability of their        ting and that depressed clients might act as a burden on a group
analyses.                                                              because of their withdrawal, pessimism, and self-absorption.
   Although all forms of psychotherapy were more effective than        However, our analyses indicated that both group and individual
no treatment, our initial analyses indicated that there might be       treatment formats produced more improvement than no treat-
some variation in the efficacy of different types of therapy. Such     ment, and the effects of the two approaches were comparable.
a difference has also been reported in a recent review by Dobson       Moreover, there was no evidence that treatment groups with
(1989), who concluded that cognitive therapy for depression            many members were less effective than smaller therapy groups.
was superior to other approaches. However, after more detailed         Although this lack of difference in the efficacy of individual and
analyses, we discovered that differences in the efficacy of differ-    group treatments is consistent with the findings of several previ-
ent treatments may be an artifact of the theoretical allegiances       ous reviews (Miller & Herman, 1983; Shapiro & Shapiro, 1982;
of the researchers conducting these studies. When a particular         Smith et al., 1980, chap. 5), it conflicts with Nietzel et al.'s
type of therapy was preferred by an investigator, it tended to         (1987) and Dush et al.'s (1983) evidence for the superiority of
produce more favorable results than the treatment with which           individual therapy. The difference between our results and those
it was being compared. This pattern is similar to findings re-         of Nietzel et al. are particularly striking, because their review
ported by Herman et al. (1985) and Smith et al. (1980, chap. 5),       also focused specifically on the treatment of depression. How-
who also found that the outcome of a psychotherapy study var-          ever, it should be noted that our reviews differ not only in terms
ied according to the allegiance of the researcher. Furthermore,        of the studies included but also in the outcome measures ana-
the effect is consistent with Ro'senthaTs (1969, 1976) well-           lyzed.
known research demonstrating the influence of experimenter                Although variations in sample characteristics such as age,
expectations. To take this influence into account, we used re-         sex, and initial symptom severity have often been offered as ex-
gression analysis to predict the outcomes that would have oc-          planations for inconsistencies in the results of different studies,
curred under conditions of no allegiance. This predictive analy-       we found no evidence that any of these variables were systemat-
sis indicated that if all of the researchers had been neutral, there   ically related to outcome. Furthermore, studies that included
would have been no reliable differences in the effectiveness of        only clients meeting formal diagnostic criteria for depression
the various types of therapy.                                          generated results that were virtually identical to those observed
   The allegiance of the investigator also appeared to play a role     in studies using less rigorous client selection procedures. The
in comparisons between psychological and pharmacological               reliance on self-report measures for client selection has often
treatments for depression. Although psychotherapy initially ap-        been criticized on the grounds that these measures were not
peared more effective than drugs, this difference was not reli-        developed as diagnostic tools. However, our data suggest that, at
able once we adjusted for the influence of investigator alle-          least for the problem of depression, treatment findings based
giance. Our analyses also indicated that combinations of psy-          on self-identified clients will mirror the results obtained from
chotherapy with pharmacotherapy were not systematically                formally diagnosed samples.
more effective than either of the treatments alone. Furthermore,          Self-report instruments have been viewed as suspect not only
the efficacy of psychotherapy and pharmacotherapy remained             when used for client selection but also when used as an outcome
comparable even when we restricted the analysis to studies us-         measure. The underlying assumption is that clients may overes-
ing tricyclic antidepressants, the best established of the pharma-     timate the benefits of treatment, whereas measures obtained
cological interventions for depression.                                from observers will exhibit less bias. However, in a recent meta-
   Advocates of drug therapy might argue that the effectiveness        analysis, Lambert, Hatch, Kingston, and Edwards (1986) found
of pharmacotherapy was underestimated in these studies for a           greater indications of change on an interviewer measure of de-
variety of reasons. For example, the clinical response to phar-        pression than on two self-report scales. In our analysis of a
macotherapy might have been better had higher levels of the            broad range of self-report and observer measures, we found that
drugs been prescribed. In addition, some have suggested that           treatment effects derived from self-report instruments were
the efficacy of pharmacological treatment depends partly on cli-       comparable in size to those obtained from observer measures.
ent characteristics such as chronicity and the presence of endog-      As in the Lambert et al. (1986) review, there was no evidence
enous symptoms (e.g., Becker & Schuckit, 1978; Weissman,               that measures based on the self-report of the client yielded
 1981). Thus, given different dosages or client samples, pharma-       overly positive estimates of treatment efficacy.
cotherapy could have appeared more effective. Alternatively,              Our analysis did suggest that the effects of therapy differed
advocates of particular psychological or pharmacological ap-           according to the content of the outcome measures. Instruments
proaches may argue that our classification system was overly           specifically designed to assess depression tended to produce
broad, thereby obscuring important differences in the efficacy         larger effects than measures assessing other constructs. Such
42                                        L. ROBINSON, J. HERMAN, AND R. NEIMEYER

findings might suggest that the psychotherapies used in these         logical groups. Even when the client sample was restricted to
studies are more effective for depression than for other types        the diagnosis of depression, different types of psychotherapy
of complaints. However, it must be remembered that we only            yielded equivalent benefits. Furthermore, the similarity be-
included studies with clients whose primary problem was de-           tween our findings and those of general reviews including other
pression. Therefore, an equally plausible explanation for these       diagnostic groups raises the provocative possibility that diagno-
results is that change is more likely to occur for the problems       sis has much less impact on treatment response than previously
or symptoms that initially prompt a client to enter treatment.        assumed. Psychotherapy may produce similar benefits not only
   In our review, we included studies that varied considerably in     across different types of therapy but also across different types
terms of the quality of their research design. Critics of quantita-   of clients. Thus, extensive efforts to establish the diagnostic pu-
tive review methods (e.g., Wilson, 1985; Wilson & Rachman,            rity of samples for psychotherapy research may be unnecessary.
 1983) have objected to this practice, arguing that the quality of       The evidence from the review, coupled with that of previous
the research should be taken into account in the selection of         quantitative reviews, indicates that additional comparisons be-
studies. Like other quantitative reviewers (e.g., Glass et al.,       tween competing therapies for depression are not likely to prove
 1981, chap. 2), we consider the impact of study quality to be an     informative. Comparative studies are useful primarily in estab-
empirical question. In fact, our analyses generally failed to de-     lishing new approaches as viable therapeutic options rather
tect differences as a function of research design characteristics.    than in elucidating the mechanisms through which they effect
For instance, findings from studies with differential rates of at-    change. As Kazdin (1981) has noted in a commentary on de-
trition did not differ reliably from the findings of studies in       pression outcome research: "Relatively little is known about
which dropout rates were equivalent across treatment groups.          major treatment contenders, including cognitive therapy and
Moreover, consistent with the findings of Herman and Norton           behavior therapy, and the comparative studies do not shed light
(1985), investigations that used fully trained professionals as       on whether these individual treatments operate in many of the
therapists did not report greater treatment benefits than studies     ways proposed on conceptual grounds" (p. 319). Indeed, the
that relied on graduate students to administer therapy. Further-      key question now is not whether psychotherapy works for the
more, studies that used treatment manuals or that monitored           treatment of depression but rather how these therapies produce
therapist behavior during treatment yielded findings similar to       their benefits.
those of studies that did not include these procedures for ensur-        In a recent analysis of the cognitive theory of depression, Hol-
ing correct treatment delivery.                                       lon, DeRubeis, and Evans (1987) pointed out that the similarity
   Of the various study characteristics examined, only the num-       in the effects of differing therapies does not necessarily disprove
ber of clients per study related to treatment outcome. As our         the causal significance of cognitive changes as a mediator of re-
preliminary analysis of the psychotherapy data indicated, stud-       covery from depression. The possibility remains that all thera-
ies with fewer clients yielded larger effects than studies with       pies are effective because all of them activate the cognitive
many clients. The most likely explanation for this relation is        changes that are the specific target of cognitive therapy. Al-
that it reflects a publication bias. That is, journals are more       though this argument could be true, it must be recognized that
likely to publish studies that achieve statistically significant      a comparable position could be advanced with equal plausibil-
differences between groups, and small studies need larger effects     ity by proponents of behavior therapy. That is, therapies may
to achieve statistical significance. One implication of this find-    promote recovery by producing behavioral changes, whether by
ing is that literature reviews of published research (whether they    accident or by design. In this sense, cognitive or behavioral
are narrative summaries or quantitative reviews such as our           changes may constitute nonspecific effects that occur naturally
own) may yield overly generous estimates of treatment effects.        over the course of therapy regardless of whether they are the
   Because of the way in which we conducted our analyses, how-        identified goal of treatment. Alternatively, factors such as cli-
ever, such a publication bias was probably minimized in this          ents' expectations of improvement, their acceptance of the ther-
review. For example, our statistical analyses gave less weight to     apeutic rationale, or the quality of the therapeutic relationship
studies with smaller samples, and it is these small-sample stud-      may be the central mechanisms through which therapeutic
ies that are most likely to inflate estimates of treatment effects.   change occurs (e.g., see Frank, 1982; Zeiss, Lewinsohn, & Mu-
In addition, our procedures for estimating treatment effects          noz, 1979).
were often conservative. Thus, when results were reported sim-           Surprisingly little research has been directed toward evaluat-
ply as nonsignificant, we estimated that effect size to be zero.      ing these hypothesized mediators of change. Curative factors
Whether or not these conservative estimation procedures fully         common to all therapies have generally been mentioned only as
offset the influence of a publication bias, our review can at least   post hoc explanations in comparative studies finding nonsig-
be viewed as a conservative summary of the published evidence         nificant differences in treatment outcomes. If researchers are to
on the efficacy of psychotherapy for depression.                      progress in their understanding of how psychotherapy benefits
   The findings in this review are consistent with more general       clients, these common factors may need to become a more cen-
quantitative reviews of the psychotherapy outcome literature          tral focus of future research efforts.
(e.g., Smith et al., 1980). As with these other reviews, we found
few differences in the efficacy of various therapeutic methods.                                   References
However, unlike these general analyses, ours provided an evalu-       American Psychiatric Association. (1980). Diagnostic and Statistical
ation of comparative, treatment efficacy within a single diagnos-       Manual of Mental Disorders (3rd ed.). Washington, DC: Author.
tic category. Thus, the similarity in the effects of various thera-   Atkinson, A. K., & Rickel, A. U. (1984). Postpartum depression in pri-
pies cannot be attributed to their application to different noso-       miparous parents. Journal of Abnormal Psychology, 93, 115-119.
You can also read