SYMPTOMS, HEALTH-RELATED QUALITY OF LIFE AND PATIENT SATISFACTION: USING THESE PATIENT-REPORTED OUTCOMES IN PEOPLE WITH GASTROESOPHAGEAL REFLUX ...

Page created by Louise Elliott
 
CONTINUE READING
Chapter 25
SYMPTOMS, HEALTH-RELATED QUALITY OF LIFE AND PATIENT
SATISFACTION: USING THESE PATIENT-REPORTED OUTCOMES
IN PEOPLE WITH GASTROESOPHAGEAL REFLUX DISEASE

S. Wood-Dauphinee1 and D. Korolija2

1 School of Physical and Occupational Therapy, Department of Epidemiology and Biostatistics, Department of Medicine,
  McGill University, Montreal, Quebec, Canada
2 University Surgical Clinic, Clinical Hospital Center Zagreb, Zagreb, Croatia

Introduction

Patient-reported outcomes, defined as any report com-                   the expectations of patients with gastroesophageal re-
ing directly from the person whose life is affected by a               flux disease (GERD) in terms of the outcomes of lapa-
health problem, are becoming increasingly important in                 roscopic anti-reflux surgery. Responses of 70 patients to
helping professionals determine the impact of their                    open-ended questions provided the following infor-
treatments. This represents a change in focus. Tradi-                  mation. Relief of GERD symptoms was expected by
tionally, clinicians and researchers were primarily inter-             92.8%; 84.3% anticipated a return to usual daily and
ested in outcomes related to morbidity and mortality.                  work-related activities; and for 72.9% an improved
This approach was consistent with the biomedical                       quality of life was important. Successful surgery without
model of disease that relied on tests to identify pathol-              complications was named by 52.9% of patients and pro-
ogy or changes in physiological processes. A treatment                 tection from a future Barrett’s esophagus or cancer was
was judged to be successful if the biologic test returned              noted by 48.6%. Only two patients expected normaliza-
to normal, as was often the case for acute conditions. A               tion of pH values and healing of the esophagus.
more recent approach, termed an outcomes model [1],                        These results demonstrate that patients’ expecta-
suggests that medical care is designed to focus on how                 tions are generally different from those of the clini-
people feel and how they are able to function as well as               cian. In fact, patients primarily seek medical care
how long they live. Measures of symptoms, health-                      because of bothersome symptoms. It should, thus, be
related quality of life (HRQL) and patient satisfaction                anticipated that symptom relief would be their top
are considered to be appropriate outcomes. While there                 expectation. The ability to assume usual patterns of
is clearly overlap between the models, the outcomes                    daily activities including work, as well as customary
model focuses attention on the determinants of patient                 roles were also important. These latter factors are
outcomes and relies primarily on reports by patients to                well-accepted components of quality of life as it re-
judge the result of the treatment. It is particularly                  lates to health, in other words HRQL, and this was
appropriate for diseases that are chronic in nature. Cur-              also a priority of the patients. Satisfaction with the
rently, it is believed that both types of outcomes are                 process and end result of surgery is another patient-
important. The physiological measures reflect the value                 reported outcome that reflects their expectations and
system of the professionals and provide information                    degree to which they were met.
that helps confirm their clinical impression. The pa-                       The purposes of this chapter are to describe the ra-
tient-reported measures reflect the subjective evaluation               tional for using these patient-reported outcomes, the
and reporting of the illness experience and its treatment              criteria by which the measures should be selected and
[2]. These measures reflect their value system.                         how they can be used in both clinical research and
    A recent study from Austria [3] illustrates the differ-            daily surgical practice. Information will also be pro-
ent points of view of clinicians and patients in rating the            vided about available measures to assess these con-
importance of different outcomes. This study examined                  structs in individuals with GERD.
270                                                                                                          Chapter 25

Why are patient-reported outcomes                              with a biopsy and histological examination of the
useful in the care of individuals                              gastroesophageal junction, as well as a careful history
with GERD and in clinical studies of GERD?                     and assessment of symptoms. Traditionally, these pre-
                                                               operative tests were repeated post-operatively at various
Including patient-reported outcomes in clinical practice       points in time, and physiologic changes demonstrating
and in clinical studies of GERD provides several impor-        normalization of pH values and lower esophageal
tant benefits. First, as noted previously, it allows the        sphincter pressure, along with the elimination of reflux
clinician or the investigator to characterize the impact of    that signified esophagitis healing, were used as indica-
GERD and its treatment in terms that are of value to           tors of operative success. Today, while the evaluations
and understood by the patients. In fact, patient-reported      are similar, the overt surgical objective is mainly focus-
measures specifically reflect their point of view, as most       sed on the alleviation of symptoms.
often, patients provided input as to the content of the            Symptoms have been defined by the Patient-
measures. Second, certain components of these patient-         reported Outcomes Harmonization Group in 2002 as
reported measures may be independent predictors of             “the subjective experience of abnormal function, sen-
surgical outcomes. For example, measures of HRQL               sations, or appearance, generally indicating disorder
usually contain a component that assesses mental health        or disease” [8]. Surgeons aim to decrease the pres-
in terms of anxiety and depression. It has been shown          ence, severity, frequency and duration of symptoms,
that people with these problems are less satisfied with         as they portray the sensory changes perceived by the
the outcome of surgery than those without such psychi-         patient [9]. Operative success is often judged by
atric comorbidity [4], [5]. A change in symptoms may           patient reports of few remaining or new symptoms,
forecast increasing severity and this information may          negligible complications and a limited need for
provide insight into the progression of GERD, or de-           further medications [10], [11].
creasing symptoms may denote treatment adherence or                A number of findings have led to this focus on
recovery. For instance, the resolution of heartburn is         symptoms. First, severity of the symptoms has not
highly correlated with post-operative return to normal         been found to be strongly associated with the patho-
24-h pH monitoring [6]. Thirdly, measures of symptoms          logical extent of the reflux or other physiological
are excellent indicators of the severity of the disease, and   parameters. For example, studies of esophageal pH
those assessing HRQL portray the impact on function-           monitoring or manometry are not highly correlated
ing, engagement in daily activities and participation in       with reported symptom severity [12], [13]. Symptoms
life events. Finally, data specifically related to patient      and esophageal lesions do not always correlate strongly
satisfaction provide information on the quality of care        [14]. There are few differences in symptoms between
provided. A well-developed measure of patient satisfac-        patients with Barrett’s esophagus, erosive and non-
tion will indicate both the patients’ perceptions of the       erosive GERD [15]. Some studies have found that
process of care and the outcomes of treatment. Surgeons        surgery is of value for people with severe symptoms
in particular, have long been concerned with the results       regardless of the endoscopic appearance of the
of surgical care that reflect the patient’s subsequent          esophageal mucosa [16], [17]. Finally, as mentioned
health state [7]. This information is also useful to admin-    previously, stress-related symptoms and psychiatric
istrators and payers concerned with the quality of care.       diagnoses are independent predictors of the surgical
                                                               outcome [4], [5], [18], [19].
                                                                   GERD may be associated with many symptoms
                                                               but the primary one is heartburn. Others, including
Which patient-reported outcomes
                                                               acid regurgitation, epigastric pain, belching, bloating,
are important to measure?
                                                               nausea, vomiting and dysphagia, may range from cau-
                                                               sing mild impairment to severe disability [20]. Despite
Symptoms
                                                               the long standing interest in symptoms and the resent
Prior to surgery for GERD, patients undergo a number           reaffirmation of their importance as outcomes, there
of objective tests including 24-hr pH monitoring,              appears to be no unified approach to the assessment of
esophageal manometry, esophagastric duodenoscopy               symptoms or an in-depth knowledge of the best way
Wood-Dauphinee S and Korolija D                                                                                 271

to do so, either in daily practice or research [21]. There   [8], [21], [26]. Given the increasing importance of
is some agreement that self-report is the most appro-        their use in outcomes assessment, increased attention
priate approach [8], [22], partially because there is no     to the development and validation of symptom scales
strong correlation between patient and clinician             is warranted.
reporting. In some circumstances, clinicians tend to
underestimate the presence of symptoms and their
severity compared to those who actually have GERD            Health-related quality-of-life
[23], and at other times, particularly when estimating
treatment response, investigators report more positive       Patients seek medical care because of symptoms.
results than do the patients [24].                           Often, however, it is not because of their presence,
    Self-completed symptom questionnaires have               or even their severity, but to the distress they cause
been developed and validated and some examples               the patient by intruding on daily activities and life in
will be presented later in the Chapter. There is,            general. In other words, it is how the symptoms im-
however, a tendency in the literature to use tradi-          pact on their HRQL. In GERD it is evident how
tional, clinical, ordinal scales asking questions about      problems related to eating, drinking, sleeping, pain,
the presence, frequency and severity of common               and reduced vitality impair life’s quality. In fact, it
symptoms, rather than standardized measures. In              has been well documented that people with GERD
fact, surgical investigators frequently report the use       have a lower quality of life than those without this
of a “standard scale” but the meaning of the term is         disorder [27], [28].
unclear. Information about the origin and psycho-               While no one definition has received universal ac-
metric properties is seldom provided.                        ceptance, there is a general consensus that measures
    In a recent international, multidisciplinary works-      of HRQL are multi-dimensional and should assess
hop [21], the issue of symptom reporting in trials was       physical, mental, social and role-functioning, a per-
extensively addressed. Using heartburn as an example         son’s perception of overall well-being and symptoms
of a common, salient symptom in GERD, Bytzer [22]            to a greater or lesser extent depending on the type of
discussed issues related to its assessment that ranged       measure [29]. As noted by Guyatt and colleagues
from problems in defining heartburn itself, its severity,     [30], HRQL is concept that embraces the World
and frequency, to when, how and by whom heartburn            Health Organization’s definition of health [31] by
and other symptoms should be measured. While a               incorporating both personal health status and social
large number of symptom-response measures have               well-being. It reflects peoples’ subjective perceptions
been reported in the literature, there is little consis-     of how they feel and function.
tency of approach and a general lack of validation              There are two main types of HRQL measures
studies. In connection to this workshop, Wrywich and         [32]. Generic measures cover the full range of do-
Staebler Tardino [25] provided a blueprint for creating      mains and can be used across different patient popula-
symptom scales that uses a cognitive psychology              tions to compare the impact of various diseases. Some
framework approach to development. They also gave            generic measures have normative data, by age and sex,
general information about developing optimal scales          from ostensively healthy populations. When available,
and interpreting their results.                              these data make it possible to compare people with
    In sum, there is considerable evidence that relief of    GERD, for example, to those without the condition,
symptoms caused by GERD is at times more complex             perhaps prior to and after surgery. Moreover, because
than simply correcting the pathologic lesion. How            generic measures are broad in scope, they sometimes
patients perceive the sensations and respond to them         help identify previously undisclosed problems that are
must be taken into consideration. Because of the lack        not tapped by a measure specifically for people with
of physiological and pathologic markers for differen-        GERD. This latter type of measure, known as disease-
tiating disease severity, an individual’s description of     specific, focuses on the specific feelings, dysfunctions
his or her symptoms is a predominant source of infor-        and symptoms associated with a given condition.
mation for the surgeon in making a diagnosis, moni-          They, therefore, are able to detect treatment effects
toring a patient and assessing the outcomes of surgery       and mirror changes in patient status.
272                                                                                                       Chapter 25

Generic and disease-specific measures may be health          access, cost and convenience are incorporated. There
profiles, which usually yield sub-scales for each do-        has been at least one suggestion that satisfaction with
main allowing the assessment of interventions on the        the treatment process should be assessed separately
different components of HRQL, or utility measures,          from that of the outcome of the treatment [33].
derived from economic and decision theory. These                In recent years, most surgical investigators evalu-
measures have preference weights incorporated in            ating the impact of various surgical procedures and ap-
their scoring. Utility measures provide a single nume-      proaches in GERD have selected patient satisfaction
rical estimate of HRQL that includes patient choices        as one of the outcomes. In the majority of studies, the
about both duration of life and its quality. Some pro-      degree of satisfaction reported one to five years after
files also generate a single number for analysis. Today,     the surgery was high. A few reports were not as glow-
many studies use a combination of a generic and a di-       ing. For example, Bessell and colleagues [40] found
sease-specific measure so that response to change is         that 27% of those patients who replaced severe pre-
captured, but no important aspect of the person’s           operative heartburn preoperatively for severe dysphasia
HRQL is missed.                                             after the surgery would not have the surgery again.
    Finally, clinicians and investigators also use a sin-       Another study [41], assessing surgical outcomes in
gle item to evaluate HRQL. While common, partic-            routine clinical practice rather than in a referral centre
ularly in clinical practice, these single-item ratings      reported similar overall outcomes in the face of less
have not usually been tested for their measurement          positive data about complications, symptoms, and
properties, and are known not to be very reliable.          medication use after surgery as well as the need for
Moreover, they provide no help in explaining why            post-surgical dilatation or repeat operations. These
patients respond the way they do.                           investigators attributed the positive global response
                                                            regarding satisfaction to a type of measure that fails to
                                                            include specific components of the process or out-
Patient satisfaction                                        comes of care [42]. This is not an uncommon finding.
                                                            Global ratings of satisfaction tend to be positively
Patient satisfaction is the patient’s perceptions of both   skewed [43]. Patients rate high levels of satisfaction
the quality of treatment provided and its effective-        in the face of other negative information [43], [44].
ness. A measure of satisfaction is one that documents       Additionally, they tend to be less satisfied if asked
patients’ assessments or affective responses to dif-        about specific areas [44]. In fact, there are many
ferent dimensions of the treatment experience [33]–         problems with global single item ratings, although
[35]. Typically, it compares the process and outcomes       they are easy to use and intuitively appealing. Because
of the treatment experience with prior expectations         the dimensions within the satisfaction construct are
that may or may not have been met or surpassed [36],        not named, it is not known what factors the patient
[37]. Although individual patients may have different       took into consideration or excluded when making the
expectations for the distinct components of treatment       rating, why elements received the assigned ratings, or
or care, their individual expectations and satisfaction     how they were combined [45].
with the various components are independent predic-             Other investigators [17], [46] used “standard”
tors of overall satisfaction [38].                          series of questions about such areas as the success
    Different conceptual frameworks for understanding       of the surgery, whether or not the patient would
patient satisfaction have been proposed [37], [39], and     again decide to undergo the surgery, and difficul-
used as a basis for the development of measures. In         ties experienced since the operation. Each question
general terms, the frameworks include sociodemogra-         was treated individually and provided descriptive
phic, personal, medical and functional characteristics of   information about the patients’ responses. While
the patients, their values, preferences and expectations    somewhat more informative it is unknown if all sa-
of treatment outcome, prior experiences with treatment      lient aspects that are important to patients were in-
for the current and other disorders, the way treatment      cluded, and it is still difficult to form a concrete
is delivered and experienced, as well as its impact on      picture of the patient’s judgement of the process
symptoms, function and HRQL. In some models,                and outcomes of care.
Wood-Dauphinee S and Korolija D                                                                                       273

In summary, the single item ratings or questionnaires        how it is to be administered, scored and interpreted.
with only a few items used in surgical studies to assess     Conversely, “ad hoc” measures, most often created by
patient satisfaction have not been carefully developed       clinicians, are those without formal testing or estab-
and examined for their psychometric properties. In           lished measurement properties. An evaluative mea-
other words, they have not been developed in the cur-        sure is designed to assess an individual at a baseline
rently accepted rigorous manner [39].                        point, and again at one or more points later in time,
    To the best of our knowledge, only one group of          principally to determine if change has occurred.
researchers [47] has developed and validated a mea-          Evaluative measures may also discriminate between
sure of patient satisfaction for GERD patients, the          different groups and predict future events as well.
Treatment Satisfaction Questionnaire for Gastro-                 The content if the instrument is of primary inter-
esophageal reflux disease (TSQ-G). The measure was            est. While content may be influenced by the literature
developed using input from patient focus groups,             and information from clinicians, as alluded to
physicians and the literature and it was tested appro-       previously, most content should come directly from
priately for reliability and validity. Unfortunately, the    patients and reflect their issues and concerns. There is
measure is targeted for GERD patients being man-             a growing consensus that the “content” validity or the
aged by medications and, thus, is not appropriate as         adequacy with which the items sample the construct
an outcome of surgery. Nonetheless, it is a model for        being assessed by the measure, can only be judged by
the development of such a measure for use with               the persons being evaluated [2]. It is, thus, important
patients undergoing surgery.                                 that patients with the specific health problem have
    Our concerns about the assessment of treatment           had major input. Assuring that this is the case is an
satisfaction mirror those of Revicki [48]. He pointed        early step in the selection process.
out that considerable attention needs to be given to             In general terms, validity refers to the ability of an in-
the psychometric properties of satisfaction measures         strument to measure what it is supposed to measure. Be-
including the theoretical model that underpins the           yond “content”, information on criterion and construct
instrument, reliability, all types of validity and the       validity may be available. Criterion validity evaluates the
interpretability of the numerical scores.                    relationship between the measure of interest and a crite-
    It should be noted that surgical investigators           rion measure or “gold standard”, concurrently or in the
working in GERD are not alone in their difficulty             future. For concurrent criterion validity a new disease-
assessing satisfaction. A few years ago an analysis of       specific measure of HRQL for people with GERD
195 studies found that little attention had been given       should correlate moderately with a well-known disease-
to the development of satisfaction measures and this         specific measure of HRQL. While the criterion measure
its self cast doubt on the credibility of the satisfaction   may not be “gold” it should at least be “silver”! For pre-
findings [49]. It is an area that needs immediate at-         dictive criterion validity one may test if the score on a
tention. Specifically, we need to develop measures of         measure of HRQL taken two weeks after surgery will
satisfaction that reflect the components of global            forecast return to work. Given the difficulty of finding
satisfaction such as personal expectations, indicators       gold standards for patient-reported measures, construct
of quality of treatment as well as the outcomes of           validity is more often reported. Construct validation ex-
care as judged by the patients.                              amines if the measure performs according to theoretical
                                                             expectations by examining the direction and magnitude
                                                             of relationships with other variables. For example, one
Issues in selecting patient-reported outcomes                might hypothesize and test if a generic measure of
for use in clinical practice and research                    HRQL can discriminate among groups of patients who
                                                             have different levels of symptoms, or if it will negatively
First, we are interested in selecting standardized           correlate with measure of pain. A measure of patient sat-
measures that are “evaluative” in purpose [50]. A            isfaction following an anterior partial fundoplication for
standardized measure is one that has been published          GERD should be positively correlated with the degree
along with information about its psychometric prop-          of symptom resolution or an objective outcome such as
erties, and instructions as to with whom, when and           results from a 24-hr gastric pH monitoring.
274                                                                                                         Chapter 25

Reliability reflects the extent to which a measure is         provide some direction about how an individual pa-
free from random error and it refers to the reproduc-        tient is doing. Readers wishing more information on
ibility or stability of the measure over time. This is       the psychometric properties of measures are referred
termed test-retest reliability. It also includes estimates   to work by the Scientific Advisory Committee of the
of internal consistency, or how well the items in a scale    Medical Outcomes Trust [63].
relate to each other and to the total score. While there         In addition to knowledge about psychometric
are several test statistics to assess reliability, the       properties, the potential user of an instrument needs
reliability coefficients are interpreted similarly. A         other information before making a choice. For exam-
coefficient of 0.89 means that 89% of the variance is         ple, does the timeframe associated with the questions
true variance, related to the patients in the sample, and    or items fit the intended use? Patients can be asked to
11% is the amount of random error. For groups, as in         consider their responses in terms of the past 24 hours,
research, a coefficient of 0.70 is the minimum level,         a week, month or even a year. The choice depends on
but for use in clinical practice the minimum has been        the typical illness or recovery trajectory of the patients
set between 0.85 and 0.90 [51].                              or on the design of the study. Which response options
    The last psychometric property is responsiveness, or     are provided for the patient? Are they dichotomous
the ability of the measure to accurately detect patient      (yes/no), made up of several ordinal categories (poor –
change when it has occurred [52]. Most approaches            fair – good – excellent) or presented as a visual analog
to test responsiveness depend on assessing patients          scale? Are population norms available for the country
periodically over time during a period of anticipated        of the study which can be used for comparison pur-
change, and evaluating the change that occurs [53],          poses? This is particularly useful for generic measures
[54]. While various approaches to quantifying respon-        of HRQL so clinicians or investigators can compare
siveness exist, clinical studies primarily report one of     their patients’ or study subjects’ scores to age- and sex-
the variants of “effect size”. Coined by Cohen [55], this    matched population values. What is the burden on
term simply means a standardized, unitless measure of        subjects? More specifically, how long does the instru-
change. Today such variants are termed “effect sizes”        ment take to complete, and does it include questions
[56], “standardized response means” [57] or “respon-         that are potentially upsetting for the patient? What is
siveness statistics” [58].                                   the burden on the professional? How easily is the
    Potential users of measures also need direction in       measure scored? Can the scoring be automated? Does
how to interpret the score. By “interpretability” we         one have to obtain permission to use the measure, and
mean the capacity to assign a qualitative meaning to a       if so, is there an associated cost? All of this information
quantitative score [59]. One approach to the interpre-       needs to be ascertained prior to selecting a measure.
tation of change that is “distribution based” employs            Moreover, today, clinical research is conducted in
effect sizes [60]. Cohen [55] suggested that 0.2, 0.5        countries around the world, and thus, the demand
and 0.8 represent small, medium and large effect             for instruments that can be used internationally has
sizes. While these values are somewhat arbitrary [61],       risen dramatically. By now many instruments have
they are used in the literature. The second approach,        been culturally adapted, translated into different
termed “anchor-based” [60], examines the relation-           languages and then retested psychometrically to in-
ship between the change score on the instrument              sure that the language, meaning and performance of
being tested to that on another measure that is well-        the instrument remain consistent. There are differ-
known, associated with the test measure and clinically       ent methods to enhance cross-cultural comparabil-
meaningful [61]. Population norms, severity classifi-         ity, and while guidelines are available, [64]–[66] it is
cations, symptom scores and global ratings of change         a time consuming process. Investigators or clinicians
by patients or physicians as well as the minimum im-         planning to use a patient-reported measure in their
portant difference (MID) have all been used. The             clinical practice or research project should determine
MID, the smallest change that patients perceive as           if the measure they select has undergone such a pro-
beneficial [62], is another useful piece of information       cess and is available for use.
for potential users of a measure as it not only has im-          Brief information about the psychometric proper-
plications for sample size in investigations, but it can     ties of patient-reported measures used in people with
Wood-Dauphinee S and Korolija D                                                                                275

GERD and answers to some of the questions raised in       Endoscopic Surgery [74], and the GIQLI [77] was
this section of the Chapter are presented in Table 1,     recommended specifically for outcome assessment by
but such information accumulates over time, and so a      the European Study Group for Antireflux Surgery
potential user should refer to recent literature.         [79]. It is available in English [77], French [80],
                                                          German [81] and Spanish [82]. The QOLRAD was
                                                          developed in French and English [78].
Patient-reported outcomes currently                           In terms of generic HRQL measures, people with
used in people with GERD                                  GERD have mainly been assessed using two well-
                                                          known measures – the Psychological General Well
A number of articles have reviewed the development,       Being Index (PGWB Index) and the Medical Out-
psychometric performance and applications of pa-          comes Study Short Form-36 (SF-36). These measures
tient-reported measures of symptoms and HRQL for          were recommended by the European Association for
people with GERD [67]–[71]. Table 1 revisits this in-     Endoscopic Surgery [74] partially because individuals
formation and presents those measures appearing in        with GERD score lower on these measures than
the surgical literature, along with information on the    ostensibly healthy individuals and their scores decrease
different domains tapped in each measure, the number      as symptoms become more severe [83]–[85].
of items per domain, the time-frame within which pa-          The PGWB Index was developed as a measure of
tients are to consider their responses and how the        subjective well-being or distress [86]. The Index is
measures are scored. Additional information is pro-       comprised of six domains, including anxiety, depressed
vided about the approach to content development           mood, positive well-being, self control, general health
(specifically if patient input had been sought), other     and vitality. The domains contain 3–5 items, each of
aspects of validity, estimates of reliability and how     which is scored on a 6-point ordinal scale. Domain
responsiveness has been examined. Some measures           scores and a total score can be calculated. Higher values
have information about the minimal important dif-         denote better quality of life. Internal consistency and
ference in score that patients can detect as well. When   test-retest reliability as well as construct and criterion
known, the languages in which the measure is avail-       validity were moderate to strong [86]–[89]. PGWB to-
able are stated in the text. It is acknowledged, how-     tal scores were able to discriminate between individuals
ever, that other language versions, unknown to the        with and without heartburn [83]. Moreover, sensitivity
authors, may exist in the international literature.       to change in response to treatment has been demon-
    The Gastrointestinal Symptom Rating Scale             strated in patients with upper gastrointestinal symp-
(GSRS) [72] and the Gastroesophageal Reflux Dis-           toms [88]–[91] and a change of 4 points on the Index
ease Health-related Quality of Life (GERD-HRQL)           is a clinically meaningful difference in people with
scale [73] have been available for a number of years      GERD [83]. Swedish norms are available [89].
and appear frequently in surgical investigations. Both        The SF-36 is a generic measure of perceived health
these measures were recently recommended for use          status that incorporates behavioural functioning,
by the European Association for Endoscopic Surgery        subjective well-being and perceptions of health, by as-
[74]. The GSRS has been used in Scandinavian, UK          sessing eight health concepts: limitations in physical
and US samples. The Symptom Questionnaire for             activities due to health problems; limitations in role ac-
Gastroesophageal Reflux Disease [75] is more recent        tivities due to physical health problems; pain; limita-
and has been employed in one study of the long-           tions in social activities due to health problems; general
term follow-up of patients after laparoscopic Nissen      mental health; limitations in usual role activities due to
fundoplication [76].                                      emotional problems; vitality; and general health per-
    The gastrointestinal-specific and the GERD-spe-        ceptions [92]. The questionnaire is made up of 36
cific measures of HRQL have also been widely used          items that are divided into the 8 scales. The scores on
in surgical studies. Both the Gastrointestinal Quality    all scales range from 0–100, with higher scores re-
of Life Index (GIQLI) [77] and the Quality of Life        flecting better health. There is also a computerized
in Reflux and Dyspepsia (QOLRAD) [78] were                 method of scoring two major components, physical and
recommended by the European Association for               mental health. Each component has been standardized
Table 1. Patient-reported measures of symptoms and health-related quality of life used in surgical studies of people with GERD
                                                                                                                                                                    276

1 Symptoms

Instrument           Domains               # of    Time       Scoring                       Reliability             Validity                 Responsiveness
                                           items   frame

Gastrointestinal     Reflux syndrome        2       Past       7-point ordinal scale [1–7]   Internal consistency:   Content: Developed       Effect Sizes and
Symptom Rating       Abdominal pain        3       1–2        No Discomfort – Severe        Alphas – Moderate       using literature and     Standardized
Scale (GSRS) [72]    Indigestion           4       weeks      Discomfort                    to Moderately High      professional input       Response Means:
                     Diarrhea              3                  Sum domain scores             [115]–[116]             [72]                     Adequate [115]–[118]
                     Constipation          3                  Calculate domain means
                                                              Higher score : greater        Test-retest:            Construct: Adequate;     Minimal Important
                                                              severity                      ICCs – Moderate         with SF-36 & PGWB        Difference: 0.5 per
                                                                                            to Moderately High      Scales [116], [118]      Item [117]
                                                                                            [116]–[117]
                                                                                                                    Discriminative
                                                                                                                    Adequate; between
                                                                                                                    different symptom
                                                                                                                    severities & responses
                                                                                                                    to treatment [83],
                                                                                                                    [115], [116]
Gastroesophageal     Heartburn             6       Current    6-point ordinal scale [0–5]                           Content: Face validity   Pre-post treatment
Reflux Disease        Dysphagia             2                  No symptoms –                                         judged by clinicians     changes evident on
Health-related       Bloating              1                  Incapacitating Symptoms                                                        symptom scale [73]
Quality of Life      Medication impact     1                  Sum 10 symptom items                                  Construct: Adequate;
Scale                                                         Higher score : greater                                Correlates with
(GERD-HRQL)          Satisfaction with     1                  severity                                              degree of
[73]                 condition                                3-point categorical scale                             esophagitis [12]
                                                              (Satisfied – Neutral –
                                                              Unsatisfied)                                           Discriminative
                                                                                                                    Adequate; between
                                                                                                                    satisfied/unsatisfied
                                                                                                                    patients and
                                                                                                                    medical/surgical
                                                                                                                    treatments [73]
Symptom              Heartburn             1       Current    4-point ordinal scale [0–3]   Test-retest             Content: Face validity   Responsiveness
Questionnaire for    Regurgitation         1                  Severity                      ICC – High [75]         judged by clinicians     Index:
Gastroesophageal     Epigastric / chest    1                  5-point ordinal scale [0–4]                                                    Adequate [75]
Reflux Disease        pain                                     Frequency                                             Construct: Adequate
[75]                 Epigastric fullness   1                  Severity  Frequency                                 Correlates with          Minimal Important
                                                                                                                                                                    Chapter 25
Table 1 (continued)

                      Dysphagia              1                    [0–12] points per item                               disease activity          Difference  [5–10]
                      Cough                  1                    Total Score  [0–72]                                 measures and SF36         points [75]
                                                                  Higher score : greater                               cross-sectionally
                                                                  symptom impact                                       and longitudinally [75]

2 Health-related quality of life: gastrointestinal and disease-specific

Gastrointestinal      Symptoms               19        Past 2     5-point ordinal scale [0–4]   Internal Consistency   Developed by              Demonstrated
Quality of Life       Emotional status        5        weeks      Severity or frequency of      Alpha – High [77]      interviews with           expected gradient
Index (GIGLI) [77]    Physical status         7                   symptoms, dysfunctions                               patients and              pre-pos surgery and
                      Social activities       4                                                 Test-retest            clinicians and from       follow-up [119], [120]
                                                                                                                                                                          Wood-Dauphinee S and Korolija D

                      Treatment stress        1                   Total score [0–144]           ICC – high [77]        the literature [77]
                                                                  Higher score : Better                                Construct: Adequate;
                                                                  HRQL                                                 Correlates with
                                                                                                                       measures of QOL &
                                                                                                                       well-being [77]
                                                                                                                       Data on normal
                                                                                                                       people available [77]
Quality of Life       Emotional distress       5       Past       7-point ordinal scale [1–7]   Internal Consistency   Developed by focus        Effect Sizes and
in Reflux and          Sleep disturbance        5       week       Severity: none at all – a     Alphas – High          groups with patients      Standardized
Dyspepsia             Food/drink               6                  great deal                    (total and domain      (N.A., Australia and      Response means
(QOLRAD) [78]         problems                                    Frequency: none of the        scores) [78]           Europe), clinicians,      Adequate [117]
                      Physical/social          5                  time – all of the time                               and a literature
                      functioning                                                               Test-retest            review [78]               Minimal Important
                      Vitality                 4                  Total score and domain        ICC – Moderately                                 Difference: 0.5 per
                                                                  scores                        High [117]             Construct: Adequate;      Item [15]
                                                                  Higher score : better                                Correlates with SF-
                                                                  HRQL                                                 36 and GSRS [78]
                                                                                                                       Discriminative
                                                                                                                       Adequate; Better
                                                                                                                       with symptom
                                                                                                                       severity than
                                                                                                                       frequency [78]

Abbreviations: ICC  Intraclass Correlation Coefficient; Alpha  Cronbach’s alpha; QoL  Quality of Life
Reliability coefficients: 0.80  high; 0.60–0.79  moderately high; 0.40–0.59  moderate
                                                                                                                                                                          277
278                                                                                                        Chapter 25

to have a mean of 50 and a standard deviation of 10         stand the commitment will help insure continued
[93]. One version of the SF-36 asks people to think         involvement. Moreover, patient-reported measures
about their health over the past four weeks and another     rely on the ability of the patient to provide the an-
version uses a one-week recall period.                      swers. The patient must, therefore, have sufficient
    Good to excellent internal consistency and test-re-     reading ability or someone must read the questions to
test reliability have been demonstrated in diverse pa-      him or her to obtain the response. This is an accept-
tient groups including those with GERD [88], [94].          able practice, but ad hoc translating the question by a
Subscales of the SF-36 (pain and general health per-        family member, a researcher or even a qualified trans-
ceptions) and the component summary scores were             lator is not permitted, as a bias may be introduced by
able to discriminate between people with GERD re-           the way the question is translated and asked [96].
porting no heartburn and those reporting heartburn          While the use of proxy respondents has a place in
symptoms [83]. Responsiveness to treatment has also         research they are not patient-reported measures [2].
been demonstrated in people with GERD [83], [88].               When designing the study, the timing of the as-
As part of an international initiative that used a stan-    sessments should be planned within the context of the
dard protocol, the SF-36 has been translated, cultur-       surgery and the recovery trajectory [97]. Baseline as-
ally adapted and revalidated in over 50 languages.          sessments of symptoms and HRQL are essential in
Norms for many countries are available [95].                both observational and controlled studies. In both
                                                            types, one comparison will be between pre-surgery
                                                            and post-surgery at various points in the recovery tra-
Issues in using patient-reported outcomes                   jectory. In a controlled trial the baseline assessment
in clinical research                                        should be administered prior to randomization so as
                                                            to eliminate any possible bias resulting from knowl-
Using measures of symptoms, HRQL and patient sat-           edge of the allocation either on the part of the patient
isfaction in surgical studies requires additional consid-   or the individual administering the measure. This
erations in both the planning and execution of the          baseline assessment in a controlled situation also
investigation. Specific guidelines for selecting measures    provides data for group comparisons at study entry as
have already been discussed. This section will focus on     well as allowing between-group comparisons over time.
the successful use of these measures in a study.                From the previous paragraph it is obvious that pre-
    When stating the objectives in the study protocol       operative assessments are clearly important to provide
it is important to identify that symptom resolution,        baseline data. Yet, asking patients to complete ques-
improved HRQL or high satisfaction with the treat-          tionnaires as they are waiting for imminent surgery is
ment are defined outcomes, each with a hypothesis            probably not the best time to have them provide reflec-
attached to them and that they will be as rigorously        tive responses. Completion at an earlier point in time,
evaluated as the more traditional outcomes. This is         perhaps at the last visit to the doctor, or through a tele-
crucial in a multi-centered study so that these out-        phone interview a few days prior to surgery might yield
comes are not considered as “add-ons” by co-investi-        more considered answers.
gators who may treat them with less rigour than                 Another issue to think about in terms of appropri-
used with traditional assessments.                          ate timing is that the immediate effects, particularly
    When patients are asked to participate in a study       when an open approach to surgery for GERD is used,
and informed consent is sought, they should also be         will be negative on most HRQL domains. Moreover,
told about the study and what participation will entail     there will be after-effects and possibly new symptoms
[96]. In studies using patient-reported outcomes this       with which the patients must deal. However, by four
means that patients should agree to complete ques-          weeks after the operation, patients will likely associate
tionnaires or be interviewed face-to-face or over the       positive changes in eating, or level of pain with an im-
telephone at specific points in time. Some trials have       proved quality of life. If one wants information on the
actually asked patients to complete a set of forms as       patient’s perceptions of the care process, assessments
part of the eligibility criteria. Providing this informa-   of treatment satisfaction are best made directly after
tion up front and making sure that patients under-          discharge when details are fresh in patients’ minds.
Wood-Dauphinee S and Korolija D                                                                                 279

Satisfaction with the outcome of the operation, how-       they plan. Moreover, most HRQL measures are multi-
ever, must wait a sufficient time until the person is       dimensional and made up of subscales. This again may
fully recovered from the surgery itself and probably       increase the number of endpoints. Not only do we
until the long-term effects are apparent.                  select one or more multidimensional measures, but we
    The investigators should also plan where and           make measures at several points in time. An outline
how the assessments will be made. Where might be           for data analysis should, thus, be made in the planning
in the doctor’s office, in a clinic or hospital or in the   stages. All these issues are within the purview of the
home [96]. Ideally it should take place in a consis-       statistician or someone very familiar with multi-level
tent location but often this is impractical. A profes-     and multivariate analyses.
sional setting provides a milieu in which conditions           Finally, procedures to contend with missing items
are more controllable and personnel responsible for        within measures, or missing data forms need to be
administering the questionnaires can make sure that        defined. Missing data within a measure are generally
the patient completes it without input from family         dealt with according to the following process. If at least
or friends [96]. Telephone interviews, however, are        50% of the questions or items in a subscale have been
widely used, provide data similar to face-to-face          completed, a mean score calculated for that subscale
interviews [98] and control for timing and patient         can be imputed to replace the missing values. While
completion. If an external person is involved in ad-       this may decrease the variance in the data, it will prob-
ministering the questionnaire, that individual should      ably not have a major impact on the results [102],
not be part of the treatment team and preferably           [103]. Missing forms are more of a problem. If they are
should be unaware of the objective of the study and        missing at random because someone forgot to mail the
the group assignment if it is a controlled trial. Pro-     questionnaire to the patient or the patient missed a fol-
viding questionnaires for patients to complete at a        low visit because he or she was on a holiday, it is not
later date, or mailing questionnaires for completion       too serious. Forms not missing at random, which is the
are other accepted approaches but ones that often          more common scenario, may be telling us that the pa-
result in considerable missing data.                       tient is sicker (or healthier) or perhaps more upset with
    Several points are important to remember. We           the results of treatment than the average patient. In
know that data obtained from self-completed forms          other words there may be a health-related reason that
are slightly different for those obtained through          the questionnaire was not completed. For these cases it
interviews so it is preferable to select one approach      is important that a protocol is developed to handle the
[99]–[101]. Feasibility may dictate, however, that ad-     situation. Several options are available and all rely on
ministrative modes are mixed. In any case, detailed        statistical expertise and use of appropriate statistical
instructions must be provided to personnel responsi-       packages [103].
ble for collecting the data and procedures should be
established that facilitate compliance with question-
naire completion. It is also essential to clarify what     In clinical practice
should be done with the questionnaire when com-
pleted. Most often direct entry using computers and        While symptoms have traditionally been assessed, ad-
electronic transmission is used. Sometimes patients        vocates of patient-reported outcomes have supported
respond directly on a computer. Instructions on pre-       the use of other such measures in daily clinical practice.
serving confidentiality are also essential.                 In particular, the assessment of HRQL has been seen as
    Detailed descriptions of analytic methods are          an aid to screening for unidentified problems, making
clearly beyond the scope of this chapter, and so only a    decisions about treatment, monitoring patient status
few general points will be made. First, it is important    and response to treatment, as well as a mechanism for
to have statistical expertise when the study is being      quality assurance [104]. Barriers, however, were identi-
planned. HRQL or symptom scores are seldom the             fied to routine use for conceptual, methodological, prac-
primary endpoints upon which sample size is calculat-      tical and attitudinal reasons [105]. Scepticism about the
ed, and therefore, investigators need to be sure that      importance of the measures was voiced. Practitioners
they have sufficient subjects to make the comparisons       preferred traditional, pathologic or physiologic tests and
280                                                                                                        Chapter 25

did not understand the usefulness of information from          reported in the GERD literature, articles related to
both types of measures. They cited time and resource           the value of these tools in clinical decision-making
constraints, and were concerned about the costs of             was very scarce. One study conducted in Montreal
administering the tools, collecting the information,           [113], had been presented at the 58th Annual Meet-
compiling it rapidly, interpreting and using it.               ing of the Central Surgical Association in 2001.
    Over the years a number of these concerns have been        The ensuing discussion included questions to the
addressed. Studies have shown that assessing HRQL in           presenting author and these questions and their ans-
different practice settings is feasible and is easily incor-   wers were provided at the end of the article. One
porated into the office or clinic routine [106]. Briefer        question was about the practical use of continued as-
and more precise disease-specific measures have been            sessment of HRQL and why a simple satisfaction
developed and computer-assisted technology is available        rating scale was not sufficient. Her responses indi-
to provide instant scoring and feedback to the clinicians.     cated that the 5-point satisfaction measure varied
Information about population values and the amount of          little across patients and this was not sufficient as a
change in patient status required to reflect an important       sole endpoint, but that the HRQL scores yielded
difference, as perceived by the patient, have added ease       practical information. For example, if there were
to the interpretation of the scores.                           unexpected responses on the questionnaire, patients
    A number of studies, both controlled trials and oth-       were asked to return to the clinic for re-studies. As
er designs, have investigated the impact of the use of         was pointed out by the author, after you start to use
HRQL information on the doctor-patient communi-                HRQL questionnaires and “you get a feel for what is
cation. To summarize, the provision of information to          normal and abnormal, they help drive decision-
the clinician seems to have an impact on the process of        making in a practical way”. Hopefully, patient-
care. It increases the identification of previously unrec-      reported measures will be seen as an adjunct to
ognized problems [107]–[109], improves doctor-pa-              traditional care in the future. Their use can be seen as
tient communication and facilitates more emotional             formalizing what clinicians have been implicitly do-
support for patients [107], [110], and increases physi-        ing for ever when they ask a patient “How are you?”
cians’ awareness of their patients’ problems and con-
cerns [106], [107]. Moreover, the process was perceived
as useful by most physicians and it was acceptable to          Conclusions
patients and office and clinic staff [108], [109]. Finally,
it did not significantly increase the time of the doctor-       Patient reported outcomes have been advocated fol-
patient interaction [110]. The impact of providing             lowing surgery for GERD for the past several years
HRQL information to the clinicians appears to have             [114]. Those outlined in this chapter, as well as
had less impact on the outcomes of care. With the ex-          others such as “adherence to treatment”, are impor-
ception of the systematic review by Espallargues and           tant for measuring the impact of GERD and its
colleagues [109], there was no reported impact on pa-          treatment. Clinicians and researchers who use these
tient satisfaction, which was generally high [107],            measures should select them carefully according to
[111]. There was also little evidence of change in man-        their reliability, validity and responsiveness, as well
agement decisions as the result of HRQL knowledge              as to information about how the score is interpreted.
[107], [111], and most studies did not find that it in-         It is also important to think about how traditional,
fluenced health-related quality of life. There is, how-         objective tests are related to patient-reported mea-
ever, some recent evidence in people with cancer that          sures, and to use each type as appropriate. Objective
providing HRQL information to the clinician, posi-             tests and measures provide information about the
tively impacted the patient’s HRQL particularly in             medical status of the patient that is essential for
mental health and role performance areas [112].                management of the disease. Patient-reported mea-
    The studies referenced in the previous paragraph           sures give information about the individual’s per-
were all conducted using patients with problems                ception of the symptoms and dysfunctions and how
other than GERD. While the use of patient-reported             they impact on the quality of their lives before and
outcomes in follow-up after an operation was often             in response to treatment. Both are important.
Wood-Dauphinee S and Korolija D                                                                                          281

References                                                        [16] Watson DI, Foreman D, Devitt PG (1997) Preop-
                                                                       erative endoscopic grading of esophagitis versus out-
  [1]   Kaplan RM (2002) Quality of life: An outcomes                  come after laparoscopic Nissen fundoplication. Am J
        perspective. Arch Phys Med Rehabil 83 (Suppl 2):               Gastroenterol 92: 222–225
        S44–S50                                                   [17] Bammer T, Freeman M, Shabriari A et al (2002) Out-
  [2]   Patrick DL (2003) Patient-reported outcomes                    come of laparoscopic antireflux surgery in patients with
        (PROs): an organizing tool for concepts, measures              nonerosive reflux disease. J Gastointest Surg 6: 730–737
        and applications. Qual Life Newsletter 31: 1–5            [18] Kamolz T, Bammer T, Granderath FA et al (2001)
  [3]   Kamolz T, Pointer R (2002) Expectations of patients            Laparoscopic antireflux surgery in gastro-oesopha-
        with gastroesophageal reflux disease for the outcome            geal reflux disease patients with concomitant anxiety
        of laparoscopic antireflux surgery. Surg Laparosc               disorders. Dig Liver Dis 33: 659–664
        Endo Percutan Tech 12: 389–392                            [19] Kamolz T, Granderath FA, Pointner R (2003) Does
  [4]   Velanovich V, Karmy-Jones R (2001) Psychiatric                 major depression affect the outcome of laparoscopic
        disorders affect outcomes of antireflux operations              antireflux surgery? Surg Endosc 17: 55–60
        for gastroesophageal reflux disease. Surg Endosc           [20] Falk GW (2001) Gastroesophageal reflux disease
        15: 171–175                                                    and Barrett’s esophagus. Endoscopy 33: 109–118
  [5]   Velanovich V (2003) The effect of chronic pain syn-       [21] Dent J, Armstrong D, Delaney B et al (2004) Symp-
        dromes and psychoemotional disorders in symptom-               tom evaluation in reflux disease: workshop back-
        atic and quality of life outcomes of antireflux surgery.        ground, processes, terminology, recommendations and
        J Gastrointest Surg 7: 53–58                                   discussion outputs. Gut 53 (Suppl IV): IV1–IV24
  [6]   Eubanks TR, Omelanczuk P, Richards C et al (2000)         [22] Bytzer P (2004) Assessment of reflux symptom seve-
        Outcome of laparoscopic antireflux procedures. Am J             rity: methodological options and their attributes.
        Surg 179: 391–395                                              Gut 53 (Suppl IV): IV28–IV34
  [7]   Codman EA (1914) The product of a hospital. Surg          [23] Stephens RJ, Hopwood P, Girling DJ et al (1997)
        Gyn Obst 18: 491–496                                           Randomized trials with quality of life end-points:
  [8]   McColl E (2004) Best practice in symptom assess-               are doctors’ ratings of patients’ physical symptoms
        ment: a review. Gut 53 (Suppl IV): IV49–IV54                   interchangeable with patients self-ratings? Qual Life
  [9]   Price DD, Harkins SW, Baker C (1987) Sensory-af-               Res 6: 225–236
        fective relationships among different types of clinical   [24] Sandmark S, Carlsson R, Fausa O et al (1988)
        and experimental pain. Pain 28: 297–307                        Omeprazole or ranitidine in the treatment of reflux
 [10]   Stein HJ, Feussner H, Siewert JR (1998) Antireflux              esophagitis. Results of a double-blind, randomized
        surgery: a current comparison of open and laparoscopic         Scandinavian multicenter study. Scand J Gastro-
        approaches. Hepatogastroenterology 45: 1328–3337               enterol 23: 625–632
 [11]   Pope CE II (1992) The quality of life following anti-     [25] Wyrwich KW, Staebler Tardino VM (2004) A blue-
        reflux surgery. World J Surg 16: 355–358                        print for symptom scales and responses: measure-
 [12]   Velanovich V, Karmy-Jones R (1998) Measuring                   ment and reporting. Gut 53 (Suppl IV): IV45–IV48
        gastroesophageal reflux disease: relationship between      [26] Shaw M (2004) Diagnostic utility of reflux disease
        health-related quality-of-life scores and physiologic          symptoms. Gut 53 (Suppl IV): IV25–IV27
        parameters. Am Surg 64: 649–653                           [27] Revicki DA, Wood M, Maton PN et al (1998) The
 [13]   Shi G, Tatum RP, Joehl RJ et al (1999) Esophageal              impact of gastroesophageal reflux disease on health-
        sensitivity and symptom perception in gastroesopha-            related quality of life. Am J Med 104: 252–258
        geal reflux disease. Curr Gastroenterol Rep 1: 214–219     [28] Eloubeidi MA, Provenzale D (2000) Health-related
 [14]   Katzka DA (1999) Digestive system disorders:                   quality of life and severity of symptoms in patients
        gastroesophageal reflux disease. Clin Evidence 1:               with Barrett’s esophagus and gastroesophageal reflux
        145–153                                                        disease patients without Barrett’s esophagus. Am J
 [15]   Kulig M, Leodolter A, Vieth M et al (2003) Quality             Gastroenterol 95: 1881–1887
        of life in relation to symptoms in patients with          [29] Berzon RA, Hays RD, Shumaker SA (1993) Inter-
        gastro-oesophageal reflux disease – an analysis based           national use, application and performance of health-
        on the ProGERD initiative. Aliment Pharmacol                   related quality of life instruments. Qual Life Res 2:
        Ther 18: 767–776                                               367–368
282                                                                                                                    Chapter 25

[30] Guyatt GH, Feeny DH, Patrick DL (1993) Mea-                          symptomatic outcome, and patient satisfaction.
     suring health-related quality of life. Ann Intern Med                J Gastrointest Surg 6: 812–818
     118: 622–629                                                  [47]   Coyne KS, Wiklund I, Schmier J et al (2003) Devel-
[31] World Health Organization (1948) WHO Consti-                         opment and validation of a disease-specific treatment
     tution. Geneva WHO                                                   satisfaction questionnaire for gastro-oesophageal re-
[32] Guyatt G, Veldhuyzen Van Zanten S, Feeny D et al                     flux disease. Ailment Pharmacol Ther 18: 907–915
     (1989) Measuring quality of life in clinical trials: a tax-   [48]   Revicki DA (2004) Patient assessment of treatment
     onomy and review. Can Med Assoc J 140: 1441–1448                     satisfaction: methods and practical issues. GUT 53
[33] Hudak PL, Wright JG (2000) The characteristics of                    (Suppl IV): IV40–IV44
     patient satisfaction measures. Spine 25: 3167–3177            [49]   Sitzia J (1999) How valid and reliable are patient
[34] Linder-Pelz S (1982) Toward a theory of patient                      satisfaction data? An analysis of 195 studies. Int J
     satisfaction. Soc Sci Med 16: 577–582                                Qual Health Care 11: 319–328
[35] Ware JE, Davies-Avery A, Stewart AL (1978) The                [50]   Kirshner B, Guyatt G (1985) A methodological
     measurement and meaning of patient satisfaction.                     framework for assessing health indices. J Chron Dis
     Health Med Care Serv Rev 1: 13–15                                    38: 27–36
[36] Kravitz RL (1996) Patients’ expectations for medical          [51]   Portney LG, Watkins MP (2000) Reliability. In: Foun-
     care: an expanded formulation based on review of                     dations of clinical research: applications to practice (2nd
     the literature. Med Care Res Review 53: 3–27                         ed), Prentice Hall Health, New Jersey, pp 61–77
[37] Patrick DL, Martin ML, Bushnell DM et al (2003)               [52]   de Bruin AF, Diederiks JPM, de Witte LP et al (1997)
     Measuring satisfaction with migraine treatment: ex-                  Assessing the responsiveness of a functional status
     pectations, importance, outcomes and global ratings.                 measure: the Sickness Impact Profile versus the SIP68.
     Clin Ther 25: 2920–2935                                              J Clin Epidemiol 50: 529–540
[38] Locker D, Dunt D (1978) Theoretical and methodo-              [53]   Husted JA, Cook RJ, Farewell VT et al (2000)
     logical issues in sociological studies of consumer satis-            Methods for assessing responsiveness: a critical review
     faction with medical care. Soc Sci Med 12: 283–292                   and recommendations. J Clin Epidemiol 53: 459–468
[39] Weaver M, Patrick DL, Markson PD et al (1997)                 [54]   Terwee CB, Dekker FW, Wiersinga WM et al (2003)
     Issues in the measurement of treatment satisfaction.                 On assessing responsiveness of health-related quality of
     Am J Manag Care 3: 579–594                                           life instruments: guidelines for instrument evaluation.
[40] Bessell JR, Finch R, Gotley DC et al (2000) Chronic                  Qual Life Res 12: 349–362
     dysphagia following laparoscopic fundoplication.              [55]   Cohen J (1988) Statistical power analysis for the
     Br J Surg 87: 1341–1345                                              behavioral sciences. 2nd ed. Hillsdale: Laurence
[41] Vakil N, Shaw M, Kirby R (2003) Clinical effective-                  Erlbaum
     ness of laparoscopic fundoplication in a U.S. com-            [56]   Kazis L, Anderson J, Meenan R (1989) Effect sizes
     munity. Am J Med 114: 1–5                                            for interpreting changes in health status. Med Care
[42] Cleary P, McNeil B (1998) Patient satisfaction as an                 27 (Suppl): 178–189
     indicator of quality of care. Inquiry 22: 25–36               [57]   Liang MH, Fossel AH, Larson MG (1990) Com-
[43] Patrick DL, Erickson P (1993) Health status and                      parisons of five health status instruments for ortho-
     health policy: quality of life in health care evaluation             paedic evaluation. Med Care 28: 632–642
     and resource allocation. Oxford University Press,             [58]   Guyatt G, Walter S, Norman G (1987) Measuring
     New York                                                             change over time: assessing the usefulness of evalua-
[44] Dougall A, Russel A, Rutin G et al (2000) Re-                        tive instruments. J Chron Dis 40: 171–178
     thinking patient satisfaction: patient experiences of         [59]   Ware JE, Keller SD (1996) Interpreting general
     an open access flexible sigmoidoscopy service. Soc                    health measures. In: Quality of life and phar-
     Sci Med 50: 53–62                                                    macoeconomics in clinical trials (Spilker B, ed).
[45] Feinstein AR (1987) Global indexes and scales. In:                   Philadelphia: Lippencott-Raven, pp 445–460
     Clinimetrics. Yale University Press, New Haven, pp            [60]   Lydick EG, Epstein RS (1993) Interpretation of
     91–103                                                               quality of life changes. Qual Life Res 2: 221–226
[46] Granderath FA, Kamolz T, Schweiger M et al (2002)             [61]   Guyatt G, Osoba D, Wu A et al (2002) Methods to
     Long term follow-up after laparoscopic refundopli-                   explain the clinical significance of health status mea-
     cation for failed antireflux surgery: quality of life,                sures. Mayo Clin Proc 77: 371–383
You can also read