Drug Effectiveness Review Project - Systematic Review Methods and Procedures
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Drug Effectiveness Review Project Systematic Review Methods and Procedures Revised January 2011 Principal Investigator: Marian McDonagh, PharmD Oregon Evidence-based Practice Center Oregon Health & Science University Mark Helfand, MD, MPH, Director Copyright © 2011 by Oregon Health & Science University Portland, Oregon 97239. All rights reserved.
TABLE OF CONTENTS Introduction ................................................................................................................................. 3 Review Methods .......................................................................................................................... 3 Conflict of Interest Policy .......................................................................................................................... 3 Drug Effectiveness Review Project Topic Selection Process................................................................... 3 Formulating Key Questions ...................................................................................................................... 4 The Clinical Advisory Group ..................................................................................................................... 4 Searching the Literature and Other Sources of Data ............................................................................... 5 MEDLINE and other database searches .............................................................................................. 5 Dossier solicitation................................................................................................................................ 6 Web resources ..................................................................................................................................... 6 Study Selection and Inclusion................................................................................................................... 7 Application of study design criteria ....................................................................................................... 7 Cut-off date for new drug inclusion....................................................................................................... 8 Inclusion of active-control and placebo-controlled trials....................................................................... 8 Pooled analyses ............................................................................................................................... 9 Systematic reviews ........................................................................................................................ 10 Single-arm studies: Cohort or open-label extension of a trial ........................................................ 10 Unpublished studies or data .......................................................................................................... 11 Process for determining study eligibility ............................................................................................. 11 Quality Assessment of Individual Studies ............................................................................................... 12 Systematic Reviews ................................................................................................................................ 14 Data Synthesis ........................................................................................................................................ 16 Applicability ............................................................................................................................................. 16 Grading the Strength of the Overall Body of Evidence ........................................................................... 17 Summary Table....................................................................................................................................... 17 Peer Review and Public Comment ......................................................................................................... 18 Peer review......................................................................................................................................... 18 Public comment .................................................................................................................................. 18 Updating Reports .................................................................................................................................... 18 Single Drug Addendum to Reports ......................................................................................................... 19 Outline of a Typical Drug Effectiveness Review Project Report............................................................. 19 References................................................................................................................................. 20 Tables Table 1. Drug Effectiveness Review Project guidelines to assess quality of trials ..................................... 13 Table 2. Strength of evidence grades and definitions ................................................................................. 17 Drug Effectiveness Review Project January 2011 Systematic Review Methods 2 of 21
Introduction The methodology used by the Evidence-based Practice Centers in producing comparative systematic reviews for the Drug Effectiveness Review Project is described here. The methods follow the principles of “best evidence”, focusing on randomized controlled trials with direct comparisons and health outcomes wherever possible. The methods we use evolve as the international methods for evidence review evolve, incorporating newly developed methods, as appropriate, to our goal of producing high quality systematic reviews that meet the needs of the Participating Organizations of the Drug Effectiveness Review Project (see “About DERP” for more information on the participating organization who govern DERP). Review Methods Conflict of Interest Policy Drug Effectiveness Review Project investigators and staff comply with a policy on conflicts of interest whereby there is a formal, written declaration that there are no financial interests in any pharmaceutical company for the duration of the time the person is working on Drug Effectiveness Review Project projects. Prior to initiating work, all investigators and staff sign a form indicating they have no conflicts of interest. The assurance of an absence of conflicts of interest related to financial interests in pharmaceutical companies is declared annually for any investigator or staff member continuing to work with the Drug Effectiveness Review Project. For clinicians invited to participate in a Clinical Advisory Group, the Center for Evidence-based Policy obtains declarations of conflicts of interest. The policy on these conflicts is discussed in the section on Clinical Advisory Groups. Drug Effectiveness Review Project Topic Selection Process When new topics are considered, the Drug Effectiveness Review Project Participating Organizations follow an explicit selection process over a 3-month period. This process ensures that all organizations participate equally and that topics selected are relevant for the majority of participants. This process is undertaken at various points throughout the 3-year Drug Effectiveness Review Project contract cycle depending on the needs of the Participating Organizations and funds available. When new topics are considered, the Center for Evidence- based Policy solicits topics from each Participating Organization. The initial list is circulated among Participating Organizations and if there are a large number of potential topics (e.g. 10 or more) a vote is taken to narrow the list down to approximately 5 to receive further consideration. The number of topics chosen for additional work depends on the number of new reports to be initiated, but in general is not more than 5. The original topic submissions include the general scope (drugs and populations) and the reasoning behind the proposed topic. After reviewing the list of topics proposed, the Participating Organizations discuss the pros and cons of each potential topic prior to having the Center for Evidence-based Policy proceed with the production of briefing papers. Briefing papers include original participant submissions, pros and cons, and an overview of available evidence completed by the Oregon Evidence-based Practice Center. For each proposed topic the Oregon Evidence-based Practice Center conducts a search of MEDLINE Drug Effectiveness Review Project January 2011 Systematic Review Methods 3 of 21
using a search strategy designed specifically to identify systematic reviews. The Oregon Evidence-based Practice Center also searches the Websites for Agency of Healthcare Research and Quality, Canadian Agency for Drugs and Technologies in Health, the Cochrane Collaboration, Effective Healthcare, National Coordinating Center for Health Technology Assessment, National Institute for Clinical Excellence, and the National Health Service Center for Reviews and Dissemination to identify high quality systematic reviews relevant to the proposed topics. Additionally, a MEDLINE search for randomized controlled trials pertaining to the new topic is included to estimate the proposed topic size. Formulating Key Questions Based on the discussion held during topic selection, Key Questions are formulated and serve to define the scope of a Drug Effectiveness Review Project report. In general, the Key Questions follow this template: 1. What is the comparative effectiveness of for treatment of in ? 2. What are the comparative harms of for treatment of in ? 3. Does the comparative effectiveness or harms of vary in patient subgroups defined by demographics (age, racial groups, gender, etc), socioeconomic status, use of other medications, or presence of comorbidities? The questions are modified to best suit the particular review and can include specific outcomes of focus, such as mortality or symptom relief. Additional or sub-questions can be used when they add important nuance. However, the study inclusion criteria are intended to provide the detailed information on specific drugs and outcome measures included. Draft Key Questions are brought to the Participating Organization group for discussion and comment. A second draft of the Key Questions is then formulated and again discussed with the Participating Organizations. Clinical experts, identified by the Participating Organizations, are consulted via teleconference to provide assistance in refining the Key Questions. Following modifications, this set of draft Key Questions is posted to the Drug Effectiveness Review Project Website for public comment. Public comments and Oregon Evidence-based Practice Center responses are documented in a spreadsheet and are discussed with the Participating Organizations, and after any approved modifications, the final Key Questions are posted to the Drug Effectiveness Review Project Website. The Clinical Advisory Group In general, the purpose of the Clinical Advisory Group is to provide insight and assistance to Drug Effectiveness Review Project researchers and participants by offering clinically relevant counsel throughout the stages of an original review. Currently, advisory groups are utilized for updates of existing Drug Effectiveness Review Project reviews on a case-by-case basis depending largely on whether a change in scope has occurred between the previous report and the current update. Drug Effectiveness Review Project January 2011 Systematic Review Methods 4 of 21
The Center for Evidence-based Policy identifies the potential clinical advisors based initially on suggestions from the Participating Organizations, who are asked to recommend clinical experts that best represent their constituencies and who also have significant recent experience in providing direct patient care. The Center for Evidence-based Policy gathers conflict of interest information from each clinician and coordinates the assembly of a balanced Clinical Advisory Group. The composition of the Clinical Advisory Group and their conflicts of interest declarations are reviewed and discussed by the Participating Organizations prior to any clinician being contacted. Once the Clinical Advisory Group has been formalized, the Center for Evidence-based Policy arranges a teleconference with the advisors and the Evidence-based Practice Center researchers. In general, consultation with the Clinical Advisory Group focuses on the scope of the initial Key Questions with regards to the relevant aspects of the population, including identification of the most important subgroups, interventions, comparators, outcomes, and study design. Further information regarding clinical experience with certain drug therapies and/or disease-state management may also be discussed during the initial meeting and can occur throughout the review process. For example, to assist with quality assessment, it may be useful to seek guidance from the Clinical Advisory Group in the identification of the most important baseline prognostic factors. Additionally, we consult with the Clinical Advisory Group members to determine which subset of outcomes they would consider important enough to warrant formal grading of the strength of evidence. Evidence-based Practice Center researchers consider suggestions made by the Clinical Advisory Group members and any proposed modifications are then discussed with the Participating Organizations. All Clinical Advisory Group members volunteer their time and expertise and are not monetarily compensated by the Center for Evidence-based Policy, the Drug Effectiveness Review Project, or Participating Organizations. Clinical Advisory Group members have the option of becoming peer reviewers; however, this is not a required function of the group. Experts who participated in the Clinical Advisory Group are listed on the Drug Effectiveness Review Project Website. Searching the Literature and Other Sources of Data MEDLINE and other database searches Searches for Drug Effectiveness Review Project reports are generally conducted in consultation with a medical librarian. At a minimum, MEDLINE and the Cochrane Central Register of Controlled Trials, Cochrane Database of Systematic Reviews, and Database of Abstracts of Reviews of Effects are searched. Other databases (e.g., PsycInfo, CancerLit) may be searched depending on the topic. Search strategies generally combine all included interventions (using proprietary and generic names) and populations, see example below: Sample search strategy: Pegylated interferons for Hepatitis C infection (Numbers in parentheses represent number of studies retrieved) 1 exp Hepatitis C/ or hepatitis C.mp. or hcv.mp. (36716) 2 Pegasys.mp. (50) 3 Peg-intron.mp. (25) 4 peginterferon alfa 2a.mp. (679) 5 peginterferon alfa 2b.mp. (521) Drug Effectiveness Review Project January 2011 Systematic Review Methods 5 of 21
6 Interferon Alfa 2a.mp. or exp Interferon Alfa-2a/ (3015) 7 Interferon Alfa-2b.mp. or exp Interferon Alfa-2b/ (4167) 8 6 or 7 (6677) 9 exp Interferons/ (83358) 10 2a.mp. (21935) 11 2b.mp. (17454) 12 9 and (10 or 11) (7636) 13 exp Polyethylene Glycols/ (26140) 14 pegylat$.mp. (2121) 15 peginterferon$.mp. (967) 16 13 or 14 or 15 (27324) 17 (8 or 12) and 16 (988) 18 2 or 3 or 4 or 5 or 17 (1027) 19 ribavirin.mp. or exp Ribavirin/ (5099) 20 1 and 18 and 19 (697) 21 from 20 keep 1-697 (697) Databases are searched twice, once at the beginning of the review and then between 2 to 3 months prior to submission of the draft report. Dossier solicitation The Center for Evidence-based Policy requests dossiers from all pharmaceutical companies that manufacture any drug included in an individual report. Dossiers are intended to provide a complete list of citations for all relevant studies of which the manufacturer is aware. We also request unpublished study information and data, with the understanding that once the report is published the public may obtain the information by requesting a copy of the dossier – in effect making it public. Any dossier marked “confidential” are not accepted. A copy of the most recent product label is also requested. Dossiers are reviewed by Drug Effectiveness Review Project staff for relevant trials or other data that may not have been captured in MEDLINE or Web searches and for unpublished data. An accounting of which companies provided dossiers is included in the Results section of the report. Web resources At a minimum, the following Website must be searched for relevant information. • US Food and Drug Administration Center for Drug Evaluation and Research Drugs@FDA http://www.accessdata.fda.gov/scripts/cder/drugsatfda/ o This site may be searched by drug name or active ingredient (not drug class) for statistical and medical reviews written by US Food and Drug Administration personnel examining information submitted by pharmaceutical companies to the US Food and Drug Administration for drug approval. However, the Website typically does not have documents related to older drugs and very new drugs. Reviews may be downloaded and hand searched for trials. The Center for Drug Evaluation and Research site also lists any postmarketing study commitments that are conducted after the US Food and Drug Administration has approved a product for marketing (e.g., studies requiring the sponsor to demonstrate clinical benefit of a product following accelerated approval). Drug Effectiveness Review Project January 2011 Systematic Review Methods 6 of 21
o The Medical and Statistical review documents contain information about trials submitted as part of the New Drug Application and their results. Information contained in the US Food and Drug Administration reviews is typically not adequate to assess trial quality. However, these data are used to verify, or add to data obtained from published manuscripts of these trials. In addition, the studies submitted to the US Food and Drug Administration are compared with those found in the published literature and unpublished studies submitted by manufacturers to identify any remaining unpublished studies. The results of the trials reported in the US Food and Drug Administration documents are compared to those reported in published reports of the same studies to identify variation in outcome reporting. A summary of the findings of the search of US Food and Drug Administration documents is included in the Results section of the report. At the discretion of the Lead Investigator, the following Websites may be searched for relevant information. Other sites may also be included in the search if appropriate: • ClinicalTrials.gov http://www.clinicaltrials.gov/ o Information on planned and on-going trials, maintained by the National Institutes of Health. • Clinical Study Results http://www.clinicalstudyresults.org/ o Sponsored by the Pharmaceutical Research and Manufacturers of America, and provides clinical study results completed since October 2002, mostly from Phase III and Phase IV studies. It includes a link to the electronic version of the drug label, a bibliography of articles on the drug in question with links to the articles where possible, and a complete summary of each hypothesis testing trial (regardless of outcome) that has not been published in a peer-reviewed journal. • Lilly Clinical Trials http://www.lillytrials.com/ o One of several pharmaceutical manufacturers that have established clinical trial registries for their own products. Trial results are searchable by therapeutic area or product; new and ongoing trials are included. • Current Controlled Trials http://www.controlled-trials.com/ o Established to promote the exchange of information about ongoing randomized controlled trials worldwide. Allows searching across multiple clinical trial registers, including the National Health Service in England, United States ClinicalTrials.gov, and direct access to Biomed Central. Study Selection and Inclusion Application of study design criteria In order for any study report to be selected for inclusion in a Drug Effectiveness Review Project review, it must meet all eligibility criteria for populations, interventions, outcomes, and study designs, as explicitly specified, a priori, in the Key Questions determined by the Participating Organizations. The reviewers, with approval of the Participating Organizations, set the study design criteria. For effectiveness outcomes, the starting point for inclusion is controlled clinical trials and good-quality systematic reviews, and for outcomes related to harms (adverse events), these same designs are included, as well as cohort studies with a control group and case-control Drug Effectiveness Review Project January 2011 Systematic Review Methods 7 of 21
studies. Within these study designs, direct comparisons (head-to-head studies) are the primary focus of synthesis of the evidence. Determining eligibility of these studies is straightforward and is based on the study reflecting a direct comparison of at least 2 drugs included in the Drug Effectiveness Review Project report, and meeting population and outcome criteria. However, under the tenets of a best evidence approach, inclusion criteria for Drug Effectiveness Review Project reports are written to allow inclusion of placebo-controlled trials, active-control trials, single-arm cohort or open-label extension studies, and meta-analyses not based on results of a systematic review (“pooled analyses”) when necessary to fill gaps in evidence in instances when direct comparisons between drugs have not been made. This may be extended to include situations where direct comparisons are available, but these studies do not report outcomes important to the Drug Effectiveness Review Project Participating Organizations. Determining the eligibility of good-quality systematic reviews can also be complicated and depends on how similar the scope of the review is to the scope of the Drug Effectiveness Review Project report, and how recent the evidence included in the review is. Evidence not meeting study design inclusion criteria may be included in the report if the evidence is clearly identified as not meeting the criteria but is being included as a matter of record. Examples of such information are US Food and Drug Administration MedWatch reports and case series, especially those leading to black box warnings in the product label. These can be reported in the introduction/background section or in the section discussing evidence on adverse events. Cut-off date for new drug inclusion If a new drug is introduced to the market, the last date of inclusion of that drug in the report is 15 calendar days subsequent to the date the dossiers are due to be submitted by the pharmaceutical manufacturers. Additionally, if the drug has not been approved at the time initial dossier solicitations are sent out, the manufacturer must notify the Center for Evidence-based Policy of the pending approval date and intent to submit a dossier prior to the dossier submission deadline. This ensures that every pharmaceutical manufacturer is given a fair chance in submitting dossiers related to the newly approved drug. Inclusion of active-control and placebo-controlled trials In Drug Effectiveness Review Project reviews, good-quality, randomized controlled trials that directly compare different drugs (head-to-head trials) provide the most valid evidence of their comparative effectiveness. However, Drug Effectiveness Review Project reviewers often face instances when direct comparisons between one or more included drugs have not been studied. Or, even when direct comparative evidence is available, it may be limited in quality, quantity, clinical impact, generalizability and/or other important elements. Limitations in quality of direct comparative evidence are determined based on objective assessment of internal validity using predefined criteria. Limitations in quantity of direct comparative evidence are determined based on consideration of the adequacy of the number of studies and subject sample sizes. Regarding clinical impact, a common limitation found in clinical trials in general is their under-reporting of important health outcomes such as quality of life and functional capacity. Likewise, regarding generalizability, clinical trials are often criticized overall for using narrowly defined populations and for their under-reporting of treatment outcomes in subgroups based on age, sex, race, and common comorbidities. Drug Effectiveness Review Project January 2011 Systematic Review Methods 8 of 21
In cases where such gaps in direct comparative evidence exist, trials that compare included drugs to placebo are considered for their usefulness in providing a source for qualitative or quantitative indirect comparisons of effectiveness and harms. Inclusion criteria for Drug Effectiveness Review Project reviews are written to allow inclusion of placebo-controlled trials to fill gaps in direct evidence. Judgments regarding what constitutes a gap are determined on a case-by-case basis, but are based on principles of the strength of evidence, including the risk of bias, consistency, precision, directness, and applicability of the direct evidence.(1) A description of the rationale for judgments regarding sufficiency of head-to-head trial evidence and utilization of placebo-controlled trials are provided in each Drug Effectiveness Review Project report. In updates, as new head-to-head trials emerge that correspond to previously-defined gaps in evidence, reviewers should consider removing placebo-controlled trials that may no longer be useful and should revise the description of their rationale accordingly. As with head-to-head trials, the quality of any placebo-controlled trials that contribute data to the synthesis are assessed using the same standardized criteria and their data abstracted into evidence tables. On the other hand, for areas of a Drug Effectiveness Review Project review where direct comparative evidence is deemed sufficient, evidence from placebo-controlled trials are included, and their data abstraction and quality assessment is not required. Method of synthesis of evidence from placebo-controlled trials is determined on a case- by-case basis. Pursuit of qualitative or quantitative indirect comparison is never required and decisions to do so must depend on consideration of clinical, methodological, and statistical heterogeneity levels across the individual studies. Guidance on methods for quantitative indirect synthesis can be found elsewhere.(2) In many cases, when excess heterogeneity is present, a general discussion of findings from placebo-controlled trials can be useful for identifying which individual drugs have any evidence of effect in the gap areas compared with those that do not even have basic efficacy data. Trials that compare one of the drugs included in the review against a drug that is not included are called “active-control” in the Drug Effectiveness Review Project. This is to differentiate from trials with direct comparisons among the included drugs. These studies are included only in specific, infrequent, situations. Where there is no direct evidence, and no or very limited evidence from placebo-controlled trials, evidence from active-control trials may be relevant. However, such indirect comparisons are typically only useful when the comparator (the active-control drug) is the same across the included studies. If there is significant heterogeneity in the comparators such studies are unlikely to provide good indirect evidence for comparing one included drug to another. Pooled analyses A pooled analysis is a meta-analysis of a group of highly selected studies. There is typically no related search strategy to identify the articles (although sometimes a noncomprehensive search), and no quality assessment of the included trials. Pooled analyses are not systematic in nature. However, there are limited situations where this level of evidence may be useful and admissible in a Drug Effectiveness Review Project report. Similar to the use of placebo-controlled trial evidence, pooled analyses are used to provide evidence where no or insufficient evidence exists. For example, pooled analysis presents data on subgroups where primary data on these subgroups is not obtainable from the primary sources or supplements information on outcomes not reported in the primary sources. In these cases, the Drug Effectiveness Review Project January 2011 Systematic Review Methods 9 of 21
primary studies have been published or are available to the Drug Effectiveness Review Project authors in sufficient detail to assess the quality of the study. However, pooled analyses of results already available to the Drug Effectiveness Review Project authors from primary sources are not be included. Because pooled analyses do not follow a systematic approach to identifying and assessing the studies, Drug Effectiveness Review Project authors undertake an independent analysis of these studies. If the pooled analysis is the only source of data from component studies (e.g. results of the primary studies included in the meta-analysis are not published), it can be included at the Drug Effectiveness Review Project report author’s discretion, but the limitations are made clear – primarily the limited ability to assess the quality of the component studies. Systematic reviews As part of a high-quality approach to evidence review, existing systematic reviews are considered for inclusion in Drug Effectiveness Review Project reports along with other types of evidence. The intention is to include reviews that directly address the Key Questions posed in the report and that meet minimum standards for quality. In order for a review to be considered for inclusion into a Drug Effectiveness Review Project report, the review must meet at least 2 criteria that indicate it is “systematic”. First, the review must include a comprehensive search method for the evidence. This entails searching multiple sources of information (electronic databases, reference lists, hand searching journals, etc). Second, the review must provide (or at least describe) the search terms used to retrieve the evidence. Other information such as the reporting of study eligibility criteria, quality assessment, et cetera are not required to determine if a review qualifies as being “systematic”, although this information is useful. The review also must address questions that are similar enough to those posed in the report to provide useful information to the report readers. Reviews that evaluate “class effect” of drugs grouped together compared with other interventions are unlikely to be useful in a Drug Effectiveness Review Project report. Moreover, reviews that compare a small proportion of drugs in a large class may not be useful. An example would be if 2 of 7 drugs were reviewed. The report authors determine the usefulness of the review in the larger context of the drug class. Additional inclusion criteria are determined based on what is known about the underlying evidence base. For example, in an area where there are many existing reviews over many years, the authors may choose a cut-off date to examine only the most recent reviews for inclusion, such as within 2 years of the Drug Effectiveness Review Project search dates. In other areas, this may not be a reliable approach if the underlying literature is older and has not changed in recent years. Single-arm studies: Cohort or open-label extension of a trial There are 2 types of studies considered here: observational studies of patients receiving a drug included in the Drug Effectiveness Review Project report with no comparison group that is relevant to the review and open-label extension studies of a randomized controlled trial. Single-group studies are included under the “best evidence” approach only if the study adds important evidence on harms that is not available from other, higher quality, studies. This means that the study must have exposure duration longer than the trials included and that no comparative evidence is available. The minimum duration (e.g. 2 years of follow-up) is Drug Effectiveness Review Project January 2011 Systematic Review Methods 10 of 21
determined a priori based on the current knowledge of the drugs potential adverse events and taking account of the existing evidence from trials. Caveats in using open-label extension studies include that the study population is derived from a clinical trial where the populations are typically highly selected (a narrow set of inclusion criteria), and often patients continuing in the extension are those who had adequate response and/or tolerated the drug during the trial period. There are no clearly agreed upon criteria for evaluating the quality of such studies. Single-group cohort studies are evaluated under the same criteria used to evaluate the quality of cohort studies with a comparison group. It can be difficult to determine the mean or minimum duration of follow-up (exposure to the drug) in these studies, which may make them less useful in evaluating longer-term harms. However, this type of study may include a more broadly defined group of patients than those included in trials and could potentially increase the applicability of this evidence. Unpublished studies or data Unpublished studies may be identified through pharmaceutical manufacturer dossiers, US Food and Drug Administration documents, or trial registries. Pharmaceutical manufacturer dossiers may also contain previously-unpublished supplemental data from published studies. Unpublished data or studies cannot be submitted by pharmaceutical companies after the dossier process timeline (e.g. not through the public comment process for draft reports). Unpublished studies Unpublished studies identified through pharmaceutical manufacturer dossiers, US Food and Drug Administration documents, or trial registries are to be included only if the study meets the inclusion criteria established in the key questions and sufficient detail is provided to assess the study quality. At a minimum, information must be provided on the comparability of groups at baseline, the number of patients analyzed, whether an intention-to-treat analysis was conducted, and the type of statistical test used. If this information is not present in the dossier submission, the study is not to be included. Supplemental data In situations where additional data are provided regarding a published study, such as additional outcomes or subgroup data not included in the published manuscript, analyses of these data will be included if details of the data analysis are provided. Specifically, the type of statistical test used, numbers analyzed, and whether an intention-to-treat analysis was conducted must be reported. In general, raw data will not be analyzed by the review team. Additional data on subgroups will only be included for direct, head-to-head, comparisons of included drugs and it is expected that any analyses conducted on the data will be adequately described for reviewer evaluation. Study quality assessment will be based on the fully published study details. Process for determining study eligibility Overall, determination of study eligibility is based on reviewer judgment using a 2-step process. In order to reduce potential bias, and ensure accuracy and reproducibility, all study reports identified in searches are assessed for eligibility by at least 2 qualified reviewers (“dual review”) and final selection decisions are made using a consensus process. Qualified reviewers are limited Drug Effectiveness Review Project January 2011 Systematic Review Methods 11 of 21
to individuals with adequate training and experience to apply the inclusion criteria with consistency and accuracy. The first step of the study selection process involves assessment of titles and abstracts identified through literature searches for preliminary determination of study eligibility. Only study reports with titles and abstracts that are unequivocally ineligible are rejected at this stage. For all other reports, the full-text articles are obtained and read in detail for the second step in determination of eligibility. Both steps in the study selection process should involve “dual review” and it is up to the reviewers to decide which of 2 “dual review” options to use. If possible, it is desirable to complete eligibility assessments for each report in duplicate by 2 independent reviewers. However, it is also acceptable to have a first reviewer to complete eligibility assessments and a second reviewer to check the accuracy of the first reviewer’s assessment results. If 2 reviewers are unable to agree, a third party, as senior reviewer, is consulted. Results of eligibility assessments for all screened reports are displayed in a Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement-based diagram.(3) The flow diagram depicts the flow of information through the different phases of a systematic review. It maps out the number of records identified, included and excluded, and the reasons for exclusions. Studies are excluded if they do not meet predetermined inclusion criteria as defined in the Key Questions, for example when the population, intervention, comparator, outcomes, or study designs do not meet eligibility requirements. Studies with results presented only in an abstract of a conference proceeding are excluded and would be described as not meeting study design criteria. Similarly, systematic reviews are excluded if they are outdated or of poor quality. Other reasons for study exclusion are publication in a non-English language or if all efforts to retrieve the article were exhausted without successful retrieval. Finally, for reader convenience, all Drug Effectiveness Review Project reports contain an Appendix that lists reasons for exclusion for all individual trials that were excluded at least at the full-text level. For updates, this list is a cumulative list of trials but within a 3-5-page limit. If it exceeds that page limit, the excluded trials list consists of trials for that specific update only and readers are directed to the older versions of the reports available on the Drug Effectiveness Review Project Website for trials excluded previously. Quality Assessment of Individual Studies For trials, we assess internal validity (quality) based on the predefined criteria. These criteria are based on those used by the US Preventive Services Task Force and the National Health Service Centre for Reviews and Dissemination (United Kingdom).(4, 5) We rate the internal validity of each trial based on the methods used for randomization, allocation concealment, and blinding; the similarity of compared groups at baseline; maintenance of comparable groups; adequate reporting of dropouts, attrition, crossover, adherence, and contamination; loss to follow-up; and the use of intention-to-treat analysis. Trials with a fatal flaw are rated poor quality; trials that meet all criteria are rated good quality; the remainder are rated fair quality. As the fair-quality category is broad, studies with this rating vary in their strengths and weaknesses: The results of some fair-quality studies are likely to be valid, while others are only possibly valid. A poor- quality trial is not valid; the results are at least as likely to reflect flaws in the study design as a true difference between the compared drugs. A fatal flaw is reflected by failure to meet combinations of items of the quality assessment checklist. A particular randomized trial might Drug Effectiveness Review Project January 2011 Systematic Review Methods 12 of 21
receive 2 different ratings, one for effectiveness and another for adverse events. More detailed descriptions of how each checklist item is assessed are presented in Table 1, below. Table 1. Drug Effectiveness Review Project guidelines to assess quality of trials 1. Was the assignment to the treatment groups really random? Use of the term “randomized” alone is not sufficient for a judgment of “Yes”. Explicit Yes description of method for sequence generation must be provided. Adequate approaches include: Computer-generated random numbers, random numbers tables Randomization was either not attempted or was based on an inferior approach (e.g., No alternation, case record number, birth date, or day of week) Unclear Insufficient detail provided to make a judgment of yes or no. 2. Was the treatment allocation concealed? Adequate approaches to concealment of randomization: Centralized or pharmacy-controlled randomization, serially-numbered identical containers, on-site computer based system with a Yes randomization sequence that is not readable until allocation Note: If a trial did not use adequate allocation concealment methods, the highest rating it can receive is “Fair”. Inferior approaches to concealment of randomization: Use of alternation, case record No number, birth date, or day of week, open random numbers lists, serially numbered envelopes No details about allocation methods. A statement that “allocation was concealed” is not Unclear sufficient; details must be provided. 3. Were groups similar at baseline in terms of prognostic factors? Parallel design: No clinically important differences Crossover design: Comparison of baseline characteristics must be made based on order of randomization. Yes Prognostic factors are important to consider and are discussed a priori with clinical advisory groups. A statistically significant difference does not automatically constitute a clinically important difference. No Clinically important differences Parallel design: Statement of “no differences at baseline”, but data not reported; or data not Unclear reported by group, or no mention at all of baseline characteristics Crossover design: Only reported baseline characteristics of the overall group. 4. Were eligibility criteria specified? Yes Eligibility criteria were specified a priori. No Criteria not reported or description of enrolled patients only. 5. Were outcome assessors blinded to treatment allocation? 6. Was the care provider blinded? 7. Was the patient blinded? Explicit statement(s) that outcome assessors/care provider/patient were blinded. Double- Yes dummy studies and use of identically appearing treatments are also considered sufficient blinding methods for patients and care providers. No No blinding used, open-label Unclear, Study described as double-blind but no details provided on how blinding was carried out or described as who was specifically blinded. double-blind Drug Effectiveness Review Project January 2011 Systematic Review Methods 13 of 21
Not reported No information about blinding 8. Did the article include an intention-to-treat analysis or provide the data needed to calculate it (that is, number assigned to each group, number of subjects who finished in each group and their results)? All patients that were randomized were included in the analysis. Imputation methods (e.g., last-observation carried forward) should be clearly described. OR Yes Exclusion of 5% of patients or less is acceptable, given that the reasons for exclusion are not related to outcome (e.g., did not take study medication) and that the exclusions would not be expected to have an important impact on the effect size Exclusion of greater than 5% of patients from analysis OR less than 5%, with reasons that No may affect the outcome (e.g., adverse events, lack of efficacy) or reasons that may be due to bias (e.g., investigator decision) Unclear Numbers analyzed are not reported 9. Did the study maintain comparable groups? No attrition. OR, the groups analyzed remained similar in terms of their baseline prognostic Yes factors. No Groups analyzed had clinically important differences in important baseline prognostic factors There was attrition, but insufficient information to determine if groups analyzed had clinically Unclear important differences in important baseline prognostic factors 10. Were levels of crossovers (≤ 5%), adherence (≤ 20%), and contamination (≤ 5%) acceptable? Yes Levels of crossovers, adherence and contamination were below specified cut-offs. No Levels or crossovers, adherence, and contamination were above specified cut-offs. Insufficient information provided to determine the level of crossovers, adherence and Unclear contamination. 11. Was the rate of overall attrition and the difference between groups in attrition within acceptable levels? Overall attrition: There is no empirical evidence to support establishment of a specific level of attrition that is universally considered “important”. The level of attrition considered important will vary by review and is determined a priori by the review teams. Attrition refers to discontinuation for ANY reason, including lost to follow-up, lack of efficacy, adverse events, investigator decision, protocol violation, consent withdrawal, etc. Yes The overall attrition rate was below the level that was established by the review team. No The overall attrition rate was above the level that was established by the review team. Unclear Insufficient information provided to determine the level of attrition Differential attrition Yes The absolute difference between groups in rate of attrition was below 10%. The difference between groups in the overall attrition rate or in the rate of attrition for a No specific reason (e.g., adverse events, protocol violations, etc.) was 10% or more. Unclear Insufficient information provided to determine the level of attrition Systematic Reviews Included systematic reviews are rated for quality based on a clear statement of the questions(s) the review is intended to answer; reporting of inclusion criteria; methods used for identifying literature (the search strategy), validity assessment, and synthesis of evidence; and details provided about included studies. Reviews are categorized as good when all criteria are met. Because there are different methods available for assessing the quality of systematic reviews, and none has become the standard, reviewers can use one of the following: AMSTAR,(6-8) Oxman and Guyatt,(9, 10) or the Centre for Reviews and Dissemination criteria (below).(4) Drug Effectiveness Review Project January 2011 Systematic Review Methods 14 of 21
1. Is there a clear review question and inclusion/exclusion criteria reported relating to the primary studies? a. A good-quality review should focus on a well-defined question or set of questions, which ideally refers to the inclusion/exclusion criteria by which decisions are made on whether to include or exclude primary studies. The criteria should relate to the 4 components of study design: indications (patient populations), interventions (drugs), and outcomes of interest. In addition, details are reported relating to the process of decision-making, i.e., how many reviewers were involved, whether the studies were examined independently, and how disagreements between reviewers were resolved. 2. Is there evidence of a substantial effort to search for all relevant research? a. This is usually the case if details of electronic database searches and other identification strategies are given. Details of the search terms used, date, and language restrictions are presented. In addition, descriptions of hand-searching, attempts to identify unpublished material, and any contact with authors, industry, and research institutes are provided. The appropriateness of the database(s) searched by the authors should also be considered, for example if MEDLINE is searched for a review looking at health education, then it is unlikely that all relevant studies have been located. 3. Is the validity of included studies adequately assessed? a. A systematic assessment of the quality of primary studies should include an explanation of the criteria used (e.g., method of randomization, whether outcome assessment was blinded, whether analysis was on an intention-to-treat basis). Authors may use either a published checklist or scale, or one that they have designed specifically for their review. Again, the process relating to the assessment is reported (i.e. how many reviewers involved, whether the assessment was independent, and how discrepancies between reviewers were resolved). 4. Is sufficient detail of the individual studies presented? a. The review should demonstrate that the studies included are suitable to answer the question posed and that a judgement on the appropriateness of the authors' conclusions can be made. If a paper includes a table giving information on the design and results of the individual studies, or includes a narrative description of the studies within the text, this criterion is usually fulfilled. If relevant, the tables or text should include information on study design, sample size in each study group, patient characteristics, description of interventions, settings, outcome measures, follow-up, drop-out rate (withdrawals), effectiveness results, and adverse events. 5. Are the primary studies summarized appropriately? a. The authors should attempt to synthesize the results from individual studies. In all cases, there should be a narrative summary of results, which may or may not be accompanied by a quantitative summary (meta-analysis). For reviews that use a meta-analysis, heterogeneity between studies should be assessed using statistical techniques. If heterogeneity is present, the possible reasons (including chance) should be investigated. In addition, the individual evaluations should be weighted in some way (e.g., according to sample size or Drug Effectiveness Review Project January 2011 Systematic Review Methods 15 of 21
inverse of the variance) so that studies that are considered to provide the most reliable data have greater impact on the summary statistic. Data Synthesis For the Drug Effectiveness Review Project, evidence tables showing the study characteristics, quality ratings, and results for all included studies are constructed. Studies are reviewed using a hierarchy of evidence approach, where the best evidence is the focus of our synthesis for each question, population, intervention, and outcome addressed. Studies that evaluate one drug against another provide direct evidence of comparative effectiveness and harms. Where possible, these data are the primary focus. Direct comparisons are preferred over indirect comparisons; similarly, effectiveness and long-term or serious adverse event outcomes are preferred to efficacy and short-term tolerability outcomes. In theory, trials that compare a drug with other drug classes or with placebo can also provide evidence about effectiveness. This is known as an indirect comparison and can be difficult to interpret for a number of reasons, primarily heterogeneity of trial populations, interventions, and outcomes assessment across the studies. Data from indirect comparisons are used to support direct comparisons, where they exist, and are used as the primary comparison where no direct comparisons exist. Indirect comparisons are interpreted with caution. Quantitative analyses are conducted using meta-analyses of outcomes reported by a sufficient number of studies that are homogeneous enough that combining their results could be justified. In general, the Drug Effectiveness Review Project follows the guidance on meta- analysis put forth for Evidence-based Practice Centers in the Evidence-based Practice Center Methods Guide.(2) In order to determine whether meta-analysis can be meaningfully performed, we consider the quality of the studies and the heterogeneity among studies in design, patient population, interventions, and outcomes. The Q statistic and the I2 statistic (the proportion of variation in study estimates due to heterogeneity) are calculated to assess statistical heterogeneity between studies.(11, 12) If significant heterogeneity is shown, potential sources are then examined by analysis of subgroups of study design, study quality, patient population, and variation in interventions. Meta-regression models may be used to formally test for differences between subgroups with respect to outcomes.(13, 14) Random effects models to estimate pooled effects are preferred in most cases, unless a case can be made that a fixed effect model is more appropriate. Other analyses, including adjusted indirect meta-analysis and mixed treatment effect model (network meta-analysis) are done in consultation with statisticians experienced in conducting these analyses and using the most up-to-date and appropriate methods. When it is determined that is unwise to pool data from a group of studies, the data are summarized qualitatively. When synthesizing unpublished evidence, reviewers will conduct sensitivity analyses where possible to determine if there is an indication of bias when unpublished data are included. The source of unpublished information will be clearly noted in the text of the report, stating that these are unpublished data and have not undergone a medical journal’s peer review process and should be interpreted cautiously. Applicability Drug Effectiveness Review Project January 2011 Systematic Review Methods 16 of 21
An assessment of applicability is undertaken in Drug Effectiveness Review Project reports. The applicability assessment is tailored to the Key Questions, and if possible the population to whom the questions are intended to apply. These are defined in advance with the help of the Clinical Advisory Group and the Drug Effectiveness Review Project Participating Organizations. A discussion of applicability appears immediately before the Summary Table in Drug Effectiveness Review Project reports. Grading the Strength of the Overall Body of Evidence Strength of evidence is assessed based on the main outcomes for each Key Question, as determined by the Participating Organizations and with input from the Clinical Advisory Group, and generally follows the method used by the Evidence-based Practice Center program.(1) Individual lead investigators may choose to use the GRADE approach to grading the strength of evidence, if they determine they are more familiar with this system.(15-17) In either system, the main domains considered in assessing the strength of a body of evidence for a given outcome are: risk of bias of the included studies, directness of the studies in measuring the outcome and comparison in question, and the consistency and prevision of the results of the studies. Poor- quality studies do not contribute to the assessment of overall risk of bias for a body of evidence because their results are not synthesized with the fair and good quality study results. After assessing each of these items for the group of studies, an overall assessment is made. The evidence can be described as low, moderate, or high strength of evidence. In addition, when there is either no evidence available or the evidence is too limited or too indirect to make conclusions about comparative effectiveness, the evidence can be described as insufficient to determine the strength of evidence. The Evidence-based Practice Center definitions of these terms are listed below in Table 2, below. The tables used to assess each outcome are presented in an Appendix. A summary grade of the strength of evidence can be included in the summary table (above). A paragraph describing the strength of evidence of each main outcome can also be used. Tables showing the grading of individual domains for each outcome assessed are included as an appendix in Drug Effectiveness Review Project reports; overall assessments (low, moderate, high, or insufficient strength of evidence) are included as part of the Summary Table found at the end of each report. Table 2. Strength of evidence grades and definitions Grade Definition High High confidence that the evidence reflects the true effect. Further research is very unlikely to change our confidence in the estimate of effect. Moderate Moderate confidence that the evidence reflects the true effect. Further research may change our confidence in the estimate of effect and may change the estimate. Low Low confidence that the evidence reflects the true effect. Further research is likely to change our confidence in the estimate of effect and is likely to change the estimate. Insufficient Evidence either is unavailable or does not permit estimation of an effect. Source: Owens et al, 2009(18) Summary Table Drug Effectiveness Review Project January 2011 Systematic Review Methods 17 of 21
You can also read