The Canadian Task Force on Preventive Health Care (CTFPHC) was reestablished in 2010 with a mandate to develop and disseminate clinical practice guidelines (CPGs) for primary and preventive care. The CTFPHC uses the GRADE (grading of recommendations, assessment, development, and evaluation) system to rate the quality of its evidence and the strength of its recommendation statements. The GRADE system provides a structured and transparent process for guideline development that begins at framing key questions and proceeds through the evaluation of evidence for benefits and harms, as well as incorporation of patient preferences and resource implications, to arrive at recommendations. This article outlines key concepts of the GRADE process to assist primary care practitioners in understanding the GRADE recommendations and discussing these recommendations with patients.
Background
Family physicians and other primary care health professionals often seek guidance from CPGs about how to better manage their patients. Family physicians are confronted with a bewildering array of CPGs developed by a large variety of government agencies and professional organizations. It is estimated that there are currently at least 2400 guidelines in the Agency for Healthcare Research and Quality’s National Guideline Clearinghouse1; more than 6400 guidelines in the database of the Guidelines International Network2; and more than 2700 in the Canadian Medical Association’s CPG database.3 Each database includes multiple guidelines on the same topic, often with conflicting recommendations.4,5 Recently, there has also been increasing concern about the quality of CPGs owing to potential bias on the part of the guideline developers6,7 or the quality of the evidence used to develop the CPGs.5,7–9 For family physicians, these issues raise concerns about the validity of the recommendations and create confusion over which to apply in practice.
Family physicians are also confronted with a diverse range of systems used in CPGs to rate the quality of scientific evidence and the strength of recommendations. These different rating systems make it difficult for family physicians to understand and effectively communicate the benefits and harms of the practices recommended in CPGs to their patients. In 1979, the Canadian Task Force on the Periodic Health Examination published one of the first systems to explicitly characterize the quality of evidence and strength of recommendations.10 This system ranked quality of evidence from I to III and classified the strength of recommendations from A to E. Although widely adopted because of its simplicity, this system did not provide a detailed quality assessment of the evidence for benefits and harms important to patients, nor did it explicitly consider the benefits versus possible harms in the strength of recommendations.11 By 2002, at least 121 different systems had been developed that were used in publications, systematic reviews, and CPGs.12 More recently, government and professional organizations tasked with the development of CPGs in Canada, the United States, Australia, and Europe have developed and implemented a variety of systems to evaluate the quality of evidence and rate the strength of recommendations for CPGs.13–15 Many of these systems use different letters, numbers, or symbols to communicate similar recommendations on specific health issues, and often the same letter or number has different meanings in the various systems.16
Why GRADE?
To overcome the problems related to the inconsistent rating of evidence and the confusion with different rating systems, an international group of health professionals, researchers, and guideline developers created the GRADE system in 2004.17 The GRADE system rates the quality of evidence and grades the strength of recommendations in systematic reviews, health technology assessments, and CPGs. The GRADE system is structured and transparent. It is designed for systematic reviews (eg, Cochrane systematic reviews) and guidelines that examine alternative management strategies or interventions, which might include no intervention or current best practice.18 The GRADE system also informs clinician and patient decision making in clinical practice settings and supports production of informed health policy. It is now used or endorsed by at least 70 different organizations throughout the world, including the World Health Organization, UpToDate, and the Cochrane Collaboration.19
The CTFPHC was reestablished with a mandate to develop and disseminate CPGs for primary and preventive care based on systematic analysis of scientific evidence.20 The CTFPHC guidelines address primary or secondary prevention of conditions with a substantial health burden; topics are selected based on literature review and input from practitioners and the public.
How do I interpret GRADE recommendations?
Many family physicians and primary care health professionals (who are the target audience of the CTFPHC guidelines) are potentially unfamiliar with GRADE processes and therefore might be unsure of how to interpret the potential benefits and harms of practices recommended by the CTFPHC. This article outlines key concepts of the GRADE process using examples from the recently published CTFPHC guidelines on breast cancer screening21 to assist primary care practitioners in understanding the GRADE recommendations and discussing these recommendations with patients.
Although family physicians and other primary health care providers need not be aware of all the steps and processes involved in the development of CPGs using the GRADE methodology, consideration of several key elements in the GRADE guideline development process will ensure an overall understanding of the quality of evidence and strength of recommendations provided by this system. These elements include an understanding of the analytic framework and methods used in the literature review, the summaries of evidence tables, and the GRADE recommendations and how they can inform physician-patient decision making in clinical practice. More complete and detailed descriptions of the GRADE process for guideline developers and authors of systematic reviews have recently been published (www.gradeworkinggroup.org/publications/index.htm).
An overview of the CTFPHC guideline development process that highlights these key elements of GRADE is presented in Figure 1.
Steps in the CTFPHC guideline development process
CTFPHC—Canadian Task Force on Preventive Health Care.
*Highlighted steps are discussed in the paper.
Does the guideline apply to my patient?
The importance of the analytic framework to practising family physicians and other primary health care practitioners is to provide an understanding of the patient populations to which the guideline recommendations would apply. This framework also identifies issues that were included or excluded from consideration in guideline development. The analytic framework and key questions provide the foundation for the literature review and guideline recommendation. This framework consists of a flow diagram with key questions and contextual questions. Key questions are those of main importance to clinicians and patients; they define the scope and focus of the evidence reviews. The contextual questions provide further information about how to interpret and apply the recommendations in our diverse Canadian settings; they also provide information about values and preferences, cost-effectiveness, and process and outcome indicators. Key questions are answered with a full systematic review, while for contextual questions a review of key studies and other systematic reviews is performed only for literature published in the past 5 years.
In the development of the analytic framework, guideline developers define the patient population, the intervention of interest, the comparator, and the outcome of interest. The process is also known as PICO (patient, intervention, comparator, outcome) and is now a widely accepted standard for development of guidelines and systematic reviews. An example of an analytic framework and key questions is shown in Figure 2 and Box 1.
Example of an analytical framework using the breast cancer screening evidence review
BSE—breast self-examination, CBE—clinical breast examination, MRI—magnetic resonance imaging.
How good is the evidence?
In GRADE, the continuum of the quality of evidence is rated on a 4-point scale of high, moderate, low, or very low depending on the certainty that the results reflect the true effect of the intervention on the outcome (Table 1).22 Evidence is graded as high quality when the CTFPHC has high confidence that the true effect of an intervention or approach lies close to the estimate of effect, while lower-quality evidence indicates that the true effect might be substantially different from the estimate of effect.22 The GRADE system considers several factors in determining the quality of the evidence. As a starting point, evidence of randomized controlled studies begins as high-quality evidence, while evidence from observational studies begins as low-quality evidence. Evidence can then be downgraded or upgraded depending on several factors. Evidence is downgraded based on consideration of 5 factors: risk of bias, inconsistency, indirectness, imprecision, and publication bias. Evidence can be upgraded based on 3 factors: large effect, dose response, and consideration of all possible confounders (Tables 2 and 3).23–29
Interpretation of evidence levels used to GRADE CTFPHC recommendations
Factors that lead to decreasing evidence quality in the GRADE framework
Factors that lead to increasing evidence quality in the GRADE framework
For example, the evidence supporting the use of hormone replacement therapy for postmenopausal women in the early 1990s would have received a low-quality or low rating in the GRADE system because it was based on inconsistent observational studies.30 Such a rating means further research could very likely have an important effect on the confidence in the estimate of effect and is likely to change the estimate. In fact, further research did show increased cardiovascular harms with hormone replacement, and this evidence would ultimately reverse the recommendation for hormone replacement therapy. In summary, the GRADE system attempts to improve the estimate of the certainty of effects, thus providing clinicians and patients with more precise information on which to base their decisions.
Box 1. Example of key questions using the breast cancer screening evidence review
Key questions
|
BSE—breast self-examination, CBE—clinical breast examination, MRI—magnetic resonance imaging.
The GRADE evidence tables
The GRADE system has developed specific approaches for the presentation of the results of the systematic literature reviews based on the analytic framework. The CTFPHC uses the GRADE evidence profile to present its results. The evidence profile table summarizes the size of the study population, the effect of the intervention, and the quality of the evidence. Table 4 provides an example of an evidence profile developed for the CTFPHC guideline on screening for breast cancer in women aged 40 to 49 years.4
Evidence summary of benefits associated with screening mammography: The content of the evidence profile table is provided in 13 standardized columns. The first column provides information about the number of studies and the study design used to determine the effectiveness of screening mammography for women in this age range (N = 8 RCTs). Columns 2 to 7 provide an assessment of the quality of these studies. Footnotes provide further explanations as required. For instance, in column 3 (risk of bias) we indicate a serious concern about the potential risk of bias in the studies. This is based on the fact that only 3 of the 8 trials were considered truly randomized; in 5 of the trials it was not clear if investigators were blinded to the groups to which the patients were assigned or whether those enrolling patients were aware of which group patients were being assigned to. There were no other concerns about quality: results of all trials were consistent, the patients and the interventions were similar to the patients that we were studying, the samples sizes were large, the CIs were narrow, and there was no evidence of publication bias. Columns 8 to 11 in the table present the summary of our meta-analysis to determine the overall effectiveness of mammography screening in women aged 40–49 y. The number of deaths seen in the control and experimental groups are provided in columns 8 and 9. In columns 10 and 11, the estimates of the relative and absolute risk reductions that can be attributed to screening mammography are provided. Relative risk is used to compare risks between 2 different groups of people, often those who were exposed to an intervention and those who were not. Meta-analysis of mammography screening studies with women aged 40–49 y found a reduction of breast cancer risk of 15% (equivalent to an RR of 0.85) for women who were screened compared with women who were not screened. Absolute risk focuses on an individual’s risk of getting a disease in a specific period of time and can be expressed as a percentage or a rate (eg, 10% or 1 in 10). In this example, this means that 474 fewer women per million (or 1 in approximately 2100) will die as a result of screening. Column 12 provides an overall rating of the quality. Column 13 highlights the importance of the results.
How does GRADE translate evidence into recommendations?
In GRADE, the assessment of the quality of evidence and the strength of recommendations are separate. At present, GRADE recommendations are reported as either strong or weak. In addition to quality of evidence, GRADE also explicitly considers the balance between the benefits and harms, the values and preferences of patients, and the resource implications of an intervention in the determination of the strength of recommendations. While the quality of evidence and the balance between the benefits and harms are considered by the CTFPHC to be the most important elements, guideline developers might choose to place some or limited emphasis on resource implications and might have limited data on the values and preferences of patients for specific interventions.
Strong recommendations are more likely when there is a large difference between the benefits and harms and certainty around that difference, when there is greater certainty or similarity in values and preferences, and when the evidence quality is higher.31 Weak recommendations indicate that greater uncertainty exists (Figure 3).29 Strong recommendations can be made even with low-quality evidence, assuming that the balance between benefits and harms is clear and values and preferences are consistent, while weak recommendations can be made based on high-quality evidence. As an example, although only anecdotal evidence (low quality) suggests that parachutes are an effective intervention to reduce morbidity and mortality associated with jumping from an airplane,32 the recommendation to use a parachute would be classified as strong.
Balancing the benefits and harms to determine the strength of a recommendation
Based on a concept presented by Santesso and Gauld.29
When the CTFPHC makes strong recommendations, clinicians can interpret this to mean that most individuals should receive the intervention in question, while for weak recommendations the focus shifts to helping patients make informed decisions, taking into account the benefits and harms, as well as their individual values and preferences. With weak recommendations, clinicians must recognize that different choices will be appropriate for different patients. For example, a weak recommendation against mammography in average-risk women aged 40 to 49 years implies that (although most women of this age would not choose to be screened) regular screening could be appropriate in a 40- to 49-year-old woman who places a relatively higher value on preventing death from breast cancer and a relatively lower value on avoiding unnecessary tests and procedures. Similarly, a weak recommendation for mammography in average-risk women aged 50 to 74 years implies that screening would not necessarily be required or appropriate in a woman of this age who places a relatively lower value on preventing death from breast cancer and a relatively higher value on avoiding unnecessary tests and procedures.
How can GRADE recommendations inform patient-physician decision making?
Effective communication between physicians and patients is a key concept of family medicine and has been associated with improved clinical outcomes for patients with a variety of health conditions.33 Guideline developers are faced with the challenge of providing easily understood information on benefits and harms of recommendations to inform the discussion between physician and patient and assist in the decision-making process. Patients with the same condition or risk factors might have quite different values or preferences, life circumstances, or access to medical care. Understanding these differences along with the benefits and harms of the guideline recommendations would support shared decision making by the patient and physician.
To support informed physician-patient decision making related to their guidelines, the CTFPHC has developed several tools in a collaborative manner with researchers and knowledge translation experts in Canada. These tools also undergo a process of internal and external peer review and user testing with patients and physicians to ensure that the scientific information is correct and presented in an easily understood format. Tools that have been developed and tested for the CTFPHC guideline on screening with mammography include a video illustrating a doctor-patient interaction about screening, a list of frequently asked questions on breast cancer screening for patients, a flowchart to help women gauge whether screening is right for them, and decision aids that present risks and benefits in ways that patients can understand. An example of a tool developed to assist in decision making for screening with mammography for breast cancer for women between the ages of 40 and 49 is shown in Figure 4.* These tools are available on the CTFPHC website (http://canadiantaskforce.ca).
Family physicians and GRADE
The GRADE system provides a rigorous approach to the development of CPGs that is increasingly being used by many professional and government organizations throughout the world. Family physicians need to be able to appreciate the benefits and harms and the certainty of evidence behind clinical recommendations. The use of the GRADE methodology by the developers of CPGs and systematic reviews can provide family physicians and other primary care health professionals with a guide-post of high quality for CPGs and systematic reviews. With the increasing use of GRADE, family physicians and other primary care health professionals should become familiar with the GRADE approach to assessment of the quality of evidence and the strength of recommendations so that they can effectively use CPGs and systematic reviews developed by this approach when making decisions with their patients.
Footnotes
-
Competing interests
None declared
-
↵* Figure 4 is available at www.cfp.ca. Go to the full text of this article online, then click on CFPlus in the menu at the top right-hand side of the page.
- Copyright© the College of Family Physicians of Canada