In family medicine, time is of the essence. Family physicians must make decisions quickly, while still retaining a scientific approach and communicating with our patients to reach mutually agreeable solutions. For much of our work, we use a standard set of approaches—a “mindline” that enables a routine.1,2 Each day, we also encounter situations for which we do not have a mindline and then must check some advisory source. It is not possible for front-line family physicians to appraise the primary research for every situation; we must use research summarized by others into material we can look up quickly.3 Many clinical decisions can be informed by recommendations from practice guidelines.
Guidelines include systematically developed statements to assist practitioner and patient decisions about appropriate health care choices for specific clinical circumstances.2 On any given topic, there might be several guidelines with vastly different recommendations, despite often supposedly being based on the same evidence. How can this be? And how do we choose the “right” guidelines for our practice?
To choose well, it is essential to understand the process of guideline production; Schünemann et al4 describe its components. The process of the Canadian Task Force on Preventive Health Care (CTFPHC) is outlined in Box 1.5 The CTFPHC is a group of volunteer experts in primary care practice and critical appraisal methods. They do not have specific expertise in each topic they work on, but act as a “jury” to judge the evidence from experts. The CTFPHC establishes a question and then contracts an Evidence Review and Synthesis Centre to conduct a systematic review, aided by clinical experts. The CTFPHC considers this review when making its recommendations. At several stages, peers and experts provide their external opinions. It is a human process and is subject to frailties of judgment, but having a formal process with checks and balances that is open to external view reduces the chance of human error.
The CTFPHC guideline development process
Topic selection
Scoping exercise: what has already been done?
Topic refinement
Develop scope and PICO approach
Decide outcomes and harms
Draft questions
Protocol development
Rank outcomes and harms
Define questions
Draft protocol
Peer review of protocol is done (by external experts)
Finalize protocol
Register protocol with PROSPERO
Post protocol online
Systematic review
Undertake review (by an Evidence Review and Synthesis Centre)
Produce draft of systematic review
Working group and external experts comment on systematic review
Assign evidence ratings
Assign GRADE ratings of evidence
Solicit comments from task force and external experts
Respond to comments
Draft guidelines written
Guidelines are peer reviewed
Respond to comments and modify if necessary
Report for publication written
Knowledge translation tools developed
Submitted for publication
Journal performs editorial and peer review
Respond to comments and submit final version
Publication
CTFPHC—Canadian Task Force on Preventive Health Care; GRADE—Grading of Recommendations Assessment, Development and Evaluation; PICO—patient or population, intervention, comparison, and outcome.
Adapted from the CTFPHC.5
Describing the evidence
Every guideline expresses its decisions using a scale, and understanding the various scales is important. Table 1 shows examples of scales used by the CTFPHC and the United States Preventive Services Task Force over time. Initially, they both used a 5-point A-to-E Likert scale. Then the negative recommendations were combined into a single category, and the I recommendation was added for situations in which evidence is inadequate. The CTFPHC now uses the GRADE (Grading of Recommendations Assessment, Development and Evaluation) system, developed by an international group led by Canadians.6 The CTFPHC has chosen to use a version with only 4 points. Notably, there is no middle category, forcing guideline developers to decide for or against. When benefit is absent, small, uncertain, or not likely to outweigh harms, then the GRADE system recommends against the activity.
Originally, guidelines focused primarily on the type of evidence supporting a recommendation and whether that showed the intervention had any effect. Subsequently, guideline developers differentiated the quality of the evidence and the size of the effect. Higher-quality evidence with greater precision provides greater trust in the estimate of the effect. Thus, high-quality evidence might show that an intervention has a very small effect and is, therefore, not worth doing. More recently, we in the medical profession have better recognized that our actions, while intended to do good, also create harms. Therefore, high-quality guidelines now assess quality of evidence to decide the presence and strength of the benefit, and measure the severity of harms, then reach a conclusion that balances the harms and benefits. Ultimately, the information should be shared with patients to understand their perspectives, so we can help make decisions that are right for them. We have written about this process for preventive care,7,8 but it applies at each stage of illness care as well. Different guideline developers use varying systems, and readers must understand the meaning of the systems to interpret recommendations properly.
In the rush of daily practice, we must make binary (yes or no) decisions—to do something or nothing, to screen or not screen, to treat or not treat. Yet most medical data are continuous, with variation in risk as the measurement changes. For example, the risk of disease often increases with age, as does the likelihood of finding treatable disease, but the potential benefit for extending life might be less for those with less remaining life to live. The risk of harm often increases substantially with age, so that at a certain point the potential for harm is greater than any benefit. Thus, screening for diabetes and treating it might be worthwhile in middle age. However, a 90-year-old person whose hemoglobin A1c level is slightly above the diagnostic threshold need not worry, as the complications of diabetes take a long time to appear. Drug treatment might cause adverse effects, with no benefit. This balance must not be lost for the sake of writing simple rules.
How to evaluate guidelines
Understanding how guidelines are developed helps us critically appraise those guidelines we encounter in practice. The Appraisal of Guidelines for Research Evaluation (AGREE) instrument was developed by a Canadian-based group to assist users with assessing variability in guideline quality.9 Table 2 lists the headings for the AGREE II approaches.9 These headings appear overwhelming, but not all of them carry equal weight. Their value varies from topic to topic, and we indicate a rough measure of their importance. Rigour of development, ensuring that preconceptions and bias are minimized, and how the evidence is used to develop recommendations are critical. A guideline that is written for primary care, is focused on the components of health issues we deal with in primary care, and was created under the supervision of a group based in primary care is often more useful for our needs than one about the whole depth of a problem written by other specialists who are mostly focused on care of advanced and complex cases.
Recently, a group of family medicine–based researchers published the Guideline Trustworthiness, Relevance, and Utility Scoring Tool (G-TRUST) approach to classifying the value of guidelines (Box 2).10 This tool considers threats to relevance, to evidence, and to interpretation. If a guideline is not relevant, we need go no further. If the evidence is not considered properly or there are threats to unbiased interpretation, we can reject the guideline. This enables quick focus on a smaller number of guidelines that are likely to be good. During a resident seminar in 2015, 4 groups of residents were each asked to assess a different guideline on prostate cancer screening using the AGREE II approach. Table 3 shows their findings.9,11–14 If the G-TRUST approach were available at the time, the residents would likely have been even more discriminating.
G-TRUST approach to classifying the value of guidelines
Relevance threats
1. The patient populations and conditions are relevant to my clinical setting
2. The recommendations are clear and actionable
3. The recommendations focus on improving patient-oriented outcomes, explicitly comparing benefits versus harms to support clinical decision making
Evidence threats
4. The guidelines are based on a systematic review of the research data
5. The recommendation statements important to you are based on graded evidence and include a description of the quality (eg, strong, weak) of the evidence
6. The guideline development includes a research analyst, such as a statistician or epidemiologist
Interpretation threats
7. The chair of the guideline development committee and a majority of the rest of the committee are free of declared financial conflicts of interest, and the guideline development group did not receive industry funding for developing the guideline
8. The guideline development includes members from the most relevant specialties and includes other key stakeholders, such as patients, payer organizations, and public health entities, when applicable
G-TRUST—Guideline Trustworthiness, Relevance, and Utility Scoring Tool.
Reproduced with permission from Shaughnessy AF, Vaswani A, Andrews BK, Erlich DR, D’Amico F, Lexchin J, et al. Developing a clinician friendly tool to identify useful clinical practice guidelines: G-TRUST. Ann Fam Med 2017;15(5):413–8.10 Copyright American Academy of Family Physicians, 2017. All rights reserved.
Conflicts of interest
Guidelines written by disease advocacy groups are common. While their concerns are understandable, too often they are sponsored in variably concealed ways by commercial interests. The evidence is strong that sponsored research studies and guidelines are nearly always biased toward the sponsor’s preferences, compared with guidelines on the same topic without such sponsorship.15,16 Guideline groups that do not base their recommendations on a high-quality systematic review, or guidelines written by experts who already have beliefs about the topic, are at risk of allowing preconceptions to supersede the evidence.17,18 Some guideline committees include representation from token family physicians, who might not have much skill in research and critical appraisal. Such representatives are placed in a difficult position—it is hard for them to push back against disease “experts” who are promoting their own agendas.
As an illustrative example, Box 3 shows the process used to create the recent “Men’s Health Guidelines for Family Medicine,”19 widely distributed by Acerus Pharma, which sells hormone products. The panel of 50 names listed is stated to have
[a] balanced composition ... (ie, specialists, family physicians, nurse practitioners and pharmacists) [that] is instrumental in limiting the many types of bias, including competing or conflicting interests. This approach obviates the need to have the panel members declare any existing or potential competing interests. For further information on competing interests contact cssam.ca or the individual panel members directly.19
Development process of the Men’s Health Guidelines for Family Medicine
The guidelines were developed as follows:
Extensive review of existing literature and guidelines
Preparation of a working document for review by the Men’s Health Review Panel
Redraft of guidelines circulated for peer review and feedback
Comments presented to the Men’s Health Review Panel
Guidelines redrafted with modifications
Guidelines published and made available to health practitioners
Data from the Men’s Health Review Panel.19
Eleven members of the guideline committee are urologists and 20 are family physicians, but it is unclear how involved many of them were in considering the evidence. The website that is said to contain their conflict of interest statements is not accessible to non-members. While a round of discussions took place, the final document was not reviewed and approved by all members of the advisory board whose names are listed. These features make it difficult to accept their guideline recommendations.
Conclusion
The process of assessing guidelines is complex and might seem overwhelming; however, once you have used the AGREE II or G-TRUST process once or twice it becomes easier. Like any chain of reasoning, each weak link of a guideline weakens the whole structure. Readers should focus on the critical issues: Does this guideline help us solve the problems that we face in our practice? Are the recommendations based on an open-minded and transparent process of assessing the evidence? Guidelines are often most useful when they address controversial issues, where they confirm there is doubt about what to do and help us explain this to our patients. Increasingly, we can expect guideline producers to provide tools to help us communicate the evidence to our patients.
Physicians need not do this whole procedure every day. Once you have found the best of a set of guidelines, you can keep using that guideline until it goes out of date—usually about 3 to 5 years after publication.
We can routinely use some trustworthy sources. The College of Family Physicians of Canada Prevention in Hand website20 collects carefully curated resources for prevention, including guidelines. The CTFPHC is a highly regarded source.21 Guidelines from other countries are more difficult to use because the context of clinical practice might differ, and sometimes the evidence used might be limited to that country. The United States Preventive Services Task Force22 is a good source, but in accord with the general approach to medicine in the United States, it tends to recommend more active policies and give less weight to harms than Canadian recommendations do. Useful guidelines are also available from the National Institute for Health and Care Excellence23 in England, the Scottish Intercollegiate Guidelines Network,24 and the Royal Australian College of General Practitioners,25 especially their Handbook of Non-Drug Interventions.26
Notes
Key points
▸ Family physicians must make decisions quickly, while still retaining a scientific approach and communicating with our patients to reach mutually agreeable solutions. Guidelines help clinicians to make good decisions, especially in uncertain situations.
▸ Guidelines are often most useful when they address controversial issues, where they confirm there is doubt about what to do and help us explain this to our patients. Increasingly, we can expect guideline producers to provide tools to help us communicate the evidence to our patients.
▸ Readers should focus on the critical issues when choosing guidelines: Does this guideline help us solve the problems that we face in our practice? Are the recommendations based on an open-minded and transparent process of assessing the evidence?
▸ The AGREE II (Appraisal of Guidelines for Research Evaluation) and G-TRUST (Guideline Trustworthiness, Relevance, and Utility Scoring Tool) approaches are useful in assessing the quality of guidelines.
Footnotes
Competing interests
All authors have completed the International Committee of Medical Journal Editors’ Unified Competing Interest form (available on request from the corresponding author).
Dr Singh reports grants from Merck Canada, personal fees from Pendopharm, and personal fees from Ferring Canada, outside the submitted work. The other authors declare that they have no competing interests.
This article is eligible for Mainpro+ certified Self-Learning credits. To earn credits, go to www.cfp.ca and click on the Mainpro+ link.
La traduction en français de cet article se trouve à www.cfp.ca dans la table des matières du numéro de mai 2018 à la page e225.
- Copyright© the College of Family Physicians of Canada