Abstract
Objective To validate a case definition for speech and language disorders in community-dwelling older adults and to determine the prevalence of speech and language disorders in a primary care population.
Design This is a combined case definition validation and cross-sectional prevalence study. Chart review was considered the reference standard and was used to estimate prevalence. This study used de-identified electronic medical record data from participating SAPCReN-CPCSSN (Southern Alberta Primary Care Research Network–Canadian Primary Care Sentinel Surveillance Network) primary care clinics.
Setting Southern Alberta.
Participants Men and women aged 55 years and older who had visited a SAPCReN-CPCSSN physician or nurse practitioner at least once in the 2 years before the beginning of the study.
Main outcome measures Validation analysis included estimation of sensitivity, specificity, positive predictive value, and negative predictive value. Prevalence was the other main outcome measure.
Results The prevalence of speech and language disorders within the sample of 1384 patients was 1.2%. The case definition had a favourable specificity (99.9%, 95% CI 99.6% to 100.0%), positive predictive value (75.6%, 95% CI 25.4% to 96.6%), and negative predictive value (99.0%, 95% CI 98.8% to 99.2%). Sensitivity was not sufficient for validity (18.8%, 95% CI 4.05% to 45.6%).
Conclusion The case definition did not meet an acceptable standard for validity and thus cannot be used for future epidemiologic research. However, owing to the case definition’s high positive predictive value, it might be useful for clinical purposes and for cohort studies. Finally, while the case definition did not prove valid, this study has provided a conservative estimate of prevalence (1.2%) given the case definition’s high specificity.
The Canadian population is aging rapidly, creating high service demand and long wait times for geriatricians and other specialists.1,2 Speech and language disorders among older adults, which might result from prevalent conditions such as cerebrovascular disease, dementia, and other neurologic disorders, might increasingly be managed in primary care settings.3 Disorders that predominantly affect older adults include the speech disorders dysarthria, apraxia, and stuttering, and the language disorder aphasia.3 These speech and language disorders can be developmental or acquired and can coexist in adulthood. For example, stuttering is most commonly a developmental speech disorder, but also is infrequently acquired in adulthood.4 Moreover, it is not uncommon to find 2 disorders of speech or language occurring together as a result of a stroke or traumatic brain injury. McNeil et al reported that apraxia rarely occurs on its own and can be the “primary” or “secondary” disorder.5 Because there is overlap between speech and language disorders among older adults in terms of underlying cause and symptom presentation, it is reasonable to study the prevalence of such disorders, as they co-occur.
Determining the prevalence of speech and language disorders among older adults is challenging. A comprehensive literature review revealed that the most recent available prevalence estimate for speech and language disorders affecting Canadian older adults (determined as a set of related disorders) was published in 2005 by Speech-Language and Audiology Canada.6 They estimated that approximately 12% of the older adult population had speech or language disorders and 1% experienced stuttering. To understand how this prevalence estimate was determined and how representative it is of the general older adult population, we might look to previous efforts to describe the prevalence of speech and language disorders.
First, we consider the speech disorders: stuttering, dysarthria, and apraxia. Many studies cite 1% as a general estimate of prevalence for stuttering. The US National Institute on Deafness and Other Communication Disorders notes that more than 3 million Americans (about 1%) stutter.7 However, Yairi and Ambrose state that this prevalence statistic is the mean prevalence for the general population.8 Previous studies report that the prevalence of stuttering is considerably higher in young and elementary school–aged children than for other age groups.8 Very few studies refer directly to incidence or prevalence of stuttering in the older age categories, and typically do not refer to a specific value. The National Institute on Deafness and Other Communication Disorders states that as many as 1 in 4 of those who begin to stutter in childhood will continue for the rest of their lives,7 and a 2002 study by Craig et al reported that stuttering might persist into adulthood for up to 20% of those who develop a stutter as children.9
According to the American Speech-Language-Hearing Association, dysarthria’s prevalence is not fully known owing to substantial variation in the location where brain damage occurred, what concomitant diseases were present, and how diagnosis was obtained.10 A further complication is that dysarthria is a disorder that might be acute, episodic, or chronic in presentation.10 For example, the American Speech-Language-Hearing Association notes that between 8% and 60% of stroke patients experience dysarthria, while 10% to 65% of those with a traumatic brain injury are diagnosed with the disorder.10 These estimates might include occurrences of dysarthria immediately following the stroke or injury; later occurrences still considered to be resulting from the stroke or injury; cases that resolve at some point during the recovery process; and cases that persist for a sufficiently long time to be deemed chronic. Approximately 25% to 50% of patients with multiple sclerosis experience dysarthria.11 Similarly, prevalence estimates for apraxia are few. A 2013 study that assessed the type of motor speech disorders presenting to the Mayo Clinic speech pathology practice found apraxia to be the primary disorder in only 6.9% of cases.12
Second, when considering language disorders, prevalence estimates vary depending on definitions used to classify aphasia types. In his characterization of aphasia in The Handbook of Language and Speech Disorders, Code notes that while a wealth of research has been undertaken on the prevalence and incidence of aphasia in patients after stroke, there is a lack of uniformity in both research the methodology and the clinical definitions of aphasia used in different studies.13 According to Musser et al, the incidence of aphasia in the United States is 80 000 per year among stroke patients, with a population prevalence of 1 million.14 Speech-Language and Audiology Canada states that up to 30% of patients experience aphasia after a stroke,3 which 2 other studies15,16 confirm, while more than 100 000 people are estimated to be living with aphasia in Canada. Data concerning prevalence of aphasia in nonstroke patients are similarly limited. Norman et al examined communication disorders among veterans with traumatic brain injury and determined that within their cohort of 303 716 veterans, 0.2% had aphasia.17
Given the heterogeneity in definitions of speech and language disorders, it is perhaps unsurprising that prevalence estimates are difficult to ascertain. Nevertheless, physicians in primary care settings are making such diagnoses, or caring for older adults with existing diagnoses of speech and language disorders. These prevalence data might be found in electronic medical records (EMRs). Electronic medical records are an important source of clinical data in family practice. They contain diagnoses, prescriptions, billing information, referrals, laboratory testing, and other information recorded by attending physicians. The Canadian Primary Care Sentinel Surveillance Network (CPCSSN) provides a cleaned, coded, and de-identified data model for EMR-derived primary care data.18 The Canadian Primary Care Sentinel Surveillance Network comprises 11 practice-based primary care research networks across Canada to provide chronic disease surveillance and improve access to high-quality data for research and patient care.18 The Southern Alberta Primary Care Research Network (SAPCReN) is a regional node of CPCSSN that collects data on primary care patients in southern Alberta whose family physicians and nurse practitioners participate in CPCSSN. Patients whose data are included within a CPCSSN network are likely to be representative of the general Canadian population, particularly for older adults. Statistics Canada noted that Alberta had a higher-than-average proportion of its population without a designated primary health care practitioner (18.0%); however, older adults (who were considered to be those aged ≥ 65 years) were the least likely age group to be without a primary health care provider.19 The report estimated that 6.5% of older male patients and 5.3% of older female patients did not report having a designated primary health care provider.19 Based on these findings, it might be reasonable to consider a primary care–based older adult population (such as that found in the SAPCReN-CPCSSN database) as comparable to the general, community-dwelling population of Canadian older adults.
The purpose of this study was to estimate the prevalence of speech and language disorders among older adults in primary care clinics using EMR-derived data and to evaluate the accuracy of the proposed case definition.
METHODS
This retrospective cross-sectional validation study was based on the 2014 methodology of Williamson et al.20 The case definition was formulated a priori and focused on 4 disorders: 3 speech disorders (apraxia, dysarthria, and stuttering) and 1 language disorder (aphasia).
This study was conducted at the University of Alberta in Edmonton using de-identified EMR data from participating clinics in SAPCReN, a regional CPCSSN network. At the time of sampling, CPCSSN extracted data from 600 primary care physicians for 750 000 patients across Canada, and there were 44 590 patients aged 55 years and older within the SAPCReN database (not limited by active status). Electronic medical records in use by SAPCReN-CPCSSN sentinels included Med Access, Wolf, and Mediplan.
Ethics approval for this study was received from the University of Alberta Health Research Ethics Board.
Case definition
The case definition for speech and language disorders in older adults was developed through a review of the literature, including research papers, medical textbooks, Web resources, and educational resources published by professional organizations. The literature review was supplemented with an exploratory search of the SAPCReN-CPCSSN database to identify relevant key words. Subsequently, a speech-language pathologist (T.H.) was consulted to confirm the appropriateness and completeness of the case definition. Congenital anomalies such as cleft palate were not considered sufficient evidence of “caseness,” nor was referral to a speech-language pathologist unless defined text phrases or codes from the International Classification of Diseases, version 9 (ICD-9), were included.21 The search terms for this study comprised possible or reasonably likely terms derived from the aforementioned literature search. The terms included might thus be considered a hypothesis for how such diagnoses might be recorded by attending physicians or nurse practitioners in EMRs. Table 1 details the text phrases and ICD-9 codes included in the case definition.
Following completion of this process, a CPCSSN data manager translated the case definition into an electronic algorithm. This algorithm consisted of a list of text terms and phrases and related ICD-9 codes that could be used to search the sample. The algorithm was then applied to patients in the sample. Any single mention of a text item or ICD-9 billing code from the case definition found in the list of encounters, health conditions, or billing codes was considered sufficient evidence for caseness. The algorithm searched all available parts of the EMR excluding the SOAP (subjective, objective, assessment, plan) notes owing to privacy restrictions.
Reference standard
The reference standard was a chart review. Five reviewers participated in the full chart review, including 1 epidemiologist, 2 research assistants trained in epidemiologic methods, 1 medical student, and 1 physician. Reviewers were blinded to the classification of the chart as a case or noncase according to the CPCSSN algorithm. However, reviewers were able to discuss charts and reach consensus on caseness during the review process and after the review had been completed. Cases deemed “suspect” or “uncertain” were discussed and the final decision was made by the reviewers or, in the event of disagreement, by the speech-language pathologist. The reference standard for this study was subsequently used to estimate prevalence in the study sample.
Sample
A sample of “sentinels” (primary care physicians or nurse practitioners participating in CPCSSN, from whom EMR data are gathered) was contacted with the purpose of gaining permission to access patients’ charts remotely, at a distribution of approximately 60 charts per clinic. The sample of 1000 patients was chosen to ensure that the confidence interval for the estimate of sensitivity would be no wider than 20%, assuming a prevalence of 13% and a .05 level of significance. Sampling was done in December 2014.
Inclusion criteria for this study were as follows: adults aged 55 years and older, who were registered patients and had seen their SAPCReN-CPCSSN sentinel physician or nurse practitioner in the 2 years before sampling.
Unforeseen technical issues occurred midway through the study, which involved loss of access to charts managed by 1 EMR vendor. This vendor reconfigured the search interface, removing certain search fields (including the field that reviewers used to search for patient charts) after the chart review was under way. This resulted in the loss of reviewers’ ability to access the charts. The solution for this problem was a partial re-randomization of sample charts from sites using a different EMR vendor, which occurred in February 2015. The charts that had been completed by sites using the first EMR vendor were kept for analysis and were included in the sample. Further contribution of charts from other SAPCReN sentinels resulted in more charts available for review (and thus more charts were included in the sample) than were originally estimated as necessary to demonstrate statistical significance. The chart review was undertaken from January to June 2015.
Statistical analysis
The measures used to determine validity were sensitivity, specificity, positive predictive value, and negative predictive value, all with 95% CIs. The case definition would be considered valid at 70% sensitivity and specificity, in accordance with the standards of validity outlined by Williamson et al in 2014.20
Interrater reliability was assessed using a random subsample of 10 EMR charts to ensure that reviewers were consistent in their assessment of caseness. Owing to the technical issues involving loss of access to some charts, only 3 of the 5 reviewers were able to participate in the interrater reliability check. The Fleiss κ coefficient was used to indicate chance-corrected agreement among reviewers.
All analyses were performed using the statistical software Stata/IC, version 13, except for calculation of the Fleiss κ, for which SAS was used.
RESULTS
Of the total SAPCReN patient population of 44 590, 30 215 patients met the inclusion criteria of being 55 years and older and having visited a SAPCReN-CPCSSN sentinel within the 2 years before the point of sampling. Whereas 1000 patients were initially included in the sample, the second sampling resulted in the inclusion of an additional 514 patients. Thus, the chart review included a total of 1514 patients.
After the chart review, 117 charts were excluded owing to missing data (ie, charts that were incomplete when access was lost and charts found to be unsearchable in the EMR databases owing to lack of identifiers). Additionally, 13 deceased patients were excluded. Thus, statistical analysis included 1384 patients in total, as illustrated in the study flow diagram (Figure 1).
Table 2 summarizes patient characteristics of the final sample. The sample consisted of 64.3% women with a mean age of 67.5 years. Each site contributed 9.1% to 22.5% of patient EMR data to the sample.
The case definition and its search algorithm for speech and language disorders identified a prevalence of 1.2% within the sample (95% CI 0.66% to 1.87%). While specificity was favourable at 99.9% (95% CI 99.6% to 100.0%), sensitivity was considerably lower at 18.8% (95% CI 4.05% to 45.6%). Positive and negative predictive values were determined to be 75.6% (95% CI 25.4% to 96.6%) and 99.0% (95% CI 98.8% to 99.2%), respectively. Table 3 provides a 2 × 2 table and Table 4 summarizes validation metrics.
The mean age of patients identified by the chart review of having been diagnosed with a speech or language disorder was approximately 10 years older than the study population. There was little difference in distribution of sex between the study population and those determined to be cases. There was also little variation from the sample distribution in terms of site.
The Fleiss κ statistic for speech and language disorders was −0.034 (P = .57).
DISCUSSION
The purpose of this study was to estimate the prevalence of speech and language disorders among older adults in primary care clinics using EMR-derived data and to evaluate the accuracy of the proposed case definition.
Using the results of the reference-standard chart review, it was determined that the sample prevalence of speech and language disorders in older, community-dwelling primary care patients was approximately 1.2%. Speech-Language and Audiology Canada reported a prevalence estimate of 13% for the older adult Canadian population.6 Our study is limited to adults receiving primary care services and, thus, the substantial disparity between our prevalence and that reported by Speech-Language and Audiology Canada might reflect the differences between community-dwelling and hospital-attending populations. There are no recent estimates available in Canada of the burden of speech and language disorders as a set of conditions within the national population against which to compare our prevalence statistic.
The case definition validated in this study has very high specificity, which suggests it is effective at ruling in cases of speech and language disorders. The low sensitivity, however, indicates it is ineffective at ruling out patients who have a speech or language disorder. Thus, as the sensitivity did not meet the acceptable minimum of 70%,20 this case definition cannot be considered valid for epidemiologic purposes.
A possible reason for low sensitivity is misclassification in the chart review. After the statistical analysis was completed, the CPCSSN data for the 13 discrepant cases were analyzed to determine likelihood of error on the part of the algorithm. Of those 13 discrepant cases, only 1 was found to have clear evidence of a speech or language disorder and which might therefore be considered a misclassification by the algorithm.
As stated previously, the Fleiss κ statistic was too low to provide any indication of the interrater reliability of the chart review. While our reviewers agreed 90% of the time, this was not expressed through the results of the Fleiss κ analysis. This is likely due to a “paradox” of the Cohen κ and the Fleiss κ whereby high interrater agreement and low prevalence of the condition in question might result in a low κ statistic.22
Limitations
This study has 3 main limitations. First, technical issues involving one of the EMR systems resulted in a disruption of the chart review and forced us to obtain a sample that was not evenly distributed among sites (and thus EMR systems) as intended.
Second, as we were unable to reinvestigate charts following completion of the chart review, we cannot accurately determine whether misclassification occurred for false-negative cases. On a related note, speech and language disorders lack quantitative or more “objective”23 criteria on which a diagnosis might be made. For example, there are no medications or laboratory tests that lend themselves to identification of speech or language disorders within EMR databases. In the absence of such measures, chart reviewers might have been more likely to designate caseness in error for a speech or language disorder than they might for a disease such as diabetes where definitive medications and laboratory tests exist and billing codes are more frequently available.23
Last, the design of this study did not include patient data from outside the primary care setting and focused on EMRs as the source of data. Thus, the prevalence estimated is ascertained from evidence of speech and language disorders documented in patients’ EMRs. The case definition did not attempt to classify the cases beyond what was written in the sample patients’ EMRs.
Conclusion
A comprehensive review of the literature revealed no previous studies in which case definitions for speech and language disorders had been validated in the older adult primary care population. There has been little previous research as to how speech and language disorders in adults might be recorded in primary care–based EMRs. This study contributes to the existing literature in this area by providing a prevalence estimate for speech and language disorders among older adults in primary care that was not available until now. Despite limitations of the study, the prevalence estimate of 1.2% is important to consider in comparison with existing more general estimates among the older adult population. This difference in prevalence rates is an important avenue for further research that will focus on establishing a prevalence estimate that considers multiple factors. The case definition developed in this study is one of very few to be assessed for speech and language disorders among older adults in epidemiologic research, and it might be effective for use in quality improvement activities owing to its high positive and negative predictive values. Finally, this study also contributes to the literature in this area by highlighting epidemiologic methodology and challenges inherent in the use of clinical data sources.
Acknowledgments
We acknowledge the contributions of data managers Dave Jackson and Matt Taylor for their work generating the sample, establishing remote access to the clinics, developing the data-gathering tool, providing technical assistance, and providing expertise in interpreting SAPCReN data. We also acknowledge the work of participating investigators Sue Ross, Hilary Fast, Deb Slade, and Meghan Doraty, who acted as chart reviewers and provided input on determination of caseness. We thank the Canadian Primary Care Sentinel Surveillance Network, with whom we collaborated on this project, and Selphee Tang, for determining the Fleiss κ statistic for the analysis of interrater reliability.
Notes
Editor’s key points
▸ A comprehensive review of the literature revealed no previous studies in which case definitions for speech and language disorders were validated in the older adult primary care population. There has been little research as to how speech and language disorders in adults might be recorded in primary care–based electronic medical records. This study contributes to the existing literature by providing a prevalence estimate and developing a case definition for speech and language disorders among older adults in primary care.
▸ Using the results of a chart review, this study determined that the prevalence of speech and language disorders in the sample of older, community-dwelling primary care patients was approximately 1.2%.
▸ The case definition validated in this study has very high specificity (99.9%, 95% CI 99.6% to 100.0%). The low sensitivity, however (18.8%, 95% CI 4.05% to 45.6%), indicates it is ineffective at ruling out patients who have a speech or language disorder. Thus, as the sensitivity did not meet the acceptable minimum of 70%, this case definition cannot be considered valid for epidemiologic purposes, although the high positive and negative predictive values might make it effective for use in quality improvement activities.
Points de repère du rédacteur
▸ Une rigoureuse recherche documentaire a révélé qu’aucune étude antérieure comprenant des définitions de cas pour les troubles de la parole et du langage n’avait été validée dans la population des aînés en soins primaires. Très peu d’études de recherche se sont penchées sur la façon de consigner les troubles orthophoniques dans les dossiers médicaux électroniques en soins primaires. Cette étude contribue aux ouvrages existants en fournissant une estimation de la prévalence et en ayant élaboré une définition de cas pour les troubles de la parole et du langage dans la population des adultes plus âgés en soins primaires.
▸ En utilisant les résultats d’une revue des dossiers, cette étude a permis de déterminer que la prévalence des troubles de la parole et du langage dans l’échantillon de patients plus âgés en soins primaires vivant dans la communauté se situait à environ 1,2 %.
▸ La définition de cas validée dans cette étude a une très grande spécificité (99,9 %, IC à 95 % de 99,6 à 100,0 %). Par ailleurs, sa faible sensibilité (18,8 %, IC à 95 % de 4,05 à 45,6 %) indique qu’elle n’est pas efficace pour exclure les patients qui ont un trouble de la parole ou du langage. Par conséquent, parce que sa sensibilité n’a pas atteint le minimum acceptable de 70 %, cette définition de cas ne peut pas être considérée comme valide à des fins épidémiologiques, quoique ses valeurs prédictives positives et négatives puissent en faire un outil efficace à utiliser dans des activités d’amélioration de la qualité.
Footnotes
Contributors
All authors contributed to the concept and design of the study; data gathering, analysis, and interpretation; and preparing the manuscript for submission.
Competing interests
None declared
This article has been peer reviewed.
Cet article a fait l’objet d’une révision par des pairs.
- Copyright© the College of Family Physicians of Canada