Original Research
Accepted Manuscript
March 29, 2023

Factors Associated With Neuroradiology Diagnostic Errors at a Large Tertiary-Care Academic Medical Center: A Case-Control Study

Abstract

Please see the Editorial Comment by Ibrahim S. Tuna discussing this article.
Chinese (audio/PDF) and Spanish (audio/PDF) translations are available for this article's abstract.
Background: Numerous studies have explored factors associated with diagnostic errors in neuroradiology; however, large-scale multivariable analyses are lacking.
Objective: To evaluate associations of interpretation time, shift volume, care setting, day of week, and trainee participation with diagnostic errors by neuroradiologists at a large academic medical center.
Methods: This retrospective case-control study using a large tertiary-care academic medical center's neuroradiology quality assurance database evaluated CT and MRI examinations for which neuroradiologists had assigned RADPEER scores. The database was searched from January 2014 through March 2020 for examinations without (RADPEER score 1) or with (RADPEER scores 2a, 2b, 3a, 3b, or 4) diagnostic error. For each examination with error, two examinations without error were randomly selected (unless only one examination could be identified), matched by interpreting radiologist and examination type, to form case and control groups, respectively. Marginal mixed-effects logistic regression models were used to assess associations of diagnostic error with interpretation time (number of minutes since the immediately preceding report's completion), shift volume (number of examinations interpreted during the shift), emergency/inpatient setting, weekend interpretation, and trainee participation in interpretation.
Results: The case group included 564 examinations in 564 patients (mean age, 50.0±25.0 years; 309 men, 255 women); the control group included 1019 examinations in 1019 patients (mean age, 52.5±23.2 years; 540 men, 479 women). In case versus control group, mean interpretation time was 16.3±17.2 versus 14.8±16.7 minutes; mean shift volume was 50.0±22.1 versus 45.4±22.9 examinations. In univariable models, diagnostic error was associated with shift volume (OR=1.22, p<.001) and weekend interpretation (OR=1.60, p<.001), but not interpretation time, emergency/inpatient setting, or trainee participation (p>.05). However, in multivariable models, diagnostic error was independently associated with interpretation time (OR=1.18, p=.003), shift volume (OR=1.27, p<.001), and weekend interpretation (OR=1.69, p=.02). In subanalysis, diagnostic error showed independent associations on weekdays with interpretation time (OR=1.18, p=.003) and shift volume (OR=1.27, p<.001); such associations were not observed on weekends (interpretation time: p=.62; shift volume: p=.58).
Conclusion: Diagnostic errors in neuroradiology were associated with longer interpretation times, higher shift volumes, and weekend interpretation.
Clinical Impact: These findings should be considered when designing workflow-related interventions seeking to reduce neuroradiology interpretation errors.

Highlights

Key Finding
Diagnostic error of neuroradiology examinations was independently associated with longer interpretation time (OR, 1.18), higher shift volume (OR, 1.27), and weekend interpretation (OR, 1.69) but not associated with emergency/inpatient setting or trainee participation. In subanalysis, diagnostic error was associated with interpretation time and shift volume on weekdays but not weekends.
Importance
The identified risk factors for diagnostic error in neuroradiology could be used to guide targeted quality improvement interventions.
Medical errors in the United States are the third leading cause of mortality, with an estimated 250,000 deaths per year [1]. Reported major error rates in radiology range from 2% to 6% [25]. The rates of detected radiologic errors have remained similar over time [610]. An analysis of diagnostic errors among 2.9 million imaging examinations from an international teleradiology practice found higher shift volumes to be associated with diagnostic errors [11]. This finding raises concern for potential future increases in error rates given radiologists' increasing workloads [12].
Neuroradiology has been a topic of interest in the literature relating to diagnostic error. An analysis from 2019 of malpractice claims in diagnostic and interventional neuroradiology found a median plaintiff award of USD2,877,847 and a median settlement amount of USD1,950,000 [13]. Qualitative studies have described common blind spots and error patterns for imaging of the neurovasculature, head and neck, and skull base [1416], and additional studies have analyzed diagnostic errors in neuroradiology using small datasets [3, 17]. Similar to the previously noted finding for radiology examinations in general [11, 12], a study of neuroradiology examinations found higher shift volumes to be associated with diagnostic errors, although shift volume was the only variable analyzed [18]. Large-scale studies of neuroradiology examinations incorporating multivariable analyses of potential risk factors for diagnostic error are currently lacking. Identification of specific risk factors could potentially lead to corrective interventions both at the radiologist (individual) and system (group) levels. The goal of this study was to evaluate associations of examination interpretation time, shift volume, care setting, day of the week, and trainee participation with diagnostic errors by neuroradiologists at a large academic medical center.

Methods

Study Setting

Institutional review board approval was obtained for this retrospective HIPAA-compliant case-control study, with a waiver of the requirement for written informed consent.
The study was performed at the University of California, Davis, a large tertiary-care academic medical center. The neuroradiology quality assurance (QA) database was initiated in January 1, 2014, and has been previously described [19]. Each day that a neuroradiologist is on clinical service, the neuroradiologist is presented with three CT or MRI examinations (including examinations of the brain, head and neck, or spine, as well as MR angiograms [MRA] and CTA examinations), randomly selected by software, to review and assign a score (1, 2a, 2b, 3a, 3b, or 4) using the American College of Radiology RADPEER system; a score of 4 could no longer be assigned after this score was eliminated from the RADPEER system in May 2016. Examinations assigned a RADPEER score of 2a, 2b, 3a, 3b, or 4 are flagged and further reviewed, either by two additional attending neuroradiologists or by the entire neuroradiology division during a quarterly QA conference, to reach a consensus RAD-PEER score; examinations assigned a RADPEER score of 1 do not undergo further review. The reviewed examinations are entered into the QA database, along with the associated RADPEER score (score of 1 if assigned by the initially designated radiologist or the consensus RADPEER score otherwise).
All members of the neuroradiology division are full-time employees who have completed neuroradiology fellowships. Division members have similar yearly productivity and read similar neuroimaging case mixes. All division members rotate with similar frequency among five clinical services: weekday routine-hours emergency/inpatient (9-hour shift), weekday routine-hours spine (9-hour shift), weekday routine-hours outpatient (9-hour shift), weekday evening emergency/inpatient (6-hour shift), and weekend emergency/inpatient (8-to-12–hour shift). Given this work-flow, examinations interpreted on weekends are primarily emergency/inpatient examinations. The division uses two shared worklists: outpatient and emergency/inpatient; examinations are selected from the worklists in chronologic order, although examinations assigned special priority (e.g., stat priority) are selected ahead of chronologic order. There are no additional subspecialized rotations, and specific examination subsets (e.g., head and neck examinations) are not directed to individual neuroradiologists for interpretation.

Patients and Examinations

Separate searches were conducted of the neuroradiology QA database from January 2014 through March 2020 for examinations with a RADPEER score of 1 (i.e., no diagnostic error) and for examinations with a RADPEER score of 2a, 2b, 3a, 3b, or 4 (i.e., presence of diagnostic error), hereafter described as the control and case groups, respectively [20]. For each examination in the case group, matched examinations in the control group were identified by the name of the interpreting neuroradiologist and the examination's CPT code. In each group, examinations for which a match could not be found in the other group were excluded. Then, for each remaining examination in the case group, two matched examinations in the control group were randomly selected for inclusion; only one matched examination from the control group was selected if two matched studies could not be identified (e.g., because of faculty turnover over the course of the study period and/or low volumes for certain examinations such as spine MRA examinations). After random selection of matched examinations, additional examinations were excluded if information for the examination could not be extracted from the database because of technical errors or if the examination's interpretation time (as described later in the Methods) had an outlier value (based on the 95th percentile of the distribution of interpretation times for the given group); examinations with outlier interpretation times were excluded given the likelihood that such interpretation times did not reflect time dedicated to interpretation of the given study. After these exclusions, additional examinations were excluded to maintain the 1:2 matching between the study and control groups (unless only 1:1 matching was possible). The remaining examinations formed the final study sample.
All patients in the case group were included in earlier studies using the neuroradiology QA database that explored a range of questions relating to diagnostic error in neuroradiology [1416, 18, 19]. Unique elements of the current study include the comparison with a control group, the multivariable analysis among possible risk factors for diagnostic error, and the control for the interpreting neuroradiologist and examination type.

Data Collection

For each included examination in the study and control groups, the following variables were extracted from the radiology departmental database: interpretation time, shift volume, care setting (emergency/inpatient vs outpatient), day of interpretation (weekday vs weekend), and trainee participation. Interpretation time was defined as the time in minutes between when the interpreting neuroradiologist finalized the report of the immediately preceding study and finalized the report of the given study. For the first study interpreted during the shift, interpretation time was computed with respect to the last report to be finalized during the preceding shift, resulting in all such examinations having an outlier interpretation time and thus being excluded. Shift volume was defined as the total number of CT and MRI examinations that the interpreting neuroradiologist interpreted during the shift in which the given examination was interpreted. For examinations comprising multiple unique accession numbers (e.g., head MRI-MRA), each accession number was considered a single examination for purposes of determining shift volume, and each such accession number was assigned the same interpretation time (i.e., the time since finalizing the prior report). Trainee participation was defined as the presence of a radiology resident or fellow in the generation of the official report.

Statistical Analysis

Descriptive statistics (mean and SD for continuous variables; count and percentage for categoric variables) were obtained for patient age and sex as well as for examination characteristics including interpretation time, shift volume, care setting, day of interpretation, and trainee participation. Formal significance testing was not performed for differences in the descriptive statistics between the two groups.
Conditional and marginal mixed-effects logistic regression models were used to identify predictors of diagnostic error [21]. Examination characteristics were treated as fixed effects. The interpreting radiologist and examination type were treated as random effects to account for potential correlations among multiple examinations from the same radiologist and examination type. The conditional models generated ORs describing the effect of the examination characteristics on an individual radiologist's risk of diagnostic error, whereas the marginal models generated ORs describing the mean effect of the characteristic at the group (i.e., neuroradiology division as a whole) level. In the models, the continuous variables (interpretation time and shift volume) were standardized by their SDs; thus, the models' estimates represented the effect of a 1 SD change in variable on the outcome of whether the examination was associated with diagnostic error.
Univariable mixed-effects logistic regression models were first performed for each examination characteristic. Characteristics with a p value equal to or less than .1 in either the conditional or marginal model were further evaluated in conditional and marginal mixed-effects multivariable models that yielded adjusted ORs. The multivariable models also included interaction terms between day of interpretation and the other characteristics in the multivariable models. Additional subgroup analyses were performed, stratifying by weekday versus weekend interpretation. Interpretation time and shift volume were summarized, and univariable and multivariable marginal and conditional mixed-effects logistic regression models were performed separately for weekday and weekend examinations among the remaining characteristics that were otherwise included in the earlier multivariable models.
The p values were considered statistically significant if less than .05. All statistical analyses were performed using the SAS Studio version 5.2 software (SAS Institute). Matching was performed using the Surveyselect procedure in SAS software; modeling was performed using the Glimmix procedure in SAS software.

Results

Patients and Examinations

The search of the neuroradiology QA database during the study period yielded 654 examinations in the case group and 8098 examinations in the control group. A total of 36 and 1593 examinations in the two groups, respectively, were initially excluded because of an absence of a matching examination in the other group. Then, 618 and 1159 examinations in the two groups, respectively, were selected after initial 1:2 matching (unless only one control examination was available for a given study examination). Subsequently, nine examinations in the control group were excluded because information for the examination could not be extracted from the database because of technical errors; and 38 and 57 examinations in the two groups, respectively, were excluded because of outlier interpretation times (defined in each group as an interpretation time greater than 98 minutes on the basis of the 95th percentile of interpretation times for each group). After those exclusions, the matching process was repeated, whereby 13 and 31 examinations, respectively, were excluded because of an absence of a matching examination among the remaining examinations in the other group, and three and 43 examinations, respectively, were excluded to maintain the 1:2 matching (unless only one control examination was available for a given study examination). This process resulted in a final study sample of 564 examinations in 564 patients in the case group and 1019 examinations in 1019 patients in the control group. Figure 1 shows the flow of patient selection in both groups. Between the two groups, examinations were interpreted by 16 unique neurora-diologists (with from 1 to 41 years of postfellowship clinical experience). Among the examinations in the case group, the RADPEER score was 2a in 46 (8.2%), 2b in 175 (31.0%), 3a in 26 (4.6%), and 3b in 317 (56.2%); no examination received a RADPEER score of 4.
Fig. 1 —Flowchart shows patient selection in study and control groups. QA = quality assurance.
Table 1 presents descriptive characteristics for patients and examinations in both groups. The case group comprised 309 (54.8%) men and 255 (45.2%) women; the control group comprised 540 (53.0%) men and 479 (47.0%) women. The mean patient age was 50.0 ± 25.0 [SD] years for the case group versus 52.5 ± 23.2 years for the control group. The mean interpretation time was 16.3 ± 17.2 [SD] minutes in the case group versus 14.8 ± 16.7 minutes in the control group. The mean shift volume was 50.0 ± 22.1 [SD] examinations in the case group versus 45.4 ± 22.9 examinations in the control group. The percentage of examinations from the emergency/inpatient setting was 63.7% in the case group versus 59.9% in the control group. The percentage of examinations with weekend interpretation was 20.4% in the case group versus 13.8% in the control group. The percentage of examinations with trainee participation in the interpretation was 42.6% in the case group versus 43.4% in the control group.
TABLE 1: Summary of Characteristics of Patients and Examinations
FactorStudy Group (n = 564)Control Group (n = 1019)
Patient characteristic  
Sex  
Male309 (54.8)540 (53.0)
Female255 (45.2)479 (47.0)
Age (y), mean ± SD50.0 ± 25.052.5 ± 23.2
Examination characteristic  
Interpretation time (min), mean ± SD16.3 ± 17.214.8 ± 16.7
Shift volume (examinations per shift), mean ± SD50.0 ± 22.145.4 ± 22.9
Emergency/inpatient setting359 (63.7)610 (59.9)
Weekend interpretation115 (20.4)141 (13.8)
Trainee participation224 (42.6)442 (43.4)

Note—Unless otherwise indicated, data are expressed as number of patients or examinations, with percentage in parentheses. Study and control groups correspond with examinations with and without diagnostic error, respectively.

Results of Regression Models for Predicting Diagnostic Error

Table 2 shows the results of the univariable mixed-effects marginal and conditional logistic regression models. Diagnostic error was significantly associated with higher shift volume (both models: OR = 1.22, p < .001) and weekend interpretation (both models: OR = 1.60, p < .001). Diagnostic error was not significantly associated in either model with interpretation time, emergency/interpretation setting, or trainee participation (all p > .05).
TABLE 2: Results From Univariable Mixed-Effects Logistic Regression Models for Predicting Diagnostic Error
FactorMarginalConditional
OR95% CIpOR95% CIp
Interpretation time (min)1.090.99–1.23.101.090.99–1.21.10
Shift volume (examinations per shift)1.221.11–1.36< .0011.221.12–1.36< .001
Emergency/inpatient setting1.170.95–1.45.141.170.95–1.45.14
Weekend interpretation1.601.22–2.09< .0011.601.22–2.09< .001
Trainee participation0.970.79–1.19.750.970.79–1.19.75

Note—Marginal models describe a factor's mean effect on risk of errors at the neuroradiology division level. Conditional models describe a factor's effect on risk of errors for individual radiologists. Models' estimates represent the effect of a 1 SD change in continuous variables and the effect of the presence of binary variables.

Table 3 shows the results of the multivariable mixed-effects marginal and conditional logistic regression models. In the marginal model, diagnostic error was independently associated with interpretation time (OR = 1.18, p = .003), shift volume (OR = 1.27, p < .001), and weekend interpretation (OR = 1.69, p = .02). The interaction terms (interpretation time × weekend interpretation and shift volume × weekend interpretation) were not significant independent predictors of diagnostic error (both p > .05). The conditional multivariable mixed-effects model yielded similar results.
TABLE 3: Results From Multivariable Mixed-Effects Logistic Regression Models for Predicting Diagnostic Error
FactorMarginalConditional
Adjusted OR95% CIpAdjusted OR95% CIp
Interpretation time (min)a1.181.06–1.31.0031.181.06–1.31.003
Shift volume (examinations per shift)a1.271.11–1.46< .0011.271.11–1.46< .001
Weekend interpretation1.69b1.01–2.91.021.69b1.01–2.85.02
Interpretation time × weekend interpretationa0.910.61–1.34.210.910.61–1.34.21
Shift volume × weekend interpretationa0.910.64–1.28.080.910.64–1.28.08

Note—Marginal models describe a factor's mean effect on risk of errors at the neuroradiology division level. Conditional models describe a factor's effect on risk of errors for individual radiologists. Models' estimates represent the effect of a 1 SD change in continuous variables and the effect of the presence of binary variables.

a
Results for interpretation time and shift volume computed for weekdays, whereas results for interaction terms computed for weekends.
b
Dependent on levels of interpretation time and shift volume; reported OR is evaluated at the overall mean for both interpretation time (15.4 minutes) and shift volume (47.0 examinations).
Further subgroup analyses were performed, stratifying by weekday (n = 1327) versus weekend (n = 256) interpretation. The frequency of diagnostic error was 33.8% (449/1327) on weekdays versus 44.9% (115/256) on weekends. Mean interpretation time on weekdays was 17.5 ± 18.6 minutes in the case group and 15.2 ± 17.3 minutes in the control group and on weekends was 11.7 ± 8.9 minutes in the case group and 12.3 ± 12.0 minutes in the control group. Mean shift volume on weekdays was 40.4 ± 18.6 examinations in the case group and 43.5 ± 18.9 examinations in the control group and on weekends was 75.5 ± 13.7 examinations in the case group and 76.6 ± 18.3 examinations in the control group.
Table 4 shows the results of the univariable and multivariable regression analyses in the subgroups. For multivariable models on weekdays, diagnostic error showed significant independent associations with interpretation time (both marginal and conditional models: OR = 1.18, p = .003) and shift volume (both marginal and conditional models: OR = 1.27, p < .001). However, for multivariable models on weekends, diagnostic error was not significantly associated with interpretation time (both marginal and conditional models: OR = 0.91, p = .62) or shift volume (both marginal and conditional models: OR = 0.91, p = .58).
TABLE 4: Subgroup Analysis of Logistic Regression Models for Predicting Diagnostic Error Based on Interpretation Time and Shift Volume, Stratifying by Weekday Versus Weekend Interpretation
Subgroup, Model, and FactorMarginalConditional
OR95% CIpOR95% CIp
Weekday (n = 1327)      
Univariable      
Interpretation time (min)1.131.03–1.25.031.131.01–1.25.03
Shift volume (examinations per shift)1.211.06–1.38.0051.211.06–1.38.005
Multivariable      
Interpretation time (min)1.181.06–1.31.0031.181.06–1.31.003
Shift volume (examinations per shift)1.271.11–1.46< .0011.271.11–1.47< .001
Weekend (n = 256)      
Univariable      
Interpretation time (min)0.910.61–1.35.640.910.61–1.35.64
Shift volume (examinations per shift)0.910.64–1.29.590.910.65–1.29.59
Multivariable      
Interpretation time (min)0.910.61–1.35.620.910.61–1.34.62
Shift volume (examinations per shift)0.910.64–1.28.580.910.64–1.28.58

Note—Marginal models describe a factor's mean effect on risk of errors at the neuroradiology division level. Conditional models describe a factor's effect on risk of errors for individual radiologists. Models' estimates represent the effect of a 1 SD change in continuous variables.

Discussion

This study showed statistically significant associations of diagnostic error for neuroimaging examinations with longer interpretation times, higher shift volumes, and weekend interpretation. According to marginal and conditional mixed-effects logistic regression models, these three variables were found to be risk factors for diagnostic error at both the individual neuroradiologist and neuroradiology division levels, respectively. In subanalyses, longer interpretation times and higher shift volumes were significantly associated with a higher likelihood of diagnostic error on weekdays, but not on weekends. Diagnostic error was not significantly associated with emergency/inpatient setting or trainee participation in the interpretation.
Interpretation times were longer among neuroimaging examinations with diagnostic errors than among those without diagnostic error. Data regarding the association of interpretation times and diagnostic error are scarce. A prior study found that, for abdominal imaging examinations, faster interpretation speed is associated with a higher error rate [22]. However, that study entailed a prospective intervention whereby radiologists were assigned examinations to interpret twice as fast as their mean baseline reporting time; thus, that study did not reflect variations in interpretation times that occurred during routine practice. Potential reasons for the longer interpretation times for examinations with diagnostic error in the current study include greater complexity of such examinations and interruptions occurring during the longer interpretation periods.
The present finding of an association between higher shift volumes and diagnostic error is consistent with the results of a prior study of 2.8 million examinations from an international teleradiology practice [11]. Numerous additional studies have found an association between error rates and longer radiology workdays [11, 2325], with a peak in errors after the 10th hour of work [11, 25]. In an additional study relating to errors in neuroradiology, errors were more likely to be perceptual rather than interpretive in nature for shifts with higher volumes and for examinations interpreted later during shifts [17]. Radiologist fatigue may account for these associations of error with shift volume or duration.
In the current study, weekend interpretation was a significant risk factor for diagnostic error; to our knowledge, this finding has not been previously reported. In subgroup analysis, longer interpretation times and higher shift volumes were significantly associated with a higher likelihood of diagnostic error on weekdays, but not on weekends. This disparity may in part relate to the larger sample size (and thus greater statistical power) for weekday than for weekend examinations. However, the higher ORs for diagnostic errors for interpretation time and shift volume on weekdays than on weekends, as well as the difference in frequency of diagnostic error between weekdays and weekends, suggest a true influence of weekend interpretation on the relationship of diagnostic error with interpretation time and shift volume. In addition, interpretation times were shorter and shift volumes were higher on weekends than on weekdays in both the study and control groups; it is possible that diagnostic error rates are sensitive to changes in interpretation time and shift volume only within specific ranges for those variables, and that the variables were outside of such ranges on weekends.
Prior literature has highlighted a range of strategies for reducing diagnostic error, including interpretation by subspecialty radiologists [7, 26], decreases in shift volumes [11], limits on shift lengths to less than 10 hours [11, 25], reductions in noninterpretive tasks during clinical shifts [27], radiologist participation at multispecialty tumor boards [19], and (as previously discussed in contrast with the present findings) reduced interpretation speed [22]. In a study from 2014, on-call radiologists received a mean of 72 telephone calls (mean total time for handling the calls of 108 minutes) during a typical 12-hour overnight shift; when allowing an additional 90 minutes for other interruptions, breaks, consultations, and conferences, this volume of telephone calls resulted in radiologists having less than 1 second (0.86 second) to view each image during the shift [28]. In 2012, the Royal College of Radiologists [29] issued a national guideline recommending that radiologists interpret up to two complex CT or MRI examinations, and up to six CT or MRI examinations overall, per hour. In 2022, the Japanese College of Radiology [30] issued a national guideline recommending that radiologists interpret up to four examinations per hour. The American College of Radiology has not issued such guidelines for radiologists in the United States. Further prospective studies are warranted to assess the impact on diagnostic error of specific corrective workflow interventions.
This study had limitations. First, it used a single-center retrospective design. Second, given the large volume of examinations without diagnostic error in the neuroradiology QA database, only a fraction of such cases were included in the analyses, and examinations were randomly selected using a matching process with respect to examinations with diagnostic error. Third, the determination of the presence of diagnostic error is a subjective process, and examinations without diagnostic error based on the assessment of the initially assigned radiologist were not reviewed by additional radiologists. Fourth, all radiologists were fellowship-trained neuroradiologists at an academic medical center; findings may have differed for general radiologists or for neuroradiologists at community practices. Fifth, the association of the interpreting neuroradiologist's years of experience with diagnostic error was not evaluated, as prior work using the neuroradiology QA database did not find this factor to be associated with errors [19]. Sixth, when evaluating the effect of trainee participation on diagnostic error, stratification was not performed on the basis of level of training (i.e., junior or senior resident vs neuroradiology fellow). Nonetheless, the observed lack of an association between trainee participation and diagnostic error is similar to the results of prior studies in neuroradiology [31, 32]. Seventh, shift length was not evaluated as a risk factor for diagnostic error given that shifts were generally shorter than 10 hours, whereas prior literature suggests a significant increase in errors at a threshold longer than 10 hours [11, 25]. Thus, potential interactions of the identified risk factors with long shifts are unknown. Eighth, the assessment of shift volume reflected solely the total number of examinations interpreted over the course of the shift and did not account for variation in types of examinations (e.g., CT vs MRI; brain vs spine) interpreted during the course of the shift, the number of examinations that the neuroradiologist had interpreted before the given examination, or the time of day (e.g., morning, afternoon, evening) that the examination was interpreted. Ninth, interpretation time was determined as the time since completion of the immediately preceding report; this approach assumed that the entire intervening period was dedicated to interpretation of the given examination. Tenth, the regression analyses did not account for potential clustering effects among multiple examinations in individual patients. Eleventh, the analysis did not account for examination complexity, whether related to patient factors, the type of examination, or the examination's findings. Finally, subanalyses were not performed with respect to specific RADPEER scores (2a, 2b, 3a, or 3b).
In conclusion, diagnostic errors by attending neuroradiologists at a single large academic medical center were significantly associated with longer interpretation times, higher shift volumes, and weekend interpretation. The associations with diagnostic error of longer interpretation times and higher shift volumes were observed for weekday, but not for weekend, interpretations. These findings should be considered when designing workflow-related and other interventions seeking to reduce errors in neuroimaging interpretation.

Footnotes

Provenance and review: Not solicited; externally peer reviewed.
Peer reviewers: All reviewers chose not to disclose their identities.

References

1.
Makary MA, Daniel M. Medical error: the third leading cause of death in the US. BMJ 2016; 353:i2139
2.
Wu MZ, McInnes MDF, Macdonald DB, Kielar AZ, Duigenan S. CT in adults: systematic review and meta-analysis of interpretation discrepancy rates. Radiology 2014; 270:717–735
3.
Babiarz LS, Yousem DM. Quality control in neuroradiology: discrepancies in image interpretation among academic neuroradiologists. AJNR 2012; 33:37–42
4.
Borgstede JP, Lewis RS, Bhargavan M, Sunshine JH. RADPEER quality assurance program: a multifacility study of interpretive disagreement rates. J Am Coll Radiol 2004; 1:59–65
5.
Kim YW, Mansfield LT. Fool me twice: delayed diagnoses in radiology with emphasis on perpetuated errors. AJR 2014; 202:465–470
6.
Garland LH. On the scientific evaluation of diagnostic procedures. Radiology 1949; 52:309–328
7.
Loevner LA, Sonners AI, Schulman BJ, et al. Reinterpretation of cross-sectional images in patients with head and neck cancer in the setting of a multidisciplinary cancer center. AJNR 2002; 23:1622–1626
8.
Rosenkrantz AB, Duszak R Jr, Babb JS, Glover M, Kang SK. Discrepancy rates and clinical impact of imaging secondary interpretations: a systematic review and meta-analysis. J Am Coll Radiol 2018; 15:1222–1231
9.
Kabadi SJ, Krishnaraj A. Strategies for improving the value of the radiology report: a retrospective analysis of errors in formally over-read studies. J Am Coll Radiol 2017; 14:459–466
10.
Abujudeh HH, Boland GW, Kaewlai R, et al. Abdominal and pelvic computed tomography (CT) interpretation: discrepancy rates among experienced radiologists. Eur Radiol 2010; 20:1952–1957
11.
Hanna TN, Lamoureux C, Krupinski EA, Weber S, Johnson JO. Effect of shift, schedule, and volume on interpretive accuracy: a retrospective analysis of 2.9 million radiologic examinations. Radiology 2018; 287:205–212
12.
McDonald RJ, Schwartz KM, Eckel LJ, et al. The effects of changes in utilization and technological advancements of cross-sectional imaging on radiologist workload. Acad Radiol 2015; 22:1191–1198
13.
Deckey DG, Eltorai AEM, Jindal G, Daniels AH. Analysis of malpractice claims involving diagnostic and interventional neuroradiology. J Am Coll Radiol 2019; 16:764–769
14.
Ferguson A, Assadsangabi R, Chang J, et al. Analysis of misses in imaging of head and neck pathology by attending neuroradiologists at a single tertiary academic medical centre. Clin Radiol 2021; 76:786.e9–786.e13
15.
Biddle G, Assadsangabi R, Broadhead K, Hacein-Bey L, Ivanovic V. Diagnostic errors in cerebrovascular pathology: retrospective analysis of a neuroradiology database at a large tertiary academic medical center. AJNR 2022; 43:1271–1278
16.
Vong S, Chang J, Assadsangabi R, Ivanovic V. Analysis of perceptual errors in skull-base pathology. Neuroradiol J 2022 Jun 18 [published online]
17.
Patel SH, Stanton CL, Miller SG, Patrie JT, Itri JN, Shepherd TM. Risk factors for perceptual-versus-interpretative errors in diagnostic neuroradiology. AJNR 2019; 40:1252–1256
18.
Ivanovic V, Paydar A, Chang YM, et al. Impact of shift volume on neuroradiology diagnostic errors at a large tertiary academic center. Acad Radiol 2022 Sep 27 [published online]
19.
Ivanovic V, Assadsangabi R, Hacein-Bey L, et al. Neuroradiology diagnostic errors at a tertiary academic centre: effect of participation in tumour boards and physician experience. Clin Radiol 2022; 77:607–612
20.
Goldberg-Stein S, Frigini LA, Long S, et al. ACR RADPEER committee white paper with 2016 updates: revised scoring system, new classifications, self-review, and subspecialized reports. J Am Coll Radiol 2017; 14:1080–1086
21.
Agresti A. Categorical data analysis, 3rd ed. Wiley & Son, 2012
22.
Sokolovskaya E, Shinde T, Ruchman RB, et al. The effect of faster reporting speed for imaging studies on the number of misses and interpretation errors: a pilot study. J Am Coll Radiol 2015; 12:683–688
23.
Krupinski EA, Berbaum KS, Caldwell RT, Schartz KM, Kim J. Long radiology workdays reduce detection and accommodation accuracy. J Am Coll Radiol 2010; 7:698–704
24.
Krupinski EA, Berbaum KS, Caldwell RT, Schartz KM, Madsen MT, Kramer DJ. Do long radiology workdays affect nodule detection in dynamic CT interpretation? J Am Coll Radiol 2012; 9:191–198
25.
Ruutiainen AT, Durand DJ, Scanlon MH, Itri JN. Increased error rates in preliminary reports issued by radiology residents working more than 10 consecutive hours overnight. Acad Radiol 2013; 20:305–311
26.
Hatzoglou V, Omuro AM, Haque S, et al. Second-opinion interpretations of neuroimaging studies by oncologic neuroradiologists can help reduce errors in cancer care. Cancer 2016; 122:2708–2714
27.
Lee MH, Schemmel AJ, Pooler BD, et al. Workflow dynamics and the imaging value chain: quantifying the effect of designating a nonimage-interpretive task workflow. Curr Probl Diagn Radiol 2017; 46:275–281
28.
Balint BJ, Steenburg SD, Lin H, Shen C, Steele JL, Gunderman RB. Do telephone call interruptions have an impact on radiology resident diagnostic accuracy? Acad Radiol 2014; 21:1623–1628
29.
The Royal College of Radiologists website. Clinical radiology workload: guidance on radiologists' reporting figures. www.rcr.ac.uk. Accessed Oct 1, 2022
30.
Japanese College of Radiology webpage. JCR statement on appropriate workload of radiologists. jcr.or.jp/english-page. Accessed Oct 1, 2022
31.
Viertel VG, Babiarz LS, Carone M, Lewin JS, Yousem DM. Quality control in neuroradiology: impact of trainees on discrepancy rates. AJNR 2012; 33:1032–1036
32.
Lian K, Bharatha A, Aviv RI, Symons SP. Interpretation errors in CT angiography of the head and neck and the benefit of double reading. AJNR 2011; 32:2132–2135

Information & Authors

Information

Published In

American Journal of Roentgenology
PubMed: 36988269

History

Accepted: March 23, 2023
Version of record online: March 29, 2023

Keywords

  1. quality improvement
  2. radiologic errors
  3. workload

Authors

Affiliations

Vladimir Ivanovic, MD
Department of Radiology, Section of Neuroradiology, Medical College of Wisconsin, Milwaukee, Wi
Kenneth Broadhead
Department of Statistics, Colorado State University, Fort Collins, Co
Ryan Beck, MD
Department of Radiology, Section of Neuroradiology, Medical College of Wisconsin, Milwaukee, Wi
Yu-Ming Chang, MD, PhD
Department of Radiology, Section of Neuroradiology, Beth Israel Deaconess Medical Center, Boston, Ma
Alireza Paydar, MD
Department of Radiology, Section of Neuroradiology, University of California Davis Medical Center, Sacramento, CA
Garrick Biddle, MD
Department of Radiology, Section of Neuroradiology, University of California Davis Medical Center, Sacramento, CA
Lotfi Hacein-Bey, MD
Department of Radiology, Section of Neuroradiology, University of California Davis Medical Center, Sacramento, CA
Lihong Qi, PhD
Department of Public Health Sciences, School of Medicine, University of California Davis, Davis, CA

Notes

Corresponding author: Vladimir Ivanovic, MD, Medical College of Wisconsin, Department of Radiology, 8701 Watertown Plank Road, Milwaukee, WI 53226; Phone: 414-805-3122; Fax: 414-259-9290; [email protected]

Metrics & Citations

Metrics

Citations

Export Citations

To download the citation to this article, select your reference manager software.

Articles citing this article

Media

Figures

Other

Tables

Share

Share

Copy the content Link

Share on social media