Abstract
Objectives:
To examine whether radiologists’ performances are consistent throughout a reading session and whether any changes in performance over the reading task differ depending on experience of the reader.
Methods:
The performance of ten radiologists reading a test set of 60 mammographic cases without breaks was assessed using an ANOVA, 2 × 3 factorial design. Participants were categorized as more (≥2,000 mammogram readings per year) or less (<2,000 readings per year) experienced. Three series of 20 cases were chosen to ensure comparable difficulty and presented in the same sequence to all readers. It usually takes around 30 min for a radiologist to complete each of the 20-case series, resulting in a total of 90 min for the 60 mammographic cases. The sensitivity, specificity, lesion sensitivity, and area under the ROC curve were calculated for each series. We hypothesized that the order in which a series was read (i.e. fixed-series sequence) would have a significant main effect on the participants’ performance. We also determined if significant interactions exist between the fixed-series sequence and radiologist experience.
Results:
Significant linear interactions were found between experience and the fixed sequence of the series for sensitivity (F[1] =5.762, p = .04, partial η2 = .41) and lesion sensitivity. (F[1] =6.993, p = .03, partial η2 = .46). The two groups’ mean scores were similar for the first series but progressively diverged. By the end of the third series, significant differences in sensitivity and lesion sensitivity were evident, with the more experienced individuals demonstrating improving and the less experienced declining performance. Neither experience nor series sequence significantly affected the specificity or the area under the ROC curve.
Conclusions:
Radiologists’ performance may change considerably during a reading session, apparently as a function of experience, with less experienced radiologists declining in sensitivity and lesion sensitivity while more experienced radiologists actually improve. With the increasing demands on radiologists to undertake high-volume reporting, we suggest that junior radiologists be made aware of possible sensitivity and lesion sensitivity deterioration over time so they can schedule breaks during continuous reading sessions that are appropriate to them, rather than try to emulate their more experienced colleagues.
Advances in knowledge:
Less-experienced radiologists demonstrated a reduction in mammographic diagnostic accuracy in later stages of the reporting sessions. This may suggest that extending the duration of reporting sessions to compensate for increasing workloads may not represent the optimal solution for less-experienced radiologists.
Introduction
Breast cancer represents 11.7% of all cancer cases worldwide. With an estimated 2.3 million new cases in 2020, it has now passed lung cancer as the world’s most commonly diagnosed cancer.1 Mammography screening plays a crucial role in the early detection of breast cancer2: breast cancer detected early is considered potentially curable, as females have a 96% chance of surviving five years and a 90.1% chance of surviving 10 years after diagnosis.3 However, due to a shortage of available radiologists, breast radiologists in many countries (including Australia, the United Kingdom, and the United States) are under increasing pressure to read more mammograms in their daily clinical practices.4–7 The consequent long reading sessions and ever-expanding cognitive load may result in interpretation errors and missed diagnoses,8,9 and such errors have been attributed to vigilance decrement, mental fatigue, and a decrease in target detection sensitivity as the time spent on the task increases.5,8–12
The vigilance decrement phenomenon was noticed early, as it was the focus of the research of the British psychologist Norman Mackworth during World War II, when he was appointed to investigate why British radar operators demonstrated an increased tendency to miss critical signals, such as enemy submarines, at a rate of approximately 10–15% after only 30 min on the watch.13–15 Since then, numerous laboratory psychological trials16 and studies in various domains9,17 have empirically examined the phenomenon. Notably, definitions of vigilance decrement vary, and while researchers in radiology have variously defined it as a reduction in sensitivity or cancer detection rates,8–10,18 it has been defined differently in other fields, such as in cognitive psychology, where it is considered a reduced ability to sustain attention on a given task.19 In this paper, we use a definition from previous research in radiology: ‘A vigilance decrement is a decline in sensitivity to detect targets with time on task’.9
Vigilance decrement is often considered distinct from fatigue, which has been described in previous radiology research and could also affect radiologists’ performance during readings.11,12,20 Fatigue is a feeling of weariness from physical or mental exertion that is characterized by a reduced capacity for activities and reduced efficacy in those activities that are performed.11,20 Cognitive psychologists divide fatigue into sleep-related (endogenous) and task-related (exogenous) factors, depending on its source.21 Previous studies on fatigue in radiologists have focused on endogenous causes, including time of day,22 circadian influences,23 length of time awake,24,25 and amount of recent sleep.24,26 However, few studies have focused on specific task-related causes of fatigue, including how a radiologist’s performance changes from the beginning of a typical reading session to the end. Such exogenous fatigue is more prevalent for tasks that have high cognitive demands, require intense and prolonged concentration, and involve repetitive activities, and these same factors that lead to fatigue may also result in vigilance decrement and poorer performance.19,27–32 Critically, these characteristics are common features of mammographic interpretation, and because vigilance decrement and fatigue can cause important interpretation errors, the impact on radiologists’ mammographic screening performance warrants investigation.9
We identified eight radiologic studies that investigated how radiologists’ performance might change during a radiologic reading session,9,18,33–39 but a number of these studies did not consider mammogram images, and there was no consensus among the remainder as to the presence of performance changes during the session. Taylor-Phillips, Stinton and Krupinski8 hypothesized that this lack of consensus is attributable to radiologists’ experience levels, but no study has yet focused upon experience or expertise as a factor in radiologists’ susceptibility to performance changes during reading sessions. The current work addresses this gap by investigating performance changes during a mammographic reading session and the relationship between such changes and radiologists’ experience levels.
Methods and materials
Ethical approval, informed consent, and participant confidentiality
Ethical approval was obtained from the University of Sydney Human Research Ethics Committee, Project No. 2019/169. Participants electronically signed informed consent forms, whereby they agreed to the use of their anonymized data for research purposes and potential publication in peer-reviewed journals. Each participant was assigned a unique identifier, and the researchers were blinded to the participants’ identities.
Study participants and procedures
The reading performance of ten Australian and New Zealand radiologists was prospectively assessed in a workshop setting. The radiologists were recruited during the 2019 Royal Australian and New Zealand College of Radiologists (RANZCR) 70th Annual Conference in Auckland, New Zealand, through an advertisement on the conference website. A test set of 60 mammographic cases from the BreastScreen Reader Assessment Strategy (BREAST) software was used to assess the radiologists’ performance and to collect reader-specific demographic data and information on the number of mammogram cases read annually by each participant.40
Breast
The BREAST program is used regularly in Australia and New Zealand by BreastScreen radiologists. It is based on digital screen reading test sets that aim to assess clinicians’ accuracy in detecting breast cancer and in assessing cancer-free cases. Each BREAST test set includes 60 de-identified digital mammography cases (40 normal and 20 abnormal). Each case contains a craniocaudal and a mediolateral oblique view of each breast. Normal examinations are cases that were not recalled at screening, or were recalled and confirmed as benign, or confirmed normal by two experienced readers. All normal examinations included a follow-up of normal screening mammograms obtained in the succeeding screening round to confirm their true-negative status. Abnormal examinations consist of recalled cases confirmed to have malignancies by two experienced readers and later by biopsy examination.40
Study test set
It is well established that mammographic cases vary in their levels of difficulty, which impacts radiologists’ performance.41 Accordingly, the test set used consisted of three series of 20 cases in which efforts were made to ensure a similar average level of difficulty between the series. This was done to guarantee that any significant differences identified between any two series were less likely to be associated with difficulty levels and more likely to reflect radiologist performance levels. The method we used to maintain similar difficulty between the three series is explained in the next paragraph.
The cases were obtained from six BREAST test sets previously read by a total of 711 radiologists (each read by 55–232 radiologists), either online or in workshops held during RANZCR Annual Conferences from 2013 to 2018. In total, there were 360 mammographic cases. We calculated the easy-to-diagnose index (ETDI) by assessing the difficulty level of each case. The ETDI was defined as the proportion of correct reports per case, with higher ETDI values indicating greater proportions of correct interpretations.42 For the cancer cases, the ETDIs were based on correct identification of positive mammograms and correct localization of the malignant lesion within a 50-pixel radius from the lesion center. Of the 360 cases, three series of 20 cases were selected to create the current test set. The ages of the screened females whose mammographic exams were used ranged from 43 to 88 years (Mean = 61.38, standard deviation = 8.63), and the females also had a random selection of varying mammographic densities based on BI-RADS density ratings: 4 women had fatty breasts (0–25% glandular), 27 had fibroglandular breasts (25%–50% glandular), 27 had heterogeneous breasts (50%–75% glandular), and two had extremely dense breasts (75%–100% glandular). The difficulty levels in each series of 20 cases were similar to the other two series, based on their ETDI scores. Furthermore, each series contained nearly a similar distribution of normal and abnormal cases. The first series included 13 normal versus seven cancer cases, the next series had 14 normal versus six cancer cases, and the last series contained 13 normal versus seven cancer cases.
To confirm that there were no significant variations in difficulty or readability, we compared the previously calculated ETDIs across the three series. Normal and cancer cases were compared separately, and the latter were again analyzed in terms of positive image identification and malignancy localization. A Kruskal-Wallis test was used to compare the three series with the level of statistical significance set to.05 (two-tailed). The results showed no significant variations between the three series of images in reading difficulty for normal cases (K-W χ2(2)=.052, p =.975) or cancer cases in terms of either positive image identification (K-W χ2(2)=.268, p =.875) or malignancy localization (K-W χ2(2)=.055, p = .973).
Test set reading
Readings took place in the BREAST workshop between 8 a.m. and 8 p.m. Session time slot randomization was not possible; the times were selected by the participants or allocated according to their availability. The reading room in terms of ambient lighting, workstation design and viewing platform and furniture arrangement was carefully controlled to simulate a clinical environment. Each radiologist was given two hours to review the test set using the BREAST platform. The mean time to complete the BREAST test set in our study was approximately 90 min, which is similar to another study that used multiple BREAST test sets,43 so we assumed that it takes approximately half an hour to complete each series of 20 mammographic cases. The 60 selected cases (three series of 20 cases each) were presented to all readers in the same sequence (fixed-series sequence), who read all the images without breaks. Therefore, a continuous reading of 60 mammographic cases without division or interruption between the three series of 20 cases took place. Details on the review settings and reading room preparations for BREAST workshops have been published previously.22,43,44
Performance data, sequential position of the series of 20 mammographic cases, and radiologists’ experience levels
Once the radiologists had assessed the test set of 60 mammographic cases in a single session, we calculated the following performance metrics (dependent variables) for each of the three series of 20 cases: sensitivity, specificity, lesion sensitivity, and area under the receiver operating characteristic curve (ROC AUC). Thus, each radiologist’s performance was measured on the same dependent variable on three occasions (first, second, and third series of 20 cases). The participants were divided according to their experience based on the number of annual mammographic readings into the two following groups:
More experienced: radiologists who read ≥2,000 mammograms per year.
Less experienced: radiologists who read <2,000 mammograms per year.
Statistical analysis
All data were imported into IBM SPSS Statistics version 24 (IBM, Armonk, NY, USA). We conducted an analysis of variance (ANOVA) using a 2 (group) × 3 (test series) factorial design, with repeated measures on the second factor, to examine the potential for significant variations in performance for both groups (more and less experienced radiologists) while reading the three series presented in a fixed-series sequence, looking for any differences in performance across the series according to experience. The primary hypothesis regarding the main effect of the fixed-series sequence on the radiologists’ performance was tested by analyzing whether the performance by all radiologists was consistent throughout the reading session. Secondarily, we also examined for an interaction between the fixed-series sequence and the radiologists’ experience by analyzing whether performance trends across the series differed between the two groups.
We performed the factorial ANOVA using planned linear and quadratic contrasts; the significance level was set to α = .05. While a linear main effect would identify a significant difference between the first and third series in the radiologists’ average performance, a quadratic main effect would identify a significant difference between the average scores in the second series versus the first or third series. A linear interaction would determine whether both experience groups started at approximately the same performance level in the first series but showed significantly different levels by the end of the third series. A quadratic interaction would determine whether the two experience groups had significantly different trajectories (bendiness) from the first to third series.
The assumptions that must be met when performing factorial ANOVA were satisfied. These were measuring dependent variables (i.e. performance metrics) at the continuous level, having independent variables (i.e. the sequential position of the series of 20 mammographic cases) that consisted of at least two categorical and related groups (i.e. each subject measured on three occasions on the same dependent variable), not having significant outliers in the related groups, and having normally distributed residuals from the ANOVA models.
Results
Demographic and clinical characteristics of the participants
Table 1 presents the demographic details of the participating radiologists along with their clinical characteristics and annual experience with mammogram reading.
Table 1.
Demographic and Clinical Characteristics of the Participating Radiologists
| More experienceda | Less Experiencedb | Total | |
|---|---|---|---|
| Number of radiologists | 5 | 5 | 10 |
| Age (y) | 58.7 (12.1) | 38 (2.6) | 49.8 (14) |
| No. of years since qualification as a radiologist | 26 (12.1) | 9.2 (6.2) | 17.6 (12.6) |
| No. of years of reading mammograms | 22.2 (11.1) | 5.5 (2.6) | 14.7 (11.9) |
Note – Values are means (standard deviations).
Radiologists who read ≥2000 mammograms per year.
Radiologists who read <2000 mammograms per year.
Factorial ANOVA results
The ANOVA results revealed two significant linear interactions between experience and the fixed-series sequence: one for sensitivity and the other for lesion sensitivity. As shown in the profile plots in Figures 1 and 2, the effects were opposite for the two experience groups. The mean scores of the two groups were similar in the first series of 20 but then diverged, with those of the more experienced group tending to increase and those of the less experienced group tending to decrease. Conversely, ANOVA revealed no influence of experience or the fixed-series sequence on specificity or the ROC AUC.
Figure 1.

Profile plots of sensitivity over the first, second, and third series of 20 mammogram cases according to radiologists’ experience.
Figure 2.

Profile plots of lesion sensitivity over the first, second, and third series of 20 mammogram cases according to radiologists’ experience.
Main effect of the sequential presentation of mammographic cases (fixed-series sequence) on the entire sample of radiologists
The ANOVA revealed no linear or quadratic main effect of the fixed-series sequence for the entire sample of radiologists for any performance metric (Table 2).
Table 2.
Main Effect of the Fixed-Series Sequence for the Three Series of 20 Mammographic Cases on the Entire Sample of Radiologists
| Performance Metrics | Test Set | pa | |||
|---|---|---|---|---|---|
| First Series | Second Series | Third Series | Linear Main Effect | Quadratic Main Effect | |
| Sensitivity | 78.55 (4.04) | 84.98 (6.56) | 77.12 (5.53) | .833 | .230 |
| Lesion Sensitivity | 71.77 (6.40) | 81.64 (6.01) | 77.12 (5.53) | .443 | .158 |
| Specificity | 83.06 (2.61) | 76.38 (3.91) | 81.52 (3.85) | .663 | .159 |
| ROC AUC | 0.833 (0.01) | 0.856 (0.03) | 0.838 (0.02) | .870 | .529 |
ROC AUC, Area under the receiver operating characteristic curve.
Note —Values are means (standard errors).
p = significance level (two-tailed).
Interaction between experience and the fixed-series sequence
Table 3 displays the effects of the linear and quadratic interaction between experience and the fixed-series sequence on radiologists’ performance. Significant linear interactions were found between experience and the fixed-series sequence in terms of sensitivity (F1 =5.762, p = .043, partial η2 = .419) and lesion sensitivity (F1 =6.993, p = .030, partial η2 = .466; Figures 1 and 2, Table 3). The interaction shows that experienced and inexperienced readers had the same mean level of response in the first series but diverged thereafter, with the mean performance of the experienced readers tending to improve whereas the inexperienced readers’ tended to decline. In contrast, the linear interactions were not significant in terms of specificity (F1 =5.128, p = .053, partial η2 = .391; Figure 3 and Table 3) or ROC AUC (F1 =2.642, p = .143, partial η2 = .248; Figure 4 and Table 3). No significant quadratic interaction effect on the radiologists’ performance was evident between experience and the fixed-series sequence (Table 3).
Table 3.
Interaction between experience and the fixed-series sequence for the three series of 20 mammographic cases
| Performance Metrics | Experience Group | Test Set | Pd | ||||
|---|---|---|---|---|---|---|---|
| First Series | Second Series | Third Series | Linear Interaction | Quadratic Interaction | |||
| Sensitivity | Interactiona | Moreb | 79.9 (5.72) | 93.3 (9.28) | 94.2 (7.83) | .043e | .866 |
| Lessc | 77.1 (5.72) | 76.6 (9.28) | 59.9 (7.83) | ||||
| Lesion Sensitivity | Interactiona | Moreb | 71.4 (9.05) | 89.9 (8.50) | 94.2 (7.83) | .030e | .991 |
| Lessc | 72.1 (9.05) | 73.3 (8.50) | 59.9 (7.83) | ||||
| Specificity | Interactiona | Moreb | 84.6 (3.6) | 77.1 (5.53) | 75.3 (5.44) | .053 | .448 |
| Lessc | 81.5 (3.69) | 75.6 (5.5) | 87.6 (5.44) | ||||
| ROC AUC | Interactiona | Moreb | 0.841 (0.02) | 0.910 (0.04) | 0.899 (0.03) | .143 | .562 |
| Lessc | 0.824 (0.02) | 0.802 (0.04) | 0.777 (0.03) | ||||
ROC AUC, Area under the receiver operating characteristic curve.
Note—Values are means (standard errors).
Interaction between radiologists’ experience and the sequential position of the three series of 20 mammographic cases.
Radiologists who read ≥ 2,000 mammograms per year.
Radiologists who read < 2,000 mammograms per year.
p = significance level (two-tailed).
Significant correlation at p = .05 (two-tailed).
Figure 3.

Profile plots of specificity over the first, second, and third series of 20 mammogram cases according to radiologists’ experience.
Figure 4.
Profile plots of ROC AUC over the first, second, and third series of 20 mammogram cases according to radiologists’ experience.
Discussion
Today’s high volumes of radiologic reporting and long mammographic reporting sessions can entail heavy cognitive load, causing performance to deteriorate over time. This work shows that while more and less experienced radiologists started with similar sensitivity and lesion sensitivity levels at the beginning of a mammographic reading session, they progressively diverged, showing significant differences by the end of the third series of readings. Specifically, the less experienced radiologists’ sensitivity and lesion sensitivity levels decreased as readings progressed, while the more experienced radiologists’ levels increased. Changes in specificity in the opposite directions were also observed, but these were statistically non-significant.
The decline observed in both of the sensitivity measures among less experienced radiologists in the last 20 cases is suggestive of a decline in performance which could be due to vigilance decrement—that is, a decline in target detection sensitivity as time spent on task increases—or fatigue. Our findings are in line with studies in other domains. Donald et al17 examined potential vigilance decrements in operators of real-time closed-circuit television (CCTV) surveillance found that the performance of novices and generalists (i.e. operators with no or little training and experience in CCTV surveillance tasks) declined after 30 min on the task, whereas the performance of specialists (i.e. operators with more training and experience) remained stable for the first hour. This performance decline in less experienced personnel may be associated with neurophysiological features, such as increased cerebral blood flow velocity (a surrogate of attentional resource utilization), as novices expend greater effort than experts when performing vigilance tasks.17,45,46 This can lead to mental shortcuts in decision-making by less experienced radiologists and subsequent diagnostic errors.47 By contrast, experienced individuals use mental shortcuts only for efficiency purposes in common tasks and thus make more efficient use of their attentional resources to minimize cognitive load, which may even improve performance as the task progresses,17,45,46 as in our study. This was unexpected and cannot be explained by the available data, as we believe that they would eventually have reached a point at which they plateaued, as fatigue must ultimately set in and performance decline. However, performance improvements similar to those observed in our study for more experienced readers have been explained in non-radiologic domains by the ability of experts to adapt to a task and adjust their expectations of a signal being present, resulting in improved perceptual sensitivity.17,48 This may then explain, at least in part, why earlier work focusing on mammography screening, which involved only experienced radiologists, found no evidence of performance deterioration.18
In another study on vigilance decrement during mammographic reading sessions, Cowley and Gale33 found a significant decrease in sensitivity after an hour of interpreting mammographic film set and suggested that a typical breast screening session should last no more than 70 to 80 min. Vigilance decrement appeared to have begun in that study at around the same time as it did in our study, since an hour would roughly correspond to the time taken to read 40 cases.43 It should be noted, however, that Cowley and Gale made no distinction between more and less experienced radiologists, and their study used film mammograms while ours used digital mammograms, and we should be careful about drawing comparisons, although the tasks were roughly similar despite differences in technology.
Fatigue is the other factor that might explain the decline in sensitivity and lesion sensitivity observed only for less experienced radiologists11,12,20 — radiological studies consistently demonstrate that less experienced radiologists are more impacted by fatigue than experienced radiologists.38,49 Krupinski et al38 found that junior radiologists are considerably more susceptible to fatigue-related problems and experience sharper declines in diagnostic accuracy than their senior counterparts. With experience, radiologists gradually develop coping strategies to manage fatigue, such as the use of holistic processing, through which valuable information can be obtained from a quick scan of a 2D image.49 This technique is only valuable in 2D images, such as in the mammograms used in our study, as 3D images (e.g. computed tomography or digital breast tomosynthesis) do not have all the information necessary for holistic processing in a single slice. However, while it may be argued that endogenous characteristics associated with fatigue had an impact on our results, exogenous factors induced by the task (which was the focus of this work), such as reduced attention span, boredom, and recall threshold shift, cannot be excluded.27,50 In addition, we tried to minimize the impact of endogenous features associated with fatigue by scheduling the readings at various times during normal workday hours and on days when participants had no strenuous clinical commitments, as they were attending a conference. The fact that both groups of observers started at a similar level of performance, but then diverged, would suggest that the results are indeed primarily impacted by the task.
This study has certain limitations. Firstly, it was based on test set interactions, which do not accurately reflect clinical practice, since disease prevalence is higher, and decisions have no real-life consequences.51 However, previous studies have shown a high correlation between test set results and actual clinical performance.44,52 Also, it can be argued that more studies with larger numbers of radiologists are needed to confirm our results; we would agree, but the clear changes in performance, even between the two groups of five radiologists shown here, would suggest that a real issue exists that warrants further attention. Therefore, based on our findings, junior radiologists should be informed that long reading sessions (i.e. perhaps longer than 1 h) might potentially affect their performance, and they should decide if they want to take earlier breaks after a certain period of continuous reading. It may be worth noting that relevant guidelines are in place in other domains, such as the aviation industry, where air traffic controllers’ operational duty times are strictly regulated to ensure aircraft passenger and personnel safety.5 It is also interesting to note that in other fields, multiple breaks during a work day have been shown to reduce adverse incidents53 without necessarily reducing productivity.54
In conclusion, our findings suggest that in the course of a mammographic reading session, radiologists’ performance and behavior may change significantly, and these changes may be a function of experience. Observed decreases in the sensitivity and lesion sensitivity of less experienced readers as they progressed through the reading, in contrast to more experienced readers, suggest that introducing strategies around case batch length may help to optimize performance. Although further research is needed to fully understand the role of experience, in the modern context of high-volume radiologic reporting and shortages of breast radiologists, radiologists—especially less experienced ones—should be made aware of such decreases in sensitivity and lesion sensitivity as they progress through the reading and to act responsibly to minimize this possible deterioration over time during long reading (i.e. perhaps longer than 1 h) sessions without a break.
Footnotes
Funding: This work was funded by a PhD scholarship from the Ministry of Higher Education in the Kingdom of Saudi Arabia.
Contributor Information
Abdulaziz S Alshabibi, Email: aals3276@uni.sydney.edu.au.
Moayyad E Suleiman, Email: moe.suleiman@sydney.edu.au.
Salman M Albeshan, Email: Salbeshan@ksu.edu.sa.
Robert Heard, Email: rob.heard@sydney.edu.au.
Patrick C Brennan, Email: patrick.brennan@sydney.edu.au.
REFERENCES
- 1.Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 2021; 71: 209–49. doi: 10.3322/caac.21660 [DOI] [PubMed] [Google Scholar]
- 2.Dibden A, Offman J, Duffy SW, Gabe R. Worldwide review and meta-analysis of cohort studies measuring the effect of mammography screening programmes on incidence-based breast cancer mortality. Cancers 2020; 12: 976. doi: 10.3390/cancers12040976 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Maiz C, Silva F, Domínguez F, Galindo H, Camus M, León A, et al. Mammography correlates to better survival rates in breast cancer patients: a 20-year experience in a university health institution. Ecancermedicalscience 2020; 14: 1005. doi: 10.3332/ecancer.2020.1005 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Gulland A. Staff shortages are putting UK breast cancer screening "at risk," survey finds. BMJ 2016; 353: i2350. doi: 10.1136/bmj.i2350 [DOI] [PubMed] [Google Scholar]
- 5.Reicher J, Currie S, Birchall D. Safety of working patterns among UK neuroradiologists: what can we learn from the aviation industry and cognitive science? Br J Radiol 2018; 91: 20170284. doi: 10.1259/bjr.20170284 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Debono JC, Poulos AE, Houssami N, Turner RM, Boyages J. Evaluation of radiographers' mammography screen-reading accuracy in Australia. J Med Radiat Sci 2015; 62: 15–22. doi: 10.1002/jmrs.59 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Wing P, Langelier MH. Workforce shortages in breast imaging: impact on mammography utilization. AJR Am J Roentgenol 2009; 192: 370–8. doi: 10.2214/AJR.08.1665 [DOI] [PubMed] [Google Scholar]
- 8.Taylor-Phillips S, Stinton C, Krupinski EA. Ergonomics 2.0: fatigue in medical imaging. In: Samei E, Krupinski E. A, eds.The handbook of medical image perception and techniques. Cambridge, UK: Cambridge University Press; 2018. pp. 483–94. [Google Scholar]
- 9.Taylor-Phillips S, Elze MC, Krupinski EA, Dennick K, Gale AG, Clarke A, et al. Retrospective review of the drop in observer detection performance over time in lesion-enriched experimental studies. J Digit Imaging 2015; 28: 32–40. doi: 10.1007/s10278-014-9717-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Taylor-Phillips S, Stinton C. Fatigue in radiology: a fertile area for future research. Br J Radiol 2019; 92: 20190043. doi: 10.1259/bjr.20190043 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Stec N, Arje D, Moody AR, Krupinski EA, Tyrrell PN. A systematic review of fatigue in radiology: is it a problem? AJR Am J Roentgenol 2018; 210: 799–806. doi: 10.2214/AJR.17.18613 [DOI] [PubMed] [Google Scholar]
- 12.Nihashi T, Ishigaki T, Satake H, Ito S, Kaii O, Mori Y, et al. Monitoring of fatigue in radiologists during prolonged image interpretation using fNIRS. Jpn J Radiol 2019; 37: 437–48. doi: 10.1007/s11604-019-00826-2 [DOI] [PubMed] [Google Scholar]
- 13.Mackworth NH. Researches on the measurement of human performance (Med. Res. Council, Special Rep. Ser. No. 268). Oxford, UK: His Majesty’s Stationery Office; 1950. pp. 156. [Google Scholar]
- 14.Hancock PA. In search of vigilance: the problem of iatrogenically created psychological phenomena. Am Psychol 2013; 68: 97–109. doi: 10.1037/a0030214 [DOI] [PubMed] [Google Scholar]
- 15.Flanagan J, Nathan-Roberts D. Theories of vigilance and the prospect of cognitive restoration. Proc Hum Factors Ergon Soc Annu Meet 2019; 63: 1639–43. doi: 10.1177/1071181319631506 [DOI] [Google Scholar]
- 16.See JE, Howe SR, Warm JS, Dember WN. Meta-Analysis of the sensitivity decrement in vigilance. Psychol Bull 1995; 117: 230–49. doi: 10.1037/0033-2909.117.2.230 [DOI] [Google Scholar]
- 17.Donald F, Donald C, Thatcher A. Work exposure and vigilance decrements in closed circuit television surveillance. Appl Ergon 2015; 47: 220–8. doi: 10.1016/j.apergo.2014.10.001 [DOI] [PubMed] [Google Scholar]
- 18.Taylor-Phillips S, Wallis MG, Jenkinson D, Adekanmbi V, Parsons H, Dunn J, et al. Effect of using the same vs different order for second readings of screening mammograms on rates of breast cancer detection: a randomized clinical trial. JAMA 2016; 315: 1956–65. doi: 10.1001/jama.2016.5257 [DOI] [PubMed] [Google Scholar]
- 19.Warm JS, Parasuraman R, Matthews G. Vigilance requires hard mental work and is stressful. Hum Factors 2008; 50: 433–41. doi: 10.1518/001872008X312152 [DOI] [PubMed] [Google Scholar]
- 20.Krupinski E, Reiner BI. Real-Time occupational stress and fatigue measurement in medical imaging practice. J Digit Imaging 2012; 25: 319–24. doi: 10.1007/s10278-011-9439-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.May JF, Baldwin CL. Driver fatigue: the importance of identifying causal factors of fatigue when considering detection and countermeasure technologies. Transportation Research Part F: Traffic Psychology and Behaviour 2009; 12: 218–24. doi: 10.1016/j.trf.2008.11.005 [DOI] [Google Scholar]
- 22.Alshabibi AS, Suleiman ME, Tapia KA, Heard R, Brennan PC. Impact of time of day on radiology image interpretations. Clin Radiol 2020; 75: 746–56. doi: 10.1016/j.crad.2020.05.004 [DOI] [PubMed] [Google Scholar]
- 23.Gale AG, Murray D, Millar K, Worthington BS. Circadian variation in radiology. Adv Psychol 1984; 22: 313–21. [Google Scholar]
- 24.Alshabibi AS, Suleiman Mo'ayyad E, Tapia KA, Heard R, Brennan PC. Impact of hours awake and hours slept at night on radiologists' mammogram interpretations. J Am Coll Radiol 2021; 18: 730–8. doi: 10.1016/j.jacr.2020.12.023 [DOI] [PubMed] [Google Scholar]
- 25.Patel AG, Pizzitola VJ, Johnson CD, Zhang N, Patel MD. Radiologists make more errors interpreting off-hours body CT studies during overnight assignments as compared with daytime assignments. Radiology 2020; 297: 374–9. doi: 10.1148/radiol.2020201558 [DOI] [PubMed] [Google Scholar]
- 26.Maeda E, Yoshikawa T, Hayashi N, Akai H, Hanaoka S, Sasaki H, et al. Radiology reading-caused fatigue and measurement of eye strain with critical flicker fusion frequency. Jpn J Radiol 2011; 29: 483–7. doi: 10.1007/s11604-011-0585-7 [DOI] [PubMed] [Google Scholar]
- 27.Pattyn N, Neyt X, Henderickx D, Soetens E. Psychophysiological investigation of vigilance decrement: boredom or cognitive fatigue? Physiol Behav 2008; 93(1-2): 369–78. doi: 10.1016/j.physbeh.2007.09.016 [DOI] [PubMed] [Google Scholar]
- 28.Körber M, Cingel A, Zimmermann M, Bengler K. Vigilance decrement and passive fatigue caused by monotony in automated driving. Procedia Manuf 2015; 3: 2403–9. doi: 10.1016/j.promfg.2015.07.499 [DOI] [Google Scholar]
- 29.Smit AS, Eling PATM, Coenen AML. Mental effort causes vigilance decrease due to resource depletion. Acta Psychol 2004; 115: 35–42. doi: 10.1016/j.actpsy.2003.11.001 [DOI] [PubMed] [Google Scholar]
- 30.McWilliams T, Ward N. Underload on the road: measuring vigilance decrements during partially automated driving. Front Psychol 2021; 12: 631364. doi: 10.3389/fpsyg.2021.631364 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Hu X, Lodewijks G. Exploration of the effects of task-related fatigue on eye-motion features and its value in improving driver fatigue-related technology. Transportation Research Part F: Traffic Psychology and Behaviour 2021; 80(3–4): 150–71. doi: 10.1016/j.trf.2021.03.014 [DOI] [Google Scholar]
- 32.Grier RA, Warm JS, Dember WN, Matthews G, Galinsky TL, Parasuraman R, Galinsky Parasuramah R. The vigilance decrement reflects limitations in effortful attention, not mindlessness. Hum Factors 2003; 45: 349–59. doi: 10.1518/hfes.45.3.349.27253 [DOI] [PubMed] [Google Scholar]
- 33.Cowley HC, Gale AG. Time-of-day effects on mammographic film reading performance. In: Kundel H. L, ed.Medical imaging 1997: image perception Proceedings of the SPIE 3036 conference series in medical imaging 1997 April 16, Newport beach CA. USA. US: Society of Photo-Optical instrumentation engineers; 1997. [Google Scholar]
- 34.Taylor-Phillips S, Wallis MG, Duncan A, Gale AG. Use of prior mammograms in the transition to digital mammography: a performance and cost analysis. Eur J Radiol 2012; 81: 60–5. doi: 10.1016/j.ejrad.2010.10.025 [DOI] [PubMed] [Google Scholar]
- 35.Mello-Thoms C. The holistic Grail: possible implications of an initial mistake in the reading of digital mammograms. In: Proceedings of the SPIE 7263, medical imaging 2009: image perception, observer performance, and Technology assessment, 2009 March 12. lake Buena vista, FL, USA; 2009. [Google Scholar]
- 36.Mello-Thoms C. How much agreement is there in the visual search strategy of experts reading mammograms? In: Proceedings of the SPIE 6917, medical imaging 2008: image perception, observer performance, and Technology assessment, 2008 March 6. San Diego, CA, USA; 2008. [Google Scholar]
- 37.Krupinski EA, Berbaum KS, Caldwell RT, Schartz KM, Kim J. Long radiology workdays reduce detection and accommodation accuracy. J Am Coll Radiol 2010; 7: 698–704. doi: 10.1016/j.jacr.2010.03.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Krupinski EA, Berbaum KS, Caldwell RT, Schartz KM, Madsen MT, Kramer DJ. Do long radiology workdays affect nodule detection in dynamic CT interpretation? J Am Coll Radiol 2012; 9: 191–8. doi: 10.1016/j.jacr.2011.11.013 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Backmann HA, Larsen M, Danielsen AS, Hofvind S. Does it matter for the radiologists' performance whether they read short or long batches in organized mammographic screening? Eur Radiol 2021;10 Jun 2021. doi: 10.1007/s00330-021-08010-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Brennan P, Warwick L, Tapia K. Breast screen reader assessment strategy (BREAST): a research infrastructure with a translational objective. In: Samei E, Krupinski E. A, eds.The handbook of medical image perception and techniques. Cambridge, UK: Cambridge University Press; 2018. pp. 343–56. [Google Scholar]
- 41.Suleiman WI, McEntee MF, Lewis SJ, Rawashdeh MA, Georgian-Smith D, Heard R, et al. In the digital era, architectural distortion remains a challenging radiological task. Clin Radiol 2016; 71: e35–40. doi: 10.1016/j.crad.2015.10.009 [DOI] [PubMed] [Google Scholar]
- 42.Mazurowski MA. Difficulty of mammographic cases in the context of resident training: preliminary experimental data. In: Proceedings of the SPIE 8673, medical imaging 2013: image perception, observer performance, and Technology assessment, 2013 March 28. lake Buena vista, fL, USA; 2013. [Google Scholar]
- 43.A Tapia K, Rickard MT, McEntee MF, Garvey G, Lydiard L, C Brennan P. Impact of breast density on cancer detection: observations from digital mammography test sets. IJRRT 2020; 7: 36–41. doi: 10.15406/ijrrt.2020.07.00261 [DOI] [Google Scholar]
- 44.Soh BP, Lee WB, Mello-Thoms C, Tapia K, Ryan J, Hung WT, et al. Certain performance values arising from mammographic test set readings correlate well with clinical audit. J Med Imaging Radiat Oncol 2015; 59: 403–10. doi: 10.1111/1754-9485.12301 [DOI] [PubMed] [Google Scholar]
- 45.Shaw TH, Satterfield K, Ramirez R, Finomore V. Using cerebral hemovelocity to measure workload during a spatialised auditory vigilance task in novice and experienced observers. Ergonomics 2013; 56: 1251–63. doi: 10.1080/00140139.2013.809154 [DOI] [PubMed] [Google Scholar]
- 46.Shaw TH, Harwood AE, Satterfield K, Finomore V. Chapter 6 - Transcranial Doppler sonography in neuroergonomics. In: Ayaz H, Dehais F, eds.Neuroergonomics. New York: Academic Press; 2019. pp. 35–42. [Google Scholar]
- 47.Itri JN, Patel SH. Heuristics and cognitive error in medical imaging. AJR Am J Roentgenol 2018; 210: 1097–105. doi: 10.2214/AJR.17.18907 [DOI] [PubMed] [Google Scholar]
- 48.Smith SL, Helton WS, Matthews G, Funke GJ. Performance, hemodynamics, and stress in a two-day vigilance task: practical and theoretical implications. Hum Factors 1872; 2021: 08211011333. [DOI] [PubMed] [Google Scholar]
- 49.Waite S, Grigorian A, Alexander RG, Macknik SL, Carrasco M, Heeger DJ, et al. Analysis of Perceptual Expertise in Radiology - Current Knowledge and a New Perspective. Front Hum Neurosci 2019; 13: 213. doi: 10.3389/fnhum.2019.00213 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Thomson DR, Besner D, Smilek D. A critical examination of the evidence for sensitivity loss in modern vigilance tasks. Psychol Rev 2016; 123: 70–83. doi: 10.1037/rev0000021 [DOI] [PubMed] [Google Scholar]
- 51.Gur D, Rockette HE. Performance assessments of diagnostic systems under the FROC paradigm: experimental, analytical, and results interpretation issues. Acad Radiol 2008; 15: 1312–5. doi: 10.1016/j.acra.2008.05.006 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Soh BP, Lee W, McEntee MF, Kench PL, Reed WM, Heard R, et al. Screening mammography: test set data can reasonably describe actual clinical reporting. Radiology 2013; 268: 46–53. doi: 10.1148/radiol.13122399 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Chen C, Xie Y. The impacts of multiple rest-break periods on commercial truck driver's crash risk. J Safety Res 2014; 48: 87–93. doi: 10.1016/j.jsr.2013.12.003 [DOI] [PubMed] [Google Scholar]
- 54.Dababneh AJ, Swanson N, Shell RL. Impact of added rest breaks on the productivity and well being of workers. Ergonomics 2001; 44: 164–74. doi: 10.1080/00140130121538 [DOI] [PubMed] [Google Scholar]

