Abstract
Objectives
The quest to measure and improve diagnosis has proven challenging; new approaches are needed to better understand and measure key elements of the diagnostic process in clinical encounters. The aim of this study was to develop a tool assessing key elements of the diagnostic assessment process and apply it to a series of diagnostic encounters examining clinical notes and encounters’ recorded transcripts. Additionally, we aimed to correlate and contextualise these findings with measures of encounter time and physician burnout.
Design
We audio-recorded encounters, reviewed their transcripts and associated them with their clinical notes and findings were correlated with concurrent Mini Z Worklife measures and physician burnout.
Setting
Three primary urgent-care settings.
Participants
We conducted in-depth evaluations of 28 clinical encounters delivered by seven physicians.
Results
Comparing encounter transcripts with clinical notes, in 24 of 28 (86%) there was high note/transcript concordance for the diagnostic elements on our tool. Reliably included elements were red flags (92% of notes/encounters), aetiologies (88%), likelihood/uncertainties (71%) and follow-up contingencies (71%), whereas psychosocial/contextual information (35%) and mentioning common pitfalls (7%) were often missing. In 22% of encounters, follow-up contingencies were in the note, but absent from the recorded encounter. There was a trend for higher burnout scores being associated with physicians less likely to address key diagnosis items, such as psychosocial history/context.
Conclusions
A new tool shows promise as a means of assessing key elements of diagnostic quality in clinical encounters. Work conditions and physician reactions appear to correlate with diagnostic behaviours. Future research should continue to assess relationships between time pressure and diagnostic quality.
Keywords: PRIMARY CARE, QUALITATIVE RESEARCH, Quality in health care
Strengths and limitations of this study.
This study examined communication of verbal and written diagnostic processes in actual clinical encounters using a novel assessment tool.
Data were triangulated from audio-recorded encounters for acute care problems, clinical documentation and measures of work conditions and clinician burnout.
This study has a limited sample size due to COVID-19 constraints.
The findings may not be generalisable to other healthcare settings or other types of encounters.
Introduction
Improving diagnosis is a top priority for medical quality and patient safety.1 Based on data from patient surveys, malpractice claims and polling of safety professionals,2–5 reinforced by a report from the National Academy of Medicine (NAM),6 diagnostic errors and the imperative for diagnosis quality improvement have been recognised as important needs.7 While there have been multiple reports on epidemiology and magnitude of diagnostic errors,8–10 there has been a paucity of in-depth qualitative studies examining diagnostic processes to clarify what actually occurs during clinical encounters, as well as clinicians’ decision-making as represented in their clinical notes.11 12 Further, most qualitative studies examining clinician behaviour are paper scenario-based exercises rather than examining the more complex realm of real-world diagnostic encounters.13 14
A central challenge in improving diagnosis has been the issue of measurement: how to reliably, efficiently and effectively measure diagnosis quality.15–17 Unlike other areas of quality and safety, creating a standardised method for measuring diagnostic errors has proven challenging.18 19 In this regard, NAM and the National Quality Forum cautioned that the field of diagnostic metrics is still ‘emerging’ and ‘needing further exploration and development’, before any implementation.6 20 Measuring simpler, narrower aspects of the diagnostic process, such as what per cent of ordered tests are performed and/or followed-up when abnormal, is easier to conceptualise and measure than crafting metrics evaluating clinicians’ diagnostic reasoning or verbal or written communication, including evaluating quality of clinical notes in the electronic health record.15 21–24 In this study, we aimed to examine these more difficult-to-measure elements of diagnostic process by designing and applying a tool to evaluate physicians’ diagnostic assessments during urgent care encounters as well as in their clinical notes.25 26
Because diagnosis often occurs in time-constrained, stressful and chaotic settings, we sought to contextualise our findings with a more complete understanding of conditions where clinicians performed their diagnostic work.16 27 28 In addition to applying our tool to assess elements of the diagnostic process, we collected data on time physicians took for encounters as well as perceptions of time adequacy and work conditions by querying them regarding the time they felt was required for specific patient encounters. We concurrently measured these physicians’ level of stress and burnout using previously validated standardised tools.29
Thus, our study aimed to:
Iteratively develop, refine and apply a tool to assess key elements of diagnostic assessment in actual clinical encounters,
Apply the tool to evaluate a series of diagnostic encounters by examining audiotaped encounter transcripts and the associated clinical notes and
Correlate and contextualise findings with measures of encounter time and clinician perceptions of work conditions and burnout.
Methods
Study design
The MD-SOS study (Medical Diagnosis: Safety or Stress) was designed to better understand diagnosis quality and its relationship to clinician stress and burnout. We iteratively developed and refined a tool, based on a previously piloted instrument, to examine selected elements of clinical notes and that visit’s audiotaped encounter. We conceptualised and built the tool based on key diagnostic elements suggested in the literature and prior work of the investigator team around diagnosis quality (table 1).
Table 1.
Inclusion of diagnostic elements assessment tool items of the encounter and note
| Assessment method | Clinical notes | Transcripts of audiotaped encounters | |
| Scoring | 5-point scale | Present, absent | |
| 0—absent, 1—minimal, 2—less than good, 3—neutral, 4—good, 5—excellent. |
Present, absent | ||
| Items assessed | 20 items | 15 items | |
| Comparing between transcripts and corresponding notes | 1, 2—absent. 3, 4, 5—present. |
||
| Diagnostic assessment items | 5 items | 6 items | |
| Addresses chief issue | + | + | |
| Includes differential diagnosis | + | + | |
| Mentions aetiologies | + | + | |
| Delving into aetiologies | _ | + | |
| Contextual/psychosocial info mentioned | + | + | |
| Comment on likelihood, uncertainties | + | + | |
| Diagnostic follow-up plan items mentioned/noted | 4 items | 5 items | |
| Diagnostic test | + | + | |
| Explanation of test/rationale | – | + | |
| Contingencies: what to watch for | + | + | |
| Expected time frames: (eg, for sx improvement) | + | + | |
| Rational | + | + | |
| Situational awareness and ‘safety net’ items | 3 items | 3 items | |
| Red flags | + | + | |
| Do not miss diagnosis | + | + | |
| Pitfalls | + | + | |
| Global assessment of documentation quality | 5 items | 1 item | |
| Succinctness | + | – | |
| Clinician readability | + | – | |
| Pt readability | + | – | |
| Pejorative Comments | + | + | |
| Excessive Templating/Copy-paste | + | – | |
| Global assessment of diagnostic assessment in note | 3 items | – | |
| Quality of diagnosis | + | – | |
| Needed tests ordered | + | – | |
| Avoids over testing | + | – | |
*Excluded from analysis because no instances of pejorative language were found in any note or transcript.
pt, patient; Sx, symptom.
These included elements such as explicitly addressing chief concern(s), listing a differential diagnosis, commenting on probabilities (degree of certainty and associated uncertainties), mention of presence (or absence) of diagnostic red flags, consideration of ‘don’t miss’ diagnoses,30 31 mention of relevant pitfalls (eg, test limitations or atypical presentations) to consider,32 diagnostic follow-up testing, plans and time frames, mention of specific parameters for patients to monitor, with expected time frames33 34 and avoidance of pejorative language.35 36 The tool, Inclusion of Diagnostic Elements Assessment (IDEA) was designed to evaluate diagnostic assessment and documentation and consisted of five categories with 15 items for assessing diagnostic quality: (1) completeness and quality of assessment, (2) diagnostic situational awareness with potential red flags and don’t miss diagnoses, (3) adequacy of diagnostic follow-up plan, (4) overall note quality (organisation, succinctness) and (5) global subjective rating of documentation diagnosis quality. Data collection took place between November 2019 and March 2020. (The study was terminated early due to institutional COVID-19 restrictions imposed on in-person visits and research).
We examined the tool’s construct validity, including valid content, response process, internal structure, pairwise correlation and relations to other variables and consequences.37 Our analysis suggested that we reached good content validity given that the items were based on our previous work and the extensive experience of experts in the field. The items of the tool were rated according to a standard 5-point Likert scale, where the scoring process was carefully documented, with all raters keeping track of their comments and questions. The internal consistency/reliability was satisfactory with a kappa for inter-rater reliability (kappa=0.8), and consequences aspects of validity (clinical utility) showing favourable ability to categorise charts for diagnostic quality. We could not assess relationship to other variables (external validity) since our tool is a novel instrument. We also examined pairwise correlations across the 12 items that were used to assess both the clinical notes and transcribed encounters. The correlation ranged from −0.18 (items: differential diagnosis and chief issue) to 0.69 (items: psychosocial context and differential diagnosis). The mean absolute correlation across all items was 0.26. Given the low pairwise correlation (r<0.8), we assume low collinearity between items.
Study setting, participants and recruitment
Study participants were primary care physicians and their patients, recruited from Brigham and Women’s Hospital at three different urgent care settings. Two sites were affiliated with primary care clinics, and one was a walk-in urgent care centre affiliated with the emergency department.
Through a convenience sample, physicians were approached by the principal investigator (PI) (GS) via an email explaining the study and inviting them to participate. Of the 10 physicians invited, 7 agreed to participate. They completed a written consent form which explained the study and permitted them to raise and resolve any questions or concerns prior to patient recruitment. After obtaining consent, two researchers (MK and JR) attended sessions in the physician’s clinic where they approached patients in the waiting room prior to encounters to request permission to record their encounter. The researchers explained to patients that the study was about quality improvement, and they reviewed with them protections to ensure confidentiality. Demographics about patients were not collected, to fully anonymise the audiotaped encounters. To mitigate any bias, the researchers mentioned their specific role in the study without providing additional information about personal views. Participating patients and physicians were provided US$25 and US$75 honoraria, respectively.
Procedures
For consenting patients, we audio-recorded clinical encounters and collected accompanying visit notes from the electronic health record, or EHR (Epic, Verona, Wisconsin, USA). A research assistant started the recorder and timed the visit length but was not in the room during the encounter. At the end of each session, digital speech files from the encounters were collected and stored in secure password-protected file areas. Digital audio recordings were transcribed using a specially designed interface with Amazon Web Services that redacts HIPAA (Health Insurance Portability and Accountability Act) - protected health information such as patient and physician names. A trained research assistant (JR) manually edited each transcript for readability and any speech recognition errors. Only encounters with English-speaking patients were included.
To explore connections between work conditions, stress and burnout and diagnostic quality (element inclusion), physicians completed a Worklife and Wellness Mini Z survey form27 28 shortly after the session where their visits were recorded. At the end of each encounter, researchers asked physicians about the perceived time needed for the encounter.
Data analysis
The study included two groups of coders. Clinical notes were analysed by the first group comprising two coders (a trained male research assistant (JR) and a general internist PI (GS) who reviewed the written notes. Encounter transcripts were independently coded by two PhD-level social science female researchers (MK and EES) using a modified version of the IDEA tool along with additional qualitative coding using NVivo V.12 software.
Clinical notes
We adapted the IDEA chart review tool from our previous study (Assessment of The Assessment)38 and pilot-tested the coding agreement and clarified operational definitions by having five team members each analyse three notes and then resolve discrepancies. The two coders then coded five additional notes applying the tool’s 15 items. Disagreements were resolved to ensure consistent coding. The research assistant coded the remaining notes, conferring with the PI (GS) to resolve uncertainties, difficult judgements or questions. Issues arising in the tool refinement and coding of notes were further discussed by the entire team (three general internists and three PhD qualitative researchers). Ten charts were independently re-reviewed by one study physician to assess inter-rater reliability (kappa=0.8).
Encounter transcripts
Transcripts were entered into NVivo and deductively and inductively analysed.39–41 The researchers met after independently coding five transcripts to compare findings and reconcile discrepancies. The PhD-level researchers met bi-weekly with the larger research team to revise the coding scheme, ensure consistent coding application and reach consensus on disagreements. The clinical dialogue transcripts were qualitatively analysed by two PhD researchers to (1) identify diagnostic utterances in the conversation, noting when (using the transcript time stamps) and in what general phase of the encounter they occurred (eg, initial history taking, physical examination, wrap up assessment and plan) and their content, and (2) evaluate presence or absence of elements from the IDEA tool. Given the difficulty of more precisely rating the conversations using the tool (designed for EHR note review) with its 5-point Likert scale, for the verbal conversations, we instead dichotomised applicable IDEA items as present versus absent. Hence, if an item was present anywhere in the conversation, it was marked as present. Two researchers independently coded each transcript, then met to reconcile coding decisions. The researchers reached 100% agreement after each review and reconciliation meeting. IDEA items which were applicable only to the written note (eg, documentation completeness, succinctness, readability) were analysed separately from transcribed conversations.
Comparing clinical notes and encounter transcripts
After analysis was completed, two coders met to cross reference coded ratings between notes and transcripts to compare similarities and discrepancies.
Comparison with encounter time and Mini Z and Worklife responses
To evaluate relationships between note and encounter quality, time taken, perceived adequacy of time allotted and physician burnout, we analysed this data for each physician and encounter and compared results to determine extent to which (1) diagnostic process elements were present in verbal encounter transcripts and the accompanying written notes (concordance), (2) actual time taken and perceived time needed related to each other, (3) strength of relationships between encounter diagnostic process element inclusion (using a summary measure incorporating both verbal and written note quality) and physician burnout measured by the well validated Mini Z Worklife survey instrument (with the single item burnout measure correlating most strongly, in prior studies, with emotional exhaustion).42 We rated physicians as highly stressed if they answered agree or strongly agree to ‘I feel a great deal of stress because of my job’ and burned out if they answered affirmatively to one or more of the following items: ‘I am definitely burning out and have one or more symptoms of burnout’, ‘The symptoms of burnout that I’m experiencing won’t go away, I think about work frustrations a lot’ or ‘I feel completely burned out, I am at the point where I may need to seek help’.
Because diagnostic reasoning is linked to individual physicians, data were aggregated at the physician level which limited the power and scope of any statistical analysis. Hence all results presented here are descriptive and trend oriented.
Diagnostic elements concordance
The IDEA tool was applied to transcripts and notes to provide insight regarding which items were frequently communicated by physicians only verbally during the encounter versus only in the clinical note, versus present in both. If items were present in more than two-thirds of notes and encounters, we considered them ‘reliably included’, whereas items that were present in less than two-thirds of encounters were referred to as ‘often missed’. Items that were almost never present were deemed ‘widely missed’.
Patient and public involvement
None.
Results
This study was originally designed to include 100 encounters by audiotaping 5 encounters of 20 physicians, in order to detect a correlation coefficient of 0.3 at the significance level of 0.05.43 Due to COVID-19 constraints, we collected data on 28 patient encounters among seven physicians before March 2020, after which the data collection was halted. A total of 42 patients were approached and 28 (67%) agreed to participate. For each physician, between one and five patients were consented. Most physicians practiced in general medicine (five), whereas two practiced in emergency medicine (two). Table 2 shows diagnostic process elements captured in notes and transcripts across each physician as compared with the average across the seven physicians. In figure 1, physicians reporting high stress and/or burnout are represented in red outlines and tended to demonstrate different documentation profiles. For example, physician 2 compared with physician 3 verbally communicated and documented in their notes fewer psychosocial and contextual elements, mentions of diagnostic uncertainty comments on time frame for expected improvement.
Table 2.
Diagnostic process elements present in both the clinical note and corresponding transcribed audiotaped encounter, by physician
| Physician ID | P1 | P2 | P3 | P4 | P5 | P6 | P7 |
| Number of patient visits | 3 | 4 | 5 | 5 | 1 | 5 | 5 |
| Self-reported burnout status (1, 0) | 1 | 1 | 0 | 0 | 1 | 1 | 0 |
| Diagnostic assessment | |||||||
| Addresses chief issue (element present in visits/total N. of visits) | 2/3 | 4/4 | 5/5 | 5/5 | 1/1 | 5/5 | 5/5 |
| Includes differential diagnosis related to chief issue | 1/3 | 2/4 | 4/5 | 4/5 | 5/5 | 3/5 | |
| Addresses/notes contextual and/or psychosocial information | 0 | 0 | 3/5 | 4/5 | 0 | 0 | 3/5 |
| Discusses possible aetiologies of the issues | 3/3 | 3/4 | 5/5 | 5/5 | 1/1 | 4/5 | |
| Addresses degree of certainty/uncertainty | 1/3 | 2/4 | 4/5 | 5/5 | 1/1 | 0 | 5/5 |
| Diagnostic follow-up plan | |||||||
| Mentions diagnostic tests (laboratory, imaging) | 1/3 | 3/4 | 2/5 | 4/5 | 1/1 | 2/5 | 3/5 |
| Contingencies discussed | 0 | 2/4 | 5/5 | 4/5 | 1/1 | 3/5 | 5/5 |
| Time frames discussed | 1/3 | 3/4 | 5/5 | 3/5 | 0 | 0 | 5/5 |
| Includes rational | 0 | 3/4 | 3/5 | 5/5 | 1/1 | 3/5 | 5/5 |
| Situational awareness/safety nets | |||||||
| Red flags | 3/3 | 3/4 | 5/5 | 5/5 | 1/1 | 4/5 | 5/5 |
| ‘Don’t miss diagnoses’ (worst case scenarios) considered/noted | 0 | 2/4 | 3/5 | 4/5 | 1/1 | 3/5 | 3/5 |
| Pitfalls considered, noted, commented | 0 | 1/4 | 1/5 | 0 | 0 | 0 | 0 |
| Global assessment of quality | |||||||
| Avoid legal liability pejorative | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Figure 1.
Diagnostic process elements present in transcript and note per physician.
Elements reliably included or rarely missed
Addressing the patient’s chief issue was found to be present in all but one encounter and reliably across physicians (75%–100% of visits). ‘Red flags’ were frequently included in both the note and encounter across physicians. Another prominently featured item was exploration of aetiology, as well as how clinical issues/symptoms influenced patients’ daily routines. Importantly, addressing likely diagnoses along with degree of certainty or uncertainty was noted in both the transcript and notes in 75% of encounters. Mention of contingencies (what to watch for if symptoms worsen) was present in 70% of encounters and corresponding notes. However, in 22% of encounters, such contingencies were present in the note but absent from the verbal encounter conversation (although in 6 of 28 notes it was simply a generic sentence ‘to return if worse’; this was often not communicated verbally to the patient during the visit).
Elements sometimes missing
Four items were frequently absent from encounter conversations as well as the notes. The ‘time frames’ of expected duration of symptoms and desired time frame within which the patient should follow-up was present only 50% of the time across physicians and visits. Two additional less-frequently noted items were ‘don’t miss diagnoses’ and ‘differential diagnosis’ (means of 58%). The item ‘discussing diagnostic testing’ and commenting on tests performed at this visit or on tests that the provider was ordering as part of the plan was noted in transcripts and notes on average 60% of the time across physicians and encounters.
Often missed items
Two items were absent in most of the cases. ‘Psychosocial/contextual information’ which we defined broadly as any mention of these issue (including financial context, mention of mental health, relationships and emotional context or simply how the symptom/illness impacted day-to-day functioning) was present in only 29% of visits. Less stressed physicians addressed psychosocial context more frequently (almost 67% of encounters) than burned out/highly stressed physicians (for whom psychosocial information was not recorded in any encounters). Diagnostic pitfalls (eg, test limitations, atypical presentation) were noted in only 6% of encounters.
Work conditions
In analysing work conditions, including time pressure, we included five physicians, as two physicians practiced in urgent care embedded in emergency departments and did not have prespecified visit lengths.
Figure 2 shows the relationship between actual time taken per visit (face-to-face time with the physician) and perceived time needed for the visit. Of 18 visits, 11 were measured as taking between 15 and 25 min, and 7 were longer (longest visit 38 min). Most of the longer-lasting visits matched the perception of additional time needed made by each physician, but this was not uniform. Burnout status (reporting high stress or burnout) did not appear to be related in a systematic way to either perceived or actual time taken for the visit.
Figure 2.
Analysis of work condition: relation between perceived time needed and time taken for each visit per physician.
Additional qualitative review of encounter transcripts: physicians’ diagnostic engagement during the clinical visit
Transcripts were analysed inductively to better understand physicians’ diagnostic engagement with patients especially ways they inquired about their clinical problem. This analysis focused only on the transcripts, analysing the conversation and extent to which physicians solicited and communicated diagnostic information with the patients. We examined three aspects: (1) depth of delving into aetiologies (scored present, if physicians asked more than one question about an aetiology, or absent if no inquiries were made); (2) acknowledging psychosocial elements (scored present if physicians acknowledged patients’ psychosocial expressions or absent if not acknowledged); (3) whether red flags symptoms were queried during the encounter (dichotomised as present if 4 or 5, or absent if less than 3). As shown in table 3, non-burned out/low stress physicians (white columns) were inclined to inquire and acknowledge what patients shared more than burned out/highly stressed physicians (grey columns).
Table 3.
Shortcomings in physician diagnostic conversations from transcribed encounters
| Additional elements | P1 | P2 | P3 | P4 | P5 | P6 | P7 |
| Delving into aetiologies | 2/3 | 1/4 | 4/5 | 5/5 | 0 | 2/5 | 5/5 |
| Acknowledging psychosocial elements | 0 | 0 | 2/5 | 4/5 | 0 | 0 | 1/5 |
| Number of red flags | 2/3 | 0 | 5/5 | 5/5 | 1/1 | 2/5 | 5/5 |
| Average of three elements | 55% | 8% | 73% | 93% | 33% | 27% | 73% |
White columns represent physicians who were not burned out; physicians in grey columns were burned out.
Discussion
Defining diagnosis quality is challenging for multiple reasons in including the highly subjective and variable nature of patients symptoms, clinical interactions and disease manifestations.44–46 However, there are key elements of good diagnostic process that have been recommended by experts that can serve as the basis for designing a tool for measuring conformance with these recommended qualities, at least in physicians note. Further, we reassuringly found that most (86%) of the recorded clinical encounters were highly concordant with their corresponding notes in terms of inclusions of key diagnostic elements as assessed by the tool we developed. This is reassuring for both health services and quality research that often must solely rely on chart review.47 48
Our findings from both note and transcript review suggest that clinicians could do a better job in paying attention to several recommended items that were omitted in half or more of the encounters. These include attention to psychosocial history and context, consideration of ‘don’t miss diagnoses’ and documenting and communicating follow-up contingencies.31–34 In addition, despite their oft-lauded importance, there is room to improve consideration of ‘don’t miss’ diagnoses and development of a differential diagnosis.
This study is novel in examining diagnostic elements present in a clinical note while triangulating data from recorded encounters and measures of work conditions and clinician reactions (eg, length of encounter, burnout and stress). Although limited by our small sample size, the findings show a trend toward higher note quality and what is communicated to the patient during the encounter when physicians have and take the time that they feel is needed. When perceived time required more closely matched actual time taken (though not necessarily allotted), IDEA scores were higher. These findings also indicate a trend between low stress/non-burned out physicians, and documenting diagnostic elements regarding diagnostic tests, delving into aetiologies, and most importantly physicians addressing psychosocial elements, suggesting there may be measurable differences in diagnosis quality between non-burned out/low stress and burned out/highly stressed clinicians, particularly around communicating key elements with their patients, verbally and in their notes. It is noteworthy that this study is the first to measure burnout in real time. While stress is a factor that clearly varies day-to-day and from patient-to-patient, burnout is considered a longer-term stress reaction. Thus, the impact of acute stresses on burnout is uncertain and has not been studied.
These worklife findings should serve as a signal for concern, adding to other warning signs related to stresses in primary care.49–53 Diagnosis is foundational for high-quality care, and linking lower diagnosis quality with burnout is an important hypothesis generated by this work. The interaction between burnout and diagnosis documentation quality—as well as the vexing interaction between documentation itself and burnout. This two-way interaction, between feeling burned out and better diagnosis documentation on one hand, and documentation burdens themselves contributing to burnout on the other, calls for more transformative thinking about the EHR and workflow redesign.54 55
Limitations
The study has several limitations. First, it was an exploratory qualitative study based on a small convenience sample of physicians which limited us from performing a multilevel (clustered) statistical analysis. However, we thought it was nonetheless valuable to share our timely findings even at this stage for the benefit of other researchers and also to set the stage for future work. Second, physicians and patients were aware of being recorded during the encounter which might have altered clinician’s conversations and documentation behaviours, (eg, more thorough conversations or documentation). However, physicians were not aware of the specific quality elements being examined. Anecdotally, several physicians commented that having the recorder in the room was not noticeable and at times they forgot the conversation was being recorded. The IDEA tool, while iteratively developed by a team of physicians, health services researchers and communication PhD’s, has not undergone extensive psychometric testing. It focuses only on selected diagnostic process elements, rather than actual diagnoses made by physicians or patients’ clinical outcomes. In focusing on key diagnostic elements, we did not examine other important nuances in the encounter, such as non-verbal cues or power dynamics in patient-physicians’ relations which could have influenced trust as well as the sharing and communication of diagnostic information.56–58
Conclusions
A new tool to assess diagnostic process elements permitted us to conceptualise and measure aspects of diagnostic assessment in clinical encounters of acute problems, measure concordance between encounter transcripts and written documentation and correlate these findings with clinician burnout and stress. Our findings show reassuring concordance between notes and transcribed encounters and highlight areas that may be fruitful for future work in improving diagnosis, including better documenting psychosocial information, explaining proposed diagnostic tests and consideration of potential diagnostic pitfalls. Finally, the findings suggest time pressures and clinician burnout may correlate with poorer diagnostic performance, with a need for further research and support for primary care clinicians and practices. Given the paucity of metrics to assess diagnosis quality, the tool offers a template to model good diagnostic assessments, and a metric for offices and clinics to use and adapt for quality improvement and feedback.
Supplementary Material
Footnotes
Contributors: MK was a major contributor in gathering, analysing and interpreting the data and writing the manuscript. EES analysed and interpreted the data and was a major contributor in writing the manuscript. SA analysed and interpreted the data and contributed to writing the manuscript. JR was a contributor to obtaining and analysing the data. MM was the project manager and helped in obtaining and interpreting the data. ML was a consultant and contributed to interpreting the data and writing the manuscript. GS was the principal investigator, contributed to analysing and interpreting the data and writing the manuscript, and is the guarantor. AO was a consultant and contributed to analysing and interpreting the data and writing the manuscript. All authors read and approved the final manuscript.
Funding: This work was supported by CRICO (Harvard Risk Management Foundation—grant number 072). The funders had no role in planning the design of the study, the data collection, management, analysis and interpretation of data. They had no part in the writing of the manuscript and no influence on the decision to choose a journal for publication.
Competing interests: ML declares support through his place of employment (Hennepin Healthcare) by the American Medical Association (AMA), American College of Physicians (ACP), the Optum Office for Provider Advancement (OPA), Essentia Health Systems, Gillette Children’s Hospital, the California Area Health Education Centers (AHEC), the Institute for Healthcare Improvement (IHI) and the American Board of Internal Medicine Foundation (ABIMF) for burnout prevention research, projects and training. He is also supported for scholarly work by the NIH and the US Federal Agency for Healthcare Quality and Research. The other authors declare that there are no competing interests.
Patient and public involvement: Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.
Provenance and peer review: Not commissioned; externally peer reviewed.
Data availability statement
Data are available upon reasonable request. The data sets during and/or analysed during the current study available from the corresponding author on reasonable request.
Ethics statements
Patient consent for publication
Consent obtained directly from patient(s).
Ethics approval
This study involves human participants and was approved by the Institutional Review Board at Mass General Brigham (IRB approval number: 2018P000466). Participants gave informed consent to participate in the study before taking part.
References
- 1.The Leapfrog Group . Recognizing excellence in diagnosis: recommended practices for hospitals. Washington, D.C: The Leapfrog Group, July 2022. Available: https://www.leapfroggroup.org/recognizing-excellence-diagnosis-recommended-practices-hospitals [Google Scholar]
- 2.NORC at the University of Chicago and IHI/NPSF Lucian Leape Institute, Americans’ experiences with Medical Errors and Views on Patient Safety . Institute for healthcare improvement and NORC at the University of Chicago. Cambridge, MA; 2017. Available: http://www.ihi.org/about/news/Documents/IHI_NPSF_NORC_Patient_Safety_Survey_2017_Final_Report.pdf [Google Scholar]
- 3.Newman-Toker DE, Schaffer AC, Yu-Moe CW, et al. Serious misdiagnosis-related harms in malpractice claims: the "big thr'' -vascular events, infections, and cancers. Diagnosis (Berl) 2019;6:227–40. 10.1515/dx-2019-0019 [DOI] [PubMed] [Google Scholar]
- 4.ECRI Institute . Top 10 patient safety concerns 2022. 2022. Available: https://www.ecri.org/top-10-patient-safety-concerns-2022
- 5.Schiff GD, Puopolo AL, Huben-Kearney A, et al. Primary care closed claims experience of Massachusetts malpractice insurers. JAMA Intern Med 2013;173:2063. 10.1001/jamainternmed.2013.11070 [DOI] [PubMed] [Google Scholar]
- 6.National Academies of Sciences E, and Medicine . Improving diagnosis in health care. Washington D.C, 2015. [Google Scholar]
- 7.Yang D, Fineberg HV, Cosby K. Diagnostic excellence. JAMA 2021;326:1905. 10.1001/jama.2021.19493 [DOI] [PubMed] [Google Scholar]
- 8.Singh H, Schiff GD, Graber ML, et al. The global burden of diagnostic errors in primary care. BMJ Qual Saf 2017;26:484–94. 10.1136/bmjqs-2016-005401 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Gunderson CG, Bilan VP, Holleck JL, et al. Prevalence of harmful diagnostic errors in hospitalised adults: a systematic review and meta-analysis. BMJ Qual Saf 2020;29:1008–18. 10.1136/bmjqs-2019-010822 [DOI] [PubMed] [Google Scholar]
- 10.Graber ML. The incidence of diagnostic error in medicine. BMJ Qual Saf 2013;22(Suppl 2):ii21–7. 10.1136/bmjqs-2012-001615 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Henriksen K, Dymek C, Harrison MI, et al. Challenges and opportunities from the agency for healthcare research and quality (AHRQ) research Summit on improving diagnosis: a proceedings review. Diagnosis (Berl) 2017;4:57–66. 10.1515/dx-2017-0016 [DOI] [PubMed] [Google Scholar]
- 12.Schiff GD, Tharayil MJ. n.d. Electronic clinical documentation. Key Advances in Clinical Informatics: Elsevier;2017:51–68. [Google Scholar]
- 13.Jayasinghe S. Describing complex clinical scenarios at the bed-side: is a systems science approach useful? exploring a novel diagrammatic approach to facilitate clinical Reasoning. BMC Med Educ 2016;16:1–6. 10.1186/s12909-016-0787-x [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Cooper N, Bartlett M, Gay S, et al. Consensus statement on the content of clinical Reasoning curricula in undergraduate medical education. Med Teach 2021;43:152–9. 10.1080/0142159X.2020.1842343 [DOI] [PubMed] [Google Scholar]
- 15.Schiff GD, Ruan EL. The elusive and illusive quest for diagnostic safety metrics. J GEN INTERN MED 2018;33:983–5. 10.1007/s11606-018-4454-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Olson APJ, Linzer M, Schiff GD. Measuring and improving diagnostic safety in primary care: addressing the "twin'' pandemics of diagnostic error and clinician burnout. J GEN INTERN MED 2021;36:1404–6. 10.1007/s11606-021-06611-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Singh H, Bradford A, Goeschel C. Operational measurement of diagnostic safety: state of the science. Diagnosis (Berl) 2021;8:51–65. 10.1515/dx-2020-0045 [DOI] [PubMed] [Google Scholar]
- 18.El-Kareh R. Making clinical diagnoses: how measureable is the process. In: The National Quality Measures Clearinghouse. 2014. [Google Scholar]
- 19.Singh H, Graber ML, Hofer TP. Measures to improve diagnostic safety in clinical practice. J Patient Saf 2019;15:311–6. 10.1097/PTS.0000000000000338 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.National Quality Forum . Improving diagnostic quality and safety–final report; 2017. National quality forum Available: https://www.qualityforum.org/Publications/2017/09/Improving_Diagnostic_Quality_and_Safety_Final_Report.aspx
- 21.Stetson PD, Bakken S, Wrenn JO, et al. Assessing electronic note quality using the physician documentation quality instrument (PDQI-9). Appl Clin Inform 2012;3:164–74. 10.4338/aci-2011-11-ra-0070 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Edwards ST, Neri PM, Volk LA, et al. Association of note quality and quality of care: a cross-sectional study. BMJ Qual Saf 2014;23:406–13. 10.1136/bmjqs-2013-002194 [DOI] [PubMed] [Google Scholar]
- 23.Martin SA, Sinsky CA. The MAP is not the Territory: medical records and 21st century practice. Lancet 2016;388:2053–6. 10.1016/S0140-6736(16)00338-X [DOI] [PubMed] [Google Scholar]
- 24.Prater L, Sanchez A, Modan G, et al. Electronic health record documentation patterns of recorded primary care visits focused on complex communication: a qualitative study. Appl Clin Inform 2019;10:247–53. 10.1055/s-0039-1683986 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Khazen M, Sullivan EE, Ramos J, et al. Anatomy of diagnosis in a clinical encounter: how clinicians discuss uncertainty with patients. BMC Prim Care 2022;23:153. 10.1186/s12875-022-01767-y [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Weiner SJ, Wang S, Kelly B, et al. How accurate is the medical record? A comparison of the physician’s note with a concealed audio recording in unannounced standardized patient encounters. J Am Med Inform Assoc 2020;27:770–5. 10.1093/jamia/ocaa027 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Linzer M, Sullivan EE, Olson APJ, et al. Improving diagnosis: adding context to cognition. Diagnosis (Berl) 2023;10:4–8. 10.1515/dx-2022-0058 [DOI] [PubMed] [Google Scholar]
- 28.Linzer M, Poplau S, Brown R, et al. Do work condition interventions affect quality and errors in primary care? results from the healthy work place study. J Gen Intern Med 2017;32:56–61. 10.1007/s11606-016-3856-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Linzer M, McLoughlin C, Poplau S, et al. The mini Z worklife and burnout reduction instrument: psychometrics and clinical implications. J GEN INTERN MED 2022;37:2876–8. 10.1007/s11606-021-07278-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Loscalzo F, Kasper H, Longo J. Harrison’s principles of internal medicine. In:. 21st edn. New York McGraw Hill, 2022. [Google Scholar]
- 31.Ely JW, Graber ML. Preventing diagnostic errors in primary care. Amer Fam Phys 2016;94:426–32. [PubMed] [Google Scholar]
- 32.Schiff GD, Volodarskaya M, Ruan E, et al. Characteristics of disease-specific and generic diagnostic pitfalls. JAMA Netw Open 2022;5:e2144531. 10.1001/jamanetworkopen.2021.44531 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Friedemann Smith C, Lunn H, Wong G, et al. Optimising GPs’ communication of advice to facilitate patients’ self-care and prompt follow-up when the diagnosis is uncertain: a realist review of "safety-netting'' in primary care. BMJ Qual Saf 2022;31:541–54. 10.1136/bmjqs-2021-014529 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Almond S, Mant D, Thompson M. Diagnostic safety-netting. Br J Gen Pract 2009;59:872–4; 10.3399/bjgp09X472971 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.P Goddu A, O’Conor KJ, Lanzkron S, et al. Do words matter? stigmatizing language and the transmission of bias in the medical record. J Gen Intern Med 2018;33:685–91. 10.1007/s11606-017-4289-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Fernández L, Fossa A, Dong Z, et al. Words matter: what do patients find judgmental or offensive in outpatient notes? J Gen Intern Med 2021;36:2571–8. 10.1007/s11606-020-06432-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Cook DA, Beckman TJ. Current concepts in validity and reliability for psychometric instruments: theory and application. Am J Med 2006;119:166. 10.1016/j.amjmed.2005.10.036 [DOI] [PubMed] [Google Scholar]
- 38.Mirica M, Khazen M, Hussein S, et al. n.d. Assessing the assessment – developing and deploying a novel tool for evaluating clinical notes’ diagnostic assessment quality. J Gen Intern Med [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Sidnell J, Stivers T. The Handbook of conversation analysis. The handbook of conversation analysis. John Wiley & Sons, 7 November 2012. 10.1002/9781118325001 [DOI] [Google Scholar]
- 40.Creswell JW, Creswell JD. Research design: Qualitative, quantitative, and mixed methods approaches. Sage publications, 2017. [Google Scholar]
- 41.Miles MB, Huberman AM, Saldana J. Qualitative data analysis: A methods sourcebook. 2014: 3. [Google Scholar]
- 42.Rohland BM, Kruse GR, Rohrer JE. Validation of a single-item measure of burnout against the maslach burnout inventory among physicians. Stress and Health 2004;20:75–9. 10.1002/smi.1002 [DOI] [Google Scholar]
- 43.Hulley S, Cummings S, Browner W, et al. Designing clinical research: an epidemiologic approach 4th ed. Philadelphia, PA: Lippincott Williams & Wilkins, 2013. [Google Scholar]
- 44.Kroenke K. Studying symptoms: sampling and measurement issues. Ann Intern Med 2001;134(9 Pt 2):844–53. 10.7326/0003-4819-134-9_part_2-200105011-00008 [DOI] [PubMed] [Google Scholar]
- 45.Kroenke K. A practical and evidence-based approach to common symptoms: a narrative review. Ann Intern Med 2014;161:579–86. 10.7326/M14-0461 [DOI] [PubMed] [Google Scholar]
- 46.Zulman DM, Verghese A. Virtual care, telemedicine visits, and real connection in the era of COVID-19: unforeseen opportunity in the face of adversity. JAMA 2021;325:437–8. 10.1001/jama.2020.27304 [DOI] [PubMed] [Google Scholar]
- 47.Vassar M, Holzmann M. The retrospective chart review: important methodological considerations. J Educ Eval Health Prof 2013;10:12. 10.3352/jeehp.2013.10.12 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Sarkar S, Seshadri D. Conducting record review studies in clinical practice. J Clin Diagn Res 2014;8:JG01–4. 10.7860/JCDR/2014/8301.4806 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.New surgeon General Advisory sounds alarm on health worker burnout and resignation. 2022. Available: https://www.hhs.gov/about/news/2022/05/23/new-surgeon-general-advisory-sounds-alarm-on-health-worker-burnout-and-resignation.html [DOI] [PubMed]
- 50.for the Healthy Work Place (HWP) Investigators, Prasad K, Poplau S, et al. Time pressure during primary care office visits: a prospective evaluation of data from the healthy work place study. J GEN INTERN MED 2020;35:465–72. 10.1007/s11606-019-05343-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Linzer M. Clinician burnout and the quality of care. JAMA Intern Med 2018;178:1331–2. 10.1001/jamainternmed.2018.3708 [DOI] [PubMed] [Google Scholar]
- 52.Linzer M, Smith CD, Hingle S, et al. Evaluation of work satisfaction, stress, and burnout among US internal medicine physicians and trainees. JAMA Netw Open 2020;3:e2018758. 10.1001/jamanetworkopen.2020.18758 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Rotenstein LS, Sinsky C, Cassel CK. How to measure progress in addressing physician well-being. JAMA 2021;326:2129. 10.1001/jama.2021.20175 [DOI] [PubMed] [Google Scholar]
- 54.Rotenstein LS, Holmgren AJ, Healey MJ, et al. Association between electronic health record time and quality of care metrics in primary care. JAMA Netw Open 2022;5:e2237086. 10.1001/jamanetworkopen.2022.37086 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Eschenroeder HC, Manzione LC, Adler-Milstein J, et al. Associations of physician burnout with organizational electronic health record support and after-hours charting. J Am Med Inform Assoc 2021;28:960–6. 10.1093/jamia/ocab053 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Waitzkin H. The politics of medical encounters: How patients and doctors deal with social problems. Yale University Press, 1991. [Google Scholar]
- 57.Street RL, Gordon HS, Ward MM, et al. Patient participation in medical consultations: why some patients are more involved than others. Med Care 2005;43:960–9. 10.1097/01.mlr.0000178172.40344.70 [DOI] [PubMed] [Google Scholar]
- 58.Dahm MR, Williams M, Crock C. "more than wor'' -interpersonal communication, cognitive bias and diagnostic errors. Patient Educ Couns 2022;105:252–6. 10.1016/j.pec.2021.05.012 [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Data are available upon reasonable request. The data sets during and/or analysed during the current study available from the corresponding author on reasonable request.


