Skip to main content
Journal of the American Medical Informatics Association: JAMIA logoLink to Journal of the American Medical Informatics Association: JAMIA
. 2024 Jan 12;31(3):714–719. doi: 10.1093/jamia/ocad261

Structured and unstructured social risk factor documentation in the electronic health record underestimates patients’ self-reported risks

Bradley E Iott 1,2,, Samantha Rivas 3, Laura M Gottlieb 4,5,6, Julia Adler-Milstein 7,8, Matthew S Pantell 9,10
PMCID: PMC10873825  PMID: 38216127

Abstract

Objectives

National attention has focused on increasing clinicians’ responsiveness to the social determinants of health, for example, food security. A key step toward designing responsive interventions includes ensuring that information about patients’ social circumstances is captured in the electronic health record (EHR). While prior work has assessed levels of EHR “social risk” documentation, the extent to which documentation represents the true prevalence of social risk is unknown. While no gold standard exists to definitively characterize social risks in clinical populations, here we used the best available proxy: social risks reported by patient survey.

Materials and Methods

We compared survey results to respondents’ EHR social risk documentation (clinical free-text notes and International Statistical Classification of Diseases and Related Health Problems [ICD-10] codes).

Results

Surveys indicated much higher rates of social risk (8.2%-40.9%) than found in structured (0%-2.0%) or unstructured (0%-0.2%) documentation.

Discussion

Ideally, new care standards that include incentives to screen for social risk will increase the use of documentation tools and clinical teams’ awareness of and interventions related to social adversity, while balancing potential screening and documentation burden on clinicians and patients.

Conclusion

EHR documentation of social risk factors currently underestimates their prevalence.

Keywords: social determinants of health, electronic health records, documentation, ICD-10 Z codes, clinical free-text notes

Introduction

The strong evidence that the social determinants/drivers of health (SDH) influence healthcare access and outcomes has motivated many healthcare organizations to institute more systematic screening for patient-level social risk factors.1,2 Ensuring that clinicians and population health specialists are aware of patients’ social risks increases opportunities to implement social care interventions, including adjusting medical care in response to patients’ social circumstances and when possible, addressing needs by providing referrals to social services.1 Documentation of social risk information in electronic health records (EHRs) can facilitate social care activities, both by presenting relevant social information directly to clinicians and by enabling health systems to estimate the population prevalence of social risk factors and thereby design responsive interventions. However, efforts to encourage social risk documentation must account for the potential to increase documentation burden on clinicians and screening fatigue on patients.

Clinicians can document social information in EHRs using structured EHR fields (eg, by using medical codes) or unstructured data fields (eg, free-text-based clinical narrative). Prior work investigating social risk factor documentation using International Statistical Classification of Diseases and Related Health Problems (ICD-10) codes, specifically, has suggested that these codes are underutilized,3,4 and 1 study found that ICD codes were used less frequently to document social risk factors when compared to note narratives (unstructured data).5 But this literature is hampered by a lack of data about the true prevalence of social risks in the clinical populations being studied.6–9 In this study, we leveraged the unique availability of data from a patient survey on social risks to examine the relative prevalence of EHR documentation on social risks.

Methods

Study population

We leveraged data from a completed randomized clinical trial based in a pediatric urgent care center at an urban safety net clinic. Trial methods have been described in a previous publication.10 In brief, participating patients and their caregivers were screened for trial eligibility during urgent care encounters. Eligibility criteria included the caregiver speaking English or Spanish, the patient having an accompanying caregiver or legal guardian at the visit, the patient being aged 0-17 years, the caregiver age ≥18 years, the caregiver residing in San Francisco County, and the patient not participating in a similar intervention 6 months before the date of enrollment. As part of the trial, all participating caregivers completed an 18-item survey of social risk factors about which they were currently concerned during or directly after the child’s clinical encounter. The survey instrument was sourced from a previous social risk factor screening trial and is shown in Appendix SA.11,12 Participants were given the option to complete the survey themselves or to work with a research assistant to do so.

Measures of structured and unstructured social risk documentation

We examined EHR documentation of the same 8 social risks from 2 sources. First, we extracted structured ICD-10 Z codes (housing insecurity [Z59.0], housing conditions [Z59.1 and Z59.2], food insecurity [Z59.4], financial resource strain [Z59.8 and Z59.9], unemployment [Z56.0], job conditions [Z56.89 and Z56.9], access to education/after school activities [Z55.0, Z55.3, Z55.8, and Z55.9], and legal issues [Z62.21, Z62.810, and Z65.3]) from the EHR problem list. These codes were documented as part of the enrollment encounter or in the following 2 days. Structured documentation reflected usual medical care and physicians’ use of ICD-10 Z codes was not a component of the trial interventions, as a separate researcher administered the survey and provided resource referrals.

Second, we extracted documentation about patients’ social risks from free-text clinical narrative notes associated with the enrollment encounter. A medical student (S.R.) extracted data from 597 notes and members of the study team (B.E.I., S.R., and M.S.P.) met to reconcile extracted data, review instances of ambiguous documentation, and make decisions about inclusion criteria for each type of social risk factor. Decisions about inclusion criteria were kept in an evolving document shared with the study team to guide ongoing data extraction. Inclusion criteria for social risks to be extracted from notes included explicit free-text documentation of social risk factors endorsed by the patient as currently relevant at the time of the visit. For example, if a note mentioned that the patient’s parent was employed in the past and was currently assessed as having unemployment concerns, we recorded 1 social risk-relevant instance: unemployment present. We modeled definitions for each social risk factor from the patient survey questions (Appendix SA) and decisions for inclusion or exclusion were consensus-based. Similarly, unstructured documentation reflected usual medical care and physicians’ use of clinical free-text notes was not a component of the trial interventions, as a separate researcher administered all study activities. We compared rates of unstructured documentation across both trial arms and found no significant differences in EHR documentation rates (whether in clinical narratives or ICD codes) between intervention groups (data not shown).

Data analysis

Our sample includes 597 caregivers with no missing data for all included variables. We tabulated sample demographics and characterized the relative frequency of patients with survey-endorsed social risks, as well as ICD-10 Z code and free-text note documentation of each of the 8 included social risks in our sample overall. As a sensitivity analysis, we characterize the frequency of social risk factor documentation prior to and including the enrollment encounter, including Z codes documented during prior encounters and clinical note documentation of any prior social risk mentioned during the enrollment encounter. All analyses were conducted in Stata 17.0 (StataCorp, College Station, TX). This study was approved by the University of California San Francisco Institutional Review Board.

Results

Sample characteristics

Table 1 presents sample characteristics. Most caregiver respondents (87.4%) were over the age of 25, identified as female (91.3%), and identified as Hispanic (82.4%). Approximately half of respondents did not complete high school (51.3%) and 42.05% made $15 000 per year or less. 85.6% of respondents endorsed at least 1 social risk on the survey; an average of 2.5 social risks were reported per survey.

Table 1.

Sample demographics (n = 597).

No. %
Caregiver age
 18-24 years 75 12.6
 25-34 years 233 39.0
 35-44 years 223 37.4
 45-74 years 66 11.1
Caregiver gender
 Female 545 91.3
 Male 52 8.7
Caregiver race
 Hispanic 492 82.4
 Non-Hispanic Black 51 8.5
 Non-Hispanic White 18 3.0
 Non-Hispanic Asian 12 2.0
 Non-Hispanic Pacific Islander/Hawaiian 7 1.2
 Non-Hispanic American Indian/Alaskan Native 2 0.3
 Multi/mixed/other 15 2.5
Caregiver education level
 8th grade or less 162 27.1
 Some high school but did not graduate 144 24.1
 High school graduate or General Educational Diploma 169 28.3
 Some college, college graduate, or more than a 4-year college degree 122 20.4
Household income
 $0-5000 118 19.8
 $5001-15 000 133 22.3
 $15 001-25 000 131 22.0
 $25 001 or more 155 26.0
Declined to state 60 10.1
Had at least 1 survey self-endorsed social need 511 85.6
Average number of survey self-endorsed social needs 2.5

Figure 1 describes the prevalence of both social risk endorsement and EHR documentation in our sample. The survey-derived prevalence of individual social risks ranged from 8.2% (job conditions) to 40.9% (housing insecurity, Figure 1). Of patients who endorsed at least 1 social risk on the survey, 5.5% also had either Z code or free-text note of EHR documentation of that risk during the study enrollment visit. The proportion of patients with ICD-10 Z code documentation of any single social risk ranged from 0 (housing conditions, food insecurity, unemployment, job conditions, access to education/afterschool activities, and legal issues) to 0.2% (housing insecurity and financial resource strain). Free-text note documentation rates ranged from 0 (job conditions) to 2.0% (legal issues). We observed higher levels of EHR-based social risk factor documentation when using a combined measure featuring structured or unstructured data (0%-2.0%). Free-text note documentation about social risks was more common than Z code documentation for most social risk factors. Financial resource strain was documented at equal rates in Z codes and notes. We observed no documentation about job conditions in Z codes or free-text notes, though 8.2% of our sample endorsed concerns about job conditions. No patients had concordant documentation with both Z codes and clinical free-text notes.

Figure 1.

Figure 1.

Prevalence of social risk endorsement and documentation (n = 597).

In sensitivity analyses, we including Z codes documented during prior encounters and clinical note documentation of any prior social risk mentioned during the enrollment encounter. This yielded a higher prevalence of documented social risks, with 15.4% of patients having at least 1 instance of structured or unstructured social risk documentation (Appendix SB).

Discussion

This is the first study of which we are aware to directly compare rates of both structured and unstructured social risk factor EHR documentation with patient-endorsed social risks collected by survey. Comparing the prevalence of social risks reported in clinician EHR documentation to a patient-reported survey (which we believe is the best approximation available of actual prevalence out of the data available), we find that EHR SDH data underestimate the prevalence of patients’ social risks. Since survey data also may underestimate the true prevalence of social risks, our findings suggest that the real deficits we observed in EHR-based documentation are likely even greater. Additionally, our sensitivity analysis demonstrated a higher prevalence of social risks when considering previously documented free-text mentions of prior social risks. The ability to identify prior social risks through Z codes and free-text notes suggests the need to consider how best to document social risks in the future, including the relevance of historical social risk factors to present day clinical practice and social care delivery. While repeated screenings may provide a comprehensive understanding of patients’ social risks over time, the frequency of screening must be balanced with concerns about burden to clinicians and patients.13 Indeed, efforts to increase social risk documentation must be met with additional resources to support social care delivery. Alternatively, the searchable nature of structured data may aid clinicians in understanding patients’ longitudinal experience of social risks, and development of future Z codes may consider the creation of codes that capture the history of prior social risks. Free-text notes may similarly offer much nuance about prior social risks. Knowledge of prior risk may be valuable in the absence of recent social risk screening, though this value must be balanced with the potential for documentation burden on clinicians, privacy risks for patients, and the scarcity of resources available to address social needs.

Despite the growth in interest in social risk and new drivers incentivizing social risk data documentation in EHRs, our findings suggest that EHR data may not yet be an accurate source of information on the prevalence of patient social risks. This may simply reflect gaps in SDH data collection in clinical encounters. For instance, patients may not be asked about social risks, or even if asked, some patients may decline to participate in SDH screening.14 In some cases, clinicians may not see SDH information as relevant to patient care or as more burdensome than beneficial for clinicians,15 as SDH interventions may not be perceived as a part of standard healthcare practice and many clinicians may lack training related to SDH.16 The observed EHR documentation gaps may alternatively reflect gaps in documentation practice. For instance, even in cases where clinical teams do screen for social risks SDH, the availability of structured, EHR SDH documentation tools, including ICD-10 Z codes, does not guarantee their use. In this study, for most included social risk factors, we observe structured SDH documentation to be more common than unstructured documentation. However, Z code descriptions may fail to capture nuanced details about patients’ experiences of social risk. Further qualitative research is necessary to explain differences in use of structured vs unstructured SDH documentation tools. Clinicians who lack resources to assist with patients’ social risks might choose not to document them, even if they were endorsed by patients during clinical encounters.17 Alternatively, clinicians may not be aware of structured SDH documentation tools,18 and prior work has shown that free-text notes were more commonly used for SDH documentation.5

Information about social risks may be harnessed by clinical teams to improve care in ways that will shape health and health equity. But to do this, clinicians will need accurate point-of-care information about patients’ social risks, and our results suggest that EHR structured and unstructured data individually or in combination are currently not necessarily reliable sources for information about patients’ social risks. Ensuring that social risk factor data in EHRs more accurately represent the prevalence of patients’ social risks is important, given the potential for these data to be used in social care interventions. To ensure that social risk factor documentation is a core component of clinical care, it will be necessary to develop strategies to overcome existing challenges. To begin, aggregate measures of documentation that combine structured and unstructured SDH data, such as extracting information about social risks from clinical free-text notes via natural language processing, may increase sensitivity when estimating social risks.19–26 Indeed, when combining social risk data from Z code and free-text notes in this study, we observed higher rates of documentation.5 Rates of concordant documentation with both Z codes and clinical free-text notes were low in our sample (0%-0.5%), which may be driven by the burden of documenting the same social risk information in 2 different places in the EHR—“double documenting.” Future work should explore whether EHR automation tools can be used to alleviate some of this burden and subsequently increase structured and unstructured data concordance, though such efforts may require validation, as distinct sources of SDH data may be generated by different clinical stakeholders.

New social care quality measures will require reporting on the prevalence of screening for social risks.27–34 Other state and federal policies encourage both SDH screening and the use of structured documentation tools, which may in turn provide computable forms of SDH data that could be used in clinical decision support algorithms and community resource referral tools.1,35,36 However, given the low levels of documentation observed in this study, quality measures that rely on SDH data documented in the EHR must consider the extent to which SDH may be underdocumented.

Limitations

Our study has several key limitations. First, while we did our best to match survey questions with analogous structured data constructs, they do not perfectly align, as Z codes often attempt to represent broad social risk concepts, rather than specific domains of social risks. There are national efforts underway to provide more granular structured data capture tools so that hopefully in the future, we can assess the prevalence of social risk factors more accurately.37 Additionally, it is not clear the extent to which our findings generalize to other pediatric clinics or clinical settings. For example, social care interventions may be most relevant in primary care settings, where longitudinal patient-provider relationships may facilitate the identification of social needs to create opportunities to provide assistance. However, there is increasing evidence of the importance of providing social care even in acute emergency settings, suggesting opportunity for social care interventions in urgent care clinics.38,39 Indeed, in this study, patient surveys suggested that this sample population had a high prevalence of social needs, and despite this, documentation of these needs was limited. Additionally, while trial intervention arms were carried out by separate researchers, we cannot be sure that clinician documentation behavior was not influenced by the trial. Finally, most of our sample of caregivers identified as Hispanic and female, and this sample may not generalize to certain patient populations. However, while this study offers the perspective of 1 clinic, we believe that the pattern of underdocumentation of social risks is similar in other settings. Replication in settings with larger, more diverse patient populations may aid in characterizing the prevalence of EHR SDH documentation.

Conclusion

Our study found that EHR documentation of social risk factors currently underestimates their prevalence. Upcoming quality measures, policies, and incentives may increase the use of documentation tools to close this gap.

Supplementary Material

ocad261_Supplementary_Data

Acknowledgments

We acknowledge the help of Holly Wing for this study.

Contributor Information

Bradley E Iott, Center for Clinical Informatics and Improvement Research, University of California, San Francisco, San Francisco, CA, United States; Social Interventions Research and Evaluation Network, University of California, San Francisco, San Francisco, CA, United States.

Samantha Rivas, Social Interventions Research and Evaluation Network, University of California, San Francisco, San Francisco, CA, United States.

Laura M Gottlieb, Social Interventions Research and Evaluation Network, University of California, San Francisco, San Francisco, CA, United States; Center for Health and Community, University of California, San Francisco, San Francisco, CA, United States; Department of Family and Community Medicine, University of California, San Francisco, San Francisco, CA, United States.

Julia Adler-Milstein, Center for Clinical Informatics and Improvement Research, University of California, San Francisco, San Francisco, CA, United States; Department of Medicine, University of California, San Francisco, San Francisco, CA, United States.

Matthew S Pantell, Center for Health and Community, University of California, San Francisco, San Francisco, CA, United States; Department of Pediatrics, University of California, San Francisco, San Francisco, CA, United States.

Author contributions

B.E.I. contributed to the conception and design of this study, data collection, data analysis and interpretation, drafting the article, critical revision of the article, and final approval of the version to be published. He agrees to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. S.R. contributed to the data collection, data analysis and interpretation, and final approval of the version to be published. She agrees to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. L.M.G. contributed to the conception and design of this study, data interpretation, drafting the article, critical revision of the article, and final approval of the version to be published. She agrees to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. J.A.-M. contributed to the conception and design of this study, data interpretation, drafting the article, critical revision of the article, and final approval of the version to be published. She agrees to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. M.S.P. contributed to the conception and design of this study, data interpretation, drafting the article, critical revision of the article, and final approval of the version to be published. He agrees to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

Supplementary material

Supplementary material is available at Journal of the American Medical Informatics Association online.

Funding

None declared.

Conflicts of interest

None declared.

Data availability

The data underlying this article cannot be shared publicly in order to protect the privacy of individuals represented in the dataset.

References

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

ocad261_Supplementary_Data

Data Availability Statement

The data underlying this article cannot be shared publicly in order to protect the privacy of individuals represented in the dataset.


Articles from Journal of the American Medical Informatics Association : JAMIA are provided here courtesy of Oxford University Press

RESOURCES