Abstract
Deficiencies in data sharing capabilities limit Social Determinants of Health (SDoH) analysis as part of COVID-19 research. The National COVID Cohort Collaborative (N3C) is an example of an Electronic Health Record (EHR) database of patients tested for COVID-19 that could benefit from a SDoH elements framework that captures various screening instruments in EHR data warehouse systems. This paper uses the University of Washington Enterprise Data Warehouse (a data contributor to N3C) to demonstrate how SDoH can be represented and managed to be made available within an OMOP common data model. We found that these data varied by type of social determinants data and where it was collected, in the time period that it was collected, and in how it was represented.
Introduction
The COVID-19 pandemic has highlighted the urgency of existing informatics needs around clinical data sharing and the collection and integration of high-quality Social Determinants of Health (SDoH) data.1,2 While different organizations (Johns Hopkins, NYTimes) aggregated case data to generate publicly-available reports, data on patients and health care system response was limited. This challenge occurred even though most institutions had these data stored electronically and the potential value of such data sharing occurred at scales higher than it had ever been before. For some, we are still trying to understand their significance and determine the best approach to sharing such knowledge.
To address data sharing needs, the National COVID Cohort Collaborative (N3C) was created as an extension of the National Center for Data to Health (CD2H) project.3 With funding from the National Institutes of Health (NIH) and support from the National Center for Advancing Translational Sciences (NCATS), as of March 2021, N3C has collected data from electronic health records (EHRs) from 45 health systems across the country on 3.75 million patients, over 903,000 of whom have been diagnosed with COVID-19. Participating organizations provide data on biweekly to monthly schedules, based on extraction rules from the EHRs for COVID cases and matched controls. Each organization transforms data (e.g., patient demographics, visit history, diagnoses, medications, laboratory test results, procedures, vital signs) into 4 common data models -- the Accrual to Clinical Trials (ACT) Network, National Patient-Centered Clinical Research Network (PCORnet), Observational Medical Outcomes Partnership (OMOP), and TriNetX.3 These data are then submitted to N3C for storage in a secure enclave managed by NIH, harmonized into the OMOP common data model, which can be made accessible for researcher requesting data for analyses related to the COVID-19 pandemic.3,4 N3C significantly advances EHR data sharing capability being the first public national data sharing initiative with centralized data extracted directly from EHRs in this way.
Despite the fact that SDoH have been associated with COVID-19 incidence and outcomes, data elements required to study the role of SDoH are thought to be under-collected due to historic inattention to patient SDoH.1,5–7 SDoH are the non-clinical covariates of how people live, grow, learn and age as it relates with how they can manage stressors or prevent worsening health outcomes.8,9 As data representations for SDoH are not native to clinical informatics, mapping of detailed information bears a risk of information loss. Patient SDoH may be predominantly captured as diagnosis codes, narrative reports, and structured survey tools.10–12 Diagnosis coding, such as Z-codes in ICD-10-CM representing certain assessments, may be more easily modeled and shared across institutions, but will be limited to those conditions where this coding procedure is available and reliably implemented.5,6 Narrative text representations may be the easiest to document for clinicians and therefore could have the highest level of completeness and contextual information, but extracting narrative notes may be imperfect and inconsistent for sharing in common data models.10–13 Structured forms can be more readily disseminated and queried electronically;5,14 however, variations in research instrument used, instrument version, and coverage of SDoH questions and answer options presenting challenges for comparisons across institutions.15 Each of these approaches are further limited by whether or not SDoH are assessed as part of the standard clinical workflow.16 Efforts are needed to define how the data are included and how they can be used for N3C research.
Clinical screening tools for SDoH were introduced as research instruments to help care providers identify and address individual-level SDoH and social needs.9,17–19 These structured screening tools can provide a snapshot of non-medical conditions in the patients’ lives, and are broadly associated with patients’ ability to manage stressors and/or avoid worse health outcomes.14,15,17,20 Reliable collection of SDoH at the point of care can help clinicians to provide personalized recommendations, promoting health and well-being towards the overall population serviced by the health system.21,22
To make SDoH from EHRs usable in research and other analyses, they must be made computable, which requires knowledge of how these data are being collected and represented. Social and environmental determinants of health continue to have a central importance in studying disparities in COVID-19 outcomes and predisposed vulnerabilities.23–25 These factors have been used to guide the US vaccine deployment strategy to protect the most vulnerable and at-risk individuals.26 Information collected from SDoH structured screening tools may be stored within EHRs as FlowSheets measurements, comparable to how depression is screened.5,13,14 FlowSheets offer a tractable tabular data-structure to examine related data-points for analytics purposes. However, FlowSheet measurements are encoded independently. Between health systems, the same measurements (questions/variables) and values (answers provided) will likely contain locally unique encoding and nuances. To facilitate cross-site analyses, sites must first harmonize their FlowSheet information into standard categories while maintaining attributions for source data provenance. With few exceptions, encoding of concepts in these forms is not captured in standard terminologies such as LOINC that enable interoperable analyses.
In this study, we characterize how SDoH data are represented in an EHR to assess their use, limitations, and what recommendations are needed to address challenges in using them for research. This was done using data from UW Medicine and University of Washington School of Medicine, which was an early contributor of data to N3C. The intention of this work is to contribute towards advancing data maturity for patient SDoH representation, data utility for research on social needs and health services research, and ongoing N3C SDoH research on COVID-19 efforts.
Methods
Study design
This case-study specifically focused on secondary use of information from an Epic Clarity Flowsheet structure into an OMOP database schema. We developed a process for triaging FlowSheet measurements for patient-level SDoH then map observational data elements for Extract-Transform-Load (ETL) operations. We focused primarily on SDoH information represented in clinical observations captured in FlowSheet tables, which includes survey responses and patient-provided occupation information. Post hoc analysis aims to measure data density and completeness to identify next steps in data engineering efforts.
Data source
Data were extracted from the UW Medicine Health Network Enterprise Data Warehouse (EDW), which are collected from four Hospital/Medical centers (Harborview, Northwest, UWMC-Montlake, Valley) and various outpatient clinics in the Greater Seattle King-County metropolitan area, but patients may be received from various geographic areas in Washington, Wyoming, Alaska, Montana, and Idaho (WWAMI) and neighboring regions. The cohort criteria include patients tested for SARS-CoV-2 RT-PCR viral presence test, tested for SARS-CoV-2 IgG antigen titer assay, and/or received condition diagnosis of COVID-19 between Feb 1, 2020 through Feb 20, 2021. Although UW employees and staff had increased COVID testing protocols, UW employees and staff at the time of testing were excluded from this dataset due to institutional policies protecting employee privacy. These data elements include patient demographics, visit history, diagnoses, medications, laboratory test results, procedures, vital signs and other observations that have been extracted from the medical records into the OMOP common data model.27 Motivation to use OMOP schema was informed by N3C31 and the OHDSI CHARYBDIS28–30 design and rationale to create COVID-19 shareable data cohorts for research.
Assessment for FlowSheet measurement records
The general workflow can be represented within Figure 1. The lead author (JP) first explored the schema of the Epic Clarity FlowSheet tables to identify necessary entities. Next, the lead author (JP) performed manual review to triage the FlowSheet measurements to identify data elements relevant to SDoH and generate initial concept mappings. We used the UCSF SIREN project social needs categories as reference taxonomic labels for qualitative coding of Epic FlowSheet measurement names and display names.31 We initiated the Epic FlowSheet measurements coding as inquiries of ‘Housing Insecurity / Instability / Homelessness’, ‘Food insecurity’, ‘Employment’, ‘Education’, ‘Health Care / Medicine Access & Affordability’, and ‘Immigration / Migrant Status / Refugee Status’. Of note, we separated questions/inquiries of ‘Household size’ into its own category and we lumped inquiries related to ‘Annual Household Income’ as a measure under the ‘Employment’ category.
Harmonization to OMOP
Once it has been established that SDoH information was captured within the FlowSheet structure, we examined the unique measurement values related to each FlowSheet measurement for value data types (e.g., string, numeric) and concept representation. Concept representations were triangulated with the FlowSheet measurement comments, which are clinician memos. FlowSheet measurements with overwhelming singleton values were considered free-text and flagged for separate extraction approaches. Flowsheet measurements and their standardized unique values were searched for within the ATHENA vocabularies repository. 32 Using ATHENA, we identified and viewed OMOP version 5.3.1 vocabulary, which contains standardized concept relationships from biomedical vocabularies such as the Logical Observation Identifiers Names and Codes (LOINC) database and the Systematized Nomenclature of Medicine -- Clinical Terms (SNOMED-CT) vocabulary. We prioritized OMOP standard concepts from the LOINC vocabulary, which have incorporated survey tools with defined ‘has answer’ relationships, which we used to verify the question-answer relationships. OMOP concepts were reviewed and assigned on the basis of the question and answer having adequate semantic representation. For example, ‘3-years’ can be represented with ‘36 months’; ‘how long have you been homeless’ cannot be represented with ‘Current housing status,’ or vice versa. Where LOINC concepts were not available for the question or answer, we broadened the concept representation to the
SNOMED-CT and AllOfUs_Columbia vocabulary. The lead author (JP) conducted qualitative member checking with subject matter experts in OMOP data engineering (AW) and concept mapping using LOINC and SNOMED (DM) to ensure accuracy and validity of the mapping interpretations. Mapping discrepancies were reconciled to an agreed upon set of corrections.
The triage results in a mapping table that associates each ‘Flowsheet Measurement ID’ with their OMOP representation of concept ID and source concept ID (Table 1). FlowSheet measurements marked for exclusion would have RUN_SET set to NULL.
Table 1: Features from the FlowSheet Observations table.
Flowsheet observation features | Example values |
SIREN Social Needs category | Housing Insecurity / Instability / Homelessness |
FLOWSHEET MEASUREMENT ID | 1234567890 |
MEASUREMENT NAME | UWM R HCHN HOW LONG HOMELESS UD |
DISPLAY NAME | How Long Homeless |
NORMALIZED NAME (optional) | Length of time homeless |
CONCEPT_ID | 40482660 |
SOURCE_CONCEPT_ID | 40482660 |
IS_NUMERIC | FALSE |
RUN_SET | SDOH |
Answer options (value_as_concept_id) |
|
Once the mapping table was established, we refined ETL scripts to generate new Observation records. Regular expressions apply logic between 1) FlowSheet measurement IDs that were triaged and 2) the measurement’s unique value sets to decide on the 3) corresponding OMOP value_as_concept_id. For numeric values, like ‘Annual household income’, we retain the raw value in value_as_number then discretize into value ranges, which are encoded as value_as_concept_ids. After the ETL has been implemented, we conducted patient-level random walks to review for accurate concept representations.
Assessment of employment information
During patient encounters, patients may provide or update their occupation titles (e.g., ‘nurse’) employment status (e.g., ‘full time’), which may be separately collected as free-text and time-stamped within patient demographics tables. Occupation and employment circumstances may be expressed in a variety of ways. We referred to the 2018 US Census Bureau Occupation Code list (as of Sept 26, 2019)33 Level 2 categories to form regular-expression rules for mapping occupations. These Level 2 categories are also represented with OMOP standard concepts, originating from the LOINC National Trauma Data Standard vocabulary. We link these mappings to extract occupation titles to OMOP concepts. For example, with an occupation label of ‘icu nurse’, an observation record would be mapped to OMOP observation_concept_id 36203487 for “Occupation [Type]”, and value_as_concept_id 36308137 (corresponding to “Healthcare practitioners and technical occupations” (3000-3550)”), and the source string would be retained for data provenance in the value_as_string and the observation_source_value. We repeatedly reviewed the occupations and occupation category assignments until saturation was achieved and all interpretable values were accounted for. To prevent uniquely identifying information, we imposed that occupation labels must have a minimum occurrence of at least 10 patients, else the record would be excluded for unacceptable risk.
Results
Out of 4200 FlowSheet measurements reviewed, initial triage detected 35 FlowSheet measurements of relevance to understand patient-level SDoH. Assessment of FlowSheet measurement records resulted in 21 FlowSheet measurements that had interpretable concept representations available in OMOP v5.3.1; however, the 21 FlowSheet measurements were orphan questions without obvious linkage to survey templates, so the source screening tool and version could not be established. 14 FlowSheet measurements had measurement values that were non-interpretable or not amenable for secondary use and, therefore, were removed from further extraction and analysis. These 14 FlowSheet measurements that were removed accounted for less than 1000 observation records. The 21 FlowSheet measurements that were mappable accounted for the gross majority of SDoH Observation records (n=445222).
As of Feb 20, 2021, the UW COVID-19 OMOP Limited dataset includes 133833 patients that have been assessed for COVID-19. Of the patients assessed for COVID, 83535 patients had pre-COVID medical conditions, observations, vitals, and medical procedure information, the timeframe between Jan 1, 2010 through Dec 31, 2019 (Table 2). In contrast, 124021 patients who were assessed for COVID had at least 1 medical record generated since COVID-19 reached epidemic scales, between Jan 1, 2020 through Feb 19, 2021. The number of patients with SDoH observations ranged from 11.7% of patients (n=9795) generated during the pre-COVID period to 82% of patients (n=101717) generated since COVID-19 reached epidemic scales.
Table 2: Data availability for each SDoH.
Patients (% of total patients) | Observations (% of total observations) | |||
pre-COVID (Jan 1, 2010 through Dec 31, 2019) | Since COVID (Jan 1, 2020 through Feb 19, 2021) | pre-COVID (Jan 1, 2010 through Dec 31, 2019) | Since COVID (Jan 1, 2020 through Feb 19, 2021) | |
UW COVID OMOP limited dataset (totals) | 83535 | 124021 | 21.7 M | 13.1 M |
SDoH observations | 9795 (11.7) | 101717 (82.00) | 231651 (1.00) | 213571 (1.60) |
Employment | 1878 (2.24) | 101552 (81.88) | 35095 (0.16) | 124131 (0.94) |
Housing insecurity / Instability / Homelessness | 5178 (6.19) | 4472 (3.60) | 117935 (0.54) | 48148 (0.36) |
Education | 5309 (6.35) | 3776 (3.04) | 32545 (0.15) | 21419 (0.16) |
Household_size | 1149 (1.37) | 551 (0.44) | 27983 (0.12) | 11494 (0.08) |
Immigration / migrant status / Refugee status | 1674 (2.00) | 911 (0.73) | 17163 (0.07) | 7456 (0.05) |
Food insecurity | 179 (0.21) | 184 (0.14) | 930 (0.00) | 887 (0.00) |
Of the 133833 patients, approximately 34.8 million patient observations were extracted into the UW COVID OMOP limited dataset as of Feb 19, 2021, where 1.2% of records were SDoH observations (n=445222 total; n=231651 pre-COVID; n=213571 since COVID reached epidemic scales). Since Jan 1, 2020, 81.88% of patients (n=101522) assessed for COVID-19 have provided updates to their ‘Employment’ information. Pre-COVID documentation of patient SDoH was sparse though most frequently collecting information on ‘Housing insecurity / Instability / Homelessness’, followed by ‘Employment’, ‘Education’ and ‘Household size’. The earliest indications of FlowSheet adoption for SDoH observations started in Aug 2010 for ‘Housing insecurity / Instability / Homelessness’. FlowSheet documentation for patient ‘Employment’, ‘Education’, ‘Immigration’, and ‘Household size’ began in Aug 2014 and later included ‘Food insecurity’ in Apr 2018. In contrast, since COVID, the Patients-to-SDoH-observations ratio indicates that knowledge about patient SDoH have increased drastically, especially for Employment information; other SDoH observations have been collected at a slower pace and indicate repeated data collection for a small pool of patients.
Since COVID, of the 101552 patients who provided Employment information, approximately 101497 patients described their ‘Employment: current occupation status’. Based on the most recent ‘current occupation status’ information, 34% indicated ‘Full-time’ employment, 27% ‘unemployed’, 18% ‘retired’ and 17% ‘undetermined’, ‘Part-time’, ‘self-employed’ or a ‘student’ status. Only 9.1% of these patients who provided Employment information (n=9344) provided their occupation title. We were able to map 90% of the occupation titles: 46% were ‘retired’, ‘disabled’, ‘unemployed’, ‘self-employed’, or a ‘student’ status, 44% of the occupation titles (n=3737) were successfully mapped to the 23 National Trauma Data Standard occupation concepts [https://athena.ohdsi.org/search-terms/terms/44786930], leaving 10% of occupations as unmappable. The discrepancy between 9344 patients and 101552 patients suggests that 91% of patients did not provide their employment occupation titles. Figure 2 depicts the collection of SDoH information for patients within this dataset and the shift in recent months. The recent spike in ‘Employment’ information coincides closely with the phased COVID-19 vaccine deployment (Figure 2b). Collection of SDoH information in all categories since the start of the Pandemic has exceeded or reached within one order of magnitude as the amount collected over the past 10 years.
Discussion
In this study, we characterized SDoH data collected in EHRs at a healthcare delivery institution and stored in the electronic data warehouse. This characterization allowed for queries to extract these data into a common data model that was used for sharing data with a national cohort study. Data were primarily extracted from Epic flowsheet data, which represented assessments in ambulatory care of SDoH. We found that these data varied by type of SDoH data and where it was collected, in the time period that it was collected, and in how it was represented.
The data we collected from flowsheet data involved a process of manually reviewing the data elements that were being used, by ranking them according to frequency and then assessing whether they were SDoH-related. Many of the concepts were available as social history concepts, and standard documentation structures exist in many EHRs in similar ways. However, we found many concepts that were not part of this single approach for collecting data and that often represented multiple assessment tools that may have overlap in content. This is important to recognize with research data sets that may not have specifically identified and queried SDoH data. Unlike more common data elements that are consistently recorded and extracted (e.g., diagnosis and laboratory results), SDoH assessments will be incomplete without focused queries.
Even when attempting to be comprehensive in gathering structured SDoH data, many of the data elements were stored on a minority of patient records. Business and operational concerns, such as subjective screening for program eligibility and clinical and standardized, department-specific intake workflows drive data collection. SDoH observations may be stored on less than 10% of patients seen in health-settings for any specific measure in earlier years,5,6 which may be explained by either new programs and policies or by migration from paper to electronic processes. Espinoza et al. provide some understanding to how this occurs with their development of a maturity model on SDoH data.34 Briefly, the proposed model describes five domains of institutional capacity (data collection policies, data collection methods, technology platforms, analytics capabilities, and operational and strategic impact) across seven detailed levels of maturity adapted for each domain. When SDoH measures are only stored for specific populations conditional on services provided or location treated, it can reflect a combinatorial data completeness problem of assessment, data collection and data accessibility on the different measures.
The change in data availability over time demonstrates these data collection issues and how they reflect the potential use of those data. Employment information as SDoH data is generally pulled from intake forms and demographic information. Until the COVID-19 pandemic, it had been collected for approximately one in 40 patients. During the pandemic, it increased significantly, possibly due to interest in how employment as “essential workers” may affect individuals’ infection risk, employment-linked vaccine eligibility screening, or perhaps a wider concern about employment loss during the accompanying economic effects. There was then a large spike where the majority of patients had employment information documented, coinciding with vaccine deployment. This demonstrated how a targeted need and workflow for collecting data could dramatically change completeness. This was not surprising, but the magnitude of the change was impressive. This variation also highlights the ways in which standardization into a common data model may lose contextual information related to operational drivers of data collection that may need to be recovered with metadata related to organization, policy, physician practice patterns, and care setting.
Collecting, harmonizing, and tracking SDoH will likely prove incredibly beneficial as addressing SDOH is now promoted to improve population health outcomes and cost savings.35 An increased understanding of SDOH helps providers connect patients with relevant social services and target vulnerable populations with health-improving social policies and programs.16 Lofters et al. found that using self-reported SDoH allows primary care centers to identify cancer disparities.36 Another study screened children for SDoH during well-care checkups and identified that 25% had unmet needs and provided relevant services.37 Garg et al. conducted a compelling randomized controlled trial that systematically screened all patients using a validated SDoH instrument and saw an increase in the use of community services within their patient population.38 The work presented in this article provides the basis for ongoing procedure development to address patient SDoH data collection and use for continued benefit.
There remain important limitations to this work. First, this study was done at a single institution in a single region of the country studying SDoH during a period of time where there was increased focus on SDoH. UW Medicine employees and staff were excluded as their COVID testing and prior medical information were considered protected from research use. Data likely looks different in different organizations, though how they differ may reflect more the maturity of SDoH initiatives at the organizations than regional variables. Methods for automated text analysis and taxonomy development may streamline future efforts by analyzing not only text content but incorporating priors from structured and contextual information from FlowSheets. We did not quantify the inter-annotation agreement as a measure for validity; instead, we checked the concept mappings with subject matter experts in EHR data engineering and SDoH data collection. In addition, we studied only data that were collected in structured forms and did not evaluate data completeness compared to information in narrative text reports. The FlowSheet measurements that were mapped did not have survey or version documentation available. Concept mappings and concept coverage are limited to the vocabularies available within OMOP v5.3.1. Other researchers have demonstrated how SDoH variables can be extracted from narrative text and that it significantly increases data completeness.10,11 Knowing the extent of the increase could be helpful in determining strategies for increasing SDoH data. Finally, it is unknown whether changes in SDoH data collection represent only changes related to the COVID pandemic and their sustainability. Some monitoring of these metrics will be useful to identify how the data are changing long-term.
Conclusion
Reliable SDoH data are vital for understanding COVID-19 risks and outcomes, as well as for prioritizing medical resources. While there are large amounts of SDoH data available within most EHRs, these data do not conform to common data models, making them difficult to analyze at large scales. There are a multitude of methods for documenting SDoH data. Structured forms in EHR systems can replicate standard assessment instruments, but these forms can vary within and across institutions and may change over time. In order to enable semantic interoperability, we developed a workflow for triaging EHRs for patient-level SDoH then map observational data elements into the OMOP common data model. We found significant data completeness issues, though we identified increases in the collection of specific SDoH elements. Our work demonstrates the feasibility in making SDoH data elements readily available in OMOP data warehouses.
Acknowledgements
This work was partially funded by the National Center for Data to Health (CD2H) grant [NIH/NCATS U24TR002306], supplemental funding from the National COVID Cohort Collaborative [NIH/NCATS U24TR002306-04S3], and grant funding by the Bill and Melinda Gates Foundation [BMGF INV-016910,“COVID-19: Data Analytics on Cases in the Pacific Northwest”].
Figures & Table
References
- 1.Rollston R, Galea S. COVID-19 and the Social Determinants of Health. Am J Health Promot [Internet] 2020 [cited 2021 Jun 6];34:687–9. doi: 10.1177/0890117120930536b. Available at: http://journals.sagepub.com/doi/10.1177/0890117120930536b. [DOI] [PubMed] [Google Scholar]
- 2.Ramírez IJ, Lee J. COVID-19 Emergence and Social and Health Determinants in Colorado: A Rapid Spatial Analysis. International Journal of Environmental Research and Public Health [Internet] 2020[cited 2020 Nov 9];17:3856. doi: 10.3390/ijerph17113856. Available at: [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Haendel MA, Chute CG, Gersing K. The National COVID Cohort Collaborative (N3C): Rationale, Design, Infrastructure, and Deployment. Journal of the American Medical Informatics Association [Internet] 2020. [cited 2020 Nov 9]; Available at: [DOI] [PMC free article] [PubMed]
- 4.Bennett TD, Moffitt RA, Hajagos JG, Amor B, Anand A, Bissell MM, et al. The National COVID Cohort Collaborative: Clinical Characterization and Early Severity Prediction [Internet] Health Informatics. 2021. Jan [cited 2021 Jul 6]. Available at: http://medrxiv.org/lookup/doi/10.1101/2021.01.12.21249511.
- 5.Hatef E, Rouhizadeh M, Tia I, Lasser E, Hill-Briggs F, Marsteller J, et al. Assessing the Availability of Data on Social and Behavioral Determinants in Structured and Unstructured Electronic Health Records: A Retrospective Analysis of a Multilevel Health Care System. JMIR Med Inform [Internet] 2019[cited 2021 Feb 18];7:e13802. doi: 10.2196/13802. Available at: http://medinform.jmir.org/2019/3/e13802/ [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Truong HP, Luke AA, Hammond G, Wadhera RK, Reidhead M, Joynt Maddox KE. Utilization of Social Determinants of Health ICD-10 Z-Codes Among Hospitalized Patients in the United States, 2016-2017. Medical Care [Internet] 2020 [cited 2021 Feb 18];58:1037–43. doi: 10.1097/MLR.0000000000001418. Available at: https://journals.lww.com/10.1097/MLR.0000000000001418. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Moscrop A, Ziebland S, Bloch G, Iraola JR. If social determinants of health are so important, shouldn’t we ask patients about them? BMJ [Internet] 2020. [cited 2021 Feb 18];m4150. Available at: https://www.bmj.com/lookup/doi/10.1136/bmj.m4150. [DOI] [PubMed]
- 8.Giroir BP. Healthy People 2030: A Call to Action to Lead America to Healthier Lives. Journal of Public Health Management and Practice [Internet] 2020. [cited 2021 Jun 6];Publish Ahead of Print. Available at: https://journals.lww.com/10.1097/PHH.0000000000001266. [DOI] [PMC free article] [PubMed]
- 9.Samuels‐Kalow ME, Ciccolo GE, Lin MP, Schoenfeld EM, Camargo CA. The terminology of social emergency medicine: Measuring social determinants of health, social risk, and social need. Journal of the American College of Emergency Physicians Open [Internet] 2020 [cited 2021 Feb 18];1:852–6. doi: 10.1002/emp2.12191. Available at: https://onlinelibrary.wiley.com/doi/10.1002/emp2.12191. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Dorr D, Bejan CA, Pizzimenti C, Singh S, Storer M, Quinones A. Identifying Patients with Significant Problems Related to Social Determinants of Health with Natural Language Processing. 264:1456–7. doi: 10.3233/SHTI190482. [DOI] [PubMed] [Google Scholar]
- 11.Lybarger K, Ostendorf M, Yetisgen M. Annotating Social Determinants of Health Using Active Learning, and Characterizing Determinants Using Neural Event Extraction. Available at: https://arxiv.org/abs/2004.05438. [DOI] [PMC free article] [PubMed]
- 12.Teng A, Wilcox A. A Simplified Framework to Extract Social Determinants: A Data Science Approach. Submitted for publication.
- 13.Winden TJ, Chen ES, Monsen KA, Melton GB. Evaluation of Flowsheet Documentation in the Electronic Health Record for Residence, Living Situation, and Living Conditions. 2018:236–45. [PMC free article] [PubMed] [Google Scholar]
- 14.Gold R, Bunce A, Cowburn S, Dambrun K, Dearing M, Middendorf M, et al. Adoption of Social Determinants of Health EHR Tools by Community Health Centers. Ann Fam Med [Internet] 2018 [cited 2021 Feb 18];16:399–407. doi: 10.1370/afm.2275. Available at: http://www.annfammed.org/lookup/doi/10.1370/afm.2275. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Cottrell EK, Dambrun K, Cowburn S, Mossman N, Bunce AE, Marino M, et al. Variation in Electronic Health Record Documentation of Social Determinants of Health Across a National Network of Community Health Centers. American Journal of Preventive Medicine [Internet] 2019. [cited 2020 Dec 17];57:S65-73. Available at: https://linkinghub.elsevier.com/retrieve/pii/S0749379719303228. [DOI] [PubMed]
- 16.Garg A, Boynton-Jarrett R, Dworkin PH. Avoiding the Unintended Consequences of Screening for Social Determinants of Health. JAMA [Internet] 2016. [cited 2021 Jul 6];316:813. Available at: http://jama.jamanetwork.com/article.aspx?doi=10.1001/jama.2016.9282. [DOI] [PubMed]
- 17.Page-Reeves J, Kaufman W, Bleecker M, Norris J, McCalmont K, Ianakieva V, et al. Addressing Social Determinants of Health in a Clinic Setting: The WellRx Pilot in Albuquerque, New Mexico. The Journal of the American Board of Family Medicine [Internet] 2016 [cited 2021 Jul 6];29:414–8. doi: 10.3122/jabfm.2016.03.150272. Available at: http://www.jabfm.org/cgi/doi/10.3122/jabfm.2016.03.150272. [DOI] [PubMed] [Google Scholar]
- 18.Gottlieb L. Uses and Misuses of Patient-and Neighborhood-level Social Determinants of Health Data. The Permanente Journal [Internet] 2018 [cited 2020 Dec 17]. Available at: [DOI] [PMC free article] [PubMed]
- 19.Henrikson NB, Blasi PR, Dorsey CN, Mettert KD, Nguyen MB, Walsh-Bailey C, et al. Psychometric and Pragmatic Properties of Social Risk Screening Tools: A Systematic Review. American Journal of Preventive Medicine [Internet] 2019 [cited 2020 Dec 17];57:S13–24. doi: 10.1016/j.amepre.2019.07.012. Available at: [DOI] [PubMed] [Google Scholar]
- 20.Gold R, Bunce A, Cottrell E, Marino M, Middendorf M, Cowburn S, et al. Study protocol: a pragmatic, stepped-wedge trial of tailored support for implementing social determinants of health documentation/action in community health centers, with realist evaluation. Implementation Science. 2019. [cited 2020 Dec 17];14. Available at: [DOI] [PMC free article] [PubMed]
- 21.Kindig DA. A Population Health Framework for Setting National and State Health Goals. JAMA [Internet] 2008 [cited 2019 Dec 20];299:2081. doi: 10.1001/jama.299.17.2081. [DOI] [PubMed] [Google Scholar]
- 22.Kharrazi H, Gamache R, Weiner J. Cham: Springer International Publishing; 2020 [cited 2020 Nov 9]. Role of Informatics in Bridging Public and Population Health. In: Magnuson JA, Dixon BE, editors. Public Health Informatics and Information Systems [Internet] pp. 59–79. (Health Informatics). Available at: http://link.springer.com/10.1007/978-3-030-41215-9_5. [Google Scholar]
- 23.Kim HN, Lan KF, Nkyekyer E, Neme S, Pierre-Louis M, Chew L, et al. Assessment of Disparities in COVID-19 Testing and Infection Across Language Groups in Seattle, Washington. JAMA Network Open [Internet] 2020 [cited 2020 Oct 2];3:1–4. doi: 10.1001/jamanetworkopen.2020.21213. Available at: [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Schmitt-Grohé S, Teoh K, Uribe M. Cambridge, MA: National Bureau of Economic Research; 2020. Covid-19: Testing Inequality in New York City [Internet] Apr [cited 2020 Nov 9]. Report No.: w27019. Available at: http://www.nber.org/papers/w27019. [Google Scholar]
- 25.Seto E, Min E, Ingram C, Cummings B, Farquhar SA. Community-Level Factors Associated with COVID-19 Cases and Testing Equity in King County, Washington. International Journal of Environmental Research and Public Health [Internet] 2020. [cited 2021 Jan 6];17:9516. Available at: https://www.mdpi.com/1660-4601/17/24/9516. [DOI] [PMC free article] [PubMed]
- 26.McClung N, Chamberland M, Kinlaw K, Matthew DB, Wallace M, Bell B, et al. The Advisory Committee on Immunization Practices’ Ethical Principles for Allocating Initial Supplies of COVID-19 Vaccine — United States, 2020 [Internet] U.S. Department of Health and Human Services, Public Health Service, Centers for Disease Control and Prevention, National Institute for Occupational Safety and Health. 2012 Oct [cited 2021 Jan 21]. (Morbidity and Mortality Weekly Report). Available at: https://www.cdc.gov/niosh/docs/2012-161/
- 27.Voss EA, Makadia R, Matcho A, Ma Q, Knoll C, Schuemie M, et al. Feasibility and utility of applications of the common data model to multiple, disparate observational health databases. Journal of the American Medical Informatics Association [Internet] 2015 [cited 2020 May 28];22:553–64. doi: 10.1093/jamia/ocu023. Available at: [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Recalde M, Roel E, Pistillo A, Sena AG, Prats-Uribe A, Ahmed W-U-R, et al. Characteristics and outcomes of 627 044 COVID-19 patients with and without obesity in the United States, Spain, and the United Kingdom [Internet] Epidemiology. 2020 Sep [cited 2021 Jul 6]. Available at: http://medrxiv.org/lookup/doi/10.1101/2020.09.02.20185173. [DOI] [PMC free article] [PubMed]
- 29.Prieto-Alhambra D, Kostka K, Duarte-Salles T, Prats-Uribe A, Sena A, Pistillo A, et al. Unraveling COVID-19: a large-scale characterization of 4.5 million COVID-19 cases using CHARYBDIS [Internet]. In Review; 2021 Mar [cited 2021 Jul 6] Available at: https://www.researchsquare.com/article/rs-279400/v1. [DOI] [PMC free article] [PubMed]
- 30.Burn E, You SC, Sena A, Kostka K, Abedtash H, Abrahao MTF, et al. Deep phenotyping of 34,128 patients hospitalised with COVID-19 and a comparison with 81,596 influenza patients in America, Europe and Asia: an international network study. 2020. [cited 2020 Nov 9]; Available at: http://medrxiv.org/lookup/doi/10.1101/2020.04.22.20074336. [DOI] [PMC free article] [PubMed]
- 31.UCSF Social Interventions Research & Evaluation Network (SIREN) [Internet] Social Needs Screening Tool Comparison Table [Internet] 2020. [cited 2021 Mar 10]. Available at: https://sirenetwork.ucsf.edu/SocialNeedsScreeningToolComparisonTable.
- 32.OHDSI ATHENA standardized vocabularies [Internet]. [cited 2021 Jul 2] Available at: https://athena.ohdsi.org/
- 33.2018 Census Occupation Code List with Crosswalk. US Census Bureau. 2019. [cited 2020 Aug 3]. Available at: https://www2.census.gov/programs-surveys/demo/guidance/industry-occupation/2018-occupation-code-list-and-crosswalk.xlsx.
- 34.Espinoza J, Meeker D, Bahroos N. Social and Environmental Determinants of Health (SEDoH) in Clinical Informatics. Presented at: CD2H Informatics Maturity And Best Practices Community Meeting; 2020 May 21.
- 35.Thornton RLJ, Glover CM, Cené CW, Glik DC, Henderson JA, Williams DR. Evaluating Strategies For Reducing Health Disparities By Addressing The Social Determinants Of Health. Health Affairs [Internet] 2016 [cited 2021 Jul 6];35:1416–23. doi: 10.1377/hlthaff.2015.1357. Available at: http://www.healthaffairs.org/doi/10.1377/hlthaff.2015.1357. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Lofters AK, Schuler A, Slater M, Baxter NN, Persaud N, Pinto AD, et al. Using self-reported data on the social determinants of health in primary care to identify cancer screening disparities: opportunities and challenges. BMC Fam Pract [Internet] 2017. [cited 2021 Jul 6];18:31. Available at: http://bmcfampract.biomedcentral.com/articles/10.1186/s12875-017-0599-z. [DOI] [PMC free article] [PubMed]
- 37.Higginbotham K, Crutcher TD, Karp SM. Screening for Social Determinants of Health at Well-Child Appointments: A Quality Improvement Project. Nursing Clinics of North America [Internet] 2019;54:141–8. doi: 10.1016/j.cnur.2018.10.009. Available at: [DOI] [PubMed] [Google Scholar]
- 38.Garg A, Toy S, Tripodis Y, Silverstein M, Freeman E. Addressing Social Determinants of Health at Well Child Care Visits: A Cluster RCT. PEDIATRICS [Internet] 2015. [cited 2021 Jul 6];135:e296-304. Available at: http://pediatrics.aappublications.org/cgi/doi/10.1542/peds.2014-2888. [DOI] [PMC free article] [PubMed]