Skip to main content
Wiley Open Access Collection logoLink to Wiley Open Access Collection
. 2022 Jan 25;36(4):508–517. doi: 10.1111/ppe.12863

Performance of diagnostic coding and laboratory testing results to measure COVID‐19 during pregnancy and associations with pregnancy outcomes

Annette K Regan 1,2,3,, Onyebuchi A Arah 2,4, Sheena G Sullivan 2,5
PMCID: PMC9303575  PMID: 35077581

Abstract

Background

Large‐scale evaluation of COVID‐19 is likely to rely on the quality of ICD coding. However, little is known about the validity of ICD‐coded COVID‐19 diagnoses.

Objectives

To evaluate the performance of diagnostic codes in detecting COVID‐19 during pregnancy.

Methods

We used data from a national cohort of 78,283 individuals with a pregnancy ending between 11 March 2020 and 31 January 2021 in the OptumLabs® Data Warehouse (OLDW). OLDW is a longitudinal, real‐world data asset with de‐identified administrative claims and electronic health record data. We identified all services with an ICD‐10‐CM diagnostic code of U07.1 and all laboratory claims records for COVID‐19 diagnostic testing. We compared ICD‐coded diagnoses to testing results to estimate positive and negative predictive values (PPV and NPV). To evaluate impact on risk estimation, we estimated risk of adverse pregnancy outcomes by source of exposure information.

Results

Of 78,283 pregnancies, 5644 had a laboratory test result for COVID‐19. Testing was most common among older individuals, Hispanic individuals, those with higher socioeconomic status and those with a diagnosed medical condition or pregnancy complication; 52% of COVID‐19 cases was identified through ICD‐coded diagnosis alone, 19% from laboratory test results alone and 29% from both sources. Agreement between ICD‐coded diagnosis and laboratory testing records was high 91% (95% confidence interval [CI] 90, 92). However, the PPV of ICD‐code diagnosis was low (36%; 95% CI 33, 39). We observed up to a 50% difference in risk estimates of adverse pregnancy outcomes when exposure was based on laboratory testing results or diagnostic coding alone.

Conclusions

More than one‐in‐five COVID‐19 cases would be missed by using ICD‐coded diagnoses alone to identify COVID‐19 during pregnancy. Epidemiological studies exclusively relying on diagnostic coding or laboratory testing results are likely to be affected by exposure misclassification. Research and surveillance should draw upon multiple sources of COVID‐19 diagnostic information.

Keywords: COVID‐19, diagnostic coding, pregnancy, SARS‐CoV‐2, validity


Synopsis

Study question

How well do diagnostic codes measure COVID‐19 during pregnancy?

What’s already known

Many studies examining the perinatal health impacts of COVID‐19 have utilised diagnostic coding from medical records to identify infections. However, the validity of these codes among pregnant patients has not yet been evaluated.

What this study adds

In a large cohort of pregnancies, we found that diagnostic coding without laboratory testing information would miss nearly 20% of COVID‐19 cases, resulting in exposure misclassification. Reliance on either diagnostic coding or laboratory testing data exclusively can bias risk estimates for adverse pregnancy outcomes by up to 50%. To avoid misclassification, epidemiology studies should draw from multiple sources of COVID‐19 information.

1. BACKGROUND

Large‐scale epidemiological studies often draw from existing medical records. These studies commonly use diagnostic coding systems, such as the International Classification of Diseases (ICD) system, to detect conditions of interest. In the context of the ongoing pandemic, large studies are likely to rely on ICD‐coded diagnoses for documenting COVID‐19. Given growing interest in evaluating the effects of COVID‐19 on maternal and infant health, 1 , 2 pregnant persons are one such group where large‐scale evaluation utilising diagnostic coding is likely. However, the validity and reliability of these codes for measuring COVID‐19, particularly among pregnant persons, remain uncertain.

ICD‐10‐CM codes were released in March 2020 for emergency use in order to clinically document diagnoses of COVID‐19. 3  These codes include U07.1 (COVID‐19 diagnosis without laboratory confirmation) and U07.2 (COVID‐19 diagnosis with laboratory confirmation). While many countries have adopted both codes, in the United States, only U07.1 was adopted. Guidance on the use of U07.1 is to record only a confirmed diagnosis as (a) documented by the provider, (b) documented through positive COVID‐19 test result or (c) through presumptive positive COVID‐19 test result. 3 Analysis of national commercial insurance claims data presents an opportunity to evaluate the accuracy of ICD‐10‐CM codes for two reasons: (1) commercially insured cohorts are closed with extensive information on contact with the medical health system; and (2) since all individuals are insured, financial access to laboratory testing may have less influence on results.

Using national insurance claims data, we aimed to evaluate (a) the ability of ICD‐coded clinical diagnoses to detect COVID‐19 when compared to laboratory testing results; and (b) how reliance on different methods of COVID‐19 detection could impact measurement of associations between COVID‐19 and pregnancy outcomes.

2. METHODS

2.1. Cohort selection

We conducted a claims‐based cohort study using de‐identified administrative claims and electronic health record (EHR) data from the OptumLabs® Data Warehouse (OLDW). 4  The database includes longitudinal health information on patients and enrollees across the United States. Claims data in OLDW include medical and pharmacy claims, laboratory results and enrolment records for commercial enrollees. Pregnancies, pregnancy outcomes and gestational age were identified and estimated from medical claims data using a validated algorithm based on ICD‐9, Current Procedural Terminology (CPT) codes, Healthcare Common Procedure Coding System (HCPC) and ICD‐9 procedure codes, which was modified for use with ICD‐10‐CM codes (Table S1). 5 All pregnant individuals with a date of delivery after 11 March 2020 (ie declaration of pandemic start) were selected for analysis. We excluded ectopic and molar pregnancies identified and pregnancies where the gestational age was inconsistent with the pregnancy outcome (ie spontaneous abortion at 32 weeks). To avoid right truncation in the cohort and ensure complete capture of medical events during the pregnancy period, we restricted analysis to those with an estimated date of conception before 30 April 2020. COVID‐19 information and pregnancy outcomes were complete in the dataset.

2.2. Variable measurement

We extracted physician, facility and laboratory claims records and EHR data for one year preceding and 30 days following the date of delivery. We identified COVID‐19 diagnoses from diagnosis codes (U07.1) in the physician or facility medical claims records. The presence of the diagnostic code indicated the individual had a COVID‐19 diagnosis, and we considered the absence of such a code as no COVID‐19 diagnosis. COVID‐19 laboratory tests and test results were identified using LOINC codes (Table S2) and free text information in the extracted laboratory claims record. We considered a positive COVID‐19 test result as an indication of COVID‐19 and a negative test result as an indication of no COVID‐19. All types of laboratory tests were considered, including serology, as we assumed a priori that infections at any time during pregnancy were potentially influential; however, we acknowledge that some infections detected through serology may have occurred prior to pregnancy, especially if testing was performed early in pregnancy. An encounter with a date of service within three days of the date of the laboratory test date was considered to be the same medical event. For pregnancies with more than one unique laboratory test result, we included any positive test result. For those with consecutive negative test results, we selected any test result coinciding with a COVID‐19 diagnosis in the medical record. For those with consecutive negative laboratory test results without no clinical diagnoses, we selected one negative result at random.

Information on maternal age, race/ethnicity, residence, educational attainment and household income was derived from enrolment data and linked data supplied by an external vendor. Race/ethnicity was defined as Black, Hispanic, Asian or White and was assigned by an external vendor based on a structured, rule‐based system that combines analysis of first names, middle names, surnames and surname prefixes and suffixes with geographic reference files. Values were then categorised to comply with data de‐identification requirements. Education was estimated based on the average level of education achieved among residents within the census block group. Household income was derived using public and private consumer data for the street address of the enrollee. We identified pre‐existing health conditions using medical claims records (Table S3).

2.3. Statistical analysis

We estimated the per cent of individuals tested during pregnancy for COVID‐19, overall and among those with a COVID‐19 diagnosis. We compared the rate of testing by sociodemographic and health characteristics using log‐binomial regression models which controlled for calendar week of conception (as a cubic spline term) and estimated gestational age at pregnancy end.

Among those tested for COVID‐19, we estimated the per cent agreement between ICD‐coded diagnoses and laboratory testing results using Kappa's coefficient. We additionally estimated the positive predictive value and negative predictive value of the U07.1 COVID‐19 diagnostic code. We estimated these values overall and by subgroups. To evaluate the influence of increased testing availability and changes in disease prevalence over time, we estimated values during three time periods: March 2020 to May 2020, June 2020 to September 2020 and October 2020 to January 2021. We additionally estimated values by method of laboratory detection (polymerase chain reaction vs. rapid antigen or serology), gestational age at infection (first, second or third trimester) and presence of medical risk factors including chronic medical conditions, pregnancy complications and advanced maternal age.

To evaluate how different sources of information could influence measures of association, we used Cox proportional hazard models treating COVID‐19 as a time‐varying exposure to estimate the risk of adverse birth outcomes associated with prenatal COVID‐19, using three different methods of identifying COVID‐19: laboratory testing only, diagnostic coding only or a combination of both. Models adjusted for maternal age, race/ethnicity, household income, pre‐existing medical conditions and week of conception (cubic spline).

2.3.1. Missing data

Missing covariate information was imputed using multiple imputation by bootstrapping and expectation‐maximization algorithm with 50 sets of imputations.

2.4. Ethics approval

Because this study involved analysis of pre‐existing, de‐identified data, it was considered exempt from Institutional Review Board approval.

3. RESULTS

Of the 86,111 pregnancies identified, 78,283 pregnancies had a date of conception after 30 April 2020 and were eligible for inclusion in the analysis (Figure 1); 5644 (7.2%, 95% CI 7.0%, 7.4%) pregnancies had a laboratory test result for COVID‐19. Laboratory testing peaked between June and July 2020 (Figure 2). The majority of laboratory tests (75.4%; n=4253) were performed using RT‐PCR; 1391 (24.6%) laboratory tests were performed by serology or rapid antigen testing. COVID‐19 testing rates increased with maternal age (RR ≥40 years vs. <24 years: 1.37, 95% CI 1.23, 1.51) and was more frequent among pregnancies with an asthma diagnosis (RR 1.26, 95% CI 1.18, 1.35), a diagnosed pregnancy complication (RR 1.09, 95% CI 1.04, 1.14) and among pregnancies delivered by caesarean delivery (RR 1.23, 95% CI 1.17, 1.29) (Table 1). Testing was less common among non‐Hispanic Black pregnant individuals (RR 0.91, 95% CI 0.83, 0.99) and those residing in rural areas compared with metropolitan (RR 0.58, 95% CI 0.42, 0.73). Among those with a clinical diagnosis of COVID‐19, similar factors were associated with COVID‐19 testing, with exception to caesarean delivery, age and race/ethnicity.

FIGURE 1.

FIGURE 1

Flow diagram for selection of pregnancies into the study cohort

FIGURE 2.

FIGURE 2

Proportion of pregnancies tested for COVID‐19—OptumLabs Data Warehouse, United States, 11 March to 31 January 2021. Y‐axis values are suppressed to comply with requirements for release of OptumLabs data

TABLE 1.

Rate ratios comparing the rate of testing during pregnancy—OptumLabs Data Warehouse, United States, 1 March to 31 December 2020

All pregnancies Pregnancies diagnosed with COVID−19
Number of pregnancies Tested Number of pregnancies Tested
n (%) aRR a (95% CI) n (%) aRR a (95% CI)
Total 78,283 5644 (7.2) 2515 596 (23.7)
Maternal age (years)
<24 5684 345 (6.1) 1.00 (Reference) 277 62 (22.4) 1.00 (Reference)
25–29 19,496 1325 (6.8) 1.14 (1.03, 1.26) 773 195 (25.2) 1.14 (0.89, 1.39)
30–34 29,207 2157 (7.4) 1.27 (1.16, 1.38) 834 198 (23.7) 1.11 (0.87, 1.36)
35–39 18,348 1409 (7.7) 1.38 (1.27, 1.49) 481 116 (24.1) 1.15 (0.89, 1.42)
≥40 5548 408 (7.3) 1.37 (1.23, 1.51) 150 25 (16.7) 0.79 (0.38, 1.21)
Race/ethnicity
White, non‐Hispanic 48,969 3509 (7.2) 1.00 (Reference) 1346 309 (22.9) 1.00 (Reference)
Black, non‐Hispanic 9205 603 (6.5) 0.91 (0.83, 0.99) 380 83 (21.8) 0.97 (0.76, 1.18)
Hispanic 13,013 1043 (8.0) 1.11 (1.04, 1.17) 601 154 (25.6) 1.15 (0.98, 1.32)
Asian 7096 489 (6.9) 0.97 (0.88, 1.06) 188 50 (26.6) 1.13 (0.88, 1.39)
Education
≤High school graduate 16,771 1747 (10.4) 1.00 (Reference) 699 167 (23.9) 1.00 (Reference)
Some college 40,135 2714 (6.8) 0.84 (0.77, 0.91) 1222 284 (23.2) 0.96 (0.77, 1.15)
≥College graduate 21,377 1183 (5.5) 0.82 (0.76, 0.88) 594 145 (24.4) 0.93 (0.76, 1.11)
Residence
Metropolitan 70,452 5,266 (7.5) 1.00 (Reference) 2215 549 (24.8) 1.00 (Reference)
Micropolitan 4297 224 (5.2) 0.69 (0.56, 0.82) 164 29 (17.7) 0.66 (0.31, 0.99)
Small town/Rural 3534 154 (4.4) 0.58 (0.42, 0.73) 136 18 (13.2) 0.51 (0.07, 0.95)
Household income
<$40,000 15,564 1079 (6.9) 1.00 (Reference) 568 120 (21.1) 1.00 (Reference)
$40–74,000 17,340 1183 (6.8) 0.99 (0.91, 1.07) 546 116 (21.2) 0.99 (0.77, 1.22)
$75–124,000 21,575 1449 (6.7) 0.97 (0.89, 1.04) 681 182 (26.7) 1.27 (1.07 1.47)
$125–199,000 13,811 1067 (7.7) 1.11 (1.03, 1.19) 417 102 (24.5) 1.15 (0.91, 1.38)
≥$200,000 9993 866 (8.7) 1.27 (1.18, 1.35) 303 76 (25.1) 1.20 (0.95, 1.44)
Medical conditions
Any medical condition 9972 829 (8.3) 1.16 (1.09, 1.23) 366 96 (26.2) 1.13 (0.94, 1.32)
Asthma 5294 480 (9.1) 1.26 (1.18, 1.35) 192 56 (29.2) 1.27 (1.03, 1.50)
Hypertension 3652 267 (7.3) 1.01 (0.89, 1.12) 138 32 (23.2) 0.96 (0.64, 1.27)
Immune disorder 160 14 (8.7) 1.24 (0.75, 1.74) 13 <11
Neurological disorder 237 15 (6.3) 0.91 (0.42, 1.40) <11 <11
No medical condition 68,311 4815 (7.0) 1.00 (Reference) 2149 500 (23.3) 1.00 (Reference)
Pregnancy complications
Any complication 23,971 1801 (7.5) 1.09 (1.04, 1.14) 820 196 (23.9) 1.02 (0.87, 1.17)
Pre‐eclampsia 3443 269 (7.8) 1.03 (0.91, 1.15) 117 30 (25.6) 1.03 (0.71, 1.36)
Gestational diabetes 8098 639 (7.9) 1.01 (0.93, 1.09) 279 59 (21.1) 0.88 (0.64, 1.12)
Bleeding in pregnancy 4264 325 (7.6) 1.10 (0.99, 1.20) 149 53 (35.6) 1.60 (1.37, 1.83)
Haemorrhage in early pregnancy 13,061 951 (7.3) 1.13 (1.07, 1.20) 442 107 (24.2) 1.03 (0.85, 1.21)
Hyperemesis 3478 269 (7.7) 1.04 (0.92, 1.15) 118 30 (25.4) 1.07 (0.74, 1.39)
Antepartum haemorrhage 4883 339 (6.9) 1.01 (0.91, 1.11) 190 60 (31.6) 1.40 (1.18, 1.62)
Postpartum haemorrhage 3835 285 (7.4) 0.96 (0.85, 1.07) 150 31 (20.7) 0.83 (0.50, 1.15)
No complications 54,312 3843 (7.1) 1.00 (Reference) 1695 400 (23.6) 1.00 (Reference)
Birth outcome
Miscarriage 7366 320 (4.3) 1.35 (1.16, 1.55) 80 <11
Medical termination 1187 22 (1.9) 0.49 (0.06, 0.93) 15 <11
Stillbirth 401 24 (6.0) 0.87 (0.49, 1.26) 14 <11
Preterm birth 4630 300 (6.5) 0.88 (0.76, 1.00) 190 41 (21.6) 1.05 (0.80, 1.30)
Spontaneous 3526 224 (6.3) 0.85 (0.72, 0.99) 134 26 (19.4) 0.78 (0.43, 1.13)
Clinician‐initiated 1104 76 (6.9) 0.96 (0.73, 1.18) 56 25 (44.6) 1.87 (1.53, 2.22)
Premature rupture of membranes 8716 587 (6.7) 0.88 (0.79, 0.96) 277 60 (21.7) 0.87 (0.64, 1.11)
Caesarean delivery 12,286 1087 (8.8) 1.23 (1.17, 1.29) 405 98 (24.2) 0.99 (0.80, 1.18)
a

Relative rate of testing, controlling for timing of conception (cubic spline) and estimated gestational age at pregnancy end.

Of the 5644 pregnant individuals with information on COVID‐19 testing, 736 had a record of COVID‐19: 380 (51.6%) had only a COVID‐19 diagnosis, 140 (19.0%) had only a positive laboratory test, and 216 (29.3%) had both a positive test and a COVID‐19 diagnosis code (Figure 3). The remaining 4908 pregnant individuals tested negative for COVID‐19 and had no clinical diagnosis of COVID‐19. Agreement between laboratory testing and diagnostic coding was high (90.8%; 95% CI 90.0, 91.5). When compared to laboratory testing information, clinical diagnoses of COVID‐19 had a positive predictive value of 36.2% (95% CI 33.3, 39.3) and a negative predictive value of 97.2% (95% CI 96.9, 97.5) (Table 2). Positive predictive values increased after May 2020 and were highest for infections occurring during third trimester (42.1%, 95% CI 38.1, 46.3). We observed no differences in positive predictive values by medical risk factors.

FIGURE 3.

FIGURE 3

Source of COVID‐19 identification among pregnant women—OptumLabs Data Warehouse, United States, March 2020–January 2021. Y‐axis values are suppressed to comply with requirements for release of OptumLabs data

TABLE 2.

Comparison of clinical diagnosis of COVID‐19 and COVID‐19 testing results for pregnant women (n = 5644)—OptumLabs Data Warehouse, United States, March 2020–January 2021

Tested positive (n = 356) Diagnosis (n = 596) Agreement (95% CI) Sensitivity (95% CI) Specificity (95% CI) PPV (95% CI) NPV (95% CI)
% %
Overall 6.6 10.5 90.8 (90.0, 91.5) 60.7 (55.4, 65.8) 92.8 (92.1, 93.5) 36.2 (33.3, 39.3) 97.2 (96.9, 97.5)
By month of laboratory test
Mar 2020–May 2020 <3.1 a 10.0 89.6 (87.5, 91.5) 40.0 (19.1, 63.9) 90.7 (88.6, 92.5) 8.3 (4.9, 13.9) 98.6 (98.0, 99.0)
Jun 2020–Sept 2020 7.1 9.9 91.3 (90.4, 92.2) 58.7 (52.5, 64.8) 93.8 (93.0, 94.6) 41.9 (37.9, 46.1) 96.7 (96.3, 97.2)
Oct 2020–Jan 2021 7.6 13.4 89.9 (87.9, 91.7) 72.1 (60.9, 81.7) 91.4 (89.4, 93.1) 40.7 (34.9, 46.8) 97.6 (96.5, 98.3)
By laboratory testing method
RT‐PCR 8.1 11.0 90.8 (89.9, 91.7) 61.2 (55.8, 66.4) 93.4 (92.6, 94.2) 44.9 (41.3, 48.5) 96.5 (96.0, 96.9)
Rapid antigen or serology <3.1 a 9.2 90.7 (89.1, 92.2) 46.1 (19.2, 74.9) 91.1 (89.5, 92.6) 4.7 (2.6, 8.3) 99.5 (99.1, 99.7)
By gestation at time of laboratory test
First trimester 7.8 14.6 86.9 (83.8, 89.7) 59.5 (43.3, 74.4) 89.3 (86.2, 91.8) 32.1 (24.8, 40.3) 96.3 (94.7, 97.4)
Second trimester 6.2 10.8 89.1 (87.5, 90.5) 49.1 (39.5, 58.7) 91.7 (90.3, 93.0) 28.2 (23.5, 33.5) 96.5 (95.8, 97.0)
Third trimester 6.1 10.1 92.3 (91.4, 93.2) 67.3 (60.4, 73.7) 94.0 (93.1, 94.8) 42.1 (38.1, 46.3) 97.8 (97.3, 98.2)
By health and medical factors
High‐risk medical condition
Yes 6.4 11.6 89.0 (86.7, 91.1) 54.7 (40.5, 68.4) 91.4 (89.2, 93.3) 30.2 (23.6, 37.7) 96.7 (95.6, 97.5)
No 6.3 10.4 91.1 (90.3, 91.9) 61.7 (56.0, 67.2) 93.1 (92.3, 93.8) 37.4 (34.2, 40.7) 97.3 (96.9, 97.7)
Diagnosis of a pregnancy complication
Yes 5.6 10.9 91.1 (89.7, 92.3) 67.3 (57.3, 76.3) 92.5 (91.1, 93.7) 34.7 (30.0, 39.7) 97.9 (97.3, 98.4)
No 6.6 10.4 90.7 (89.7, 91.6) 58.0 (51.7, 64.2) 93.0 (92.1, 93.8) 37.0 (33.4, 40.8) 96.9 (96.4, 97.3)
Advanced maternal age
Yes 4.5 7.8 93.1 (91.8, 94.2) 59.3 (47.8, 70.1) 94.6 (93.5, 95.7) 34.0 (28.3, 40.3) 98.0 (97.5, 98.5)
No 7.2 11.9 89.7 (88.7, 90.7) 61.1 (55.1, 66.9) 91.9 (91.0, 92.8) 36.9 (33.6, 40.4) 96.8 (96.3, 97.3)

Abbreviations: CI, confidence interval; NPV, negative predictive value; PPV, positive predictive value. a Exact percentages are suppressed to comply with requirements for data release.

a

Exact percentages are suppressed to comply with requirements for data release.

Based on the two sources of information, although the incidence proportion of COVID‐19 would be higher if relying on laboratory testing alone (6.3%, 95% CI 5.7%, 7.0%) compared with diagnostic coding alone (3.2%, 95% CI 3.1, 3.3%), the proportionate severity would be similar (Table 3). For measures of association, the direction was the same for the outcomes considered; however, the strength of association was not. For example, when caesarean delivery was measured by laboratory testing only, there was a weaker association between COVID‐19 and caesarean when exposure was measured by laboratory testing alone (aHR 1.32, 95% CI 0.95, 1.84) compared with diagnostic coding alone (aHR 2.04, 95% CI 1.74, 2.39) (Table 3). We observed a stronger association between COVID‐19 and preterm birth when exposure was measured by laboratory testing alone (aHR 2.41, 95% CI 1.53, 3.79) compared with diagnostic coding alone (aHR 1.87, 95% CI 1.41, 2.47). Effect estimates for all other outcomes were within 0.23 units from each other.

TABLE 3.

Incidence of COVID‐19 and measure of association between COVID‐19 and adverse pregnancy outcomes—OptumLabs Data Warehouse, United States, March 2020–January 2021

Source of COVID−19 information
Laboratory test result (n = 5644) Diagnostic coding (n = 78,283) Both laboratory testing or diagnostic coding (n = 78,283)
Number of COVID−19 cases 356 2515 2655
Incidence proportion, % (95% CI) 6.3 (5.7, 7.0) 3.2 (3.1, 3.3) 3.4 (3.3, 3.5)
Proportionate severity, % (95% CI) a 3.2 (1.9, 5.5) 3.6 (2.9, 4.4) 3.4 (2.8, 4.2)
Association with birth outcome (aHR b [95% CI])
Preterm birth 2.41 (1.53, 3.79) 1.87 (1.41, 2.47) 2.07 (1.65, 2.61)
Premature rupture of membranes 1.36 (0.84, 2.20) 1.59 (1.30, 1.95) 1.54 (1.27, 1.87)
Spontaneous preterm labour 1.97 (1.15, 3.38) 1.87 (1.41, 2.47) 1.80 (1.38, 2.35)
Caesarean delivery 1.32 (0.95, 1.84) 2.04 (1.74, 2.39) 1.99 (1.71, 2.31)
Foetal growth restriction 2.13 (1.41, 3.24) 2.15 (1.80, 2.57) 2.04 (1.72, 2.43)

Abbreviation: CI, confidence interval.

a

Proportionate severity indicates the proportion of COVID‐19 cases that were considered severe (ie required admission to intensive care unit, mechanical ventilation or extracorporeal membrane oxygenation, or diagnosis of acute respiratory distress syndrome).

b

Hazard ratios accounting for COVID‐19 infection as a time‐varying exposure and adjusting for maternal age, race/ethnicity, household income, pre‐existing medical conditions and week of conception (cubic spline).

4. COMMENT

4.1. Principal findings

This large, national study comparing clinical diagnoses to COVID‐19 laboratory testing results provides important insights into the quality of existing data for measuring COVID‐19 infections during pregnancy. In this commercially insured cohort, we found that the rate of COVID‐19 testing did not appear to be random and varied by sociodemographic and health factors. More than one‐in‐five COVID‐19 cases would be missed by ICD‐coded diagnoses alone, and 64% of those with a COVID‐19 diagnosis had a negative laboratory result. These results imply that reliance on diagnostic coding or laboratory testing results alone from medical records will result in misclassification of disease status, and depending on which measurement applied, could change measures of association with perinatal health outcomes. Epidemiological studies relying on existing health data should be aware of these considerations and how they may influence their study findings.

4.2. Strengths of the study

Our study had several key strengths. This study drew from a large national cohort of commercially insured US pregnant individuals with comprehensive health and medical information. Analyses within a commercially insured population presented the opportunity to perform our analyses with a population where access to clinical testing was not limited (a major strength of the study). Furthermore, in this population, clinical information from multiple sources can be readily pooled.

4.3. Limitations of the data

Despite these strengths, our study had several limitations to consider. First, the reliance on commercial insurance claims records, while useful, meant that our findings may not necessarily generalise to uninsured or publicly insured populations. Second, similar to evaluations of influenza, we opted to use laboratory testing results as the ‘gold standard’ for our comparisons. However, clinical guidance allows for diagnosis of COVID‐19 in the absence of a positive laboratory test if there are clinical observations or epidemiological reasoning to indicate COVID‐19 infection. As a result, it is possible that several of our test‐negative individuals were false negatives, which would have biased our estimates of positive predictive values.

4.4. Interpretation

COVID‐19 diagnoses are relatively novel, and few studies have evaluated their validity. A recent Danish study conducted among 710 diagnosed patients with COVID‐19 between February and May 2020 showed that compared with medical record review, ICD‐coded COVID‐19 diagnoses had a positive predictive value of 99%. 6 An important distinction between this study and ours is that for our comparison of diagnoses and laboratory testing information, we restricted our sample to tested individuals to allow comparison. Medical chart review of those who test negative but have a clinical diagnosis would be helpful for elucidating reasons for diagnosis, but was not possible in the current study. A cohort study of 2201 individuals tested for COVID‐19 at the University of Utah Medical Center showed that ICD codes for fever, cough and dyspnoea codes had low sensitivity and negative predictive values. 7 However, this study did not evaluate the validity of COVID‐19 diagnostic codes. A single‐centre validation study among 3,905 paediatric patients indicated U07.1 diagnosis had a sensitivity, specificity, positive predictive value and negative predictive value exceeding 90%. 8 Our findings do not align with these previous studies—which could be due to differences in patient population, clinical behaviours or a combination of both.

Our findings suggest existing data may be prone to several biases. First, in this insured cohort of pregnant individuals, although individuals would have had access to clinical testing, the rate of testing was not independent of sociodemographic and health factors. Given these factors have also been associated with more serious COVID‐19 infection, 9 this could indicate that detection of COVID‐19 is more common among those more prone to severe infection. We believe these results show that observational studies should consider the sociodemographic factors correlated with testing when using large observational databases to perform COVID‐19 research. Second, 64% of those with a clinical diagnosis had a negative COVID‐19 test. Clinical guidance is consistent with giving a diagnosis in the absence of a positive COVID‐19 test if clinical symptoms indicate. Furthermore, false negatives range from 21%–67%, depending on the duration of illness at the time of testing. 10 , 11 , 12 As a result, without medical chart review, it is difficult to say whether these diagnoses reflect true COVID‐19 cases. Since we relied on clinical records, our study would not have detected COVID‐19 among asymptomatic individuals who did not present for medical care or routine screening. As a result, it is possible that we underestimated the true incidence of COVID‐19 among pregnant persons.

Thirty‐nine per cent of COVID‐19 cases detected through laboratory testing had no diagnosis of COVID‐19, indicating that reliance on coded illnesses alone would miss these infections, including asymptomatic infections. Our analyses of how different sources of COVID‐19 information may influence effect estimates highlight that the use of laboratory testing information only or clinical diagnostic information only could result in somewhat different effect estimates for several key outcomes. We hypothesise, therefore, that differences in the measurement of the outcome in published studies 9  may be somewhat attributable to methods of measurement of the outcome. However, our results did not show consistently weaker or stronger effect estimates when clinical diagnosis was the sole means of identifying infection, suggesting that additional factors may contribute to variation in these estimates.

5. CONCLUSIONS

We believe our results highlight the importance of considering the source of information on COVID‐19 infection among pregnant individuals and how different sources of information may influence effect estimations. Regardless, utilisation of one data source alone is likely to miss a substantial portion of cases, resulting in measurement error and misclassification bias. It is important for epidemiological evaluations to consider these limitations when conducting research and interpreting findings.

CONFLICT OF INTERESTS

The authors have no conflicts of interest to disclose.

AUTHOR CONTRIBUTIONS

AKR extracted and prepared data and oversaw all aspects of project implementation. OA and SS contributed to the study design. OA and SS contributed to the development of the analytic plan. AKR performed the statistical analyses and prepared study results. AKR, OA and SS contributed to the interpretation of findings. AKR led the drafting of the report, and all co‐authors contributed to revising of the report and approved the final version.

Supporting information

Table S1‐S3

ACKNOWLEDGEMENTS

The authors would like to acknowledge Nina Veeravalli, Christine Kha, and Rick Little at OptumLabs for their technical assistance and guidance in working with the data from the OptumLabs Data Warehouse. Open access publishing facilitated by Curtin University, as part of the Wiley ‐ Curtin University agreement via the Council of Australian University Librarians.

Regan AK, Arah OA, Sullivan SG. Performance of diagnostic coding and laboratory testing results to measure COVID‐19 during pregnancy and associations with pregnancy outcomes. Paediatr Perinat Epidemiol. 2022;36:508–517. doi: 10.1111/ppe.12863

Funding information

Access to study data was financially supported by OptumLabs(R) and University of California Los Angeles. The WHO Collaborating Centre for Reference and Research on Influenza is supported by the Australian Government Department of Health. The funder had no influence in the design or implementation of the study or the decision to publish the study findings

DATA AVAILABILITY STATEMENT

During the conduct of the study, the first author (AKR) had full access to the study data and takes responsibility for the integrity of the data and the accuracy of the data analysis; however, the authors do not have ongoing access to the data analysed in this study, nor do they have permission to share the study data with other researchers.

REFERENCES

  • 1. Zambrano LD, Ellington S, Strid P, et al. Update: characteristics of symptomatic women of reproductive age with laboratory‐confirmed SARS‐CoV‐2 infection by pregnancy status ‐ United States, January 22‐October 3, 2020. MMWR Morb Mortal Wkly Rep. 2020;69:1641‐1647. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Villar J, Ariff S, Gunier RB, et al. Maternal and neonatal morbidity and mortality among pregnant women with and without COVID‐19 infection: the INTERCOVID multinational cohort study. JAMA Pediatr. 2021;175(8):817. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. CDC . International Classification of Diseases, Tenth Revision, Clinical Modification (ICD‐10‐CM). [cited 2021 July 28]. https://www.cdc.gov/nchs/icd/icd10cm.htm2021https://www.cdc.gov/nchs/icd/icd10cm.htm
  • 4. OptumLabs . OptumLabs and OptumLabs Data Warehouse (OLDW) Descriptions and Citation. July. 2020. [Google Scholar]
  • 5. Ailes EC, Simeone RM, Dawson AL, Petersen EE, Gilboa SM. Using insurance claims data to identify and estimate critical periods in pregnancy: an application to antidepressants. Birth Defects Res A Clin Mol Teratol. 2016;106:927‐934. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Bodilsen J, Leth S, Nielsen SL, Holler JG, Benfield T, Omland LH. Positive predictive value of ICD‐10 diagnosis codes for COVID‐19. Clin Epidemiol. 2021;13:367‐372. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Crabb BT, Lyons A, Bale M, et al. Comparison of international classification of diseases and related health problems, tenth revision codes with electronic medical records among patients with symptoms of coronavirus disease 2019. JAMA Netw Open. 2020;3:e2017703. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Blatz AM, David MZ, Otto WR, Luan X, Gerber JS. Validation of international classification of disease‐10 code for identifying children hospitalized with coronavirus disease‐2019. J Pediatr Infect Dis. 2021;10:547‐548. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Allotey J, Stallings E, Bonet M, et al. Clinical manifestations, risk factors, and maternal and perinatal outcomes of coronavirus disease 2019 in pregnancy: living systematic review and meta‐analysis. BMJ. 2020;370:m3320. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Kucirka LM, Lauer SA, Laeyendecker O, Boon D, Lessler J. Variation in false‐negative rate of reverse transcriptase polymerase chain reaction‐based SARS‐CoV‐2 tests by time since exposure. Ann Intern Med. 2020;173:262‐267. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Woloshin S, Patel N, Kesselheim AS. False negative tests for SARS‐CoV‐2 Infection — challenges and implications. New Engl J Med. 2020;383:e38. [DOI] [PubMed] [Google Scholar]
  • 12. Arevalo‐Rodriguez I, Buitrago‐Garcia D, Simancas‐Racines D, et al. False‐negative results of initial RT‐PCR assays for COVID‐19: a systematic review. PLoS One. 2020;15:e0242958. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Table S1‐S3

Data Availability Statement

During the conduct of the study, the first author (AKR) had full access to the study data and takes responsibility for the integrity of the data and the accuracy of the data analysis; however, the authors do not have ongoing access to the data analysed in this study, nor do they have permission to share the study data with other researchers.


Articles from Paediatric and Perinatal Epidemiology are provided here courtesy of Wiley

RESOURCES