Skip to main content
HHS Author Manuscripts logoLink to HHS Author Manuscripts
. Author manuscript; available in PMC: 2026 Jan 28.
Published in final edited form as: Birth Defects Res. 2023 Jan 14;115(4):498–509. doi: 10.1002/bdr2.2151

Agreement Between Maternal Report and Medical Records on Use of Medications During Early Pregnancy in New York

Meredith M Howley 1, Sarah C Fisher 1, Margueritta A Fuentes 1, Martha M Werler 2, Melissa Tracy 3, Marilyn L Browne 1,3; the Birth Defects Study To Evaluate Pregnancy exposureS
PMCID: PMC12836282  NIHMSID: NIHMS2136968  PMID: 36640121

Abstract

Background:

Studies evaluating associations between medication use in pregnancy and birth outcomes rely on various sources of exposure information. We sought to assess agreement between self-reported use of medications during early pregnancy and medication information in prenatal medical records to understand the reliability of each of these information sources.

Methods:

We compared self-reported prescription medication use in early pregnancy to medical records from 184 New York women with deliveries in 2018 who participated in the Birth Defects Study To Evaluate Pregnancy exposureS. We assessed medications used chronically and episodically, and medications within 12 therapeutic groups. We calculated agreement using kappa (κ) coefficients, sensitivity, and specificity. We assessed differences by case/control status, maternal age, education, time to interview, and interview language.

Results:

Medications used chronically showed substantial agreement between self-report and medical records (κ=0.75, 0.61–0.88), with agreement for therapeutic groups used chronically ranging from κ=0.61 for antidiabetics to κ=1.00 for antihypertensives. Prescription medications used episodically showed worse agreement (κ=0.40, 0.25–0.54), with the lowest agreement for opioid analgesics (κ=0.20) and anti-infectives (κ=0.33). Agreement did not differ by the characteristics examined, although we observed potential differences by interview language.

Conclusions:

Among our sample, we observed good agreement between self-report and medical records for medications used chronically and substantially less agreement for medications used episodically. Differences by source may be due to poor recall in self-reports, non-adherence with prescribed medications and lack of complete prescription information within medical records. Limitations should be considered when assessing prescription medication exposures during early pregnancy in epidemiologic studies.

Keywords: prescription medications, pregnancy, self-report, exposure misclassification, agreement, Kappa, validity

BACKGROUND

Prescription medication use is common in pregnancy and increasing (Mitchell et al., 2011; Tinker et al., 2015). Many studies seek to understand the risks and safety of medication use in pregnancy. Studies investigating associations between medications and the risk of birth defects and other pregnancy outcomes often rely on maternal self-reported medication information. The accuracy of recalling medication use may be influenced by many factors, including questionnaire design, potential stigma, time between use and interview, age, education level, and presence and severity of comorbidities (Mitchell et al., 1986; Sarangarm et al., 2012; van Gelder et al., 2013). Past studies have observed that the frequency and duration of medication use impacts accurate reporting, with medications taken regularly having higher agreement than medications taken episodically (Cheung et al., 2017; Evandt et al., 2019; Olesen et al., 2001; Pisa et al., 2015; Sarangarm et al., 2012; Skurtveit, 2014; Stephansson et al., 2011; Sundermann et al., 2017; van Gelder et al., 2013). Importantly, no source may perfectly capture medication use, as medical records or claims databases may not reflect actual use due to potential nonadherence to prescribed medications and lack of information on over-the-counter medications.

Existing studies comparing self-reported medication use during pregnancy to prescription drug databases and pharmacy dispensing records have most often been conducted in countries outside the United States (Cheung et al., 2017; Evandt et al., 2019; Olesen et al., 2001; Pisa et al., 2015; Skurtveit, 2014; Stephansson et al., 2011; Sundermann et al., 2017; van Gelder et al., 2013), although two studies in the US compared self-reported and medical record information (Palmsten et al., 2018; Sarangarm et al., 2012). Observational studies among pregnant women would benefit from estimates of the accuracy and validity of self-reported medication information, which can inform potential exposure misclassification bias analyses that seek to refine estimates of the effects of early pregnancy medication use on the risk of birth defects or other pregnancy outcomes.

Our objective was to assess agreement between self-reported medication use in early pregnancy collected as part of the Birth Defects Study To Evaluate Pregnancy exposureS (BD-STEPS) and information in prenatal medical records. Our study adds to the literature on this topic and provides comparisons to medical records using recent data from a US population-based study.

METHODS

BD-STEPS is an ongoing population-based case-control study of birth defects. At present, data are available for pregnancies ending on or after 01/01/2014 through 12/31/2018 (Tinker et al., 2015). Pregnancies affected by one or more of 17 major structural birth defects (“cases”) are eligible, excluding those attributed to known chromosomal or single-gene abnormalities. Clinical information is obtained from birth defects surveillance programs in seven states (Arkansas, California, Georgia, Iowa, Massachusetts, New York, and North Carolina). Liveborn infants without major birth defects (“controls”) are randomly selected from vital or hospital records in the same time period and geographic area as the cases. Women with eligible pregnancies are invited to participate in a computer-assisted telephone interview between 6 weeks and 18 months after the estimated date of delivery (EDD). BD-STEPS and this study were approved by the Centers for Disease Control and Prevention institutional review board. Our validation study included BD-STEPS participants from New York with EDDs in 2018. Among eligible NY women with 2018 EDDs, 58% of cases and 48% of controls consented to and participated in the interview.

BD-STEPS collects self-reported information on a range of exposures in early pregnancy (defined as the month before through the third month of pregnancy). The interview is structured so that women are asked about medication use in several ways: use for specific chronic and infectious diseases, use of specific medication groups (e.g., antibiotics) and medications (e.g., isotretinoin), and a “catch all” question about any other medications. For each reported medication, women are asked the indication, start and stop dates of use, frequency of use, and dosage. To assist with recall, women are sent a calendar covering the period of their pregnancy and a chart to help summarize their medication information. The chart is designed for women to use to prepare for the interview and has places for them to document names, dosage, and frequency of medication use.

We developed a database to collect medication information from medical records (including medication name, indication, dates of use, frequency, and dosage). An experienced BD-STEPS medical record abstractor pursued all prenatal and delivery records from hospitals and obstetric providers, including paper records, electronic records, and regional health information organizations. Of the 191 women eligible, we abstracted medical records for 184 (96%) women and could not locate records for seven women.

We abstracted information on any medications documented in the medical record as being used or prescribed during early pregnancy. We classified all medications as available by prescription only, over-the-counter only, or both. We limited our analysis to medications only available by prescription as we expected these to be most reliably collected in the medical record. We grouped prescription medications into 12 therapeutic classes: 1) antidiabetics; 2) antiemetics; 3) antiepileptics; 4) antihypertensives; 5) anti-infectives, including antibiotics, antifungals, and antivirals; 6) asthma medications; 7) fertility-related medications; 8) migraine medications; 9) opioid analgesics; 10) psychotherapeutics, including antidepressants, anxiolytics, and antipsychotics; 11) sleep aid medications; and 12) thyroid medications. Our main analysis focused on prescription medications by type of use: medications used chronically (antidiabetics, antiepileptics, antihypertensives, psychotherapeutics, and thyroid medications) and medications used episodically (antiemetics, anti-infectives, migraine medications, opioid analgesics, and sleep aid medications). Asthma and fertility-related medications were not included in either category given the variations in their recommended use. Eight women reported a prescription medication but did not recall the specific name (e.g., “insulin, not otherwise specified”). We included these reports within the relevant groups and analyses. All medication classifications were reviewed by a research pharmacist.

We described self-reported characteristics of the included women and compared them to the overall BD-STEPS sample. We calculated the prevalence of early pregnancy medication use by medication use group and therapeutic class. We analyzed concordance between the two data sources by calculating the percent total agreement and Cohen’s kappa (κ) statistic and 95% confidence intervals (CI). We used commonly defined categories to summarize agreement: 0–0.20 as slight, 0.21–0.40 as fair, 0.41–0.60 as moderate, 0.61–0.80 as substantial, and 0.81–1 as almost perfect (Landis & Koch, 1977). In addition, we calculated the validity of self-reported medication use using medical records as the reference standard. We calculated the sensitivity (probability medication was self-reported given it was in the medical record) and specificity (probability medication was not self-reported given it was not in the medical record), as well as the positive predictive value (PPV) and the negative predictive value (NPV). To assess differences in agreement based on maternal characteristics, we compared agreement metrics across levels of case/control status, age at delivery (<30 years, ≥30 years), educational attainment (high school or less, more than high school), time between delivery and interview (≤6 months, >6 months), and interview language (English, Spanish). We conducted four sub-analyses. First, we used logistic regression to calculate adjusted odds ratios (OR) and 95% CIs for the association between concordant reports (medication reports in both sources) and maternal and interview characteristics, including case/control status, time between delivery and interview, interview language, age, educational attainment, occupation status, health insurance during early pregnancy, birth outcome, parity, and singleton birth. Additionally, among women with concordant responses, we assessed agreement on specific reported medications. We excluded women who reported medication use “as needed” or on a “varied schedule to assess whether this accounted for our findings. Lastly, we excluded women for whom we did not obtain a prenatal record and those where the prenatal record was minimal. We conducted all analyses in SAS (9.4; SAS Corporation, Cary, NC).

RESULTS

Our analysis included 184 women from NY with 2018 EDDs, which represents 6% of BD-STEPS participants (184/3,232). Women in our sample were generally similar to the rest of the women in BD-STEPS with a few exceptions (data not shown). Women in our sample more frequently reported being born outside the United States, completed the interview in Spanish, and were interviewed within 6 months of delivery than the rest of BD-STEPS. Additionally, the women in our sample less frequently reported having more than a high school education compared to all BD-STEPS women. The mean age of women in our sample was 28.7 years for cases and 31.4 years for controls (Table 1). Over 50% of women reported more than a high school education, although this was more frequently reported among controls than cases. The vast majority reported having some form of health insurance during early pregnancy and working outside the home. Additionally, 56% of women (61% cases, 42% controls) reported a history of at least one chronic condition; the most common self-reported conditions were obesity (22%) and depression or anxiety (22%).

Table1.

Characteristics of the NY validation population, Birth Defects Study To Evaluate Pregnancy exposureS (BD-STEPS), 2018 (n=184).

NY 2018 Validation Subgroup
Characteristics Total n=184 Cases n=100 Controls n=84
mean (SD) mean (SD) mean (SD)
Age at delivery, years 30.0 (6.0) 28.7 (5.7) 31.4 (6.0)
n (%) n (%) n (%)
Educational attainment
 Less than high school 24 (13.0) 15 (14.6) 9 (10.7)
 High school 50 (27.2) 36 (35.9) 14 (16.7)
 More than high school 98 (53.3) 41 (41.8) 57 (67.9)
 Missing 12 (6.5) 8 (7.8) 4 (4.8)
Nativity
 US-born 102 (55.4) 56 (56.0) 46 (54.8)
 Foreign-born 70 (38.0) 36 (36.0) 34 (40.5)
 Missing 12 (6.5) 8 (8.0) 4 (4.7)
Occupation outside the home
 No 43 (23.4) 25 (25.0) 18 (21.4)
 Yes 129 (70.1) 67 (67.0) 62 (73.8)
 Missing 12 (6.5) 8 (8.0) 4 (4.8)
Health insurance in early pregnancy
 No 23 (12.5) 15 (15.0) 8 (9.5)
 Yes 149 (81.0) 77 (77.0) 72 (85.7)
 Missing 12 (6.5) 8 (8.0) 4 (4.8)
Parity
 None 60 (32.6) 42 (42.0) 18 (21.4)
 One 72 (39.1) 32 (32.0) 40 (47.6)
 Two or more 48 (26.1) 25 (25.0) 23 (27.4)
 Missing 4 (2.2) 1 (1.0) 3 (3.6)
Singleton gestation
 No 8 (4.4) 6 (6.0) 2 (2.4)
 Yes 176 (95.7) 94 (94.0) 82 (97.6)
Birth outcome
 Live birth 177 (96.2) 93 (93.0) 84 (100)
 Non-livebirth 7 (3.8) 7 (7.0) --
History of a chronic condition 103 (56.0) 61 (61.0) 42 (50.0)
 Obesity 41 (22.3) 25 (25.0) 16 (19.1)
 Depression/Anxiety 40 (21.7) 25 (25.0) 15 (17.9)
 Asthma 29 (15.8) 17 (17.0) 12 (14.3)
 Autoimmune disease 15 (8.2) 7 (7.0) 8 (9.5)
 Thyroid disease 15 (8.2) 9 (9.0) 6 (7.1)
 Attention deficit hyperactivity disorder 6 (3.3) 4 (4.0) 2 (2.4)
 Bipolar disorder 6 (3.3) 4 (4.0) 2 (2.4)
 Chronic hypertension 5 (2.7) 3 (3.0) 2 (2.4)
 Pre-existing diabetes 3 (1.6) 2 (2.0) 1 (1.2)
 Heart problems since birth 3 (1.6) 3 (3.0) 0 (0)
Interview time
 6 months or less 74 (40.2) 55 (55.0) 19 (22.6)
 7–12 months 102 (55.4) 41 (41.0) 61 (72.6)
 13–18 months 8 (4.4) 4 (4.0) 4 (4.8)
Interview language
 English 132 (71.7) 71 (71.0) 61 (72.6)
 Spanish 52 (27.7) 29 (29.0) 23 (27.4)
Prenatal record available
 No § 8 (4.3) 5 (5.0) 3 (3.6)
 Yes, but incomplete § 5 (2.7 2 (2.0) 3 (3.6)
 Yes 171 (93.0) 93 (93.0) 78 (92.9)

History of a chronic condition included the conditions listed as well as cancer and epilepsy. The counts by case/control status for cancer and epilepsy were suppressed due to small sizes.

Obesity defined as a pre-pregnancy body mass index (weight in kg/height in meters2) of 30 or more.

§

For these women, we relied on the delivery record or records from later in pregnancy to obtain information on early pregnancy medication use.

In both sources, medications used episodically were more prevalent than medications used chronically (Table 2). For medications used episodically, the prevalence from medical records (31.0%) was higher than from self-report (25.0%). Anti-infectives were the most prevalent medication in both sources and had the largest differences in prevalence between the sources (medical records =19.6%, self-report=13.0%). Medications used episodically had poor agreement between the two sources [κ=0.40 (0.25–0.54)], with the poorest agreement for opioid analgesics [κ=0.20 (0.00–0.57)] and anti-infectives [κ=0.33 (0.15–0.50)]. Considering medical records as the reference standard, medications used episodically had low sensitivity (50.9%; Table 3). The sensitivity for therapeutic groups used episodically ranged from 25.0% for opioids to perfect sensitivity for sleep aid medications, although the sensitivity was around or below 50% for antiemetics, anti-infectives, and opioid analgesics. Medications used episodically had a high specificity (86.6%), ranging from 92.6% for anti-infectives to 100% for sleep aids.

Table 2.

Agreement between self-reported medication use and prenatal medical record, Birth Defects Study To Evaluate Pregnancy exposureS (BD-STEPS), 2018 (n=184).

Both Self-reported only Medical record only Self-reported prevalence n (%) Medical record prevalence n (%) % agreement Kappa coefficient (95% CI) Interpretation
Medication Use Groups
 Used episodically 29 17 28 46 (25.0) 57 (31.0) 75.5 0.40 (0.25, 0.54) Fair
 Used chronically 24 10 3 34 (18.5) 27 (14.7) 92.9 0.75 (0.61, 0.88) Substantial
Therapeutic Class
 Antidiabetics 5 3 3 8 (4.3) 8 (4.3) 96.7 0.61 (0.32, 0.90) Substantial
 Antiemetics 11 2 11 13 (7.1) 22 (12.0) 92.9 0.59 (0.39, 0.79) Moderate
 Antiepileptics 2 1 1 3 (1.6) 3 (1.6) 98.9 0.66 (0.22, 1.00) Substantial
 Antihypertensives 3 0 0 3 (1.6) 3 (1.6) 100.0 1.00 (1.00, 1.00) Perfect
 Anti-infectives 13 11 23 24 (13.0) 36 (19.6) 81.5 0.33 (0.15, 0.50) Fair
 Asthma medications 2 4 5 6 (3.3) 7 (3.8) 95.1 0.28 (0.00, 0.62) Fair
 Fertility-related medications 9 8 9 17 (9.3) 18 (9.8) 90.8 0.46 (0.25, 0.68) Moderate
 Migraine medications 2 4 1 6 (3.3) 3 (1.6) 97.3 0.43 (0.02, 0.84) Moderate
 Opioid analgesics 1 4 3 5 (2.7) 4 (2.2) 96.2 0.20 (0.00, 0.57) Slight
 Psychotherapeutics 9 3 0 12 (6.5) 9 (4.9) 98.4 0.85 (0.68, 1.00) Almost perfect
 Sleep aid medications 2 0 0 2 (1.1) 2 (1.1) 100.0 1.00 (1.00, 1.00) Perfect
 Thyroid medications 8 1 1 9 (4.9) 9 (4.9) 98.9 0.88 (0.72, 1.00) Almost perfect

Abbreviations. MR= medical record; CI=confidence interval.

Agreement interpretation categories outlined by Landis and Koch.

Episodic use includes antiemetics, anti-infectives, migraine medications, opioid analgesics, sleep aid medications and other medications not in a therapeutic group but are used for a short duration (for example, prednisone for an allergic reaction). Chronic use includes antidiabetics, antiepileptics, antihypertensives, psychotherapeutics, thyroid medications, and other medications not in a therapeutic group that are used chronically (for example, simvastatin).

Table 3.

Validity of self-report using medical records as reference standard, Birth Defects Study To Evaluate Pregnancy exposureS (BD-STEPS), 2018 (n=184)

Sensitivity Specificity PPVa NPVb
% 95% CI % 95% CI % 95% CI % 95% CI
Medication Use Groupsc
 Used episodically 50.9 (38, 64) 86.6 (81, 93) 63.0 (49, 77) 79.7 (73, 86)
 Used chronically 88.9 (77, 100) 93.6 (90, 97) 70.6 (55, 86) 98.0 (96, 100)
Therapeutic Class
 Antidiabetics 62.5 (29, 96) 98.3 (96, 100) 62.5 (29, 96) 98.3 (96, 100)
 Antiemetics 50.0 (29, 71) 98.8 (97, 100) 84.6 (65, 100) 93.6 (90, 97)
 Antiepileptics 66.7 (13, 100) 99.5 (98, 100) 66.7 (13, 100) 99.5 (98, 100)
 Antihypertensives 100.0 -- 100.0 -- 100.0 -- 100.0 --
 Anti-infectives 36.1 (20, 52) 92.6 (88, 97) 54.2 (34, 74) 85.6 (80, 91)
 Asthma medications 28.6 (0, 62) 97.7 (96, 100) 33.3 (0, 71) 97.2 (95, 100)
 Fertility-related medications 50.0 (27, 73) 95.2 (92, 98) 52.9 (29, 77) 94.6 (92, 98)
 Migraine medications 66.7 (13, 100) 97.8 (96, 100) 33.3 (0, 72) 99.4 (98, 100)
 Opioid analgesics 25.0 (0, 67) 97.8 (96, 100) 20.0 (0, 55) 98.3 (96, 100)
 Psychotherapeutics 100.0 -- 98.3 (96, 100) 75.0 (51, 100) 100.0 --
 Sleep aid medications 100.0 -- 100.0 -- 100.0 -- 100.0 --
 Thyroid medications 88.9 (68, 100 99.4 (98, 100) 88.9 (68, 100) 99.4 (98, 100)

CI=confidence interval; PPV=positive predictive value; NPV=negative predictive value

a

Equivalent to the sensitivity of medical records using self-report as the reference standard.

b

Equivalent to the specificity of medical records using self-report as the reference standard.

c

Episodic use includes antiemetics, anti-infectives, migraine medications, opioid analgesics, sleep aid medications and other medications not in a therapeutic group but are used for a short duration (for example, prednisone for an allergic reaction). Chronic use includes antidiabetics, antiepileptics, antihypertensives, psychotherapeutics, thyroid medications, and other medications not in a therapeutic group that are used chronically (for example, simvastatin).

For medications used chronically the prevalence from self-reports (18.5%) was higher than the prevalence from medical records (14.7%; Table 2). Medications used chronically showed substantial agreement between the two sources [κ=0.75 (0.61–0.88)], and antihypertensives, psychotherapeutics, and thyroid medications had almost perfect or perfect agreement. Considering medical records as the reference standard, medications used chronically had high sensitivity (88.9%), ranging from 62.5% for antidiabetics to 100% for antihypertensives and psychotherapeutics (Table 3). Medications used chronically had high specificity (93.6%), and specificity was greater than 98.0% for all therapeutic groups used chronically.

Agreement between the two sources was largely similar across the characteristics investigated for medications used episodically and those used chronically (Table 4). Potential differences were observed by interview language for medications used episodically, where women interviewed in Spanish had lower agreement [κ=0.13 (0–0.40)] than women interviewed in English [κ=0.49 (0.33–0.65)]. Yet, in our multivariable analysis, no examined characteristics were predictive of concordant reports for medications used episodically or chronically (Table 5). The numbers within these analyses were small, particularly for medications used chronically, thus estimates are unstable. We explored agreement on specific medication products among those who agreed at the therapeutic class level (Table 6). Antiepileptics, asthma medications, psychotherapeutics, and thyroid medications had the highest levels of agreement between both sources on specific medications, while sleep aid medications, migraine medications, and anti-infectives had poor agreement on specific medications. After excluding eight women who self-reported using a prescription medication “as needed,” our results were unchanged (Supplemental Table 1). Lastly, after excluding 13 women for whom we lacked a complete prenatal record, our results were unchanged (Supplemental Table 2).

Table 4.

Variation in prevalence and agreement by maternal characteristics, Birth Defects Study To Evaluate Pregnancy exposureS (BD-STEPS), 2018 (n=184).

Characteristic Medications used episodically Medications used chronically
Self-reported Prev n (%) Medical record Prev n (%) % agree K (CI) Sens (CI) Spec (CI) Self-reported Prev n (%) Medical record Prev n (%) % agree K (CI) Sens (CI) Spec (CI)
Case/control status
 Case 29 (29.0) 34 (34.0) 73.0 0.38 (0.18, 0.57) 52.9 (36, 70) 83.3 (74, 92) 21 (21.0) 19 (19.0) 92.0 0.75 (0.59, 0.91) 84.2 (68, 100) 93.8 (89, 99)
 Control 17 (20.2) 23 (27.4) 78.6 0.41 (0.19, 0.64) 47.8 (27, 68) 90.2 (83, 98) 13 (15.5) 8 (9.5) 94.0 0.73 (0.51, 0.95) 100.0 -- 93.4 (88, 99)
Age at delivery
 < 30 years 21 (24.1) 25 (28.7) 74.7 0.35 (0.13, 0.57) 48.0 (28, 68) 85.5 (77, 94) 13 (14.9) 11 (12.6) 95.4 0.81 (0.62, 0.99) 90.9 (74, 100) 96.1 (92, 100)
30 years 25 (25.8) 32 (33.0) 76.3 0.43 (0.24, 0.63) 53.1 (36, 70) 87.7 (90, 96) 21 (21.6) 16 (16.5) 90.7 0.70 (0.52, 0.88) 87.5 (71, 100) 91.4 (85, 97)
Educational attainment
 High school or less 15 (20.3) 21 (28.4) 75.7 0.35 (0.11, 0.58) 54.6 (38, 72) 84.6 (76, 93) 9 (12.2) 7 (9.5) 94.6 0.72 (0.46, 0.98) 87.5 (71, 100) 91.5 (85, 98)
 More than high school 28 (28.6) 33 (33.7) 74.5 0.41 (0.21, 0.60) 42.9 (22, 64) 88.7 (80, 97) 21 (21.5) 16 (16.3) 90.8 0.70 (0.52, 0.88) 85.7 (60, 100) 95.5 (91, 100)
Time from delivery to interview
 ≤ 6 months 20 (27.0) 28 (37.8) 70.3 0.33 (0.11, 0.55) 46.4 (28, 65) 84.8 (74, 95) 14 (18.9) 15 (20.3) 93.2 0.79 (0.61, 0.97) 80.0 (60, 100) 96.6 (92, 100)
 > 6 months 26 (23.6) 29 (26.4) 79.1 0.44 (0.25, 0.64) 55.2 (37, 73) 87.7 (80, 95) 20 (18.2) 12 (10.9) 92.7 0.71 (0.53, 0.90) 100.0 -- 91.8 (86, 97)
Interview Language
 English 37 (28.0) 41 (31.1) 78.8 0.49 (0.33, 0.65) 61.0 (46, 76) 86.8 (80, 94) 28 (21.2) 32 (31.3) 92.4 0.75 (0.61, 0.90) 90.9 (79, 100) 92.7 (88, 98)
 Spanish 9 (17.3) 16 (30.8) 67.3 0.13 (0.00, 0.40) 25.0 (4, 46) 86.1 (75, 97) 6 (11.5) 5 (9.6) 94.2 0.70 (0.37, 1.00) 80.0 (45, 100) 95.7 (90, 100)

Prev=prevalence; K=kappa coefficient; CI=95% confidence interval, Sens=sensitivity, Spec=specificity

Sensitivity and specificity were calculated using medical records as reference standard.

Table 5.

Characteristics associated with concordance between self-reports and medical records, Birth Defects Study To Evaluate Pregnancy exposureS (BD-STEPS), 2018 (n=184).

Medications used episodically Medications used chronically
Concordant n=139 Discordant n=45 Univariate analysis Multivariable analysis Concordant n=171 Discordant n=13 Univariate analysis Multivariable analysis
n n OR (95%CI) OR (95%CI) n n OR (95%CI) OR (95%CI)
Case/control status
 Case 73 27 ref ref 92 8 ref ref
 Control 66 18 1.35 (0.70, 2.70) 1.47 (0.66, 3.33) 79 5 1.33 (0.44, 4.35) 2.56 (0.66, 11.11)
Interview time
 6 months or less 52 22 ref ref 69 5 ref ref
 More than 6 months 87 23 1.60 (0.81, 3.13) 1.32 (0.62, 2.80) 102 8 0.95 (0.29, 2.97) 0.69 (0.18, 2.47)
Interview language
 English 95 26 ref ref 111 10 ref ref
 Spanish 34 17 0.55 (0.27, 1.13) 0.59 (0.20, 1.68) 48 3 1.31 (0.40, 5.35) 0.29 (0.02, 3.19)
Nativity
 US-born 84 21 ref ref 92 10 ref ref
 Foreign-born 48 22 0.57 (0.28, 1.13) 0.55 (0.20, 1.60) 67 3 2.19 (0.68, 8.94) 2.70 (0.30, 48.90)
Educational attainment
 High school or less 56 18 1.06 (0.53, 2.14) 1.10 (0.49, 2.73) 70 4 1.66 (0.54, 5.89) 1.74 (0.41, 9.16)
 More than high school 73 25 ref ref 89 9 ref ref
Maternal Occupational Status
 No 33 10 ref ref 41 2 ref
 Yes 96 33 0.90 (0.39, 1.96) 0.81 (0.33, 1.89) 118 11 0.62 (0.12, 2.22) --
Health insurance in early pregnancy
 No 19 4 ref ref 21 2 ref
 Yes 110 39 0.65 (0.19, 1.78) 0.31 (0.08, 1.04) 138 11 1.40 (0.26, 5.20) --
Parity
 None 42 18 ref ref 57 3 ref ref
 One or more 93 27 1.48 (0.73, 2.95) 1.54 (0.68, 3.48) 112 8 0.81 (0.19, 2.73) 1.02 (0.23, 3.88)
mean (SD) mean (SD) mean (SD) mean (SD)
Maternal age at delivery 29.9 (6.0) 30.1 (6.1) 1.00 (0.94, 1.05) 0.97 (0.91, 1.04) 29.8 (6.0) 32.5 (5.6) 0.93 (0.84, 1.02) 0.89 (0.77, 1.01)

SD= standard deviation

The multivariable models included variables for case/control status, time between delivery and interview, interview language, age, educational attainment, occupation status, health insurance during early pregnancy, and parity. Singleton birth and birth outcome were not included in this analysis due to small numbers. Additionally, health insurance in early pregnancy and maternal occupational status were not included in the multivariable model for medications used chronically.

Table 6.

Agreement of medication details among those who reported use in both sources (n=50).

Therapeutic Class Women who reported in both Number of specific medications in both sources Number of specific medications not in both sources % agreement on specific medications
Antidiabetics 5 9 5 64.3%
Antiemetics 11 13 4 76.5%
Antiepileptics 2 2 0 100.0%
Antihypertensives 3 3 2 60.0%
Anti-infectives 13 12 15 44.0%
Asthma medications 2 2 0 100.0%
Fertility-related medications 9 10 8 55.6%
Migraine medications 2 1 2 33.3%
Opioid analgesics 1 1 1 50.0%
Psychotherapeutics 9 15 2 88.2%
Sleep aid medications 2 0 4 0.0%
Thyroid medications 8 8 0 100.0%

The total number of specific medications in both sources and the number of medications not in both sources may not equal the number of women who reported both medications as women could report more than one medication within each therapeutic class.

CONCLUSIONS

We found agreement between self-reports and medical records varied greatly by type of prescription medication. Medications used chronically, and the specific therapeutic classes that constituted that group (antidiabetics, antiepileptics, antihypertensives, psychotherapeutics, and thyroid medications), tended to have substantial or almost perfect agreement. This is in line with results from other studies that have also observed higher validity and agreement among medications used chronically, particularly for antihypertensives, thyroid medications and antidepressants (Cheung et al., 2017; Olesen et al., 2001; Pisa et al., 2015; Sarangarm et al., 2012; Skurtveit, 2014; Sundermann et al., 2017; van Gelder et al., 2013). Medications used episodically, however, had much lower agreement both as an overall group and within specific therapeutic classes (antiemetics, anti-infectives, migraine medications, opioid analgesics, and sleep aid medications).

We observed that medications used episodically had much lower sensitivity, but relatively high specificity, with medical records as the reference standard. Agreement was poorest for opioid analgesics and anti-infectives, which has been reported elsewhere (Nielsen et al., 2008; Olesen et al., 2001; Sarangarm et al., 2012). When looking at agreement by specific medication, we found similar patterns where agreement on the specific medication name was generally higher for medications taken chronically (notably, antiepileptics and thyroid medications had perfect agreement). Agreement on specific medication names of medications taken episodically was lower, although the agreement on medication names among antiemetics was relatively high (76.5%).

Medications taken episodically are likely taken for short periods or on an “as needed” basis and are thus subject to poorer recall than medications taken more consistently. We excluded women with self-reported “as needed” use and our results were unchanged. Our findings may be explained by non-adherence with prescribed treatments, use of medications prescribed previously, or medication sharing, particularly for acute conditions. Medications taken episodically may be more likely to be given during in-patient hospital stays, and administration while in the hospital may impact the ability to know and/or recall medication use. We identified 10 women for whom the medical record indicated that medications were administered during an in-patient stay in early pregnancy. These 10 women had documentation of receiving 14 medications; none of the medications were self-reported. Of these 14 medications, 13 were medications used episodically (7 anti-infectives, 4 antiemetics, 2 opioid analgesics) and one was an asthma medication. This may explain the difference observed in anti-infective prevalence between the sources.

For most characteristics investigated, including case-control status, we did not find that agreement differed across strata for medications used chronically or episodically with one potential exception. There were potential differences in our stratified analysis by interview language, for which we observed lower agreement for medications used episodically among women interviewed in Spanish compared to women interviewed in English. Yet language was not associated with concordant reports of episodic medication use in our multivariable analysis when adjusting for other maternal characteristics. The smaller number of women who reported episodic medications and spoke Spanish (n=52) limited our precision.

Our results differ from previous findings in a few ways. Others have reported substantial agreement for asthma medications (Nielsen et al., 2008; Olesen et al., 2001; Palmsten et al., 2018; Sarangarm et al., 2012), but poor recall for sleep aids (Skurtveit et al., 2014). In our study, agreement for asthma medications was poor (κ=0.28) and worse than all other therapeutic classes except opioid analgesics. Yet in our sub-analysis of specific medications among women whose use of an asthma medication was reported in both sources, the two sources had perfect agreement for specific asthma medications. While we found high agreement for sleep aid medications at the therapeutic class level, there was no agreement when we looked by specific medication. Reports of asthma medication and sleep aid medication use were infrequent in our sample.

Women included in our sample were generally similar to other women who participated in BD-STEPS with a few exceptions: women in our sample more frequently reported being born outside the United States, being interviewed in Spanish, and completed the interview within 6 months of delivery, but less frequently reported having more than a high school education. Yet we did not find any differences in agreement by any of these factors in our multivariable analysis.

A strength of this study was that the interview asks about a range of medications and is structured so that medication use is linked to health conditions. Women are asked about specific medications and medication groups, likely improving recall (West SL, 2012). We were able to explore agreement for a wide range of medications and restrict to those only available by prescription. BD-STEPS collected information on other variables that may impact agreement, allowing us to evaluate agreement across levels of key variables and to examine predictors of agreement.

We presented sensitivity, specificity, PPV, and NPV with medical records as the reference standard. We did this so these estimates could be applied to future quantitative bias analyses of exposure misclassification (Lash et al., 2009). Given that both medication sources have limitations, there is likely no true reference standard. The PPV and NPV are equivalent to the sensitivity and specificity of the medical records using self-report as the reference standard.

Our study had several limitations. First, our validation sample included data from NY participants over one year; medication information was validated for only 184 women. Since these women are largely similar to women from the rest of BD-STEPS, we expect our findings to apply to the rest of BD-STEPS; however, precision was low due to the small sample size. While the main analysis focused on medications taken episodically and chronically, the number of women who used prescription medications in either source was relatively small. Thus, our estimates are unstable for the smallest therapeutic classes and even for some of the main analyses. Second, we used medical records to verify accuracy of maternal self-reports. Medical records may not reflect actual medication use for several reasons, including non-adherence and medication sharing. Additionally, medical records may not capture actual use of medications, especially for medications used intermittently. Our study cannot determine which data source better reflects true medication exposure. The medical records we used varied in terms of quality and detail. Some were deficient in their recording of medication use, their completeness, or were fragmented when care was sought from multiple sources. To combat this, we obtained medical records from multiple obstetric providers, used paper and electronic records, and had all records abstracted by one investigator for consistency (abstracts were reviewed by a second to resolve questions). We did not obtain medical records from specialists or urgent care clinics who may have been the primary prescriber, which may have impacted the completeness of the information. For eight women, we could not obtain a prenatal record and relied on medication use documented in their delivery record. For another five women, the prenatal record either started after pregnancy month three or was very minimal and early pregnancy medication use was insufficiently captured. For these women, we relied on early pregnancy medication use documented in later pregnancy and delivery records. Excluding them did not alter our results. Next, while we measured agreement based on exact medication in table 6, the majority of our analyses explored agreement within therapeutic classes. Lastly, we did not explore agreement for medications available over-the-counter, given the lack of information within medical records.

Our study indicates that agreement of prescription data between self-report and medical records varies by whether the medication is used chronically or episodically. While self-reported medication information for medications used chronically were found to have substantial to near perfect agreement and high sensitivity and specificity, agreement and sensitivity for medications used episodically were lower. Even for these medications, however, the specificity was quite high. In rare exposures like these, the strength of the bias from exposure misclassification is driven by the specificity. Therefore, we would not expect a strong bias given the relatively high specificity observed in this study, despite low sensitivity (De Smedt et al., 2018; Lash et al., 2009). While ideally we would measure exposure without error, future analyses exploring medications used episodically would benefit from conducting a quantitative bias analysis to estimate the effects of exposure misclassification. Using the estimates from our validation sample could help quantify the impacts of the potential exposure misclassification bias on the estimates of the effects of prescription medication use in early pregnancy on birth defects and other pregnancy outcomes.

Supplementary Material

1

ACKNOWLEDGEMENTS

We thank the participating families, scientists, and staff from the BD-STEPS sites. This project was supported through Centers for Disease Control and Prevention (CDC) cooperative agreements under PA #96043, PA #02081, FOA #DD09-001, FOA #DD13-003, and NOFO #DD18-001 to the Centers for Birth Defects Research and Prevention participating in the National Birth Defects Prevention Study (NBDPS) and/or the Birth Defects Study To Evaluate Pregnancy exposureS (BD-STEPS). We also thank Jada Scott for replicating the analysis.

Footnotes

CONFLICT OF INTEREST STATEMENT

The authors do not have any conflicts of interest to disclose.

REFERENCES

  1. Cheung K, El Marroun H, Elfrink ME, Jaddoe VWV, Visser LE, & Stricker BHC (2017). The concordance between self-reported medication use and pharmacy records in pregnant women. Pharmacoepidemiology and Drug Safety, 26, 1119–1125. 10.1002/pds.4264 [DOI] [PubMed] [Google Scholar]
  2. De Smedt T, Merrall E, Macina D, Perez-Vilar S, Andrews N, & Bollaerts K (2018). Bias due to differential and non-differential disease- and exposure misclassification in studies of vaccine effectiveness. PloS One, 13, e0199180. 10.1371/journal.pone.0199180 [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Evandt J, Skurtveit S, Oftedal B, Krog NH, Nafstad P, Skovlund E, … Aasvang GM (2019). Agreement between self-reported and registry-based use of sleep medications and tranquilizers. Pharmacoepidemiology and Drug Safety, 28, 1336–1343. 10.1002/pds.4854 [DOI] [PubMed] [Google Scholar]
  4. Landis JR, & Koch GG (1977). The measurement of observer agreement for categorical data. Biometrics, 33(1), 159–174. [PubMed] [Google Scholar]
  5. Lash TL, Fox MP, & Fink AK (2009). Applying Quantitative Bias Analysis to Epidemiologic Data. Springer-Verlag; New York. 10.1007/978-0-387-87959-8 [DOI] [Google Scholar]
  6. Mitchell AA, Cottler LB, & Shapiro S (1986). Effect of questionnaire design on recall of drug exposure in pregnancy. American Journal of Epidemiology, 123, 670–676. 10.1093/oxfordjournals.aje.a114286 [DOI] [PubMed] [Google Scholar]
  7. Mitchell AA, Gilboa SM, Werler MM, Kelley KE, Louik C, & Hernández-Díaz S (2011). Medication use during pregnancy, with particular focus on prescription drugs: 1976–2008. American Journal of Obstetrics and Gynecology, 205, 51.e51–58. 10.1016/j.ajog.2011.02.029 [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Nielsen MW, Søndergaard B, Kjøller M, & Hansen EH (2008). Agreement between self-reported data on medicine use and prescription records vary according to method of analysis and therapeutic group. Journal of Clinical Epidemiology, 61, 919–924. 10.1016/j.jclinepi.2007.10.021 [DOI] [PubMed] [Google Scholar]
  9. Olesen C, Søndergaard C, Thrane N, Nielsen GL, de Jong-van den Berg L, & Olsen J (2001). Do pregnant women report use of dispensed medications? Epidemiology, 12, 497–501. 10.1097/00001648-200109000-00006 [DOI] [PubMed] [Google Scholar]
  10. Palmsten K, Hulugalle A, Bandoli G, Kuo GM, Ansari S, Xu R, & Chambers CD (2018). Agreement Between Maternal Report and Medical Records During Pregnancy: Medications for Rheumatoid Arthritis and Asthma. Paediatric and Perinatal Epidemiology, 32, 68–77. 10.1111/ppe.12415 [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Pisa FE, Casetta A, Clagnan E, Michelesio E, Vecchi Brumatti L, & Barbone F (2015). Medication use during pregnancy, gestational age and date of delivery: agreement between maternal self-reports and health database information in a cohort. BMC Pregnancy and Childbirth, 15, 310. 10.1186/s12884-015-0745-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Sarangarm P, Young B, Rayburn W, Jaiswal P, Dodd M, Phelan S, & Bakhireva L (2012). Agreement between self-report and prescription data in medical records for pregnant women. Birth Defects Research. Part A: Clinical and Molecular Teratology, 94, 153–161. 10.1002/bdra.22888 [DOI] [PubMed] [Google Scholar]
  13. Skurtveit S, Selmer R, Odsbu I, & Handal M (2014). Self-reported data on medicine use in the Norwegian Mother and Child cohort study compared to data from the Norwegian Prescription Database. Norsk Epidemiologi, 24(1–2). 10.5324/nje.v24i1-2.1824 [DOI] [Google Scholar]
  14. Stephansson O, Granath F, Svensson T, Haglund B, Ekbom A, & Kieler H (2011). Drug use during pregnancy in Sweden - assessed by the Prescribed Drug Register and the Medical Birth Register. Clinical Epidemiology, 3, 43–50. 10.2147/clep.s16305 [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Sundermann AC, Hartmann KE, Jones SH, Torstenson ES, & Velez Edwards DR (2017). Validation of maternal recall of early pregnancy medication exposure using prospective diary data. Annals of Epidemiology, 27, 135–139.e132. 10.1016/j.annepidem.2016.11.015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Tinker SC, Broussard CS, Frey MT, & Gilboa SM (2015). Prevalence of prescription medication use among non-pregnant women of childbearing age and pregnant women in the United States: NHANES, 1999–2006. Matern Child Health J, 19, 1097–1106. 10.1007/s10995-014-1611-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Tinker SC, Carmichael SL, Anderka M, Browne ML, Caspers Conway KM, Meyer RE, … Reefhuis J (2015). Next steps for birth defects research and prevention: The birth defects study to evaluate pregnancy exposures (BD-STEPS). Birth Defects Research. Part A: Clinical and Molecular Teratology, 103, 733–740. 10.1002/bdra.23373 [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. van Gelder MM, van Rooij IA, de Walle HE, Roeleveld N, & Bakker MK (2013). Maternal recall of prescription medication use during pregnancy using a paper-based questionnaire: a validation study in the Netherlands. Drug Safety, 36, 43–54. 10.1007/s40264-012-0004-8 [DOI] [PubMed] [Google Scholar]
  19. West SL, Ritchey ME, & Poole C (2012). Validity of Pharmacoepidemiologic Drug and Diagnosis Data. In Strom BL, Kimmel SE, Hennessy S (Eds.), Pharmacoepidemiology, Fifth Edition (pp. 757–794). Wiley-Blackwell. 10.1002/9781119959946.ch41 [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1

RESOURCES