Abstract
Effective prevention of cardiac malformations, a leading cause of infant morbidity, is constrained by limited understanding of etiology. The study objective was to screen for associations between maternal and paternal characteristics and cardiac malformations. We selected 720,381 pregnancies linked to live-born infants (n=9,076 cardiac malformations) in 2011–2021 MarketScan US insurance claims data. Odds ratios were estimated with clinical diagnostic and medication codes using logistic regression. Screening of 2,000 associations selected 81 associated codes at the 5% false discovery rate. Grouping of selected codes, using latent semantic analysis and the Apriori-SD algorithm, identified elevated risk with known risk factors, including maternal diabetes and chronic hypertension. Less recognized potential signals included maternal fingolimod or azathioprine use. Signals identified might be explained by confounding, measurement error, and selection bias and warrant further investigation. The screening methods employed identified known risk factors, suggesting potential utility for identifying novel risk factors for other pregnancy outcomes.
Introduction
Cardiac malformations are one of the most common congenital malformations and a major cause of infant mortality and lifelong morbidity.1,2,3 Effective prevention, which could reduce this burden, is hampered by limited understanding of the etiology of these malformations outside of known genetic causes.3
Several non-genetic potential causes of cardiac malformations have been identified including maternal clinical conditions and medications, such as diabetes and oral retinoids.4 Previous studies have primarily, though not exclusively,5 focused on single exposures or have been limited by the data collection instruments to a restricted set of pre-defined risk factors.6,7 Occasionally, clinical observation of unusual clusters might provide clues (e.g., Ebstein’s anomaly after lithium exposure), however, it can also trigger false alarms.8–10 Furthermore, while there has been some suggestion that paternal exposures preceding conception may also lead to congenital malformations, there is little evidence to date, including for cardiac malformations.11,12 Despite cardiac malformations being the most common malformations, given their estimated prevalence of ~ 1% among live births and their multifactorial etiology, it may be helpful to complement passive clinical surveillance and ad hoc studies of specific exposures by screening of large healthcare utilization databases.
The aim of this study was to identify and characterize associations of both maternal and paternal clinical conditions and medications with cardiac malformations among infants using statistical and machine learning methods in a large cohort of pregnancies linked to live-born infants in insurance claims data, Merative MarketScan Commercial Claims and Encounters (CCAE), from the United States (US).
Results
We selected a cohort of 720,381 pregnancies linked to live births occurring among 647,711 mothers for the analysis of maternal associations. Multiple gestation occurred in 24,631 (3.4%) of pregnancies. For the analysis of paternal associations, we linked 507,442 (70.4%) of the 720,381 pregnancies to one of 457,096 fathers. The median age at pregnancy for mothers was 31 years (interquartile range [IQR] 28–34) and for fathers 33 years (IQR 30–37).
Among the 720,381 pregnancies, there were 9,076 pregnancies in which one or more infants had a cardiac malformation (1.3%). The most common cardiac malformations (see Supplementary Table 1) were ventricular septal defects (n = 5,349, 0.7%), left-sided defects (n = 1,316, 0.2%), and atrial septal defects (n = 940, 0.1%).
Screening for associations
Screening of associations with derived covariates selected, at the 5% FDR threshold, 49 associations with maternal diagnoses, 5 associations with paternal diagnoses, 27 associations with maternal medications, and no association with paternal medications (Fig. 1 and Box 1). Of the 81 selected associations, 1 variable had reduced odds and 80 increased odds of cardiac malformations. The adjusted odds ratio was > 2 for 22 associations. At a 1% FDR 56 associations were selected and at 10% FDR more associations (n = 101) were selected.
Box 1: Manual grouping of codes identified at 5% FD (maternal codes unless otherwise specified].
* The ICD-10 code (099.411) “Diseases of the circulatory system complicating pregnanecy, fiist trimester” maps in the Centers for Medicare & Medicaid Service; General Equivalence Mapping to 61701 and 61703 which pertain to cerebrovascular disorders in the puerperium.
Characterization of associations
Manual categorization of selected codes highlighted groups of codes relating to maternal diabetes, maternal obesity, other maternal cardiometabolic conditions, cardiac conditions, fertility treatment and multiple gestation, prenatal diagnoses of malformations and malformations in the parent, and obstetric conditions (Box 1). Automated identification of groups using hierarchical clustering after latent semantic analysis similarly identified groups relating to diabetes, dyslipidemia, chronic hypertension, fertility treatment and prenatally diagnosed/parental malformations (Fig. 2 and Table 1). Latent semantic analysis highlighted links between codes that are not immediately apparent from inspection, such as the relation between doxycycline, methylprednisolone, diazepam, and leuprolide acetate, which are all used in assisted reproductive technology (ART). Elevated maternal age was associated with increased risk of cardiac malformations (Fig. 3).
Table 1.
Code terms | Number of code terms in group |
---|---|
Diabetes mellitus; Insulin - Human; Insulin Aspart; Insulin Degludec; Insulin Detemir; Insulin Glargine; Insulin Lispro | 7 |
Disorders of lipoid metabolism; Atorvastatin & Comb. | 2 |
Other retinal disorders; Chorioretinal inflammations, scars, and other disorders of choroid | 2 |
Diseases of other endocardial structures; Other diseases of endocardium; Cardiomyopathy; Bulbus cordis anomalies and anomalies of cardiac septal closure; Other congenital anomalies of heart | 5 |
Essential hypertension; Lisinopril & Comb. | 2 |
Known or suspected fetal abnormality affecting management of mother; Chromosomal anomalies | 2 |
Diffuse diseases of connective tissue; Hydroxychloroquine | 2 |
Leuprolide Acetate; Diazepam; Methylprednisolone; Doxycycline | 4 |
Chorionic Gonadotropin; Menotropins; Follitropins & Comb. | 3 |
Identification of high-risk subgroups
Application of the Apriori-SD algorithm selected subgroups at elevated risk pertaining to the maternal characteristics of pregestational diabetes, chronic hypertension, multiple gestation, and prenatally diagnosed known or suspected malformations (Table 2). Identified subgroups at elevated risk were defined by the conjunction of two or fewer codes, which may relate to limited power to identify smaller subgroups defined by 3 + codes.
Table 2.
Rule | Prevalence (%) | Risk ratio (95% CI) |
---|---|---|
Supervision of high-risk pregnancy | 25.4 | 1.29 (1.24–1.35) |
Diabetes mellitus | 1.4 | 2.20 (1.96–2.47) |
Multiple gestation | 2.5 | 2.11 (1.93–2.32) |
Essential hypertension | 2.8 | 1.52 (1.38–1.69) |
Known or suspected fetal abnormality affecting management of mother; Other nonspecific abnormal findings | 0.3 | 2.82 (2.28–3.48) |
Other conditions or status of the mother complicating pregnancy, childbirth, or the puerperium; Age 35+ | 2.9 | 1.39 (1.25–1.55) |
Secondary analyses
Secondary analyses with specific cardiac malformations (Supplementary Tables 2–9) selected at the 5% FDR were associations between diabetes and antidiabetic medications and increased atrial septal, conotruncal, left-sided, and ventricular septal defects. Other associations selected included an association between clonazepam and increased left-sided defects, an association between screening for viral or chlamydial infections and increased defects of the great cardiac vein, and vitamin B deficiency and increased other cardiac malformations. No associations were identified with single ventricular, atrioventricular septal, patent ductus arteriosus (PDA), or persistent pulmonary hypertension of the new-born (PPHN). However, numbers of these outcomes were small limiting power to detect associations.
Sensitivity analyses
Exclusion of pregnancies with chromosomal abnormalities led to the identification of similar associations to the main analysis, though a few additional associations were selected leading to 90 selected associations at a 5% FDR. Associations with hypertension and hyperlipidemia attenuated after restricting to women without diabetes (Supplementary Table 10), though an association with essential hypertension remained (aOR 1.26, 95% CI 1.11–1.44). Associations with fertility treatments attenuated after restricting to singletons (Supplementary Table 10). Associations with many other medications attenuated after restricting to singleton pregnancies amongst women without diabetes highlighting the use of these medications in fertility treatment and diabetes (e.g., methylprednisolone, cabergoline, diazepam, and doxycycline in assisted reproductive technology; aspirin prophylaxis to prevent preeclampsia in pregnancies with multiple gestation or diabetes; diabetes concomitant with hypertension treated with lisinopril). After further restricting to pregnancies without a malformation recorded in the parent prior to last menstrual period (LMP), associations with parental cardiac condition attenuated (Supplementary Table 10). After restriction, strong associations with medications (aOR > 2) remained for azathioprine and fingolimod. Requiring two or more codes to define presence led to the additional selection of the following associations (Supplementary Table 10): other known or suspected fetal and placental problems affecting management of mother, number of fetuses, tramadol, disorders of sweat glands, varenicline, and vitamin D deficiency. Varying the time window of assessment for clinical covariates to acute exposures (first trimester among mothers, 90 days prior to LMP for fathers) led to the selection of many of the same associations, with the supplementary addition of disorders of mineral metabolism and paternal gingival and periodontal diseases. Fewer, but similar, associations to the main analysis were selected when using 3- and 4-digit ICD-9 and ICD-10 codes in analyses stratified by ICD code period (Supplementary Tables 11–14).
Discussion
In this study we applied statistical and machine learning methods to screen for associations between maternal and paternal characteristics and cardiac malformations. Screening identified both known risk factors including pregestational diabetes, chronic hypertension, obesity, and parental malformations in addition to less well-recognized potential signals including maternal use of fingolimod and azathioprine. Increased risk of cardiac malformations in the offspring of women with pregestational diabetes has long been recognized and has been found to increase with HbA1c levels.6,13–15 Patients with advanced or uncontrolled diabetes, such as women with type 2 diabetes requiring insulin or other second line therapies, have the highest risks.16 Suggested mechanisms include hyperglycemia-induced oxidative stress, altered signaling pathways, and epigenetic modifications.17 Although finding an association is not surprising, the prominent role of diabetes-related factors on the risk of cardiac malformations highlights the need for public health action to prevent incidence and progression of diabetes before pregnancy.
Chronic hypertension was associated with an increased risk of cardiac malformations even after restricting to pregnancies among mothers without recorded pregestational diabetes. A previous study using nationwide Medicaid data, found an association between both treated and untreated chronic hypertension and cardiac malformations after adjustment for potential confounders (treated hypertension OR 1.6, 95% CI 1.4–1.9 and untreated hypertension OR 1.5, 95% CI 1.3–1.7).18 The mechanism of increased risk is not at present fully comprehended.18,19
Maternal obesity was found to be associated with cardiac malformations including after restricting pregnancies to mothers without recorded pregestational diabetes. Higher maternal BMIs have been consistently associated with increased malformations, including cardiac malformations.20,21 Potential mechanisms include elevated glycemic levels without diagnosed diabetes22, nutritional deficiencies23, and reduced prenatal detection among obese women and hence reduced termination of pregnancy due to fetal anomaly.24,25
The finding of increased risk with both paternal and maternal recording of cardiac malformations may relate to familial inheritance through genetic or shared environmental risk factors.26 Another explanation for the association may be the prenatal diagnosis of syndromic defects such as Down’s syndrome early in pregnancy coded in maternal claims records.27 Although prenatal screening of anomalies is typically conducted at 15 weeks and we considered diagnosis in the first 12 weeks, we might still be including early diagnoses if our LMP was slightly miscalculated in some instances. The association with paternal malformations suggests that the association is not exclusively due to prenatal diagnosis of the outcome.
While fertility treatments were associated with increased risk of cardiac malformations in our study, this may be an artefact of counting outcomes by presence per pregnancy. In MarketScan CCAE data it is difficult to distinguish individual infant outcomes within a multiple gestation pregnancy and as such we counted presence of malformations at the pregnancy-level. Counting at the pregnancy-level can lead to up to 2-times higher risk in twin pregnancies even if the individual fetal risk is not elevated in twins.28 After restricting to singleton gestation pregnancies, associations with fertility treatments were attenuated and close to the null.
The identification of known risk factors using these statistical and machine learning methods indicates the potential value of these methods for identifying novel associations. A challenge to the application of machine learning methods in insurance claims and electronic health record data is the sparsity of the data (i.e., a large fraction of values are zero), the high degree of correlation between variables, and the interpretability of results.29,30 Interpretable unsupervised machine learning methods, for example the dimensionality reduction technique of latent semantic analysis and the rule-based machine learning algorithm Apriori-SD, can handle sparse correlated data and aid in the interpretation of findings, such as in this study by highlighting the use in common of doxycycline, methylprednisolone, and diazepam in ART. There can be many explanations for identified associations, including latent characteristics of those with the code (e.g., diabetes among those with a code for insulin), therefore characterizing the relation between codes can aid in interpretation.
Less recognized signals identified included maternal use of fingolimod and azathioprine. These associations may be due to the characteristics of women prescribed these medications, but nevertheless highlight the need for further evaluation of the safety in pregnancy of the different medications for the indications for which these medications are prescribed (i.e., multiple sclerosis, autoimmune disease). Both fingolimod and azathioprine have been found in animal studies involving rats and rabbits to be teratogenic, including at doses equivalent to the human dose.31,32
An association was selected between clonazepam and left-sided defects. The association between benzodiazepines and congenital malformations has been widely investigated, but findings have been conflicting, and the association may have been subject to residual confounding and recall bias.33–35 In our study, the association between diazepam, which is used as a uterine relaxant in ART, attenuated after restricting to singletons. Nonetheless, the association between specific benzodiazepines and cardiac malformations, specifically left-sided defects, deserves further evaluation.
There was a relative absence of associations with paternal exposures. Few associations with paternal non-genetic exposures have been previously identified. A recent exception is a cohort study that reported an association between paternal metformin use and genital defects.11
The strengths of this study include the large cohort size, validated outcome (with PPV of 78% in a validation study36), examination of risk with subtypes of cardiac malformation, linkage to fathers, and careful control for multiple testing.
While the cohort is relatively large, associations with rare exposures may not be detected, though these will be arguably of less clinical significance given their infrequency. The study is limited to exposures that can be captured in insurance claims data, which includes both prescription medications and clinical diagnoses, but excludes genetic variables, lab measurements, dietary exposures, and other lifestyle factors. Known or suspected teratogens such as oral retinoids are unlikely to be dispensed in pregnancy and, therefore, unlikely to be detected. Furthermore, pregnancies exposed to known teratogens are more often terminated. The study was restricted to live births, introducing the possibility of selection bias, for example due to differential termination of pregnancy for fetal anomaly.37 Given that our aim was not to determine causality, but rather to identify associations for further investigation, the identified associations should not be interpreted as causal. Both the detection and the absence of specific signals can be explained by random and systematic errors such as confounding, information and selection biases.
In conclusion, through screening of associations aided by unsupervised machine learning methods, we identified a number of characteristics associated with cardiac malformations. Some associations were known, and some represent potential signals. The ability of the screening methods employed to detect both known and suspected risk factors for cardiac malformations, suggest potential utility of these methods in identifying novel risk factors for other malformations and other adverse birth outcomes.
Methods
Data sources
Data were obtained for years 2011–2021 of Merative MarketScan CCAE, a US commercial insurance claims dataset. MarketScan CCAE is one of the largest US nationwide datasets of commercial health beneficiaries and contains deidentified inpatient and outpatient healthcare utilization data including diagnoses, procedures, and all outpatient pharmacy dispensed prescription medications.38
Study population
We have previously defined a cohort of 2.7 million pregnancies linked to infants in MarketScan CCAE. Algorithms developed to identify pregnancies, link to infants, assign a pregnancy outcome, and estimate the date of LMP have previously been described in detail.39–42
For the present study we included all pregnancies among mothers aged 12–55 years linked to live-born infants meeting the following eligibility requirements: First, in order to ascertain exposures and pregnancy outcomes, all mothers were required to be continuously enrolled from at least 180 days before LMP to end of pregnancy plus 30 days. Second, mothers were required to have full medication benefits from LMP minus 180 days to end of pregnancy in order to characterize medication exposures. Third, to ascertain cardiac malformations, linked infants were required to have enrolment to at least 90 days following the end of pregnancy except in cases of infant death.
To identify associations with paternal exposures we identified a cohort of fathers linked to both pregnant women and to live-born infants through family case number and year of delivery. Fathers were required to have continuous enrolment and medication benefits from the mother’s LMP minus 180 days to LMP in order to ascertain exposures during spermatogenesis.
Exposures
Indicator variables for maternal and paternal clinical conditions and medications were derived based on an adaptation of methodology employed in the high-dimensional propensity score algorithm.43
For maternal medication exposure, we created indicator variables for dispensation of one or more prescriptions, categorized by RED BOOK ™ therapeutic detail code, within the first trimester (LMP to LMP plus 90 days), a critical period during embryogenesis for the development of structural abnormalities of the heart.44 Therapeutic detail codes categorize medications at the level of the generic ingredient. For paternal exposure, indicator variables were created for medication dispensation, categorized by therapeutic detail code, during spermatogenesis (LMP minus 90 days to LMP).
For maternal clinical diagnoses, indicator variables were created based on International Statistical Classification of Diseases and Related Health Problems (ICD), 9th revision (ICD-9; before October 2015) and 10th revision (ICD-10; October 2014 onwards), codes. ICD-9 codes were categorized by three digit ICD-9 section code, a categorization that has proven useful in the high-dimensional propensity score.43 ICD-10 codes were mapped to the respective ICD-9 3-digit section code using General Equivalence Mappings provided by the Centers for Medicare & Medicaid Services (see Supplement for details).45 Code occurrence was assessed between LMP minus 180 days and LMP plus 90 days in mothers and between LMP minus 180 days and LMP among fathers.
Separately among mothers and fathers, we selected the 500 most common medications dispensed and 500 most common ICD-9 section codes occurring within the exposure assessment windows.
Outcomes
The primary outcome was cardiac malformation in the live-born infant defined by ICD-9 and ICD-10 codes using code lists and algorithms previously validated by chart review (see Supplement for details of algorithm).36,46 Secondary outcomes were the following specific cardiac defects: conotruncal, single ventricle, ventricular septal, atrial septal, atrioventricular septal, right-sided, left-sided, patent ductus arteriosus (PDA), persistent pulmonary hypertension of the new-born (PPHN), great cardiac veins, and other cardiac defect (see Supplement for code lists).2
Statistical and machine learning analyses
Screening for associations
To identify associations, in the separate maternal and paternal data logistic regression models were separately fitted for each code adjusting for maternal age using restricted cubic splines. We adjusted for maternal age, in order to identify associations independent of this known risk factor.47 P-values were calculated using likelihood ratio tests.20Associations were identified at 1%, 5%, and 10% false discovery rate (FDR) using the Benjamini-Hochberg procedure to account for multiple testing.48 We applied a bootstrap-based bias-correction method to correct adjusted odds ratios for the winner’s curse effect, in which effect estimates selected by screening are artificially elevated due to the selecting associations with the smallest p-value.49 Log-10 p-values were plotted using a variant of the Manhattan plot, in which we multiply the p-value by the sign of the log odds ratio to distinguish positive from negative associations.
Characterization of associations
To characterize identified associations, in order to help generate hypotheses of underlying causes, we used latent semantic analysis.50 Latent semantic analysis is an unsupervised machine learning method from natural language processing, in which a lower dimensional representation of the data is derived assuming the observed high-dimensional data is a function of a smaller number of underlying latent factors.51–53 In this lower dimensional representation, the vector representation of terms that occur in similar contexts will be close in distance. For example, diabetes medications, diabetes diagnoses, and sequelae of diabetes, such as diabetic retinopathy, are all indicators of a latent state of diabetes.
To perform latent semantic analysis, truncated singular value decomposition (SVD) was performed on the n×d matrix for the n individuals and d dimensions for all mapped ICD-9 section codes and therapeutic detail codes.52 Per individual counts for each code were log transformed before SVD. After SVD, the 500 dimensions with the largest singular values were retained. To characterize the relationship between variables identified through screening, the similarity of the vector representation of these identified codes in lower dimensional space was calculated using cosine similarity.52 For visualization and grouping, agglomerative hierarchical clustering was used to cluster codes into groups of similar variables based on angular distance between vector representations.54 In addition to this data-driven grouping of codes, we conducted a separate manual categorization of codes into groups with clinical similarity (e.g., diabetes medications and diagnoses).
To characterize the association between maternal age and cardiac malformations we plotted the predicted values from a logistic regression model fitted using a restricted cubic spline for maternal age.
Identification of high-risk subgroups
To identify high-risk subgroups of pregnancies characterized by the conjunction of multiple conditions, the Apriori-SD (subgroup discovery) rule-based machine learning algorithm was applied.55 This algorithm has two components. First, the Apriori algorithm exhaustively generates all rules of the form {X, Z} = > Y (e.g., {chest pain, faint, shortness of breath} = > myocardial infarction) of length (i.e., number of variables in the left-hand side of the rule) less than a specified maximum, with a minimum support (i.e., proportion of the population with X, Y, and Z), and with a minimum confidence (i.e., proportion of those with X and Z who have Y).54,56 Second, those rules with the desired target in the right hand side (i.e., cardiac malformation) are processed to identify distinctive subgroups in the population at high risk of the outcome.
The Apriori-SD algorithm was applied to the binary maternal data as defined in the Exposures section for all mapped ICD-9 section codes and therapeutic detail codes. Random down-sampling of non-cases was applied to ensure class balance (i.e., 50% with the outcome and 50% without) and improve algorithm performance.52 To generate rules with support in the data, the minimum support was set at 0.1%, minimum confidence at 55%, and the maximum length of the left-hand side of the rule at 10 variables. Age was categorized for this analysis into < 35 years and ≥ 35 years. The Westfall-Young permutation testing procedure was applied to select rules while controlling the family-wise error rate at 5%.29 Prevalence of subgroups and risk ratios between subgroup membership and the outcome were calculated in the entire dataset without undersampling.
Secondary and sensitivity analyses
As a secondary analysis, screening of associations was conducted separately for the secondary outcomes.
Sensitivity analyses for screening of associations were 1) restriction to pregnancies without chromosomal abnormalities given these have predefined definite causes 2) restriction to singletons since pregnancies with more than one fetus would by definition have a higher probability of having at least one fetus with a diagnosis; 3) restriction to singleton pregnancies among mothers without diabetes to ascertain associations independent of diabetes (a known risk factor for cardiac malformations) or multiple gestation, 4) restriction to singleton pregnancies to mothers without diabetes and parents without a congenital malformation recorded in the 6 months prior to LMP to ascertain associations independent of maternal diabetes, multiple gestation or parental malformation, 5) conducting analyses separately in data pre-October 2015 using ICD-9 section codes and from October 2015 onwards using 3-digit ICD-10 categories to assess the sensitivity of results to mapping of ICD-10 codes to ICD-9 sections, 6) conducting analyses separately in data pre-October 2015 using 4-digit ICD-9 codes and from October 2015 onwards using 4-digit ICD-10 codes, 7) definition of indicator variables by two or more occurrences within the assessment window to increase specificity of exposure classification, and 8) varying the assessment window for clinical covariates to capture with more specificity acute clinical exposures (e.g., infections) to LMP to LMP plus 90 days among mothers and to LMP minus 90 days to LMP among fathers.
Acknowledgements:
JPB was supported by NIH grant R01 HD097778. LS is supported by a Training Grant from the National Institute of Child Health and Human Development (T32HD040128).
Footnotes
Competing interests: KFH is an investigator on grants to her institution from UCB and Takeda, unrelated to this work. SHD reports being an investigator on research grants to her institution from Takeda and consulting for Moderna, UCB and Jansen; all unrelated to the present study. All other authors report no conflicts of interest.
Supplementary Files
Ethics: This study was deemed exempt from review by the Harvard T.H. Chan School of Public Health Institutional Review Board.
Contributor Information
Jeremy Brown, Harvard T.H. Chan School of Public health.
Krista Huybrechts, Brigham and Women’s Hospital.
Loreen Straub, Brigham and Women’s Hospital.
Dominik Heider, Heinrich-Heine-University of Düsseldorf.
Brian Bateman, Brigham and Women’s Hospital.
Sonia Hernandez-Diaz, Harvard University.
References
- 1.Ely DM, Driscoll AK. Infant Mortality in the United States, 2020: Data From the Period Linked Birth/Infant Death File. National Vital Statistics Reports Centers Dis Control Prev National Cent Heal Statistics National Vital Statistics Syst. 2022;71(5):1–18. [PubMed] [Google Scholar]
- 2.Reller MD, Strickland MJ, Riehle-Colarusso T, Mahle WT, Correa A. Prevalence of Congenital Heart Defects in Metropolitan Atlanta, 1998–2005. J Pediatrics. 2008;153(6):807–813. doi: 10.1016/j.jpeds.2008.05.059 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Patel SS, Burns TL. Nongenetic Risk Factors and Congenital Heart Defects. Pediatr Cardiol. 2013;34(7):1535–1555. doi: 10.1007/s00246-013-0775-4 [DOI] [PubMed] [Google Scholar]
- 4.Jenkins KJ, Correa A, Feinstein JA, et al. Noninherited Risk Factors and Congenital Cardiovascular Defects: Current Knowledge. Circulation. 2007;115(23):2995–3014. doi: 10.1161/circulationaha.106.183216 [DOI] [PubMed] [Google Scholar]
- 5.Liu S, Joseph KS, Lisonkova S, et al. Association Between Maternal Chronic Conditions and Congenital Heart Defects. Circulation. 2013;128(6):583–589. doi: 10.1161/circulationaha.112.001054 [DOI] [PubMed] [Google Scholar]
- 6.Mitchell SC, Sellmann AH, Westphal MC, Park J. Etiologic correlates in a study of congenital heart disease in 56,109 births. Am J Cardiol. 1971;28(6):653–657. doi: 10.1016/0002-9149(71)90053-1 [DOI] [PubMed] [Google Scholar]
- 7.Kučienė R, Dulskienė V. Selected environmental risk factors and congenital heart defects. Medicina. 2008;44(11):827. doi: 10.3390/medicina44110104 [DOI] [PubMed] [Google Scholar]
- 8.Weinstein MR, Goldfield M. Lithium carbonate treatment during pregnancy; report of a case. Dis Nerv Syst. 1969;30(12):828–832. [PubMed] [Google Scholar]
- 9.Patorno E, HK F., BB T., et al. Lithium Use in Pregnancy and the Risk of Cardiac Malformations. New Engl J Med. 2017;376(23):2245–2254. doi: 10.1056/nejmoa1612222 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Mitchell AA, Rosenberg L, Shapiro S, Slone D. Birth Defects Related to Bendectin Use in Pregnancy: I. Oral Clefts and Cardiac Defects. Jama. 1981;245(22):2311–2314. doi: 10.1001/jama.1981.03310470025020 [DOI] [PubMed] [Google Scholar]
- 11.Wensink MJ, Lu Y, Tian L, et al. Preconception Antidiabetic Drugs in Men and Birth Defects in Offspring: A Nationwide Cohort Study. Ann Intern Med. 2022;175(5):665–673. doi: 10.7326/m21-4389 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Braun JM, Messerlian C, Hauser R. Fathers Matter: Why It’s Time to Consider the Impact of Paternal Environmental Exposures on Children’s Health. Curr Epidemiology Reports. 2017;4(1):46–55. doi: 10.1007/s40471-017-0098-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Arendt LH, Pedersen LH, Pedersen L, et al. Glycemic Control in Pregnancies Complicated by Pre-Existing Diabetes Mellitus and Congenital Malformations: A Danish Population-Based Study. Clin Epidemiology. 2021;13:615–626. doi: 10.2147/clep.s298748 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Greene MF, Hare JW, Cloherty JP, Benacerraf BR, Soeldner JS. First-trimester hemoglobin A1 and risk for major malformation and spontaneous abortion in diabetic pregnancy. Teratology. 1989;39(3):225–231. doi: 10.1002/tera.1420390303 [DOI] [PubMed] [Google Scholar]
- 15.Ludvigsson JF, Neovius M, Söderling J, et al. Periconception glycaemic control in women with type 1 diabetes and risk of major birth defects: population based cohort study in Sweden. BMJ. 2018;362:k2638. doi: 10.1136/bmj.k2638 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Cesta CE, Rotem R, Bateman BT, et al. Safety of GLP-1 Receptor Agonists and Other Second-Line Antidiabetics in Early Pregnancy. JAMA Intern Med. 2024;184(2). doi: 10.1001/jamainternmed.2023.6663 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Basu M, Garg V. Maternal hyperglycemia and fetal cardiac development: Clinical impact and underlying mechanisms. Birth Defects Res. 2018;110(20):1504–1516. doi: 10.1002/bdr2.1435 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Bateman BT, Huybrechts KF, Fischer MA, et al. Chronic hypertension in pregnancy and the risk of congenital malformations: a cohort study. Am J Obstet Gynecol. 2015;212(3):337.e1–337.e14. doi: 10.1016/j.ajog.2014.09.031 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Caton AR, Bell EM, Druschel CM, et al. Antihypertensive Medication Use During Pregnancy and the Risk of Cardiovascular Malformations. Hypertension. 2009;54(1):63–70. doi: 10.1161/hypertensionaha.109.129098 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Stothard KJ, Tennant PWG, Bell R, Rankin J. Maternal Overweight and Obesity and the Risk of Congenital Anomalies: A Systematic Review and Meta-analysis. JAMA. 2009;301(6):636–650. doi: 10.1001/jama.2009.113 [DOI] [PubMed] [Google Scholar]
- 21.Persson M, Cnattingius S, Villamor E, et al. Risk of major congenital malformations in relation to maternal overweight and obesity severity: cohort study of 1.2 million singletons. BMJ. 2017;357:j2563. doi: 10.1136/bmj.j2563 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Rosella LC, Lebenbaum M, Fitzpatrick T, Zuk A, Booth GL. Prevalence of Prediabetes and Undiagnosed Diabetes in Canada (2007–2011) According to Fasting Plasma Glucose and HbA1c Screening Criteria. Diabetes Care. 2015;38(7):1299–1305. doi: 10.2337/dc14-2474 [DOI] [PubMed] [Google Scholar]
- 23.Astrup A, Bügel S. Overfed but undernourished: recognizing nutritional inadequacies/deficiencies in patients with overweight or obesity. Int J Obes. 2019;43(2):219–232. doi: 10.1038/s41366-018-0143-9 [DOI] [PubMed] [Google Scholar]
- 24.Wolfe HM, Sokol RJ, Martier SM, Zador IE. Maternal obesity: a potential source of error in sonographic prenatal diagnosis. Obstet Gynecol. 1990;76(3 Pt 1):339–342. [PubMed] [Google Scholar]
- 25.Racusin D, Stevens B, Campbell G, Aagaard KM. Obesity and the Risk and Detection of Fetal Malformations. Semin Perinatol. 2012;36(3):213–221. doi: 10.1053/j.semperi.2012.05.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Pierpont ME, Brueckner M, Chung WK, et al. Genetic Basis for Congenital Heart Disease: Revisited: A Scientific Statement From the American Heart Association. Circulation. 2018;138(21):1. doi: 10.1161/cir.0000000000000606 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Roizen NJ, Patterson D. Down’s syndrome. Lancet. 2003;361(9365):1281–1289. doi: 10.1016/s0140-6736(03)12987-x [DOI] [PubMed] [Google Scholar]
- 28.Brown JP, J JJY, Williams PL, Huybrechts KF, Hernández-Díaz S. Accounting for Twins and Other Multiple Births in Perinatal Studies Conducted Using Healthcare Administration Data. medRxiv. Published online 2024:2024.01.23.24301685. doi: 10.1101/2024.01.23.24301685 [DOI] [Google Scholar]
- 29.Miotto R, Wang F, Wang S, Jiang X, Dudley JT. Deep learning for healthcare: review, opportunities and challenges. Brief Bioinform. 2018;19(6):1236–1246. doi: 10.1093/bib/bbx044 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Harpaz R, DuMouchel W, Shah NH, Madigan D, Ryan P, Friedman C. Novel Data-Mining Methodologies for Adverse Drug Event Discovery and Analysis. Clin Pharmacol Ther. 2012;91(6):1010–1021. doi: 10.1038/clpt.2012.50 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Prometheus Laboratories Inc. US Product Information: Imuran.; 2011.
- 32.Novartis. United States Prescribing Information: Gilenya.; 2023. https://www.novartis.com/usen/sites/novartis_us/les/gilenya.pdf
- 33.Noh Y, Lee H, Choi A, et al. First-trimester exposure to benzodiazepines and risk of congenital malformations in offspring: A population-based cohort study in South Korea. PLoS Med. 2022;19(3):e1003945. doi: 10.1371/journal.pmed.1003945 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Bellantuono C, Tofani S, Sciascio GD, Santone G. Benzodiazepine exposure in pregnancy and risk of major malformations: a critical overview. Gen Hosp Psychiatry. 2013;35(1):3–8. doi: 10.1016/j.genhosppsych.2012.09.003 [DOI] [PubMed] [Google Scholar]
- 35.Pradat P. A case-control study of major congenital heart defects in Sweden — 1981–1986. Eur J Epidemiology. 1992;8(6):789–796. doi: 10.1007/bf00145321 [DOI] [PubMed] [Google Scholar]
- 36.Palmsten K, Huybrechts KF, Kowal MK, Mogun H, Hernández-Díaz S. Validity of maternal and infant outcomes within nationwide Medicaid data. Pharmacoepidem Drug Safe. 2014;23(6):646–655. doi: 10.1002/pds.3627 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Hernán MA, Hernández-Díaz S, Robins JM. A Structural Approach to Selection Bias. Epidemiology. 2004;15(5):615–625. doi: 10.1097/01.ede.0000135174.63482.43 [DOI] [PubMed] [Google Scholar]
- 38.Butler AM, Nickel KB, Overman RA, Brookhart MA. Databases for Pharmacoepidemiological Research. Springer Ser Epidemiology Public Heal. Published online 2021:243–251. doi: 10.1007/978-3-030-51455-6_20 [DOI] [Google Scholar]
- 39.Zhu Y, Thai TN, Hernandez-Diaz S, et al. Development and Validation of Algorithms to Estimate Live Birth Gestational Age in Medicaid Analytic eXtract Data. Epidemiology. 2023;34(1):69–79. doi: 10.1097/ede.0000000000001559 [DOI] [PubMed] [Google Scholar]
- 40.Margulis AV, Setoguchi S, Mittleman MA, Glynn RJ, Dormuth CR, Hernández-Díaz S. Algorithms to estimate the beginning of pregnancy in administrative databases. Pharmacoepidem Drug Safe. 2013;22(1):16–24. doi: 10.1002/pds.3284 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Palmsten K, Huybrechts KF, Mogun H, et al. Harnessing the Medicaid Analytic eXtract (MAX) to Evaluate Medications in Pregnancy: Design Considerations. PLoS ONE. 2013;8(6):e67405. doi: 10.1371/journal.pone.0067405 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.MacDonald SC, Cohen JM, Panchaud A, McElrath TF, Huybrechts KF, Hernández-Díaz S. Identifying pregnancies in insurance claims data: Methods and application to retinoid teratogenic surveillance. Pharmacoepidem Drug Safe. 2019;28(9):1211–1221. doi: 10.1002/pds.4794 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Schneeweiss S, Rassen JA, Glynn RJ, Avorn J, Mogun H, Brookhart MA. High-dimensional Propensity Score Adjustment in Studies of Treatment Effects Using Health Care Claims Data. Epidemiology. 2009;20(4):512–522. doi: 10.1097/ede.0b013e3181a663cc [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Srivastava D. GENETIC ASSEMBLY OF THE HEART: Implications for Congenital Heart Disease. Annu Rev Physiol. 2001;63(1):451–469. doi: 10.1146/annurev.physiol.63.1.451 [DOI] [PubMed] [Google Scholar]
- 45.Butler R. ICD-10 General Equivalence Mappings. Bridging the translation gap from ICD-9. J Ahima. 2007;78(9):84–85. [PubMed] [Google Scholar]
- 46.Cooper WO, Hernandez-Diaz S, Gideon P, et al. Positive predictive value of computerized records for major congenital malformations. Pharmacoepidem Drug Safe. 2008;17(5):455–460. doi: 10.1002/pds.1534 [DOI] [PubMed] [Google Scholar]
- 47.Hollier LM, Leveno KJ, Kelly MA, MCIntire DD, Cunningham FG. Maternal age and malformations in singleton births. Obstet Gynecol. 2000;96(5):701–706. doi: 10.1016/s0029-7844(00)01019-x [DOI] [PubMed] [Google Scholar]
- 48.Benjamini Y, Hochberg Y. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. J Royal Statistical Soc Ser B Methodol. 1995;57(1):289–300. doi: 10.1111/j.2517-6161.1995.tb02031.x [DOI] [Google Scholar]
- 49.Faye LL, Sun L, Dimitromanolakis A, Bull SB. A flexible genome-wide bootstrap method that accounts for rankingand threshold-selection bias in GWAS interpretation and replication study design. Statist Med. 2011;30(15):1898–1912. doi: 10.1002/sim.4228 [DOI] [PubMed] [Google Scholar]
- 50.Altszyler E, Sigman M, Ribeiro S, Slezak DF. Comparative study of LSA vs Word2vec embeddings in small corpora: a case study in dreams database. arXiv. Published online 2016. doi: 10.48550/arxiv.1610.01520 [DOI] [Google Scholar]
- 51.Landauer TK, Foltz PW, Laham D. An introduction to latent semantic analysis. Discourse Process. 1998;25(2–3):259–284. doi: 10.1080/01638539809545028 [DOI] [Google Scholar]
- 52.Murphy KP. Probabilistic Machine Learning: An Introduction. MIT press; 2022. [Google Scholar]
- 53.Landauer TK, McNamara DS, Dennis S, Kintsch W. Handbook of Latent Semantic Analysis. Psychology Press; 2013. [Google Scholar]
- 54.Hastie T, Tibshirani R, Friedman JH, Friedman JH. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer; 2009. [Google Scholar]
- 55.Kavšek B, Lavrač N. APRIORI-SD: Adapting Association Rule Learning to Subgroup Discovery. Appl Artif Intell. 2006;20(7):543–583. doi: 10.1080/08839510600779688 [DOI] [Google Scholar]
- 56.Agrawal R, Srikant R. Fast Algorithms for Mining Association Rules. doi: 10.5555/645920.672836 [DOI]