Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2015 Feb 27.
Published in final edited form as: Matern Child Health J. 2014 Nov;18(9):2167–2178. doi: 10.1007/s10995-014-1465-4

The MOSART Database: Linking the SART CORS Clinical Database to the Population-Based Massachusetts PELL Reproductive Public Health Data System

Milton Kotelchuck 1,, Lan Hoang 2, Judy E Stern 3, Hafsatou Diop 4, Candice Belanoff 5, Eugene Declercq 6
PMCID: PMC4342616  NIHMSID: NIHMS660947  PMID: 24623195

Abstract

Although Assisted Reproductive Technology (ART) births make up 1.6 % of births in the US, the impact of ART on subsequent infant and maternal health is not well understood. Clinical ART treatment records linked to population data would be a powerful tool to study long term outcomes among those treated or not by ART. This paper describes the development of a database intended to accomplish this task. We constructed the Massachusetts Outcomes Study of Assisted Reproductive Technology (MOSART) database by linking the Society of Assisted Reproductive Technologies Clinical Outcomes Reporting System (SART CORS) and the Massachusetts (MA) Pregnancy to Early Life Longitudinal (PELL) data systems for children born to MA resident women at MA hospitals between July 2004 and December 2008. PELL data representing 282,971 individual women and their 334,152 deliveries and 342,035 total births were linked with 48,578 cycles of ART treatment in SART CORS delivered to MA residents or women receiving treatment in MA clinics, representing 18,439 eligible women of whom 9,326 had 10,138 deliveries in this time period. A deterministic five phase linkage algorithm methodology was employed. Linkage results, accuracy, and concordance analyses were examined. We linked 9,092 (89.7 %) SART CORS outcome records to PELL delivery records overall, including 95.0 % among known MA residents treated in MA clinics; 70.8 % with full exact matches. There were minimal differences between matched and unmatched delivery records, except for unknown residency and out-of-state ART site. There was very low concordance of reported use of ART treatment between SART CORS and PELL (birth certificate) data. A total of 3.4 % of MA children (11,729) were identified from ART assisted pregnancies (6,556 singletons; 5,173 multiples). The MOSART linked database provides a strong basis for further longitudinal ART outcomes studies and supports the continued development of potentially powerful linked clinical-public health databases.

Keywords: Assisted Reproductive Technologies, SART CORS, PELL, Data linkage, Adverse birth outcomes

Introduction

Assisted Reproductive Technology, or ART, is the overall term encompassing procedures ranging from ovulation induction and intrauterine insemination to in vitro fertilization (IVF), where both the egg and sperm are handled a clinical laboratory to achieve fertilization. In this paper, we will utilize the term ART to mean IVF exclusively. The use of ART/IVF has grown rapidly over the past three decades, with 163,038 ART cycles performed in 2011 resulting in 1.6 % of children born in the United States [1]. From the outset, clinicians and researchers have sought to understand whether adverse reproductive health outcomes are associated with these ART cycles, including preterm birth [2], low birth weight [3], multiple gestations [46], and congenital malformations [79]; and if so, whether they are the direct result of ART or a function of the underlying health and subfertility conditions that led to ART use.

ART research has been limited by small sample sizes, the lack of population-based databases with accurate ART attribution, the absence of appropriate comparison groups of mothers with non-ART treated infertility, the inability to track outcomes beyond birth, and limited statistical power to assess rare outcomes [10]. The Massachusetts Outcomes Study of Assisted Reproductive Technology (MOSART) Collaborative was developed to address these research gaps. The project is based on a collaboration of researchers from the Society for Assisted Reproductive Technology (SART), the Boston University School of Public Health, the Massachusetts Department of Public Health (MDPH), and the Centers for Disease Control and Prevention (CDC). The collaboration provided for the linking of two databases: the Society for Assisted Reproductive Technology Clinic Outcome Reporting System (SART CORS), which contains clinical records of >95 % of IVF Assisted Reproductive Technology treatment cycles in the U.S.; and the MA Pregnancy to Early Life Longitudinal (PELL) Data System, a state-wide, longitudinal population database which links birth certificates of children born in the state to their own and their mothers' hospital discharge records as well as an array of other administrative, public health and vital statistics datasets [11]. The goal of the MOSART Collaborative was to address the major shortcomings of prior research by developing a large multi-year statewide population based dataset that includes ART clinical data and measures of socio-demographic and health variables for mothers and their children—a model clinical-public health linked database.

Few US studies have linked ART outcomes data with vital statistics data. Sunderam et al. [12] probabilistically linked 1997–1998 treatment data from 11 clinics in MA and RI in the National ART Surveillance System (NASS), with birth certificates from the MA and RI vital statistics registries, using the dates of birth of the mother and infant as the only available linkage variables; they achieved an 80 % linkage rate. For a subsample of records augmented with name identifiers provided by some of the clinics, they obtained an 89.5 % linkage rate. NASS is the CDC national database into which SART CORS data are reported under the federal Fertility Clinic Success Rate and Certification Act of 1992 (Public Law 102-493). Building on that study, CDC expanded the NASS data linkage activities to include two additional states, Florida and Michigan, and established the current States Monitoring Assisted Reproductive Technology (SMART) Collaborative. Using a similar probabilistic methodology (now augmented by plurality status, gender, residential ZIP code and gravidity), the SMART Collaborative have achieved a linkage rate of 87.8 % for MA only data [13], and an overall 90.2 % during 1997–2006 [14].These latter two studies' linkage efforts are limited by the absence of personal identifiers, such as mothers' names, the use of probabilistic rather than deterministic linkages, and the use of treatment cycle-based rather than women-based linkage methodologies.

Study Aims

This paper reports on (1) the construction of the linked MOSART database; (2) the methodological assessment of the linkage; and (3) the clinical, epidemiologic and public health practice importance of the linked MOSART database.

Methods

IRB Approved Data Sharing Procedures

IRB approval for the MOSART linkage was obtained from the MDPH, CDC, Boston University Medical Center, Dartmouth College and Massachusetts General Hospital. Collaborators signed a Memorandum of Understanding covering the confidentiality of the data and data use responsibilities.

We developed a secure, multi-step data sharing algorithm that began with the transfer of files containing limited Protected Health Information variables used for linkage, and ultimately permitted the construction of a full de-identified MOSART database, with the original source databases being unable to be linked to the MOSART database or vice versa. Prior to any data sharing activities all potential linkage variables were reformatted to common data element structures. A temporary linkage key is maintained at the MDPH, on a restricted server, to facilitate the linkage of additional years of data. All data sharing linkage and de-identification occurred on restricted-access servers at the MDPH.

Data Systems and Preparation

The SART CORS Database

The SART CORS Online database is a technically advanced, cycle-based, healthcare outcome monitoring system. The database contains comprehensive information on patient demographics, diagnosis, treatment, and pregnancy outcome collected in compliance with the Fertility Clinic Success Rate and Certification Act of 1992 [15] that requires reporting of every IVF cycle of ART to CDC. Data are collected into this system from all SART associated ART clinics representing more than 90 % of all clinics and >97 % of ART cycles (2009 data) reporting data to the CDC under this law. All seven MA ART clinics operating during the study period reported to SART CORS. Information in this database undergoes validation through methods developed jointly by SART and CDC. Data are also monitored for impossible values at data entry into the system. Delivery resulting from a specific ART treatment cycle is recorded in SART CORS as the presence of both an outcome date and a field indicating a live birth or fetal death. Early loss of less than 20 week gestation is indicated by the presence of an outcome date but no indication of a live birth or stillbirth. Information on birth outcomes includes a calculated gestational age and up to six child fields (for plural births) with information on whether the babies were born alive or dead, male or female, and their birth weight. Complicating the potential linkage, most SART CORS identifiers are collected at the cycle start, 9 months before delivery and thus, some identifiers, such as surnames (married, single), zip codes, and partners, could have changed by the time of delivery.

The MA PELL Data System

The MA Pregnancy to Early Life Longitudinal (PELL) data system was created through a public–private partnership among the Boston University School of Public Health, the MDPH and the CDC. The population-based PELL data system, which has been described elsewhere [16, 17], includes vital records (births, fetal deaths and deaths of mothers and infants/children), Birth Defects Registry and Cancer Registry information; hospital utilization data (inpatient, observational stay, emergency department visits); and Early Intervention and the Women Infants and Children program participant data. The PELL data system annually updates and identifies the number of unique women, deliveries, and children, as part of its core linkage procedure, and then longitudinally links their data over time. PELL currently contains information on over 1,025,000 MA births and fetal deaths (1998–2009), and 550,000 mothers, including over 240,000 women with more than one delivery. Over 99 % of births are linked to their hospital discharge delivery records annually. The PELL hospital discharge datasets are also linked longitudinally facilitating assessment of mothers' health prior to birth and in the years following birth for both mothers and children. PELL birth and hospital data can be used to create algorithms to identify MA women of reproductive age with fertility problems [23], but it contains limited information about specific clinical treatments for infertility.

Study Subjects

The initial set of subjects for this linkage was 342,035 infants born to 282,971 individual MA resident women with 334,152 deliveries in the PELL data system who were live born or fetal deaths in a Massachusetts hospital or birthing center between July 1, 2004 and December 31, 2008 (Fig. 1). SART CORS data were obtained for all ART cycles from January 2004 through December 2008 in which the patient was either a MA resident or in which treatment took place at a MA ART clinic. This included 3094 women (30.5 %) with unknown residency treated at MA clinics, which was important since a large proportion of patients treated at several major MA clinics did not have state of residency reported. These initial selection criteria yielded 48,578 ART cycle records potentially eligible for linkage to our study population.

Fig. 1.

Fig. 1

PELL–SART CORS linkage overview. (July 1, 2004–December 31, 2008)

Strategy for Linkage of SART CORS to PELL

Beyond cycle level data, SART CORS could be linked with PELL at three levels—women, deliveries (whether singleton or plural) or births (individual infants). As this current database was developed to focus on child outcomes, we chose to initiate our linkage at the level of deliveries rather than individual infants, since plural infants from the same pregnancy share the same exposure parameters from a cycle treatment record. Moreover, in the case of plural deliveries, infant characteristics are not recorded in SART CORS in a uniform manner relative to birth order making them sub-optimal for linkage purposes.

Determination of Deliveries in SART CORS Eligible for Linkage to MOSART

We began with the 48,578 ART cycles, attributed to 21,102 unique women based on SART CORS Woman ID which allows identification of women across clinics in the SART database [1820], and independently verified by the MOSART database manager. We then eliminated ART cycles and women that could not link to PELL including 5,339 cycle records belonging to 2,437 women treated at MA clinics who had established non-MA residencies and 590 cycles associated with 226 women who were gestational carriers. The latter were omitted because SART CORS records the intended mother's name and date of birth, not the carrier's, and a linkage to a PELL birth record would not have been possible. This resulted in a final SART CORS sample of 18,439 women with 42,649 ART treatment cycles in the 2004–2008 study period eligible for potential linkage to the PELL database.

The 18,439 ART treated women were further subdivided into three groups according to birth outcome information 10,733 women with a live birth (live birth or fetal death greater than 20 weeks gestation) associated with one of their ART treatment cycles; 1,285 women with a clinical intrauterine gestation but subsequent early loss prior to 20 weeks gestation; and 6,421 women with no reported conception or delivery. The latter two groups of women were eliminated from this current linkage analysis, as were the 1,450 women with live birth deliveries that occurred after December 31, 2008. The resulting 9,283 women had 10,086 deliveries during the study period.

The pregnancy and/or birth outcome information in a proportion of ART cycles reported to SART CORS, however, may be unknown, incomplete or an approximation— due to loss of follow-up or provision of inexact outcome information by the patient, social service or provider reports nine months or more after the ART cycle takes place (i.e., not based on formal medical records verification) We, therefore, additionally assessed all cycles of ART regardless of whether an outcome was listed in SART CORS. An additional 43 women with 52 deliveries; 15 deliveries associated with an intrauterine gestation and early loss and 37 deliveries with no conception were identified in the PELL/birth certificate database in the study time period and added back into the SART CORS database.

The final linkage database consisted of 10,138 deliveries eligible for matching from SART CORS with 334,152 deliveries eligible for matching from MA PELL.

Linkage Procedures

Steps for Linkage of SART CORS to PELL

The linkage efforts utilized a multistep deterministic methodology—which started with the most stringent matching criteria, and then gradually and systematically reduced the level of matching criteria, until no more secure matches could be obtained. Linkage was performed with Link Pro, a publicly available, SAS-based program for the data linkage [21]. The five primary linkage variables from PELL/SART CORS included baby's date of birth (BDOB), Mother's date of birth (MDOB), mother's first name (MFN), mother's last name (MLN), and father/partner's last name (FLN). Secondary linkage variables that were potentially available to break multiple linkage ties or identify possibly overlooked linkages included multiple birth status, mother's middle name, father first name, sex of baby, and baby's birth weight.

There were five broad phases of the linkage algorithm (involving 23 specific deterministic linkage steps). Phase 1 linked records with perfect full matches on all five matching variables: MDOB, BDOB, MFN, MLN, and FLN. Included in perfect matches were those with minor spelling variations across the two databases, which were evaluated using the SAS Soundex and Spellex [22], or the first three letters of a name. A missing father's last name was considered a legitimate value in this step.

The second phase found perfect matches on all primary linkage variables (MDOB, BDOB, MFN, MLN) except father's last name under the assumption that father's information in either database could be inaccurate, inconsistently missing or changed over the course of the pregnancy.

The third phase linked on name variation matches, but with matching mother and baby date of births (two factor criteria matches). There were numerous name variations such as marital last name changes or complex foreign spelling variations and inversions which obscured otherwise good matches and these were discovered at this step.

The fourth phase explored baby date of birth date variations, matching on MDOB and MLN, MFN or FLN names (e.g., using two-factor criteria matches). Inconsistencies in BDOB were observed, as well as month/day date transpositions. To explore BDOB variations, we examined all potential matches for births within ±61 days (to allow for typographical errors in the month and day fields) and month/day inversions. We also looked for other minor date errors and additional typographical errors.

The fifth phase explored visually and by hand the remaining records with only one primary variable match. Some added linkages resulted from scrutinizing multiple last names and last-first name inversions. In addition, we explored all residual fetal death records, which often contain substantial incomplete information.

Analysis of Linkage Rates and Assessment of Linkage Quality

SART CORS to PELL Data Linkage Rates

Delivery linkage rates were calculated as total linked SART CORS-PELL birth delivery records divided by total potential deliveries in the SART CORS records. These linkage rates were further examined by maternal residence (MA vs. Unknown) and by clinic location (Massachusetts clinic vs. non-Massachusetts clinic).

Analysis of Linked and Unlinked Records

We compared linked to unlinked SART CORS delivery records on select maternal and infant demographic and clinical markers found in the SART CORS database, using Chi square tests, to assess any systematic differences between these two groups. We further compared characteristics of linked ART treatment records that had recorded birth outcomes to those that did not to assess for systematic bias in records without birth outcome follow up data.

Concordance of SART CORS and PELL Databases

To document the importance and advantages of linking the SART CORS to the PELL data systems, and to validate the best measures of key exposures and outcomes, we examined the concordance of two sets of key measures: ART treatment and birth outcomes, including plurality, number of children, and live birth/fetal death status. SART CORS was considered the gold standard for measuring presence of ART treatment; and because it draws on vital statistics records, PELL served as the gold standard for birth outcomes. Concordance was measured using the Kappa statistic, and the validity of ART reported on the birth certificate/PELL, and birth outcomes reported on the SART CORS record, were measured using sensitivity and positive predictive value.

Results

Matched Deliveries Using the Five Phase Linkage Algorithm

The linkage of 334,152 deliveries occurring in Massachusetts between July 1, 2004 and December 31, 2008 to 42,649 cycles of ART treatment from SART CORS having cycle starts between January 1, 2004 and December 31, 2008 yielded a total of 9,092 deliveries attributable to ART treatment, and 325,060 deliveries not attributable to ART treatment.

Table 1 shows a summary of the sequential five-phase deterministic SART CORS-PELL linkage for the 10,138 ART associated deliveries from SART CORS and the 334,152 deliveries from MA PELL. In phase 1, perfect matches, 70.8 % of all of the matched deliveries (6,439) were linked. Of these, 64.6 % were true perfect matches and those with minor name variations (e.g., Soundex) accounted for the remaining matches. At each additional step, fewer matches were found. Phase 2, which omitted father's name information, resulted in 15.5 % or 1,408 additional matches. As with phase 1, the great majority were true perfect matches. Phase 3, which evaluated two out of the three name concurrences and MDOB and BDOB agreements, yielded an additional 3.6 % or 328 matches. The principal name variant that provided the most new linkages (3.0 %) were changes in mother's last name between time of conception and birth. Phase 4, BDOB variations, was an important source of 7.9 % or 722 additional matches. Inconsistencies in dates of birth between SART CORS clinical records and PELL records, plus typos were not uncommon. Finally, in phase 5, which linked on one single primary matching variable, yielded only 2.1 % or 195 additional linkages, including 35 new fetal death record linkages.

Table 1. Five phase linkage algorithm for 10,138 SART known deliveries (July 1, 2004– December 31, 2008).

Matching phase Type of match Number linked Cumulative number linked Cumulative percentage linked (%) Remaining number unlinked
1 MDOB, BDOB, MFN, MLN, FLNa (All matching variables) 6,439 6,439 70.8 3,699
2 MDOB, BDOB, MFN, MLN (All variables except father names) 1,408 7,847 86.3 2,291
3 Two matched name variations plus MDOB and BDOB 328 8,175 90.0 1,963
4 BDOB variations 722 8,897 97.9 1,241
5 Other miscellaneous linkages 195 9,092 100 1,046
a

MDOB mother's date of birth, BDOB baby's date of birth, MFN mother's first name, MLN mother's last name, FLN father's last name

Two other associated linkage notes are important to mention (essentially variants of the Phase 4 BDOB analyses- wherein with the gradual loosening of the BDOB criteria, we eliminated any BDOB restrictions). First, we reviewed all 18,439 SART CORS women regardless of any reported birth outcome; and using MDOB, MFN, MLN, and/or FLN and the same matching algorithms as above, and as noted earlier we were able to identify 52 additional deliveries—37 deliveries from 37 women with no SART CORS birth outcomes deliveries recorded in their database and 15 deliveries from 6 women with early pregnancy loss cycles—within 5–9 months of an ART cycle. Second, we were able to detect 1,571 additional deliveries to SART CORS women (e.g., maternal linkages) in the PELL database, where the births were not related to a current ART treatment cycle (e.g., more than 9 months +62 days from an ART cycle start date). That is, we detected other non-ART treatment deliveries associated with women who had at some point in 2004–2008 had an ART associated cycle, potentially representing births to women with fertility issues but no ART treatment delivery [23].

Overall Matching/Linkage rates

Table 2 presents the distribution of matched and unmatched deliveries among the 10,138 potential SART CORS deliveries outcomes eligible for linkage with the PELL data system. Overall, 89.7 % of the ART deliveries were matched. There are marked variations in linkage rates by the Massachusetts residency status and by site of ART treatment. The majority of delivery records 6,606 (65.2 %) were births to MA resident women receiving ART treatment in MA clinics; among these women the linkage rate was 95.0 %. There were 3,094 delivery records of women treated in MA clinics, but who had unknown residency in the SART CORS; among these 83.5 % linked to a delivery in PELL. Among Massachusetts residents at conception who received ART services outside Massachusetts, only 52.7 % of their deliveries matched.

Table 2. SART CORS to PELL linkage rates, overall and by residency and ART site.

MA clinic, MA resident Non-MA clinic, MA resident MA clinic, residency unknown Total
All records (N) 6,606 438 3,094 10,138
Linked records (N) 6,276 231 2,585 9,092
Un-linked records (N) 330 207 509 1,046
Linked records (%) 95.0 % 52.7 % 83.5 % 89.7 %

Analysis of Linked and Unlinked Records

In Table 3, we examined possible differences between the 9,092 SART CORS/PELL linked and 1,046 unlinked deliveries. Based on available SART CORS data, there were no significant differences in maternal age or race, though data on race was missing for the majority of the SART CORS records. Infant outcomes did not differ for plurality or live birth, although there was a slight increase in fetal deaths among unmatched records. Clinically, there were some statistically significant, though numerically small, differences between linked and unlinked records in the type of ART cycles (e.g., more autologous fresh and frozen cycles in linked records); and reasons for treatment (e.g., more male factor, ovulation disorders and uterine factors). Unlinked records, consistent with the prior matching data, had disproportionately more missing state residency and more treatment from non-Massachusetts clinics.

Table 3. Linked versus unlinked SART CORS delivery records.

Domain Linked (9,092) % Unlinked (1,046) % p Value
Maternal age <35 years 49 51 <.10
Maternal race/ethnicity
 Unknown 68 81 <.001
 White (of known records) 84 81 <.25
Mother residency—unknown 28 49 <.001
Clinic site—non-MA 3 20 <.001
Plurality-multiples 28 30 <.13
Gestational age <37 weeks 38 41 <.001
Fetal deaths 0.7 2.3 <.001
Type of ART
 Autologous fresh 79 76 <.02
 Donor fresh 8 10 <.03
 Autologous frozen 11 10 <.49
 Donor frozen 2 4 <.001
Reasons for ARTa
 Endometriosis 9 7 <.07
 Male factor 33 29 <.02
 Ovulation disorders 13 11 <.03
 Diminished ovarian reserve 9 8 <.18
 Tubal factors 27 28 <.65
 Uterine factors 3 2 <.04
 Other factors 17 19 <.03
 Unexplained factors 22 22 <.90
a

Adds to over 100 % if multiple reasons indicated

Compared to the 9,040 deliveries with known birth outcomes in the SART CORS data system, (data not shown), deliveries with unknown birth outcomes in the SART CORS data system (n = 52) were more likely to be to Black (19 % vs. 3 %) or Hispanic (10 % vs. 3 %) mothers, to have an unknown residency (37 % vs. 20 %), less likely to require ART for tubal factors (28 % vs. 50 %), and more likely to experience fetal deaths (25 % vs. 0.4 %).

SART CORS—PELL Data System Concordance

ART Treatment Concordance

We found limited agreement between ART treatment exposure from SART CORS records and report of ART treatment on the MA birth certificate or fetal death record (Table 4) with ART treatment substantially underreported on Massachusetts birth certificates. Among the 9,092 MOSART linked delivery records, 3,356 (37 %) were recorded as having ART on the PELL birth records; among the 5,364 birth certificate delivery records indicating ART fertility treatment only 3,356 were confirmed from the SART CORS, for a 63 % positive predictive value. Specificity was over 99 %, reflecting the large number of deliveries without an ART treatment, but only achieving an overall Kappa of 0.45. Complete data for all concordance analyses are presented in the electronic appendices.

Table 4. Concordance of PELL and SART CORS data items.
Concordance statisticsa

Data item Kappa Sensitivity (%) Positive predictive value (%)
ART status .453 36.9 62.6
Plurality at delivery .973 98.9 99.5
Fetal loss/live birth associated delivery .457 47.4 69.2
a

Complete data for concordance tables in Electronic Appendices 1–3

MOSART Plurality and Children Concordance

We compared the plurality status of births at delivery for women based on SART CORS treatment records compared to those identified from the PELL delivery records. Table 4 documents the very high degree of concordance for plurality across the two data systems, with an overall Kappa of 0.98. We further explored the number of children born across the two data systems. Among the 9,092 matched deliveries, the MA PELL data system recorded 11,729 children (deriving from 6,556 singleton births, 4,872 twin births, and 301 high order multiple births); while the SART CORS records 11,635 children (deriving from 6,544 singleton births, 4,796 births involving twins, and 295 births involving high order multiples, data not shown).

Fetal Death and Live Births Concordance

Table 4 also documents the only moderate concordance of fetal deaths associated deliveries with between the SART CORS and PELL, with an overall Kappa of 0.46. Among the ART-conceived infants in the MOSART study population, there were a total of 60 fetal deaths among the 57 deliveries, or 5.1 (60/11,729) deaths per 1,000 births and fetal deaths. SART CORS recorded 52 total fetal losses, but also with only moderate overlap with PELL at the individual infant level; 46.7 % of PELL fetal deaths were noted in SART CORS and 59.6 % of SART CORS fetal deaths were noted in PELL (data not shown).

Final MOSART Population 2004–2008

The final MOSART child study population for 2004–2008 is shown in Fig. 2. It includes: 342,035 children born to 282,971 MA resident women who delivered a child in MA hospitals between July 1, 2004 and Dec 31, 2008. Among these women, there were 334,152 deliveries, including 9,092 ART associated deliveries and 325,060 non-associated deliveries. The 9,092 ART associated deliveries resulted in 11,729 infants (11,669 live births and 60 fetal deaths); including 6,556 singletons and 5,173 multiples.

Fig. 2.

Fig. 2

Final MOSART SAMPLE. (2004–2008)

According to this analysis 2.7 % (9,092/334,152) of Massachusetts deliveries to Massachusetts residents, and 3.4 % (11,729/342,035) of Massachusetts children were conceived by ART treatment during the period July 1, 2004- December 31, 2008; including 2.0 % of all Massachusetts singletons, 32.8 % of twins, and 44.8 % of triplets+.

Discussion

ART-assisted births are increasing nationally and in MA, but little is known about the long term health outcomes for these children. This paper reports on development of a MA population-based database, the MOSART database, formed from the linkage of the SART CORS and PELL data systems. MOSART is unique in differentiating ART and non-ART treatment status for all children born in MA hospitals or birthing centers to MA-residing women who received ART at MA-based or non-MA-based ART clinics, and in linking that data to a population based dataset that can be used to follow children's health status after their birth. The SART CORS serves as the gold standard for identifying ART treatment at conception and the MA PELL data system serves as the gold standard for birth outcomes at delivery. This paper demonstrates that robust linkage between these data systems is feasible, with 95 % of deliveries to MA women treated in MA clinics and 89.7 % of all potential SART CORS delivery records linked to the PELL data system with high linkage accuracy.

The MOSART database is one of the initial US population-based linked clinical public health databases to incorporate both detailed treatment data and subsequent longitudinal information on post-birth child health status, health service utilization, and public health program participation. Ultimately this database will be able to be used to create algorithms that identify and compare births deriving from ART treatment, those with an indication of maternal subfertility but lacking ART treatment, and a residual presumed fertile control population [23]. MOS-ART database currently however does not systematically identify women who have had ovulation induction or intrauterine insemination, and may not be able to identify them in the fertile or subfertile populations. The MOSART database will be able to be used to research issues that relate ART clinical data and underlying infertility with longitudinal health outcomes for children. The MOSART child database can and will be augmented annually with additional years of data.

The current linkage analysis is consistent with, and strengthens, prior efforts to link vital records with ART births on a population level. The MA SART CORS-PELL deterministic database linkage rate of 89.7 % is comparable to those few other population-based probabilistic linkage efforts; higher than the 80 % rate obtained from the basic initial linkage by Sunderam et al. [12] and 87.7 % rate from Zhang et al. [13], and slightly lower than the overall 90.2 % rate from Mneimneh et al. [14] using the recent improved SMART probabilistic methodology (though still without name identifiers). The rates are not however totally comparable, as each effort also has slightly different assumptions about eligible records for inclusion in their linkage numerators and denominators. We believe that the current MOSART linkage, however, is ultimately more accurate; as there is increased probability of mis-classification in any algorithm that relies heavily on MDOB, which we documented is inaccurate in 7.9 % of SART CORS/NASS records. Zhang et al. [24] detected 3.6 % linkage inaccuracies (primarily false positives) when they compared the SMART probabilistic methods with the deterministic subsample from the Sunderam et al. [12]. A brief examination of the MOSART database shows that we would have found ∼249 or 3.0 % added false links using only MDOB, BDOB and plurality status as linkage variables. The availability of names for deterministic linkages clearly strengthens the certainty that the SART CORS-PELL linkages are accurate.

Our broad inclusion criteria for ART treatment cycles obtained from SART CORS was likely to have been an important factor in the overall linkage rates we obtained, since the “true” linkage rates depend in part on the breadth of the denominator used to determine the rate. The highest linkage rates were among women who were MA residents receiving ART treatment at MA clinics at 95 %. Unlinked records fall into 2 broad categories: linkage failures, for example, inadequate or imprecise information for the linking records, a false negative; and ineligible records for linkage, for example the ART treatment or state of birth was not actually associated with a MA resident. Several ineligibility scenarios could exist. First, current study-eligible SART CORS women may reside in and deliver in non-MA states and hence they would not be identifiable among MA births in PELL. Indeed, linkage rates were much lower for the deliveries with unknown residency group (83.5 %) than for MA residents (92.4 %; see Table 2). And Zhang et al. [13] further reported that 2.5 % of MA ART-treated women delivered in Rhode Island and other nearby states. Second, since there is up to a 9 month pregnancy gestation between the cycle start date and the delivery date, some MA women may have moved out of state after the ART treatment, and hence their deliveries would not be in the MA resident's birth file. Consistent with this, we had the lowest linkage rates (52.7 %) for non-MA clinic treatment. Additionally, 2.7 % of MA resident births occur in neighboring states [25]. Given that the majority of the (1,046) unlinked records involved either deliveries where residence was unknown (509) or MA residents treated in a non-MA clinic (207), it is likely that the 89.7 % SART-CORS PELL overall linkage rate was substantially higher, since we have no way of determining if those deliveries actually occurred in Massachusetts. The true linkage rate, based on the SART CORS information, ultimately depends on knowledge of delivery state, not the state of residence or site of treatment at time of ART conception treatment.

The concordance analyses further demonstrate the importance of linking the two databases, creating a clinical public health database which has improved accuracy relative to either database alone, in order to more accurately study the impact of ART on reproductive health; plus, the obvious advantage of combining information that each database alone contains.

SART CORS, by definition, is the gold standard for ART treatment. As has been noted by Zhang et al. [24] and Cohen et al. [26], we too demonstrated very poor concordance, with both substantially over-reported (38 %) and under-reported (63 %), and limited positive predictive value, between SART CORS reported ART treatment and ART identifier(s) on the MA/national standard birth certificate. These may reflect the fact that the MA birth certificate allows for ART to be confused with intrauterine insemination in the same assisted reproduction response category. These figures are discouraging and strongly suggest that birth certificates are very poor measures of ART treatment status and that they should not be used for public health research or surveillance. Similarly, the public health birth certificate data has more accurate and extensive birth outcome information, especially for less common birth outcomes like fetal deaths, and greater accuracy of infant date of birth for ascertaining gestational age.

The nature of the final linked database is important for the breadth and accuracy of research on the sequalae of ART treatment. Currently studies of ART birth outcome in the United States are either clinic-based [9, 27] or utilize either the SART CORS [2831] or the NASS [2, 3] database and thus have no capacity to evaluate health risks in a longitudinal manner, or are based on vital statistics data in which ART patients may not be accurately identified. The MOSART database allows for cycle, mother, delivery, child and family-based analyses, and further allows for longitudinal investigation.

Public Health & Epidemiologic Implications for Linkage/Surveillance

Our data linkage effort provides insights to improve future MCH epidemiologic research, especially for ART-vital statistics data linkage methodology, or any two name/date based clinical-public health linkage efforts. We find that ART outcome studies should be cautious about using the exact baby's date of birth from SART CORS or NASS as the algorithmic starting point for linking ART databases or for assessing ART and birth outcome characteristics. 7.9 % of the linked birth records demonstrated inaccurate dates of birth. Additionally, we discovered a small number of birth outcomes (0.5 %) that were not recorded in the SART CORS system, and those had distinctive characteristics, being more often minority women and deliveries with fetal deaths.

We documented substantial variations in key linker name variables, especially for women's names across a pregnancy period. Our practical experience suggested that Spellex was stronger than Soundex (which created too many false positives); in the end, we came to prefer using first 3 letter matches. Father's name, not surprisingly, was the most inconsistent, followed by mother's last name, and mother's first name.

We utilized a deterministic rather than a probabilistic approach for creating the linkages. Each approach has its advantages and disadvantages. Probabilistic linkages are much less time consuming and may be more practical for ongoing public health surveillance; but we were developing a detailed research database and were concerned with establishing ART attribution as accurately as possible.

In linking clinical and public health databases, we learned the critical importance of having individuals from both databases on the same team, actively sharing their databases' unique features. Database managers from both the originating clinical and public health databases may not be familiar with linking data sets across health systems created for very different purposes. Moreover, without an active joint leadership team, errors in interpretation would be much more readily made. Clinical and public health database professionals need to collaborate in finding a common language in creating linked databases.

Finally, this MOSART linkage is representative of a new direction in MCH epidemiology, the creation of clinical-public health databases, the blending of clinical data and public health data. Such data systems can provide the longitudinal information for clinical follow up research and practice, independent of whether a person remains in the original case/clinical care practice; and they can provide needed clinical treatment information to interpret population trends. But mostly, they would allow for addressing issues that are at the nexus of clinical and public health practice—such as the impact of ART on population patterns of long term child health. They will allow for assessing the aggregation of clinical practices on population health, and the role of public policy on clinical practice and health outcomes.

The MOSART database is one of the initial reproductive clinical public health databases in the United States. This linkage represents a model of what is possible and what is likely to develop in the coming years. The public health (and clinical) worlds have been slow compared to private industry to exploit the advantages of linkage studies involving large data bases to improve clinical practice, understand clinical outcomes, and advise public policy. This study has demonstrated that such linkages are possible. This methodology could be used for similar linkages of other clinical (and programmatic) data to Vital Statistic Registry databases.

Conclusion

In sum, this paper reports on the construction of the linked MOSART database and evaluates its linkage rates and accuracy, and expands our epidemiologic knowledge of methods for linking public health and clinical databases on a population basis. The MOSART database derives from a robust linkage between SART CORS and PELL databases, and provides a unique, innovative, and crucial platform for further studies of the impact of ART on subsequent maternal and infant health outcomes.

Supplementary Material

Supplemental

Acknowledgments

This project was supported by NICHD Award Numbers RO1 HD064595 and RO1 HD067270. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institute of Child Health and Human Development or the National Institutes of Health. Other/former MOSART staff: Howard Cabral; Bruce Cohen; Katrina Plummer; Mark Hornstein; Barbara Luke; Qi Yu; Daksha Gopal; Judith Weiss; and Maurizio Macaluso. SART wishes to thank all of its members for providing clinical information to the SART CORS database for use by patients and researchers. Without the efforts of SART members, this research would not have been possible.

Abbreviations

US

United States

MA

Massachusetts

ART

Assisted Reproductive Technologies

PELL

Pregnancy to Early Life Longitudinal data system

SART

Society for Assisted Reproductive

CORS

Technologies Clinical Outcome Reporting System

Footnotes

Electronic supplementary material The online version of this article (doi:10.1007/s10995-014-1465-4) contains supplementary material, which is available to authorized users.

Conflict of interest: The authors have no conflicts of interest to disclose. The findings and conclusions in this paper are those of the authors and do not necessarily represent the official position of the Society of Assisted Reproductive Technologies or the Massachusetts Department of Public Health.

Contributor Information

Milton Kotelchuck, Email: mkotelchuck@mgh.harvard.edu, MGH Center for Child and Adolescent Health Research and Policy, MassGeneral Hospital for Children, 100 Cambridge, Street, 15-1545, Boston, MA 02114, USA.

Lan Hoang, Community Health Sciences, Boston University School of Public, Health, Boston, MA, USA.

Judy E. Stern, Obstetrics and Gynecology and Pathology, Geisel School of Medicine at Dartmouth, Lebanon, NH, USA

Hafsatou Diop, Massachusetts Department of Public Health, Boston, MA, USA.

Candice Belanoff, Community Health Sciences, Boston University School of Public, Health, Boston, MA, USA.

Eugene Declercq, Community Health Sciences, Boston University School of Public, Health, Boston, MA, USA.

References

  • 1.Centers for Disease Control and Prevention, American Society for Reproductive Medicine, Society for Assisted Reproductive Technology. Washington, DC: 2011. [Accessed 8/5/2103]. Preliminary Assisted Reproductive Technology Success Rates: National Summary and Fertility Clinic Reports 2013. http://www.cdc.gov/art/ [Google Scholar]
  • 2.Schieve LA, Ferre C, Peterson HB, Macaluso M, Reynolds MA, Wright VC. Perinatal outcome among singleton infants conceived through assisted reproductive technology in the United States. Obstetrics and Gynecology. 2004;103(6):1144–1153. doi: 10.1097/01.AOG.0000127037.12652.76. [DOI] [PubMed] [Google Scholar]
  • 3.Schieve LA, Meikle SF, Ferre C, Peterson HB, Jeng G, Wilcox LS. Low and very low birth weight in infants conceived with use of assisted reproductive technology. New England Journal of Medicine. 2002;346(10):731–737. doi: 10.1056/NEJMoa010806. [DOI] [PubMed] [Google Scholar]
  • 4.Reddy UM, Wapner RJ, Rebar RW, Tasca RJ. Infertility, assisted reproductive technology, and adverse pregnancy outcomes: Executive summary of a national institute of child health and human development workshop. Obstetrics and Gynecology. 2007;109(4):967–977. doi: 10.1097/01.AOG.0000259316.04136.30. [DOI] [PubMed] [Google Scholar]
  • 5.Kissin D, Schieve L, Reynolds M. Multiple-birth risk associated with IVF and extended embryo culture: USA, 2001. Human Reproduction. 2005;20(8):2215–2223. doi: 10.1093/humrep/dei025. [DOI] [PubMed] [Google Scholar]
  • 6.Schieve LA, Rasmussen SA, Buck GM, Schendel DE, Reynolds MA, Wright VC. Are children born after assisted reproductive technology at increased risk for adverse health outcomes? Obstetrics and Gynecology. 2004;103(6):1154–1163. doi: 10.1097/01.AOG.0000124571.04890.67. [DOI] [PubMed] [Google Scholar]
  • 7.Hansen M, Kurinczuk JJ, Bower C, Webb S. The risk of major birth defects after intracytoplasmic sperm injection and in vitro fertilization. New England Journal of Medicine. 2002;346(10):725–730. doi: 10.1056/NEJMoa010035. [DOI] [PubMed] [Google Scholar]
  • 8.Hansen M, Bower C, Milne E, de Klerk N, Kurinczuk JJ. Assisted reproductive technologies and the risk of birth defects—A systematic review. Human Reproduction. 2005;20(2):328–338. doi: 10.1093/humrep/deh593. [DOI] [PubMed] [Google Scholar]
  • 9.Olson CK, Keppler-Noreuil KM, Romitti PA, et al. In vitro fertilization is associated with an increase in major birth defects. Fertility and Sterility. 2005;84(5):1308–1315. doi: 10.1016/j.fertnstert.2005.03.086. [DOI] [PubMed] [Google Scholar]
  • 10.Louis GB, Schisterman E, Dukic V, Schieve L. Research hurdles complicating the analysis of infertility treatment and child health. Human Reproduction. 2005;20(1):12–18. doi: 10.1093/humrep/deh542. [DOI] [PubMed] [Google Scholar]
  • 11.Declercq E, Barger M, Cabral HJ, et al. Maternal outcomes associated with planned primary cesarean births compared with planned vaginal births. Obstetrics and Gynecology. 2007;109(3):669–677. doi: 10.1097/01.AOG.0000255668.20639.40. [DOI] [PubMed] [Google Scholar]
  • 12.Sunderam S, Schieve LA, Cohen B, et al. Linking birth and infant death records with assisted reproductive technology data: Massachusetts, 1997–1998. Maternal and Child Health Journal. 2006;10(2):115–125. doi: 10.1007/s10995-005-0013-7. [DOI] [PubMed] [Google Scholar]
  • 13.Zhang Y, Cohen B, Macaluso M, Zhang Z, Durant T, Nannini A. Probabilistic linkage of assisted reproductive technology information with vital records,Massachusetts 1997–2000. Maternal and Child Health Journal. 2012;16(8):1703–1708. doi: 10.1007/s10995-011-0877-7. [DOI] [PubMed] [Google Scholar]
  • 14.Mneimneh AS, Boulet SL, Sunderam S, et al. States monitoring assisted reproductive technology (SMART) collaborative: Data collection, linkage, dissemination, and use. Journal of Women's Health. 2013;22(7):571–577. doi: 10.1089/jwh.2013.4452. [DOI] [PubMed] [Google Scholar]
  • 15.United States Legislative Act. United States Fertility Clinic Success Rate and Certification Act of 1992: Public Law 102-493. US Statutes. 1992;106:3146–3152. [PubMed] [Google Scholar]
  • 16.Shapiro-Mendoza CK, Tomashek KM, Kotelchuck M, et al. Effect of late-preterm birth and maternal medical conditions on newborn morbidity risk. Pediatrics. 2008;121(2):e223–e232. doi: 10.1542/peds.2006-3629. [DOI] [PubMed] [Google Scholar]
  • 17.Tomashek KM, Shapiro-Mendoza CK, Davidoff MJ, Petrini JR. Differences in mortality between late-preterm and term singleton infants in the United States, 1995–2002. The Journal of Pediatrics. 2007;151(5):450–456. doi: 10.1016/j.jpeds.2007.05.002. [DOI] [PubMed] [Google Scholar]
  • 18.Stern JE, Brown MB, Luke B, et al. Calculating cumulative live-birth rates from linked cycles of assisted reproductive technology (ART): Data from the Massachusetts SART CORS. Fertility and Sterility. 2010;94(4):1334–1340. doi: 10.1016/j.fertnstert.2009.05.052. [DOI] [PubMed] [Google Scholar]
  • 19.Luke B, Brown MB, Wantman E, et al. Cumulative birth rates with linked assisted reproductive technology cycles. New England Journal of Medicine. 2012;366(26):2483–2491. doi: 10.1056/NEJMoa1110238. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Luke B, Brown M, Wantman E, Lederman A, Gibbons W, Stern J. National assisted reproductive technology (ART) cycle linkage. Fertility and Sterility. 2010;94(4):S82. doi: 10.1016/j.fertnstert.2009.05.052. [DOI] [PubMed] [Google Scholar]
  • 21.INFOSoft, Inc. Winnipeg, Manitoba, Canada.
  • 22.SAS 9.2, 2011, SAS Institute Inc., Cary, NC, USA.
  • 23.Declercq ER, Belanoff C, Diop H, et al. Identifying women with indicators of subfertility in a statewide population database: Operationalizing the missing link in ART research. Fertility and Sterility 2014. 2013;101(2):463–471. doi: 10.1016/j.fertnstert.2013.10.028. Submitted for publication. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Zhang Z, Macaluso M, Cohen B, et al. Accuracy of assisted reproductive technology information on the Massachusetts birth certificate, 1997–2000. Fertility and Sterility. 2010;94(5):1657–1661. doi: 10.1016/j.fertnstert.2009.10.059. [DOI] [PubMed] [Google Scholar]
  • 25.Massachusetts Department of Public Health. Birth. 2010 [Google Scholar]
  • 26.Cohen B, Bernson D, Sappenfield W, et al. Accuracy of assisted reproductive technology information on birth certificates: Florida and Massachusetts, 2004–06. Paediatric and Perinatal Epidemiology. 2014 doi: 10.1111/ppe.12110. [DOI] [PubMed] [Google Scholar]
  • 27.Shevell T, Malone FD, Vidaver J, et al. Assisted reproductive technology and pregnancy outcome. Obstetrics and Gynecology. 2005;106:1039–1045. doi: 10.1097/01.AOG.0000183593.24583.7c. [DOI] [PubMed] [Google Scholar]
  • 28.Fujimoto VY, Luke B, Brown MB, et al. Racial and ethnic disparities in assisted reproductive technology outcomes in the united states. Fertility and Sterility. 2010;93(2):382–390. doi: 10.1016/j.fertnstert.2008.10.061. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Kalra SK, Ratcliffe SJ, Coutifaris C, Molinaro T, Barnhart KT. Ovarian stimulation and low birth weight in infants conceived through in vitro fertilization. Obstetrics and Gynecology. 2011;118(4):863. doi: 10.1097/AOG.0b013e31822be65f. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Luke B, Brown MB, Grainger DA, Stern JE, Klein N, Cedars MI. The effect of early fetal losses on singleton assisted-conception pregnancy outcomes. Fertility and Sterility. 2009;91(6):2578–2585. doi: 10.1016/j.fertnstert.2008.03.068. [DOI] [PubMed] [Google Scholar]
  • 31.Luke B, Brown MB, Morbeck DE, Hudson SB, Coddington CC, III, Stern JE. Factors associated with ovarian hyperstimulation syndrome (OHSS) and its effect on assisted reproductive technology (ART) treatment and outcome. Fertility and Sterility. 2010;94(4):1399–1404. doi: 10.1016/j.fertnstert.2009.05.092. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental

RESOURCES