Abstract
Objective.
Health disparities in childhood-onset systemic lupus erythematosus (cSLE) disproportionally impact marginalized populations. Socioeconomically-patterned missing data can magnify existing health inequities by supporting inferences that may misrepresent populations of interest. We assessed missing data and subsequent health equity implications among participants with cSLE enrolled in a large pediatric rheumatology registry.
Methods.
We evaluated co-missingness of 12 variables representing demographics, socioeconomic position, and clinical factors (e.g., disease-related indices) using Childhood Arthritis and Rheumatology Research Alliance (CARRA) Registry cSLE enrollment data (2015—2022; N=766). We performed logistic regression to calculate odds ratios (OR) and 95% confidence intervals (95%CI) for missing disease-related indices at enrollment (SLE Disease Activity Index, [SLEDAI-2K] and/or SLICC/ACR Damage Index [SDI]) associated with data missingness and linear regression to assess the association between socioeconomic factors and SLEDAI-2K at enrollment utilizing three analytic methods for missing data: 1) complete case analysis, 2) multiple imputation, and 3) non-probabilistic bias analyses—missing values imputed to represent extreme low or high disadvantage.
Results.
On average, participants were missing 6.1% of data with over 50% of participants missing at least one variable. Missing data correlated most closely with variables within data categories (i.e., demographic). Government-assisted health insurance was associated with missing SLEDAI-2K and/or SDI compared to private health insurance (OR=2.04; 95%CI: 1.22,3.41). The different analytic approaches resulted in varying analytic sample sizes and fundamentally conflicting estimated associations.
Conclusion.
Our results support intentional evaluation of missing data to inform effect estimate interpretation and critical assessment of causal statements that might otherwise misrepresent health inequities.
Pervasive health disparities in childhood-onset systemic lupus erythematosus (cSLE) affect the quality of life and disease outcomes across marginalized populations. Studies have found that Black and Hispanic/Latino people with cSLE disproportionately develop more severe disease manifestations (1). In addition, low socioeconomic status is associated with high disease activity and organ damage among people with SLE compared to those with high socioeconomic status (2). Despite the known disparities in disease severity and outcomes in pediatric rheumatology, the most affected populations continue to be underrepresented in research studies. In a review of randomized clinical trials for SLE in adults, individuals who identified as members of minoritized racial and ethnic groups comprised 73% of prevalent SLE cases and only 45% of clinical trial participants (3).
Because cSLE is a relatively rare disease, multi-site patient registries are essential for observational research to gain knowledge on the disease and its outcomes. The Childhood Arthritis and Rheumatology Research Alliance (CARRA) Registry is the largest cSLE registry to date and represents over 70 sites across North America and Israel. The CARRA Registry has supported multiple published studies on cSLE in the past decade (4-14). While the CARRA Registry represents a diverse cohort of patients, to ensure that knowledge gained from the Registry is generalizable and representative of the diverse patient population affected by cSLE, a better understanding of patterns of missing data in the Registry is required.
Missing data that are socioeconomically, racially, or ethnically patterned can obscure existing health inequities by underestimating the underlying associations or contributing to unsupported inferences that fail to represent the target population. Because of this, there is a growing emphasis in epidemiologic research on the use of quantitative bias analysis methods to evaluate the impact of missing data, measurement error, selection bias, and other mechanisms that can contribute to biased estimates (15, 16). A recent systematic review by Lauper et al. (2021) reported that 83% of longitudinal observational studies reported in key rheumatology journals between 2008 and 2019 failed to report missing data on covariates and almost half relied on complete case analysis (17). When looking at prior CARRA research studies, we found that while many reported missing data, only a few studies used quantitative bias analysis methods, such as sensitivity analyses and multiple imputation (12, 18-11).
No current studies have comprehensively assessed patterns of missing data in the CARRA Registry or have evaluated how missing data may impact the interpretation of study results. By assessing the socioeconomic patterning of missing data related to disease outcomes in the cSLE population, researchers can better determine whether they should consider additional methods to address missingness when using CARRA Registry data to minimize the potential for perpetuating existing health inequities and to improve generalizability of results. In this study, we aim to identify patterns of missing data in the CARRA Registry, assess whether missing data are more prevalent among minoritized racial and ethnic and/or lower socioeconomic groups, and to examine the health equity implications that might arise when missing data are not addressed in analyses.
PATIENTS AND METHODS
CARRA Registry.
The CARRA Registry is a prospective observational registry of persons with childhood-onset rheumatic disease designed to evaluate therapeutic safety among the study population (25). Enrollment began in July 2015 and is ongoing among 74 pediatric rheumatology clinical sites across the United States (US), Canada, and Israel and includes over 12,000 participants as of February 2022.
A total of 925 CARRA Registry participants were diagnosed with childhood-onset systemic lupus erythematosus (cSLE) using the 2012 Systemic Lupus International Collaborating Clinics (SLICC) diagnostic criteria. We include in this proof-of-concept study of missing data 766 participants who also fulfilled at least 4 of 11 American College of Rheumatology (ACR) criteria for SLE (1997) or had biopsy-proven lupus nephritis with at least two additional ACR criteria prior to 19 years of age (26). We use these generally stricter guidelines to standardize the potential impact of missing diagnosis criteria, which is more difficult to address when utilizing the SLICC diagnostic criteria that allows for varying number of criteria to confirm cSLE diagnosis (i.e., 4 out of 17 criteria without lupus nephritis or lupus nephritis and ANA or dsDNA positivity). All participants were diagnosed with cSLE within 24 months of enrollment or had a new diagnosis of lupus nephritis if their original cSLE diagnosis was greater than 24 months prior. At enrollment, participants and/or parents/guardians completed self- or medical staff-administered questionnaires regarding sociodemographics, symptom onset, and patient reported outcomes. Pediatric rheumatology staff completed corresponding questionnaires regarding clinical manifestations, including disease symptoms, laboratory values, and physician global assessment of disease activity.
CARRA Registry participants (and/or a parent or guardian when required) provided written informed consent at Registry enrollment. Our current analyses were conducted using de-identified data and were considered exempt by the Institutional Review Board at the National Institutes of Health.
Socioeconomic factors.
Data on socioeconomic factors were collected at enrollment and included information on self-identified race (White, Black, Asian, Middle Eastern or North African, Native American, American Indian, or Alaska Native, Native Hawaiian or other Pacific Islander, multiple races, or races not otherwise specified) and ethnicity (Hispanic, not Hispanic) –social constructs that are rooted in practices of structural racism rather than biological differences between racialized groups (27-30), annual household income (in U.S. or Canadian dollars, <$25,000, $25,000-49,999, $50,000-74,999, $75,000-99,999, $100,000-150,000, above $150,000, prefer not to answer, unknown); highest level of parental/guardian education completed (elementary/middle school, some high school, graduated high school or GED, college – including junior college or technical school, graduate school, prefer not to answer) and insurance status (private, government-assisted [e.g., Medicare, Medicaid, Military Health Care, State-specific Plan (non-Medicaid), Indian Health Services], other [e.g., Non-U.S. Insurance, Other], none).
Clinical factors.
Measures of SLE disease activity and severity were collected at enrollment. Measures included the SLE Disease Activity Index (SLEDAI-2K)–SLE disease activity over the past 30 days (possible score: 0 to 105)--and SLICC/ACR Damage Index (SDI)–SLE-related damage (possible score: 0 to 43). Additional clinical variables assessed were age at symptom onset and age at first visit with a pediatric rheumatologist. Disease duration was calculated as time from symptom onset to time of enrollment (years).
Covariates.
Covariates were collected at enrollment, including participant sex assigned at birth (male, female, other) and self-identified gender (male, female, other). Models were also adjusted for participant age (continuous) and enrollment site.
Statistical analysis.
Descriptive statistics were used to describe patterns of missingness among CARRA Registry participants. Tetrachoric correlation coefficients, which are measures of the strength of correlation between two binary variables (range −1.0 [high negative correlation] to 1.0 [high positive correlation]), were estimated using PROC FREQ and the PLCORR option to evaluate patterns of co-missingness among variables (missing vs. not missing). We used logistic regression models to estimate odds ratios (OR) and 95% confidence intervals (95% CI) to evaluate the association between socioeconomic factors and the odds of missing clinical data. For this analysis, missing data for all variables were coded as a discrete category so as to not be dropped from the models.
In our comparative study, we used linear regression models to test the association between socioeconomic factors and disease activity (SLEDAI-2K) and disease damage (SDI) for youths with cSLE at time of enrollment, adjusting for age, sex assigned at birth, disease duration at enrollment, time since first pediatric rheumatology visit, and study site. First, we used complete case analysis, excluding all participants missing any data.
Next, we used multiple imputation by chained equations (PROC MI with FCS command) to obtain 20 imputed datasets, imputing values for missing socioeconomic factors, covariates, and clinical assessments. The variables included in the imputation models were: SLEDAI, SDI, race and ethnicity, sex, annual household income, parental/guardian educational attainment, insurance status, disease duration, and time between symptom onset and first visit with a pediatric rheumatologist, as well as SLICC diagnostic criteria as an auxiliary variable. The multiple imputation datasets were then used to obtain pooled estimates of socioeconomic impact on disease activity and severity using PROC MIANALYZE. A detailed description of the multiple imputation process and example code are included in the Supplemental Materials (Appendix A).
Finally, we conducted a non-probabilistic bias analysis to evaluate the impact of potential differential missingness related to socioeconomic factors (16). Using fixed bias parameter analysis methods, we ran separate models in which we assigned missing socioeconomic factors to represent extreme high disadvantage or extreme low disadvantage and examined how estimates of disease activity and severity changed at these extremes. In extreme high disadvantage models, missing values were imputed to non-Hispanic Black/African American—selected based on the persistent impact of racial residential segregation and marginalization in the U.S. (31), household income less than $25,000 annually, parental educational attainment less than high school, and no health insurance. In extreme low disadvantage models, missing values were imputed to non-Hispanic White, household income $150,000 and above annually, parental educational attainment of graduate school, and private health insurance. All statistical analyses were completed using SAS version 9.4 (SAS Institute, Inc., Cary, NC).
RESULTS
Overall, we included 766 participants with cSLE in the analyses. Characteristics of the study cohort are described in Table 1. The mean age at enrollment into the CARRA Registry was 14.2 ± 3.0 years and approximately 86% were assigned female sex at birth. Furthermore, nearly 74% (n = 555) did not identify as non-Hispanic White, 35% of participants reported an annual household income of less than $50,000, and 56% of participants with known insurance status had non-private insurance or no insurance. There were 37 unique patterns of missing data with greater than 50% of participants missing data on at least one variable (Supplemental Figure 1). On average, participants were missing approximately 6.2% (range 0 – 70%) of the variables we investigated and 10% of participants (n = 78) were missing greater than 10% of included variables (Table 1). Missingness was observed across demographic, socioeconomic, and clinical variables with socioeconomic variables missing at the highest frequency (Figure 1). The variable with the greatest level of missing data was household income, with 35% of participants (n = 269) either selecting “Prefer not to answer” or not providing a response. The distributions of participants with known data across variables of interest were similar for most variables, except for household income, when comparing those who were missing 10% or fewer variables and those missing greater than 10% of variables (Table 1).
Table 1.
Characteristics of CARRA Registry participants diagnosed with childhood-onset systemic lupus erythematosus by level of individual data missingness (N = 766)
| Overall | ≤10% Missing (N =688) |
>10 % Missing (N = 78) |
|
|---|---|---|---|
| Variable | N (%) | N (%) | |
| Demographics | |||
| Sex assigned at birth | |||
| Male | 106 (14) | 97 (14) | 9 (11) |
| Female | 660 (86) | 591 (86) | 69 (89) |
| Gender | |||
| Male | 107 (14) | 98 (14) | 9 (12) |
| Female | 650 (86) | 586 (86) | 64 (88) |
| Other | 1 (0) | 1 (0) | 0 (0) |
| Missing | 8 | 3 | 5 |
| Age, mean (SD) | 14.2 (3.0) | 14.2 (3.0) | 14.2 (3.0) |
| Socioeconomic Factors | |||
| Race and ethnicity | |||
| Asian, Native Hawaiian, or Pacific Islander | 87 (12) | 80 (12) | 7 (11) |
| Hispanic | 196 (26) | 178 (26) | 18 (28) |
| Middle Eastern or North African | 9 (1) | 7 (1) | 2 (3) |
| Native American, American Indian, or Alaska Native | 7 (1) | 6 (1) | 1 (2) |
| Non-Hispanic Black, African American, African, or Afro-Caribbean | 210 (28) | 196 (29) | 14 (22) |
| Non-Hispanic White | 195 (26) | 180 (26) | 15 (24) |
| Race not previously mentioned | 13 (2) | 13 (2) | 0 (0) |
| Multiracial | 33 (4) | 27 (4) | 6 (10) |
| Missing | 16 | 1 | 15 |
| Household income | |||
| <$25,000 | 104 (16) | 98 (16) | 6 (11) |
| $25,000-49,999 | 122 (19) | 119 (20) | 3 (5) |
| $50,000-74,999 | 78 (12) | 78 (13) | 0 (0) |
| $75,000-99,999 | 59 (9) | 59 (10) | 0 (0) |
| $100,000-150,000 | 68 (10) | 67 (11) | 1 (2) |
| >$150,000 | 66 (10) | 61 (10) | 5 (9) |
| Prefer not to answer | 154 (24) | 114 (19) | 40 (73) |
| Missing | 115 | 92 | 23 |
| Parent/Guardian educational attainment | |||
| Less than high school | 68 (11) | 63 (11) | 5 (10) |
| High school or GED | 172 (27) | 159 (28) | 13 (26) |
| College | 279 (44) | 259 (45) | 20 (40) |
| Graduate School | 108 (17) | 96 (17) | 12 (24) |
| Missing | 139 | 111 | 28 |
| Insurance status | |||
| Private | 316 (44) | 293 (44) | 23 (41) |
| Government-assisteda | 318 (44) | 294 (44) | 24 (44) |
| Other, including non-US insurance | 72 (10) | 67 (10) | 5 (9) |
| No Insurance | 20 (3) | 18 (3) | 2 (4) |
| Missing | 40 | 16 | 24 |
| Clinical Factors | |||
| SLEDAI-2K, mean (SD) | 7.1 (7.3) | 7.0 (7.3) | 8.1 (7.7) |
| SDI, mean(SD) | 0.3 (0.8) | 0.3 (0.8) | 0.4 (1.0) |
| Lupus nephritis | 390 (51) | 353 (51) | 37 (48) |
| Age at symptom onset, mean (SD) | 12.7 (3.1) | 12.7 (3.1) | 12.8 (3.0) |
| Age at first pediatric rheumatology visit, mean (SD) | 13.0 (3.0) | 13.0 (3.0) | 13.0 (3.0) |
| Disease duration from symptom onsetb, mean (SD) | 1.5 (1.8) | 1.5 (1.9) | 1.3 (1.5) |
Missing not otherwise noted: age (n=8); SLEDAI-2K (n=90); SDI (n=12); age of symptom onset (n=8); age of first pediatric rheumatology visit (n=8)
Medicare, Medicaid, Military Health Care, State-specific Plan (non-Medicaid), Indian Health Services
Disease duration defined as time from symptom onset to enrollment
GED: Graduation Equivalent Degree; SD: Standard deviation; SDI: SLICC damage index; SLEDAI-2K: Systemic lupus erythematosus disease activity index
Figure 1.

Patterns of missing data in the CARRA Registry, tetrachoric correlations (missing vs. not missing).
PR: Pediatric rheumatologist; SDI: SLICC damage index; SLEDAI-2K: Systemic lupus erythematosus disease activity index
Missing data were most highly correlated with variables of the same type (i.e., demographic, socioeconomic, or clinical). Figure 1 represents the correlation of missing data between variables. Missingness of SLEDAI-2K was perfectly correlated with missingness of age, age at symptom onset, and age at first visit with a pediatric rheumatologist (tetrachoric correlation coefficients = 1.0). Missingness of gender was also highly correlated with missingness of parental highest educational attainment (tetrachoric correlation coefficient = 0.82).
We next evaluated the characteristics associated with missing either a SLEDAI-2K and/or SDI score, Reporting government-assisted health insurance was associated with 2.0 times greater odds of missing a clinical score compared to reporting private health insurance (OR = 2.04; 95% CI: 1.22, 3.41; Table 2). Furthermore, each year increase in age was associated with an 8% decrease in odds of missing a clinical score (OR=0.92; CI:0.85, 0.99). Race and ethnicity, household income, and parental educational attainment were not statistically significantly associated with missing SLEDAI-2K and/or SDI.
Table 2.
Association between demographic and socioeconomic factors and the odds of missing Systemic Lupus Erythematosus Disease Activity Index (SLEDAI-2K) and/or SLICC Damage Index (SDI) at enrollment among participants with childhood-onset SLE enrolled in the CARRA Registry (N = 758)a
| OR (95% CI)b | |
|---|---|
| Sex assigned at birth | |
| Male | Reference |
| Female | 1.98 (0.89, 4.42) |
| Gender | |
| Male | Reference |
| Female | 2.03 (0.91, 4.53) |
| Other | -- |
| Missing | -- |
| Age | 0.92 (0.85, 0.99) |
| Race and ethnicity | |
| Asian/NHPI | 0.52 (0.22, 1.26) |
| Hispanic | 0.86 (0.47, 1.55) |
| Non-Hispanic Black/African American | 0.63 (0.34, 1.18) |
| Non-Hispanic Other (NA/AN, Middle Eastern, North African, race not previously mentioned, multiracial) | 1.01 (0.44, 2.30) |
| Non-Hispanic White | Reference |
| Missing | 0.91 (0.19, 4.22) |
| Household income | |
| <$25,000 | 2.57 (0.91, 7.31) |
| $25,000-49,999 | 1.77 (0.61, 5.11) |
| $50,000-74,999 | 0.76 (0.21, 2.78) |
| $75,000-99,999 | 0.41 (0.08, 2.22) |
| $100,000-150,000 | 1.98 (0.63, 6.21) |
| >$150,000 | Reference |
| Prefer not to answer | 1.92 (0.69, 5.36) |
| Missing | 1.24 (0.40, 3.83) |
| Parental educational attainment | |
| <High school | 0.93 (0.41, 2.11) |
| High school graduate or GED | 0.62 (0.31, 1.23) |
| College | 0.57 (0.31, 1.08) |
| Graduate school | Reference |
| Missing | 0.40 (0.18, 0.91) |
| Insurance status | |
| Private | Reference |
| Government-assisted | 2.04 (1.22, 3.41) |
| Other, including non-US insurance | 0.88 (0.32, 2.39) |
| None | 1.19 (0.26, 5.45) |
| Missing | 2.53 (1.01, 6.34) |
Excludes 8 participants missing age-related variables.
Models run separately for each demographic or socioeconomic factor and adjusted for age (except age model), disease duration at enrollment (continuous), and time since first seen by a pediatric rheumatologist (continuous).
CI: Confidence interval; GED: General Equivalent Degree; NA/AN: Native American or Alaska Native; NHPI: Native Hawaiian or Pacific Islander; OR: Odds ratio; SLE: Systemic lupus erythematosus
The sample distributions of characteristics for each method used to address missing data are described in Supplemental Table 1. As expected, the greatest variation in study samples was observed between the two samples used in the non-probabilistic bias analysis, in which missing values were imputed to the most and least disadvantaged groups. Results from the three statistical methods used to address missing data and the association between socioeconomic factors and the SLEDAI-2K at enrollment are reported in Table 3. In complete case analysis, 385 participants were dropped from analysis due to missing data, leaving an analytic sample of 381. Due to the nature of multiple imputation, each imputed dataset reflected data from all 766 participants. Finally, in non-probabilistic bias analyses, 90 participants were dropped due to missing data on model covariates (age, disease duration, and time since first seen by a pediatric rheumatologist) and SLEDAI-2K since only the socioeconomic factors were imputed for the purpose of this study.
Table 3.
Associationsa between socioeconomic factors and SLEDAI-2K score at enrollment among participants with childhood-onset systemic lupus erythematosus enrolled in the CARRA Registry using different analytic methods to address missing data
| Non-probabilistic bias analysis (N = 676) |
||||
|---|---|---|---|---|
| Complete case analysis (N = 381) β (95% CI) |
Multiple imputation (20 imputations) (N = 766) Mean β (95% CI) |
Imputed most disadvantaged/ marginalized group β (95% CI) |
Imputed least disadvantaged/ marginalized group β (95% CI) |
|
| Race and ethnicity | ||||
| Asian/NHPI | 2.33 (−0.80, 5.45) | 1.31 (−0.79, 2.90) | 1.79 (−0.25, 3.83) | 1.58 (−0.42, 3.59) |
| Hispanic | 0.60 (−1.79, 2.99) | 1.22 (−0.46, 2.90) | 1.48 (−0.18, 3.13) | 1.07 (−0.56, 2.69) |
| Non-Hispanic Black/African American | −0.58 (−2.80, 1.65) | 0.18 (−1.42, 1.78) | 0.25 (−1.28, 1.79) | 0.00 (−1.53, 1.54) |
| Non-Hispanic Other (NA/AN, Middle Eastern, North African, race not previously mentioned, multiracial) | 0.51 (−2.56, 3.58) | 2.00 (−0.31, 4.32) | 1.87 (−0.36, 4.10) | 1.91 (−0.31, 4.12) |
| Non-Hispanic White | Reference | Reference | Reference | Reference |
| Household income | ||||
| <$25,000 | 1.89 (−1.79, 5.56) | −0.50 (−3.46, 2.46) | −1.36 (−3.60, 0.89) | 0.88 (−0.93, 2.69) |
| $25,000-49,999 | 1.06 (−2.24, 4.37) | −0.29 (−2.97, 2.40) | −0.25 (−2.78, 2.27) | 1.08 (−0.62, 2.79) |
| $50,000-74,999 | 0.66 (−2.50, 3.82) | −0.62 (−3.11, 1.87) | −0.60 (−3.18, 1.97) | 0.58 (−1.31, 2.46) |
| $75,000-99,999 | 1.46 (−1.71, 4.63) | 0.14 (−2.46, 2.74) | 0.39 (−2.29, 3.06) | 1.37 (−0.73, 3.47) |
| $100,000-150,000 | 0.12 (−3.03, 3.27) | 0.06 (−2.49, 2.62) | 0.11 (−2.52, 2.74) | 1.13 (−0.93, 3.19) |
| >$150,000 | Reference | Reference | Reference | Reference |
| Parental educational attainment | ||||
| <High school | 1.03 (−2.75, 4.81) | 0.70 (−1.95, 3.34) | 0.46 (−1.59, 2.51) | 1.10 (−1.05, 3.25) |
| High school grad or GED | 1.28 (−1.45, 4.01) | 1.18 (−1.04, 3.40) | 1.06 (−0.98, 3.10) | 0.85 (−0.74, 2.45) |
| College | 0.95 (−1.41, 3.30) | 0.90 (−1.02, 2.82) | 0.89 (−0.95, 2.73) | 0.76 (−0.67, 2.18) |
| Graduate school | Reference | Reference | Reference | Reference |
| Insurance status | ||||
| Private | Reference | Reference | Reference | Reference |
| Government-assisted | −1.60 (−3.76, 0.56) | 0.07 (−1.66, 1.81) | 0.41 (−0.99, 1.81) | −0.22 (−1.58, 1.14) |
| Other, including non-US insurance | −1.09 (−5.81, 3.64) | −1.71 (−4.70, 1.27) | −1.59 (−4.49, 1.30) | −1.94 (−4.78, 0.90) |
| None | 4.91 (0.13, 9.70) | 3.81 (0.05, 7.57) | 2.69 (0.47, 4.92) | 3.58 (0.13, 7.04) |
Models adjusted for age (continuous), sex assigned at birth (male, female, other), disease duration at enrollment (continuous), time since first seen by a pediatric rheumatologist (continuous), and study site (categorical).
GED: Graduation Equivalent Degree; NA/AN: Native American or Alaska Native; NHPI: Native Hawaiian or Pacific Islander; SDI: SLICC damage index; SE: Standard error; SLEDAI-2K: Systemic lupus erythematosus disease activity index
The variation in SLEDAI-2K scores associated with socioeconomic factors differed across methods of addressing missing data suggesting inverse, positive, or no association. For example, when compared to private health insurance, government-assisted health insurance was suggestive of an inverse association with SLEDAI-2K scores in complete case analysis (β=−1.60; 95% CI: −3.76, 0.56) after controlling for potential confounding due to age, sex assigned at birth, disease duration, time since first seen by a pediatric rheumatologist, and study site. Further, in similarly adjusted models using multiple imputation, there was no apparent association between government-assisted health insurance compared to private health insurance (β = −0.07; 95% CI: −1.66, 1.81). Finally, in non-probabilistic bias analyses, estimated SLEDAI-2k scores were slightly elevated when missing data were imputed to the most disadvantaged group (no health insurance; β = 0.41; 95% CI: −0.99, 1.81) and slightly decreased when missing were imputed to the most advantaged group (private health insurance; β = −0.22; 95% CI: −1.58, 1.14). Conversely, we observed elevated SLEDAI scores associated with having no health insurance, compared to private insurance, across all methods used to address missing data ranging from 2.69-fold higher scores (95% CI: 0.47, 4.92) when missing data were imputed to the most disadvantaged to 4.91-fold higher scores (95% CI: 0.13, 9.70) in complete case analysis. Similar levels of variation in estimates, either in direction or magnitude, were observed across the three methods to address missing data when assessing the association between socioeconomic factors and SDI at enrollment (Supplemental Table 2).
DISCUSSION
This is the first study to systematically assess the impact of missing data on the interpretation of findings in a pediatric rheumatology research setting. Although missing data were most highly correlated among variables of the same type (i.e., demographic, socioeconomic factors, or clinical factors), missingness of clinical data were socioeconomically patterned such that the odds of missing SLEDAI-2K or SDI scores at enrollment were noticeably higher among participants who reported government-assisted insurance compared to private insurance. Using multiple statistical methods to address missing data (i.e., complete case analysis, multiple imputation, and non-probabilistic bias analyses) and to evaluate the association between socioeconomic factors and enrollment SLEDAI-2K scores, we observed estimates across methods that varied in both direction and magnitude and suggest conflicting conclusions: decreased disease activity, no difference in disease activity, or elevated disease activity at enrollment in relation to lower socioeconomic position. Notably, this variation in the results clearly demonstrates how missing data and the methodology used to address missingness can influence results and subsequent conclusions derived in observational studies. Differences in estimated effects would not have been captured nor considered in the interpretation of results if a single analytic method had been selected. These results support the use of multiple methods to evaluate the role of missing data in quantitative analyses and further sensitivity analyses to better identify the underlying relationship between the exposure(s) and outcome(s) of interest.
Similar to other registry studies, the CARRA Registry relies on longitudinal observational data collection, which makes it vulnerable to missing data due to loss to follow-up, missed study visits, and participant failure to complete visit questionnaires (32). Proactive measures that reduce the amount of missing data during the data collection phase can reduce the dependence on post hoc solutions in the analytic phase (33). For example, the prevalence of missing data in patient registries may be attributed to several factors, including training and staffing at registry sites. To ensure the completion of self-administered questionnaires, the medical staff may need to routinely check on and assist participants during data collection. The identification of missing data after a questionnaire is submitted may require the medical or registry staff to take additional steps to complete the missing variables (34), which may not be feasible at understaffed sites due to lack of time, resources, or training to monitor missing data. However, individuals who are experienced with data collection associated with patient registries may be more inclined to assist participants while filling out questionnaires, assess for missing data after the study visit, and subsequently follow-up with participants for questionnaire completion (34, 35).
To attempt to mitigate the impact of these concerns on data completeness in the CARRA Registry, the clinical and data coordinating center provides thorough training for Registry site staff including best practices for administering questionnaires and capturing critical Registry data. This training is repeated periodically to account for ongoing site staff turnover. Additionally, the Registry’s database includes automated validation checks that flag fields with missing data for the site to review. These validation checks are especially useful for clinical data that may be readily available for extraction from the medical record. Despite these efforts, institutional barriers that facilitate incomplete data may persist or evolve over time, and as such, regular evaluation of ongoing data collection efforts are needed to limit data missingness.
In addition to institutional barriers that may hinder data collection, missing data can also occur when participants avoid questions that address sensitive topics or suffer from respondent fatigue and fail to complete the full questionnaire (36). Of the variables with missing data across the CARRA Registry, household income had the highest level of missingness. Participants may feel uncomfortable disclosing personal income information if they consider it sensitive information (34). Further, accurately describing income may be challenging for participants with unstable income or employment who may opt to defer or leave the item blank. It may be beneficial to inform patients about how their identity will be protected when their data are used and the reasons why sensitive information, such as income, are important to the research question. In lieu of individually reported income data, area level data, such as the area deprivation index or the social vulnerability index can be used to supplement missing socioeconomic data or corroborate reported values (37, 38).
When missing data persist despite these preventative measures, strategies for dealing with missing data vary, and best practices depend on study design, the distribution of missingness, and underlying assumptions about the structure of the data and reasons contributing to data missingness (39, 40). Being intentional about the presence of missing data and the use of multiple methods to assess missing data can help to elucidate the impact of selection bias, confounding, and exposure misclassification on the interpretation of results (41) and facilitate critical assessment of potential causal statements that would otherwise have the potential to under- or over- estimate existing health inequities.
Although the use of multiple imputation to address missing data in medical research and clinical trials has gained traction (42, 43), these methods have yet to become standard practice in epidemiologic studies in pediatric rheumatology. When looking at prior CARRA Registry research studies, we found that while many reported missing data, most used complete case analysis methods (44-7) and only a few CARRA Registry studies conducted sensitivity analyses or used multiple imputation to address missing data and to assess its impact on study results (11, 20). These trends are similar to those observed in studies published in key rheumatology journals (17). However, when data are not missing at random, complete case analysis can bias study results, and when missing data are socially patterned, they can introduce systematic biases that can have subsequent health equity implications. Many pediatric rheumatology researchers may not be aware of how different patterns of missing data may impact study results and their interpretation. It is, therefore, recommended that 1) attempts be made to reduce the potential for missing data during the study planning and implementation phases, 2) trained statistical personnel are included in multidisciplinary research teams to help identify and address persisting issues related to missing data, 3) authors be transparent about their missing data by reporting missing values and the methods used to address the missingness, and 4) results are thoughtfully interpreted after considering the impact of any missing data and the methods used to mitigate their effect.
A primary limitation of this study relates to the creation of the analytic sample. We excluded participants classified as cSLE per SLICC diagonostic criteria who did not fulfill at least four ACR criteria for SLE or did not have renal involvement and fulfill two additional criteria (n=159) in order to support generalizability among youth diagnosed cSLE. As a result, we do not capture potential missing data related to cSLE diagnosis. Furthermore, whereas the CARRA Registry includes thousands of variables, as well as additional complications related to the collection of prospective longitudinal data, in this proof-of-concept study, we include a limited number of pertinent variables collected at enrollment to demonstrate the potential health equity implications of missing data. Additional analyses may be warranted to fully assess the impact of missing data across more variables. Further, the inclusion of additional variables that are correlated with missingness of our variables of interest could potentially improve the quality of the imputation models (45).
The methods presented in this study address missing data within the analytic sample, which is also limited by the sample of participants who consent to participate in the CARRA Registry. They do not address the overall representativeness of youths with cSLE who are not captured in the CARRA Registry because they are not approached for research, are not treated at a CARRA Registry site, or are not seen by a pediatric rheumatologist due to workforce shortages. Although there are methods to help address potential selection bias (i.e., probability weights), dedicated research is needed to elucidate what populations may be underrepresented from the underlying population of interest.
Finally, the reason for data missingness is unknown (e.g., due to missing at random, differences in research workflows between sites, participant refusal to answer specific questions, etc.). Further assessment of these reasons could help to inform reporting biases and subsequent health equity implications. Targeted interventions to improve reporting among groups where missing data are demonstrably higher may be warranted. Despite these limitations, this study sheds light on the potential impacts of missing data in pediatric rheumatology research and their role in how results may be interpreted.
Every health registry and corresponding database has missing data. Previous studies have relied heavily on complete case analysis to test study hypotheses, however, these practices can misrepresent existing biases. In this study, we demonstrate the wide variation in results that may occur when using different statistical methods to address missing data. Our findings highlight the need in pediatric rheumatology research for 1) careful review of study data to identify patterns of missingness and 2) selection of appropriate methodology to handle missing data in order to ensure that misleading findings do not exacerbate existent health inequities.
Supplementary Material
Significance and Innovations.
Health disparities in SLE contribute to poor outcomes in historically marginalized populations. Patient registries may have data that are missing in patterns that can unintentionally obscure important findings related to health disparities in childhood-onset SLE (cSLE).
Our study is the first to use multiple analytic methods to assess the effects of missing data on disease measures in epidemiologic research in cSLE.
Using data from the Childhood Arthritis and Rheumatology Research Alliance (CARRA) Registry, we demonstrate how socioeconomically-patterned missing data and the use of varying analytic methods to handle missing data can impact sample size and effect estimates; researchers should integrate these practices in the interpretation of study results to ensure that findings across demographics are not misrepresented.
ACKNOWLEDGEMENTS
This work could not have been accomplished without the aid of the following organizations: The NIH’s National Institute of Arthritis and Musculoskeletal and Skin Diseases (NIAMS) & the Arthritis Foundation. We would like to thank Drs. Paige Bommarito and Peter Grayson for review of this manuscript and Yolanda L. Jones, NIH Library, for manuscript editing assistance. Finally, the authors thank all the CARRA Registry participants, without whom this work would not be possible, as well as the following CARRA Registry site principal investigators, sub-investigators and research coordinators:
N. Abel, K. Abulaban, A. Adams, M. Adams, R. Agbayani, J. Aiello, S. Akoghlanian, C. Alejandro, E. Allenspach, R. Alperin, M. Alpizar, G. Amarilyo, W. Ambler, E. Anderson, S. Ardoin, S. Armendariz, E. Baker, I. Balboni, S. Balevic, L. Ballenger, S. Ballinger, N. Balmuri, F. Barbar-Smiley, L. Barillas-Arias, M. Basiaga, K. Baszis, M. Becker, H. Bell-Brunson, E. Beltz, H. Benham, S. Benseler, W. Bernal, T. Beukelman, T. Bigley, B. Binstadt, C. Black, M. Blakley, J. Bohnsack, J. Boland, A. Boneparth, S. Bowman, C. Bracaglia, E. Brooks, M. Brothers, A. Brown, H. Brunner, M. Buckley, M. Buckley, H. Bukulmez, D. Bullock, B. Cameron, S. Canna, L. Cannon, P. Carper, V. Cartwright, E. Cassidy, L. Cerracchio, E. Chalom, J. Chang, A. Chang-Hoftman, V. Chauhan, P. Chira, T. Chinn, K. Chundru, H. Clairman, D. Co, A. Confair, H. Conlon, R. Connor, A. Cooper, J. Cooper, S. Cooper, C. Correll, R. Corvalan, D. Costanzo, R. Cron, L. Curiel-Duran, T. Curington, M. Curry, A. Dalrymple, A. Davis, C. Davis, C. Davis, T. Davis, F. De Benedetti, D. De Ranieri, J. Dean, F. Dedeoglu, M. DeGuzman, N. Delnay, V. Dempsey, E. DeSantis, T. Dickson, J. Dingle, B. Donaldson, E. Dorsey, S. Dover, J. Dowling, J. Drew, K. Driest, Q. Du, K. Duarte, D. Durkee, E. Duverger, J. Dvergsten, A. Eberhard, M. Eckert, K. Ede, B. Edelheit, C. Edens, C. Edens, Y. Edgerly, M. Elder, B. Ervin, S. Fadrhonc, C. Failing, D. Fair, M. Falcon, L. Favier, S. Federici, B. Feldman, J. Fennell, I. Ferguson, P. Ferguson, B. Ferreira, R. Ferrucho, K. Fields, T. Finkel, M. Fitzgerald, C. Fleming, O. Flynn, L. Fogel, E. Fox, M. Fox, L. Franco, M. Freeman, K. Fritz, S. Froese, R. Fuhlbrigge, J. Fuller, N. George, K. Gerhold, D. Gerstbacher, M. Gilbert, M. Gillispie-Taylor, E. Giverc, C. Godiwala, I. Goh, H. Goheer, D. Goldsmith, E. Gotschlich, A. Gotte, B. Gottlieb, C. Gracia, T. Graham, S. Grevich, T. Griffin, J. Griswold, A. Grom, M. Guevara, P. Guittar, M. Guzman, M. Hager, T. Hahn, O. Halyabar, E. Hammelev, M. Hance, A. Hanson, L. Harel, S. Haro, J. Harris, O. Harry, E. Hartigan, J. Hausmann, A. Hay, K. Hayward, J. Heiart, K. Hekl, L. Henderson, M. Henrickson, A. Hersh, K. Hickey, P. Hill, S. Hillyer, L. Hiraki, M. Hiskey, P. Hobday, C. Hoffart, M. Holland, M. Hollander, S. Hong, M. Horwitz, J. Hsu, A. Huber, J. Huggins, J. Hui-Yuen, C. Hung, J. Huntington, A. Huttenlocher, M. Ibarra, L. Imundo, C. Inman, A. Insalaco, A. Jackson, S. Jackson, K. James, G. Janow, J. Jaquith, S. Jared, N. Johnson, J. Jones, J. Jones, J. Jones, K. Jones, S. Jones, S. Joshi, L. Jung, C. Justice, A. Justiniano, N. Karan, K. Kaufman, A. Kemp, E. Kessler, U. Khalsa, B. Kienzle, S. Kim, Y. Kimura, D. Kingsbury, M. Kitcharoensakkul, T. Klausmeier, K. Klein, M. Klein-Gitelman, B. Kompelien, A. Kosikowski, L. Kovalick, J. Kracker, S. Kramer, C. Kremer, J. Lai, J. Lam, B. Lang, S. Lapidus, B. Lapin, A. Lasky, D. Latham, E. Lawson, R. Laxer, P. Lee, P. Lee, T. Lee, L. Lentini, M. Lerman, D. Levy, S. Li, S. Lieberman, L. Lim, C. Lin, N. Ling, M. Lingis, M. Lo, D. Lovell, D. Lowman, N. Luca, S. Lvovich, C. Madison, J. Madison, S. Magni Manzoni, B. Malla, J. Maller, M. Malloy, M. Mannion, C. Manos, L.. Marques, A. Martyniuk, T. Mason, S. Mathus, L. McAllister, K. McCarthy, K. McConnell, E. McCormick, D. McCurdy, P. McCurdy Stokes, S. McGuire, I. McHale, A. McMonagle, C. McMullen-Jackson, E. Meidan, E. Mellins, E. Mendoza, R. Mercado, A. Merritt, L. Michalowski, P. Miettunen, M. Miller, D. Milojevic, E. Mirizio, E. Misajon, M. Mitchell, R. Modica, S. Mohan, K. Moore, L. Moorthy, S. Morgan, E. Morgan Dewitt, C. Moss, T. Moussa, V. Mruk, A. Murphy, E. Muscal, R. Nadler, B. Nahal, K. Nanda, N. Nasah, L. Nassi, S. Nativ, M. Natter, J. Neely, B. Nelson, L. Newhall, L. Ng, J. Nicholas, R. Nicolai, P. Nigrovic, J. Nocton, B. Nolan, E. Oberle, B. Obispo, B. O'Brien, T. O'Brien, O. Okeke, M. Oliver, J. Olson, K. O'Neil, K. Onel, A. Orandi, M. Orlando, S. Osei-Onomah, R. Oz, E. Pagano, A. Paller, N. Pan, S. Panupattanapong, M. Pardeo, J. Paredes, A. Parsons, J. Patel, K. Pentakota, P. Pepmueller, T. Pfeiffer, K. Phillippi, D. Pires Marafon, K. Phillippi, L. Ponder, R. Pooni, S. Prahalad, S. Pratt, S. Protopapas, B. Puplava, J. Quach, M. Quinlan-Waters, C. Rabinovich, S. Radhakrishna, J. Rafko, J. Raisian, A. Rakestraw, C. Ramirez, E. Ramsay, S. Ramsey, R. Randell, A. Reed, A. Reed, A. Reed, H. Reid, K. Remmel, A. Repp, A. Reyes, A. Richmond, M. Riebschleger, S. Ringold, M. Riordan, M. Riskalla, M. Ritter, R. Rivas-Chacon, A. Robinson, E. Rodela, M. Rodriquez, K. Rojas, T. Ronis, M. Rosenkranz, B. Rosolowski, H. Rothermel, D. Rothman, E. Roth-Wojcicki, K. Rouster – Stevens, T. Rubinstein, N. Ruth, N. Saad, S. Sabbagh, E. Sacco, R. Sadun, C. Sandborg, A. Sanni, L. Santiago, A. Sarkissian, S. Savani, L. Scalzi, L. Schanberg, S. Scharnhorst, K. Schikler, A. Schlefman, H. Schmeling, K. Schmidt, E. Schmitt, R. Schneider, K. Schollaert-Fitch, G. Schulert, T. Seay, C. Seper, J. Shalen, R. Sheets, A. Shelly, S. Shenoi, K. Shergill, J. Shirley, M. Shishov, C. Shivers, E. Silverman, N. Singer, V. Sivaraman, J. Sletten, A. Smith, C. Smith, J. Smith, J. Smith, E. Smitherman, J. Soep, M. Son, S. Spence, L. Spiegel, J. Spitznagle, R. Sran, H. Srinivasalu, H. Stapp, K. Steigerwald, Y. Sterba Rakovchik, S. Stern, A. Stevens, B. Stevens, R. Stevenson, K. Stewart, C. Stingl, J. Stokes, M. Stoll, E. Stringer, S. Sule, J. Sumner, R. Sundel, M. Sutter, R. Syed, G. Syverson, A. Szymanski, S. Taber, R. Tal, A. Tambralli, A. Taneja, T. Tanner, S. Tapani, G. Tarshish, S. Tarvin, L. Tate, A. Taxter, J. Taylor, M. Terry, M. Tesher, A. Thatayatikom, B. Thomas, K. Tiffany, T. Ting, A. Tipp, D. Toib, K. Torok, C. Toruner, H. Tory, M. Toth, S. Tse, V. Tubwell, M. Twilt, S. Uriguen, T. Valcarcel, H. Van Mater, L. Vannoy, C. Varghese, N. Vasquez, K. Vazzana, R. Vehe, K. Veiga, J. Velez, J. Verbsky, G. Vilar, N. Volpe, E. von Scheven, S. Vora, J. Wagner, L. Wagner-Weiner, D. Wahezi, H. Waite, J. Walker, H. Walters, T. Wampler Muskardin, L. Waqar, M. Waterfield, M. Watson, A. Watts, P. Weiser, J. Weiss, P. Weiss, E. Wershba, A. White, C. Williams, A. Wise, J. Woo, L. Woolnough, T. Wright, E. Wu, A. Yalcindag, M. Yee, E. Yen, R. Yeung, K. Yomogida, Q. Yu, R. Zapata, A. Zartoshti, A. Zeft, R. Zeft, Y. Zhang, Y. Zhao, A. Zhu, C. Zic.
Footnotes
Financial Support: This work was supported in part by the Intramural Research Programs at the National Institutes of Health (NIH), National Institute of Environmental Health Sciences and the NIH, National Institute of Arthritis and Musculoskeletal and Skin Diseases.
REFERENCES
- 1.Hiraki LT, Benseler SM, Tyrrell PN, Harvey E, Hebert D, Silverman ED. Ethnic differences in pediatric systemic lupus erythematosus. J Rheumatol. 2009;36(11):2539–46. [DOI] [PubMed] [Google Scholar]
- 2.Carter EE, Barr SG, Clarke AE. The global burden of SLE: prevalence, health disparities and socioeconomic impact. Nat Rev Rheumatol. 2016;12(10):605–20. [DOI] [PubMed] [Google Scholar]
- 3.Falasinnu T, Chaichian Y, Bass MB, Simard JF. The Representation of Gender and Race/Ethnic Groups in Randomized Clinical Trials of Individuals with Systemic Lupus Erythematosus. Curr Rheumatol Rep. 2018;20(4):20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.AlE'ed A, Vega-Fernandez P, Muscal E, Hinze CH, Tucker LB, Appenzeller S, et al. Challenges of Diagnosing Cognitive Dysfunction With Neuropsychiatric Systemic Lupus Erythematosus in Childhood. Arthritis Care Res (Hoboken). 2017;69(10):1449–59. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Ardoin SP, Daly RP, Merzoug L, Tse K, Ardalan K, Arkin L, et al. Research priorities in childhood-onset lupus: results of a multidisciplinary prioritization exercise. Pediatr Rheumatol Online J. 2019;17(1):32. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Chang JC, Mandell DS, Knight AM. High Health Care Utilization Preceding Diagnosis of Systemic Lupus Erythematosus in Youth. Arthritis Care Res (Hoboken). 2018;70(9):1303–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Driest KD, Sturm MS, O'Brien SH, Spencer CH, Stanek JR, Ardoin SP. Factors associated with thrombosis in pediatric patients with systemic lupus erythematosus. Lupus. 2016;25(7):749–53. [DOI] [PubMed] [Google Scholar]
- 8.Hersh AO, Case SM, Son MB. Predictors of disability in a childhood-onset systemic lupus erythematosus cohort: results from the CARRA Legacy Registry. Lupus. 2018;27(3):494–500. [DOI] [PubMed] [Google Scholar]
- 9.Knight AM, Trupin L, Katz P, Yelin E, Lawson EF. Depression Risk in Young Adults With Juvenile- and Adult-Onset Lupus: Twelve Years of Followup. Arthritis Care Res (Hoboken). 2018;70(3):475–80. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Knight AM, Vickery ME, Muscal E, Davis AM, Harris JG, Soybilgic A, et al. Identifying Targets for Improving Mental Healthcare of Adolescents with Systemic Lupus Erythematosus: Perspectives from Pediatric Rheumatology Clinicians in the United States and Canada. J Rheumatol. 2016;43(6):1136–45. [DOI] [PubMed] [Google Scholar]
- 11.Rubinstein TB, Mowrey WB, Ilowite NT, Wahezi DM. Delays to Care in Pediatric Lupus Patients: Data From the Childhood Arthritis and Rheumatology Research Alliance Legacy Registry. Arthritis Care Res (Hoboken). 2018;70(3):420–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Schanberg LE, Sandborg C, Barnhart HX, Ardoin SP, Yow E, Evans GW, et al. Use of atorvastatin in systemic lupus erythematosus in children and adolescents. Arthritis Rheum. 2012;64(1):285–96. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Wahezi DM, Ilowite NT, Wu XX, Pelkmans L, Laat B, Schanberg LE, et al. Annexin A5 anticoagulant activity in children with systemic lupus erythematosus and the association with antibodies to domain I of β2-glycoprotein I. Lupus. 2013;22(7):702–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Yen EY, Shaheen M, Woo JMP, Mercer N, Li N, McCurdy DK, et al. 46-Year Trends in Systemic Lupus Erythematosus Mortality in the United States, 1968 to 2013: A Nationwide Population-Based Study. Ann Intern Med. 2017;167(11):777–85. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Johnson CY, Howards PP, Strickland MJ, Waller DK, Flanders WD. Multiple bias analysis using logistic regression: an example from the National Birth Defects Prevention Study. Ann Epidemiol. 2018;28(8):510–4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Lash TL, Fox MP, Fink AK. Applying quantitative bias analysis to epidemiologic data: Springer; 2009. [Google Scholar]
- 17.Lauper K, Kedra J, De Wit M, Fautrel B, Frisell T, Hyrich KL, et al. Analysing and reporting of observational data: a systematic review informing the EULAR points to consider when analysing and reporting comparative effectiveness research with observational data in rheumatology. RMD Open. 2021;7(3):e001818. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Balmuri N, Soulsby WD, Cooley V, Gerber L, Lawson E, Goodman S, et al. Community poverty level influences time to first pediatric rheumatology appointment in Polyarticular Juvenile Idiopathic Arthritis. Pediatr Rheumatol Online J. 2021;19(1):122. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Ilowite NT, Prather K, Lokhnygina Y, Schanberg LE, Elder M, Milojevic D, et al. Randomized, double-blind, placebo-controlled trial of the efficacy and safety of rilonacept in the treatment of systemic juvenile idiopathic arthritis. Arthritis Rheumatol. 2014;66(9):2570–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Kimura Y, Schanberg LE, Tomlinson GA, Riordan ME, Dennos AC, Del Gaizo V, et al. Optimizing the Start Time of Biologics in Polyarticular Juvenile Idiopathic Arthritis: A Comparative Effectiveness Study of Childhood Arthritis and Rheumatology Research Alliance Consensus Treatment Plans. Arthritis Rheumatol. 2021;73(10):1898–909. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Liu K, Tomlinson G, Reed AM, Huber AM, Saarela O, Bout-Tabaku SM, et al. Pilot Study of the Juvenile Dermatomyositis Consensus Treatment Plans: A CARRA Registry Study. J Rheumatol. 2021;48(1):114–22. [DOI] [PubMed] [Google Scholar]
- 22.Ong MS, Ringold S, Kimura Y, Schanberg LE, Tomlinson GA, Natter MD. Improved Disease Course Associated With Early Initiation of Biologics in Polyarticular Juvenile Idiopathic Arthritis: Trajectory Analysis of a Childhood Arthritis and Rheumatology Research Alliance Consensus Treatment Plans Study. Arthritis Rheumatol. 2021;73(10):1910–20. [DOI] [PubMed] [Google Scholar]
- 23.Phillippi K, Hoeltzel M, Byun Robinson A, Kim S. Race, Income, and Disease Outcomes in Juvenile Dermatomyositis. J Pediatr. 2017;184:38–44.e1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Weitzman ER, Wisk LE, Salimian PK, Magane KM, Dedeoglu F, Hersh AO, et al. Adding patient-reported outcomes to a multisite registry to quantify quality of life and experiences of disease and treatment for youth with juvenile idiopathic arthritis. J Patient Rep Outcomes. 2018;2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Beukelman T, Kimura Y, Ilowite NT, Mieszkalski K, Natter MD, Burrell G, et al. The new Childhood Arthritis and Rheumatology Research Alliance (CARRA) registry: design, rationale, and characteristics of patients enrolled in the first 12 months. Pediatr Rheumatol Online J. 2017;15(1):30. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Tan EM, Cohen AS, Fries JF, Masi AT, McShane DJ, Rothfield NF, et al. The 1982 revised criteria for the classification of systemic lupus erythematosus. Arthritis Rheum. 1982;25(11):1271–7. [DOI] [PubMed] [Google Scholar]
- 27.Adkins-Jackson PB, Chantarat T, Bailey ZD, Ponce NA. Measuring structural racism: a guide for epidemiologists and other health researchers. American journal of epidemiology. 2021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Bonilla-Silva E. Rethinking racism: Toward a structural interpretation. American sociological review. 1997:465–80. [Google Scholar]
- 29.Smedley A, Smedley BD. Race as biology is fiction, racism as a social problem is real: Anthropological and historical perspectives on the social construction of race. American psychologist. 2005;60(1):16. [DOI] [PubMed] [Google Scholar]
- 30.Williams DR, Priest N, Anderson NB. Understanding associations among race, socioeconomic status, and health: Patterns and prospects. Health Psychol. 2016;35(4):407–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Williams DR, Collins C. Racial residential segregation: a fundamental cause of racial disparities in health. Public Health Rep. 2001;116(5):404–16. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Ayilara OF, Zhang L, Sajobi TT, Sawatzky R, Bohm E, Lix LM. Impact of missing data on bias and precision when estimating change in patient-reported outcomes from a clinical registry. Health and Quality of Life Outcomes. 2019;17(1). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Scharfstein DO, Hogan J, Herman A. On the prevention and analysis of missing data in randomized clinical trials: the state of the art. J Bone Joint Surg Am. 2012;94 Suppl 1(Suppl 1):80–4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Kyte D, Ives J, Draper H, Calvert M. Current practices in patient-reported outcome (PRO) data collection in clinical trials: a cross-sectional survey of UK trial staff and management. BMJ Open. 2016;6(10):e012281. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Palmer MJ, Krupa T, Richardson H, Brundage MD. Clinical research associates experience with missing patient reported outcomes data in cancer randomized controlled trials. Cancer Med. 2021;10(9):3026–34. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Olson K, Smyth JD, Ganshert A. The Effects of Respondent and Question Characteristics on Respondent Answering Behaviors in Telephone Interviews. Journal of Survey Statistics and Methodology. 2018;7(2):275–308. [Google Scholar]
- 37.Kind AJH, Buckingham WR. Making Neighborhood-Disadvantage Metrics Accessible — The Neighborhood Atlas. New England Journal of Medicine. 2018;378(26):2456–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Flanagan BE, Gregory EW, Hallisey EJ, Heitgerd JL, Lewis B. A social vulnerability index for disaster management. Journal of homeland security and emergency management. 2011;8(1). [Google Scholar]
- 39.Graham JW. Missing data analysis: making it work in the real world. Annu Rev Psychol. 2009;60:549–76. [DOI] [PubMed] [Google Scholar]
- 40.Raghunathan TE. What do we do with missing data? Some options for analysis of incomplete data. Annu Rev Public Health. 2004;25:99–117. [DOI] [PubMed] [Google Scholar]
- 41.van Smeden M, Penning de Vries BBL, Nab L, Groenwold RHH. Approaches to addressing missing values, measurement error, and confounding in epidemiologic studies. J Clin Epidemiol. 2021;131:89–100. [DOI] [PubMed] [Google Scholar]
- 42.Mackinnon A. The use and reporting of multiple imputation in medical research - a review. J Intern Med. 2010;268(6):586–93. [DOI] [PubMed] [Google Scholar]
- 43.Alsaber A, Al-Herz A, Pan J, Al-Sultan AT, Mishra D. Handling missing data in a rheumatoid arthritis registry using random forest approach. International Journal of Rheumatic Diseases. 2021;24(10):1282–93. [DOI] [PubMed] [Google Scholar]
- 44.Vazzana KM, Daga A, Goilav B, Ogbu EA, Okamura DM, Park C, et al. Principles of pediatric lupus nephritis in a prospective contemporary multi-center cohort. Lupus. 2021;30(10):1660–70. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Enders CK. Applied missing data analysis: Guilford press; 2010. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
