. 2023 Jan 6;6(1):e2249804. doi: 10.1001/jamanetworkopen.2022.49804

Table. Associations Between 30 Socioeconomic and Demographic Features and Claims Database Sampling Fraction at the Zip Code Level, Accounting for State-Level Variation in Sampling Fraction in 2 Models.

Characteristic	%		P value
Characteristic	Partial correlation coefficient^a	Multivariable regression coefficient (SD)^b	P value
Population
Total population (millions)	0.01	−0.0005 (0.00018)	<.001
Log₁₀ pop density (1/square mile)	0.14	0.67 (0.06)	<.001
Female sex	0.12	0.050 (0.010)	<.001
Race and ethnicity
Asian (non-Hispanic)	0.16	−0.010 (0.004)	<.001
Black (non-Hispanic)	−0.15	−0.008 (0.002)	<.001
Hispanic	−0.19	0.001 (0.004)	.64
White (non-Hispanic)	0.20	[Reference]	NA
Other^c	−0.08	−0.026 (0.008)	<.001
Age, y
<18	−0.09	[Reference]	NA
18-40	−0.21	−0.019 (0.010)	<.001
40-60	0.32	0.084 (0.014)	<.001
60-80	0.14	0.002 (0.012)	.71
>80	0.12	0.055 (0.024)	<.001
Household income, $
<15 000	−0.34	[Reference]	NA
15 000-30 000	−0.40	−0.016 (0.014)	.01
30 000-45 000	−0.37	−0.009 (0.012)	.16
45 000-60 000	−0.23	0.002 (0.014)	.78
60 000-100 000	0.10	0.007 (0.010)	.15
100 000-125 000	0.33	0.033 (0.018)	<.001
125 000-200 000	0.43	0.016 (0.012)	.01
>200 000	0.42	0.071 (0.014)	<.001
Work and insurance
Unemployed	−0.29	−0.001 (0.014)	.89
No health insurance	0.31	−0.030 (0.008)	<.001
Education
Less than high school	−0.33	[Reference]
High school	−0.27	0.038 (0.01)	<.001
Some college	−0.09	−0.010 (0.008)	.01
College	0.41	0.071 (0.010)	<.001
Graduate	0.30	−0.021 (0.010)	<.001
Housing
Houses that are owner occupied	0.25	−0.003 (0.004)	.14
Median house price (millions), $	0.36	−0.00001 (0.00003)	.43

Abbreviation: NA, not applicable.

^{^a}

The first set of models considers each covariate of interest separately, along with state-level fixed effects. Partial correlation coefficients derived from this model are presented; positive correlations indicate that zip codes with higher values of the covariate of interest are associated with higher zip-code level sampling in the claims database, even after adjusting for state-level clustering in sampling.

^{^b}

The second model is a full multivariable model that includes all 29 covariates of interest in addition to state-level fixed effects. For example, for a 10 percentage increase in a zip code’s fraction of households earning greater than $200 000, the model suggests the claims database sampling fraction will increase by 0.6 percentage points, on average.

^{^c}

Other race and ethnicity includes persons identifying as non-Hispanic American Indian and/or Alaska Native, non-Hispanic Native Hawaiian and Other Pacific Islander, non-Hispanic other races, and 2 or more races.