Abstract
Neighborhood socioeconomic deprivation has been associated with health behaviors and outcomes. However, neighborhood socioeconomic status has been measured inconsistently across studies. It remains unclear whether appropriate socioeconomic indicators vary over geographic areas and geographic levels. The aim of this study is to compare the composite socioeconomic index to six socioeconomic indicators reflecting different aspects of socioeconomic environment by both geographic areas and levels. Using 2000 U.S. Census data, we performed a multivariate common factor analysis to identify significant socioeconomic resources and constructed 12 composite indexes at the county, the census tract, and the block group levels across the nation and for three states, respectively. We assessed the agreement between composite indexes and single socioeconomic variables. The component of the composite index varied across geographic areas. At a specific geographic region, the component of the composite index was similar at the levels of census tracts and block groups but different from that at the county level. The percentage of population below federal poverty line was a significant contributor to the composite index, regardless of geographic areas and levels. Compared with non-component socioeconomic indicators, component variables were more agreeable to the composite index. Based on these findings, we conclude that a composite index is better as a measure of neighborhood socioeconomic deprivation than a single indicator, and it should be constructed on an area- and unit-specific basis to accurately identify and quantify small-area socioeconomic inequalities over a specific study region.
Keywords: assessment, neighborhood, socioeconomic, deprivation, spatial epidemiology
INTRODUCTION
Health-related behaviors and outcomes display significant geographic variations. Neighborhood socioeconomic environment (SES) has been associated with health-related behaviors [1–4], incidence [5–7] and poor prognosis [8] of diseases, and premature mortality [5, 9–12], Population-based data sources from local and federal governments (e.g. U.S. Census) provide a number of SES-related data elements and are commonly used to assess the role of neighborhood SES in health behaviors and outcomes. However, there is no consensus on which neighborhood measures, at which geographic level, should be used to examine socioeconomic disparities in health behaviors and outcomes. Neighborhood SES has been defined inconsistently across studies, which may contribute to inconsistent findings regarding the relationships between neighborhood SES and health behaviors and outcomes [13]. Various single SES indicators at different geographic levels (e.g. county, census tract, block group) have been used as neighborhood SES measures. It remains unclear regarding appropriate SES indicators for a specific geographic region at a specific geographic level.
Neighborhood SES is a complex concept consisting of multiple aspects of socioeconomic resources. A variety of single-variable measures makes it possible to develop a composite index to comprehensively assess neighborhood SES environment. We propose that, compared with single-variable measures, a composite index can more accurately reflect neighborhood deprivation by capturing more dimensions of socioeconomic resources.
In this study, we apply 2000 U.S. Census data to identify individual socioeconomic variables that significantly reflect socioeconomic deprivation across four geographic areas at three geographic levels. We compare composite indexes with six socioeconomic indicators reflecting different aspects of socioeconomic deprivation environment.
METHODS
Data source
U.S. Census data have been widely applied to assess neighborhood socioeconomic context. For the 2000 census and before, the Census Bureau collected population and housing data from all households and socioeconomic data from about one in six households every ten years at a single point in time. From 2006, these information has been collected over time with households sampled per year by the American Community Survey (ACS) and only the cumulative five-year ACS approximating the sample proportion achieved by the decennial census. Considering ACS margins of error for small areas, we applied 2000 U.S. data for the socioeconomic information of geographic areas. In this study, ethical review was not needed because only public-use area-level Census data were applied.
Single SES variables
To capture broad aspects of socioeconomic deprivation context, based on the literature [5, 10, 14–16], we selected 21 Census variables at three geographic levels (county, census tract, and block group) (Table 1). These variables, which reflect neighborhood socioeconomically deprived resources from six different domains, include 1) education (the percentage of population without high school education), 2) occupation (the percentage of population in working class, the percentage of civilian labor force unemployed), 3) housing conditions (the percentage of household rent, the percentage of vacant household, the percentage of household with at least one person per room, the percentage of female headed households with dependent children, the percentage of household with public assistance, the percentage of household with no car, the percentage of household with no phone, the percentage of occupied household with incomplete plumbing, the percentage of household with no kitchen), 4) income and poverty (income disparity, the percentage of household with low income, the percentage of households below federal poverty line, the percentage of population below federal poverty line), 5) racial composition (the percentage of non-Hispanic African Americans, the percentage of Hispanic population, the percentage of population foreign-born), and 6) residential stability (the percentage of residents aged 65 or older, the percentage of persons with the same house at least five years). To examine the influence of geographic size, we performed the analysis across the nation and three states that have different socioeconomic characteristics and are involved in the Surveillance, Epidemiology, and End Results Program of the National Cancer Institute.
Table 1.
County
|
Tract
|
Block Group
|
||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
I a | II b | III c | IV d | I | II | III | IV | I | II | III | IV | |
Education | ||||||||||||
% Population with less than high school | * | * | * | |||||||||
Occupation | ||||||||||||
% Population with working class | * | * | * | * | * | * | * | * | * | |||
% Civilian labor force unemployed | * | * | * | * | * | * | * | * | * | |||
Housing Conditions | ||||||||||||
% HH e rent | * | * | * | |||||||||
% vacant HH | * | * | ||||||||||
% HH with >=1 person/room | * | * | * | * | ||||||||
% female headed HH with dependent children | * | * | * | * | * | * | ||||||
% HH on public assistance | * | * | * | * | * | * | * | * | * | |||
% HH with no car | * | * | * | * | * | * | ||||||
% HH with no phone | * | * | * | * | ||||||||
% Occupied HH with incomplete plumbing | ||||||||||||
% HH with no kitchen | ||||||||||||
Income & Poverty | ||||||||||||
Income disparity | * | * | * | * | * | * | * | * | * | * | * | |
% HH with low income | * | * | * | * | * | * | * | * | * | * | * | |
% HH below federal poverty line | * | * | * | * | * | * | * | * | * | * | * | |
% Population below federal poverty line | * | * | * | * | * | * | * | * | * | * | * | * |
Racial Composition | ||||||||||||
% non-Hispanic African Americans | * | * | * | * | * | * | ||||||
% Hispanic population | * | * | ||||||||||
% foreign born | * | * | ||||||||||
Residential Stability | ||||||||||||
% residents aged 65 or older | ||||||||||||
% persons with the same house >=5 years | ||||||||||||
% total variance explained | 33.0 | 31.6 | 47.8 | 43.6 | 40.7 | 47.2 | 44.6 | 47.1 | 36.5 | 41.4 | 39.8 | 40.4 |
Cronbach’s Alpha | 0.94 | 0.95 | 0.92 | 0.95 | 0.91 | 0.96 | 0.96 | 0.96 | 0.88 | 0.94 | 0.94 | 0.94 |
I: the nation;
II: California;
III: Georgia;
IV: Louisiana;
HH: household;
variables selected for constructing the composite index.
Statistical Analysis
Development of neighborhood socioeconomic deprivation index
Using a multivariate common factor analysis with the “varimax” rotation, we examined the internal structure of Census variables and identified their importance. We selected the common factor which predominantly accounted for total variance of all variables. A variable was selected to construct a composite index if its factor loading on the selected common factor was: 1) no less than 0.5; 2) the largest among its factor loadings across all common factors; and 3) at least 0.1 larger than the second largest factor loading across all common factors. A composite index was constructed by summing all selected variables that were standardized and weighted by their factor scoring coefficients. Cronbach alpha was applied to evaluate the internal consistency of selected variables with bigger value indicating greater internal consistency. A total of 12 composite index scores were independently developed for four geographic areas at three geographic levels, respectively.
Examination of the agreements
To compare a composite index to single socioeconomic indicators, we selected six commonly-used variables from the aforementioned six domains (one per domain). They included the percentage of population without high school education, the percentage of civilian labor force unemployed, the percentage of households with public assistance, the percentage of population below federal poverty line, the percentage of non-Hispanic African Americans, and the percentage of residents age 65 or older. Regarding potential skewed distributions of Census variables, we categorized the composite index and six single indicators into quintiles (five categories) according to their distributions. The categorization is commonly and broadly applied to assess the effects of environmental exposures on health behaviors and outcomes in epidemiological studies. We examined the agreements between seven variables through computing weighted Kappa coefficients for each pair of these variables [17]. Based on previous literature [18], the degree of agreement was defined as six categories, including 0 (no agreement, κ<0), 1 (slight agreement, κ=0.01–0.20), 2 (fair agreement, κ=0.21–0.40), 3 (moderate agreement, κ=0.41–0.60), 4 (substantial agreement, κ=0.61–0.80), and 5 (perfect agreement, κ>0.80). The data management and analysis were performed in SAS System (version 9.3, SAS Institute Inc., Cary, North Carolina).
RESULTS
Table 1 shows the component structure of 12 geographic area- and level-specific composite SES indexes. The component of the composite index varied across examined geographic areas. These component variables selected for each of 12 composite indexes account for a large proportion of overall variance of all Census variables (ranged from 31.6% to 47.8%), and have high internal consistencies (Cronbach’s alpha ranged from 0.88 to 0.96). At a specific geographic region, the component of the composite index was similar at the census tract- and block group-level but different from that at the county level. The percentage of population below federal poverty line was consistently selected for the composite index, regardless of geographic areas and levels. In contrast, the residential stability domain did not significantly contribute to the composite index at any of geographic areas or levels.
The percentage of population without high school education and the percentage of households with public assistance were the component of the composite index for each of three states, regardless of geographic levels, but not for the nation. The percentage of non-Hispanic African Americans is one of significant contributors to the composite index in Georgia and Louisiana, the states with a relatively high proportion of African American residents.
At the census tract level, the composite indexes had moderate-to-substantial agreements with their components and no-to-moderate agreements with non-component variables (Table 2). Across the nation, the composite index showed a substantial similarity (κ category is 4) to its component variable (the percentage of population below federal poverty line), and slight-to-moderate similarities (κ categories range from 0 to 3) to non-component variables. This agreement difference between the composite index and component and non-component variables was also observed in three states. The percentage of population below federal poverty line had no-to-substantial agreements with other socioeconomic indicators (κ categories range from 0 to 4).
Table 2.
PNH a | PNE b | PPA c | PPV d | PAA e | POD f | |
---|---|---|---|---|---|---|
Socioeconomic Deprivation Index | 3 g | 3 | 3 | 4 | 1 | 1 |
4 h | 3 | 4 | 4 | 1 | 0 | |
3 i | 4 | 4 | 4 | 4 | 1 | |
2 j | 4 | 4 | 4 | 4 | 1 | |
% Population with less than high school | 2 | 3 | 3 | 1 | 1 | |
3 | 3 | 3 | 1 | 0 | ||
2 | 3 | 3 | 1 | 2 | ||
2 | 2 | 3 | 2 | 1 | ||
% Civilian labor force unemployed | 3 | 3 | 2 | 0 | ||
3 | 3 | 1 | 0 | |||
3 | 3 | 3 | 1 | |||
3 | 4 | 3 | 1 | |||
% Household on public assistance | 3 | 1 | 0 | |||
3 | 2 | 0 | ||||
3 | 3 | 1 | ||||
3 | 3 | 1 | ||||
% Population below federal poverty line | 2 | 0 | ||||
1 | 0 | |||||
3 | 2 | |||||
3 | 1 | |||||
% non-Hispanic African Americans | 0 | |||||
0 | ||||||
1 | ||||||
0 |
PNH: % Population with less than high school;
PNE: % Civilian labor force unemployed;
PPA: % household on public assistance;
PPV: % Population below federal poverty line;
PAA: % non-Hispanic African Americans;
POD: % residents aged 65 or older;
the nation (1st row);
California (2nd row);
Georgia (3rd row);
Louisiana (4th row). 0: no agreement; 1: slight agreement; 2: fair agreement; 3: moderate agreement; 4: substantial agreement; and 5: perfect agreement.
DISCUSSION
Neighborhood SES has been widely used to assess socioeconomic gradients and inequalities in a variety of health behaviors and outcomes.[1–12] However, there is no consensus on the definition of neighborhood SES, and thus various socioeconomic variables have been used across studies. This may explain, at least in part, the inconsistent results of the role of neighborhood SES in health behaviors and outcomes.[13]
Using a uniform set of U.S. Census variables, we compared a composite index to six commonly-used socioeconomic indicators from different socioeconomic deprivation domains. The result showed that substantial resources of neighborhood SES varied over target regions and geographic units. A composite index was not identical to single SES indicators and more representative of neighborhood SES by capturing broad dimensions of SES resources.
Therefore, geographic area- and level-specific SES indicators should be used to define SES for the study area. In studies examining the role of general neighborhood SES in health behaviors and outcomes, a composite index is a measure of neighborhood SES better than single SES indicators. If we assess the role of a specific SES indicator, such as poverty, it is necessary to examine if that indicator substantially reflects overall SES environment of the studied geographic region at a certain geographic level. Otherwise, the SES indicator selected may not be generalizable to overall neighborhood SES environment. In this study, we only compare the composite SES index to six commonly-used Census variables from different socioeconomic domains. Further research may be necessary to compare neighborhood SES deprivation index to other variables or indexes of interest. However, our findings suggest that the assessment method of neighborhood SES environment should be paid more attention. Researchers should examine specific characteristics of SES environment in their own study regions to design an appropriate strategy in assessing neighborhood SES, instead of simply selecting SES variables applied in previous literature.
Regarding the margins of error of the ACS data, we apply the 2000 Census data which may not benefit recently-initiated studies. However, historic data source sometimes can be useful for prospective studies initiated in an earlier time-point. History of neighborhood exposures and their changes over time should be integrated into advanced statistical modeling to control for spatial uncertainty due to time-varying exposures and confounders for unbiased estimations of neighborhood effects on health behaviors and outcomes. In addition, the main purpose of this study is to address the strategy in assessing small-area neighborhood socioeconomic environment by comparing different socioeconomic variables to a composite index and examining the degree of their agreements using a uniform and reliable data source. Previous study has indicated that selecting different socioeconomic indicators can lead to inconsistent findings [13], therefore, it is necessary for researchers to select an appropriate approach in accurately assessing neighborhood SES environment.
In conclusion, geographic area- and unit-specific SES measures should be applied to identify and quantify socioeconomic inequalities in health behaviors and outcomes. A multivariate factor analysis with an appropriate rotation method is a useful approach to identify region- and geographic unit-specific SES indicators and construct a composite index. SES resources of the specific geographic area, along with the research question, should be taken into account in selecting a composite index or single indicators as a SES measure.
Acknowledgments
This work was supported in part by a career development award (K07 CA178331) and a research award (R21 CA169807) from the National Cancer Institute at the National Institutes of Health, and a research award (R01 AA021492) from the National Institute on Alcohol Abuse and Alcoholism at the National Institutes of Health. In addition, Y. L. is supported by the Barnes-Jewish Hospital Foundation, St. Louis, Missouri and the Breast Cancer Research Foundation. We also thank for the use of the Health Behavior, Communication and Outreach Core, part of a cancer center grant (P30 CA091842) funded by the National Cancer Institute at the National Institutes of Health.
Footnotes
No conflicts of interest were declared.
References
- 1.Dailey AB, Kasl SV, Holford TR, Calvocoressi L, Jones BA. Neighborhood-level socioeconomic predictors of nonadherence to mammography screening guidelines. Cancer epidemiology, biomarkers & prevention: a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology. 2007;16:2293–303. doi: 10.1158/1055-9965.epi-06-1076. Epub 2007/11/17. [DOI] [PubMed] [Google Scholar]
- 2.Shishehbor MH, Gordon-Larsen P, Kiefe CI, Litaker D. Association of neighborhood socioeconomic status with physical fitness in healthy young adults: the Coronary Artery Risk Development in Young Adults (CARDIA) study. American heart journal. 2008;55:699–705. doi: 10.1016/j.ahj.2007.07.055. Epub 2008/03/29. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Mathur C, Erickson DJ, Stigler MH, Forster JL, Finnegan JR., Jr Individual and neighborhood socioeconomic status effects on adolescent smoking: a multilevel cohort-sequential latent growth analysis. Am J Public Health. 2013;103:543–8. doi: 10.2105/ajph.2012.300830. Epub 2013/01/19. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Cohen SS, Sonderman JS, Mumma MT, Signorello LB, Blot WJ. Individual and neighborhood-level socioeconomic characteristics in relation to smoking prevalence among black and white adults in the Southeastern United States: a cross-sectional study. BMC public health. 2011;11:877. doi: 10.1186/1471-2458-11-877. Epub 2011/11/23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Krieger N, Chen JT, Waterman PD, Soobader MJ, Subramanian SV, Carson R. Geocoding and monitoring of US socioeconomic inequalities in mortality and cancer incidence: does the choice of area-based measure and geographic level matter?: the Public Health Disparities Geocoding Project. Am J Epidemiol. 2002;156:471–82. doi: 10.1093/aje/kwf068. [DOI] [PubMed] [Google Scholar]
- 6.Kim D, Masyn KE, Kawachi I, Laden F, Colditz GA. Neighborhood socioeconomic status and behavioral pathways to risks of colon and rectal cancer in women. Cancer. 2010;116:4187–96. doi: 10.1002/cncr.25195. Epub 2010/06/15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Palmer JR, Boggs DA, Wise LA, Adams-Campbell LL, Rosenberg L. Individual and neighborhood socioeconomic status in relation to breast cancer incidence in African-American women. Am J Epidemiol. 2012;176:1141–6. doi: 10.1093/aje/kws211. Epub 2012/11/23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Gerber Y, Koton S, Goldbourt U, Myers V, Benyamini Y, Tanne D, et al. Poor neighborhood socioeconomic status and risk of ischemic stroke after myocardial infarction. Epidemiology (Cambridge, Mass) 2011;22:162–9. doi: 10.1097/EDE.0b013e31820463a3. Epub 2010/12/07. [DOI] [PubMed] [Google Scholar]
- 9.Bosma H, van de Mheen HD, Borsboom GJ, Mackenbach JP. Neighborhood socioeconomic status and all-cause mortality. Am J Epidemiol. 2001;153:363–71. doi: 10.1093/aje/153.4.363. Epub 2001/02/24. [DOI] [PubMed] [Google Scholar]
- 10.Singh GK. Area deprivation and widening inequalities in US mortality, 1969–1998. Am J Public Health. 2003;93:1137–43. doi: 10.2105/ajph.93.7.1137. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Doubeni CA, Schootman M, Major JM, Stone RA, Laiyemo AO, Park Y, et al. Health status, neighborhood socioeconomic context, and premature mortality in the United States: The National Institutes of Health-AARP Diet and Health Study. Am J Public Health. 2012;102:680–8. doi: 10.2105/ajph.2011.300158. Epub 2011/08/20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Foraker RE, Patel MD, Whitsel EA, Suchindran CM, Heiss G, Rose KM. Neighborhood socioeconomic disparities and 1-year case fatality after incident myocardial infarction: the Atherosclerosis Risk in Communities (ARIC) Community Surveillance 1992–2002. American heart journal. 2013;165:102–7. doi: 10.1016/j.ahj.2012.10.022. Epub 2012/12/15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Zhang-Salomons J, Qian H, Holowaty E, Mackillop WJ. Associations between socioeconomic status and cancer survival: choice of SES indicator may affect results. Ann Epidemiol. 2006;16:521–8. doi: 10.1016/j.annepidem.2005.10.002. Epub 2006/01/03. [DOI] [PubMed] [Google Scholar]
- 14.Diez-Roux AV, Kiefe CI, Jacobs DR, Jr, Haan M, Jackson SA, Nieto FJ, et al. Area characteristics and individual-level socioeconomic position indicators in three population-based epidemiologic studies. Ann Epidemiol. 2001;11:395–405. doi: 10.1016/s1047-2797(01)00221-6. [DOI] [PubMed] [Google Scholar]
- 15.Messer LC, Laraia BA, Kaufman JS, Eyster J, Holzman C, Culhane J, et al. The development of a standardized neighborhood deprivation index. J Urban Health. 2006;83:1041–62. doi: 10.1007/s11524-006-9094-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Lian M, Schootman M, Doubeni CA, Park Y, Major JM, Torres Stone RA, et al. Geographic Variation in Colorectal Cancer Survival and the Role of Small-Area Socioeconomic Deprivation: A Multilevel Survival Analysis of the NIH-AARP Diet and Health Study Cohort. Am J Epidemiol. 2011;174:828–38. doi: 10.1093/aje/kwr162. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Feinstein AR, Cicchetti DV. High agreement but low kappa: I. The problems of two paradoxes. J Clin Epidemiol. 1990;43:543–9. doi: 10.1016/0895-4356(90)90158-l. [DOI] [PubMed] [Google Scholar]
- 18.Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33:159–74. [PubMed] [Google Scholar]