Abstract
Research on neighborhoods and health increasingly acknowledges the need to conceptualize, measure, and model spatial features of social and physical environments. In ignoring underlying spatial dynamics, we run the risk of biased statistical inference and misleading results. In this paper, we propose an integrated multilevel-spatial approach for Poisson models of discrete responses. In an empirical example of child mortality in 1880 Newark, New Jersey, we compare this multilevel-spatial approach with the more typical aspatial multilevel approach. Results indicate that spatially-defined egocentric neighborhoods, or distance-based measures, outperform administrative areal units, such as census units. In addition, although results did not vary by specific definitions of egocentric neighborhoods, they were sensitive to geographic scale and modeling strategy. Overall, our findings confirm that adopting a spatial-multilevel approach enhances our ability to disentangle the effect of space from that of place, and point to the need for more careful spatial thinking in population research on neighborhoods and health.
Keywords: spatial, multilevel, egocentric neighborhood, child mortality, neighborhood effects
Introduction
Interest in the role of place and space in shaping people’s health experiences and outcomes has grown steadily since the 1990s. Sociologists, demographers, and epidemiologists have conceptualized place as a source of area-level health access or exposure (Arcaya et al. 2012) such as neighborhood socioeconomic resources (Montgomery and Hewett 2005), social capital (Sampson et al. 1999), environmental hazard (Downey 2003), food quality (Xu et al. 2012), and health and family planning policy (Short et al. 2000; Xu and Short 2011). This “place” perspective often lacks careful “spatial thinking” (Logan 2012) and ignores both the important role of spatial process in affecting health and the spatial pattern of individual-level risks. Methodologically speaking, the place approach typically employs multilevel regression models that take into account within-neighborhood correlations but fail to adjust for between-neighborhood correlations resulting from, for example, spatial diffusion or contagion. The place approach also often adopts administratively defined areal units as proxies for neighborhoods, and thus is subject to potentially serious measurement errors (Flowerdew et al. 2008; Guo and Bhat 2007) and the notorious modifiable areal unit problem (Openshaw 1984).
In contrast, social geographers and spatial epidemiologists have highlighted the spatial pattern of health risks and inequalities (Chaix 2009; Elliott and Wartenberg 2004; Thornton and Olson 2011). The “space” perspective emphasizes the roles of relative locations and geographic proximity in structuring contagious processes of diseases and health-related knowledge and behaviors. The space approach pays close attention to incorporating geographic information to improve measurements of health-related environmental exposures (Frank et al. 2004). Taking advantage of new developments in spatial statistics, it allows researchers to infer spatial patterns of health. Nevertheless, a pure spatial approach runs the risk of ignoring place effects – the social reality and health impacts of living in shared neighborhoods (Arcaya et al. 2012). Unfortunately, researchers have rarely combined the strengths from these two approaches to investigate place and space effects on health (for exceptions see Chaix et al. 2005a; Chaix et al. 2005b).
The aspatial nature of previous neighborhood research has derived in part from the lack of appropriate spatial data. With the increasing availability of micro-level spatial data, and the concurrent advancement of geographic information system (GIS) and spatial statistical techniques, researchers are better equipped to consider spatial dimensions in the investigation of space and place effects, making new lines of inquiry possible. Despite progress in addressing concepts like scale, proximity, and neighborhood boundaries in the recent literature, inquiry is limited by lack of systematic knowledge about the relative strengths and weaknesses of an integrated multilevel-spatial approach as compared to a standard aspatial multilevel approach. It is further complicated by the challenges of implementing a multilevel-spatial strategy in empirical research.
We propose and illustrate empirically an integrated approach to study neighborhood and spatial variations in individuals’ health. This approach combines distance-based egocentric measures of health exposures with multilevel-spatial regression models. We simultaneously model neighborhood effects and spatial dynamics that transcend neighborhood boundaries. We also extend the multilevel-spatial model to Poisson models of discrete responses. We compare this multilevel-spatial approach with a conventional aspatial multilevel approach based on administrative boundaries by presenting an empirical example. Drawing on high-resolution data geocoded to the home address for all residents living in 1880 Newark, New Jersey, we investigate the relationship between ethnic and class composition of neighborhoods and child mortality. Our systematic evaluation demonstrates the advantage of a spatially informed approach in conceptualization, measurement, and statistical modeling, and shows how a spatially informed analysis adds insight to studies of neighborhood and health.
Neighborhood Representation and Measures
The Modifiable Areal Unit Problem (MAUP)
Due to easy access to data, existing studies of neighborhood effects on health often rely heavily on administratively defined units, such as census geography, electoral districts, and postal sectors and ZIP codes, to measure neighborhood characteristics. Because administrative units are far from suitable approximations of real life neighborhoods, this approach has been criticized for ignoring important spatial features of underlying neighborhood processes (Lee et al. 2008; Reardon and O’Sullivan 2004) and hence fails to handle, among others, the MAUP (Openshaw 1984).
The MAUP arises when researchers measure contextual characteristics by aggregating data from a small geographic scale to represent a unit of analysis at a larger scale. Analytic results are sensitive to the choice of data aggregation level because the place effects under investigation may occur and change on multiple scales. In a study of residential choice, Guo and Bhat (2007) detected different racial clusters when the level of data aggregation was shifted from census blocks to block-groups, and then to tracts. In contrast, consistent clusters were identified when using more spatially informed neighborhood representations that did not rely on census geographies. This difference may be partly attributed to the fact that multiple driving forces behind the same demographic phenomenon such as residential segregation operate at different scales, ranging from local to national and even global (Kaplan and Holloway 2001). Spielman and Yoo (2009) simulated an artificial city populated by individuals who were exposed to person-specific contextual effects aggregated from neighbors living within a circular buffer area centered at each focal individual’s location (i.e. egocentric neighborhoods). They found that when the measured scale of neighborhood environment did not match the true scale of spatial process, the estimates of neighborhood effects were not only biased but also less efficient.
Researchers often are aware of the MAUP, but unable to address it due to the lack of appropriate data and better techniques. For instance, when analyzing the relation between residential segregation and infant mortality, Hearst et al. (2008) were skeptical that city-level segregation aggregated from census tract-level data could accurately reflect individual experience and provide valid regression estimations. Lee et al. (2008) found that some metropolitan areas were more segregated at small geographic scales (e.g., within a 500-meter radius) than at larger scales (e.g., within a 4000-meter radius), while others exhibit a reverse pattern. Accordingly, they advocated constructing scale-dependent contextual measures and applying caution in matching the scale of underlying social process with that of measurement.
Neighborhood Boundary
The MAUP can be exaggerated when administratively defined areal boundaries do not coincide with those of everyday life experience. The provision of health resources may be restricted within the boundary of public health areal units (Boyle and Willms 1999), but everyday social and physical interactions among residents and the resulting contextual effects on health outcomes are rarely confined within the census tracts or ZIP codes where they are geocoded (Fang et al. 1998; Jacquez and Greiling 2003). Not knowing where the real-life neighborhood boundaries lie, researchers rely heavily on administrative boundaries which may in turn hinder discoveries of underlying neighborhood effects. For example, only after aggregating multiple ZIP codes to a single more racially coherent unit did Fang et al. (1998) find that residential segregation accounted for about 70% of the total variation in mortality risks in New York City.
Fang et al.’s (1998) study also illustrates the so-called grid problem (Taeuber and Taeuber 1965). The grid problem is that values of contextual measures can vary by how a study area is divided into subareas (e.g. ZIP codes or aggregation of multiple ZIP codes), especially when such measures are made aspatially without any adjustment for the geographic arrangements of the areal units. Whenever the boundaries of artificial subareas shift, contextual measures will produce different results.
Spatial Proximity
Spatial proximity is also inadequately measured when researchers rely on administrative units. Contextual measures based on administrative units implicitly assume fixed boundaries within which human interactions are contained, irrespective of actual social boundaries. Socioeconomic or demographic processes in one neighborhood are likely to spill over across the boundary into adjacent neighborhoods. Thus, spatial variations may exist as a function of the distance between two neighborhoods as stated in Tobler’s First Law of Geography (1970: 236), “Everything is related to everything else, but near things are more related than distant things.” Proximity-based spatial variation in neighborhood effects has been noted in several studies. For example, Sampson et al. (1999) found that not only did children gain from living in a neighborhood (operationalized as a cluster of census tracts) with a high level of collective efficacy, but they also benefited from living close to such a neighborhood. Using the same data, Morenoff (2003) found that proximity to a neighborhood with a high degree of social exchange and voluntarism was protective against low birth weight, net of the social environment in one’s own neighborhood. Crowder and South (2008) found that racial conditions in areas surrounding a householder’s neighborhood of residence (approximated by census tracts) had an independent impact on the likelihood of out-migration among whites.
Egocentric Neighborhood
For health outcomes, it is theoretically important to consider individuals’ proximity to one another in local space (Chaix 2009; Frank et al. 2004). Researchers have proposed the concept of egocentric neighborhood to measure contextual characteristics that are specific to individuals at different locations (Kramer et al. 2010; Matthews 2011; Reardon and O’Sullivan 2004). Under this framework, an individual’s exposure to the local environment reflects a proximity-weighted average composition of each surrounding point within a certain distance bandwidth centered at his or her residence. Local environment is particularly important for considering children’s health (as in this study) because children have limited daily activity space around the home and hence are exposed to mortality risks at a small geographic scale (Sampson et al. 2002).
One way to define egocentric neighborhoods is to draw concentric circular buffers around individuals’ residences at a certain range or at several ranges. For every individual, the people residing within this buffer constitute his/her neighbors, and proximity is measured by the Euclidean distances between their locations. The coverage of an egocentric neighborhood can transcend administrative boundaries since it only depends on one’s location and the bandwidth. The definition of egocentric neighborhood is also scale-sensitive, which facilitates the examination of MAUP. Nevertheless, a limitation of this method is that it ignores street networks and landscape barriers.
Street Network
The structure of street networks affects where people interact with each other in daily life and hence develop their conceptions of the neighborhoods (Anderson 1992). For instance, Rabin (1987) found that residential streets helped separate blacks and whites in American cities. Grannis (1998) demonstrated that pedestrian-oriented streets had greater influence than mere geographic proximity in connecting people and forming racially homogeneous communities, and this effect existed not only at the local level, but also at the city and metropolitan levels.
A network-based egocentric neighborhood consists of street lines extending from the focal point to the points where the distance along the street network reaches the specified radius, and the people living along these street lines are counted as the focal person’s neighbors. In this study, we compare results from models in which egocentric neighborhoods are based on circular buffers and those based on street networks defined at multiple distance radii. Street networks may be particularly important for studying children’s mortality risks because pedestrian-oriented streets can be closely tied to patterns of interaction that involve children and families.
Multilevel-Spatial Modeling
Multilevel regression models have been widely used in studies of neighborhood and health. A standard multilevel model adjusts for within-group correlations among residents in the same neighborhood. Let yij denote the outcome for an individual i in neighborhood j. Using an appropriate link function, a standard multilevel model links the outcome to linear predictors as follows,
| (1) |
where α0 is the intercept, Xijβ is the product of individual-level predictors and the corresponding unknown parameters, and Zjγ is the product of neighborhood-level predictors and the associated parameters. Within-neighborhood correlation is captured by uj which is usually assumed to be a normally distributed random intercept with mean 0 and variance .
This model ignores potential correlations that transcend neighborhood boundaries as a result of, for example, spatial proximity and diffusion among people living in adjacent neighborhoods. Consequently, estimates of standard errors for the coefficients of contextual variables may be biased downwards, leading to overestimated statistical significance levels (Chaix et al. 2005a; Chaix et al. 2005b). In addition, spatial variation in the outcome variables is obscured, despite its theoretical and substantive importance for understanding “spatial embeddedness” (Sampson et al. 1999).
Some scholars proposed a spatial error model which treats spatial correlation as a statistical nuisance that needs to be adjusted to achieve unbiased estimates of standard errors (Anselin 1988). Others applied a spatial lag model in which weighted characteristics of adjacent neighborhoods is added as an autoregressive-like predictor (Morenoff 2003; Sampson et al. 1999). Both the error and lag models are implicitly spatial in the sense that they rely on creating a lattice-like structure linking neighboring units but they do not specify a distance-based parametric function. Therefore, these two types of models cannot directly assess the geographic scale at which variations in health outcomes operate unless different neighboring structures are specified and tested separately. The spatial error and lag models can also be considered “pure spatial” models in the sense that they account for spatial dependence but discard within-neighborhood correlations.
In this study, we adopt an integrated multilevel-spatial modeling approach that has a spatial dependence structure in addition to the more typical within-neighborhood correlation. This model takes the form of a linear combination of a non-spatially structured within-neighborhood correlation and a spatially structured between-neighborhood correlation. The generic form of this multilevel-spatial model can be expressed as follows,
| (2) |
where sj denotes a random effects term that capture spatial correlations in an implicit or explicit way, and everything else is the same as in Eq. (1).
Our approach builds on that of Arcaya et al. (2012), who imposed a lattice-like neighboring structure and combined a conditional autoregressive (CAR) model with a standard multilevel model to simultaneously capture place and spatial variations in county-level life expectancy where states are a higher level of hierarchy. We extend this approach by modeling spatial correlation as a parametric distance-decay function to avoid results that are sensitive to the choice of lattice-like neighboring structure (Browne and Goldstein 2010; Diggle et al. 1998).
To our knowledge, we are among the first to extend the application of the generalized linear multilevel-spatial models for discrete responses by using Poisson models. By modeling within-neighborhood and spatial correlations simultaneously, this strategy will help us identify distinct spatial patterns of health risks and avoid erroneous inferences about the relative importance of place and space in shaping individuals’ health status. In combination with egocentric neighborhood measures, an explicitly spatial model will also permit us to identify scale-dependent variation in the underlying spatial dynamics of health risks.
Data Sources
Household-Level Geocoded Data
To demonstrate the merits and shortcomings of the integrated multilevel-spatial approach, we develop an empirical example, drawing on household-level geocoded data from the Urban Transition Historical GIS Project. We focus on 1880 Newark, New Jersey, one of the largest industrial cities in the U.S. in the late nineteenth century (Galishoff 1988). Newark had been an important migration destination that attracted large streams of Irish and German immigrants since the early eighteenth century, resulting in complex inter-ethnic relationships and residential patterns. Meanwhile, rapid industrialization, urbanization and tides of immigration increased congestion and environmental pollution, making Newark one of the unhealthiest cities at that time (Galishoff 1988). We use this city as representative of urban conditions relevant to contextual health risk factors at a time when the U.S. was becoming a predominantly urban, industrial, and ethnically segregated nation.
All residents in Newark were geocoded based on their household addresses from the full transcription of the 1880 Census of Population made accessible through the North Atlantic Population Project at the Minnesota Population Center (MPC). The success rate of geocoding reached nearly 98%. Details on the geocoding procedure have been described elsewhere (Logan et al. 2011). Figure 1 illustrates the geocoded households, as well as selected attributes of individual members, along the street network in part of the city. Multiple households are located at one address in a multi-family building. This fine-grained geocoding allows us to calculate distance between any two households along the street network.
Fig. 1.
Illustration of Street Network and Household Addresses with Individual Attributes in 1880 Newark, NJ
We focus on three ethnic groups, Irish, Germans and Yankees. Individual ethnicity is determined by combining race, place of birth, and parents’ places of birth. Irish and Germans include both first- and second-generation immigrants. Yankees are native-born whites whose parents are also native born. By these definitions, there are 31,362 Irish, 42,481 Germans, and 37,967 Yankees, together accounting for about 82% of Newark’s total population (136,508) in 1880.
Child Death Records
Throughout the nineteenth century, Newark residents were vulnerable to epidemics of cholera, typhoid fever, and dysentery, which were transmitted by fecalized water sources, as well as outbreaks of influenza, smallpox and diphtheria, which were transmitted via person-to-person contact; and the leading causes of death were tuberculosis, pneumonia, and the diarrheal diseases (Galishoff 1988). Many of these epidemics were facilitated by deteriorated neighborhood environment, since the streets were filled with filth due to lack of garbage removal, and the Passiac River, a main water source, was polluted by industrial sewage, animal waste, and human excrement. The poor underclass living in crowded tenements and children under age two, regardless of their family wealth, were most vulnerable to disease. Half of all deaths in Newark in 1870 were among children two years old or younger (Cunningham 1966: 230). Unfortunately, the City Hospital was not opened until 1882, and was viewed more as a charity than a scientific institute (Cunningham 1966).
The dependent variable in this analysis, child mortality, is defined as death by age five. It is measured using vital records between June 1878 and June 1885, including death certificates, burial, reburial, transit, and disinterment permits as recorded by the New Jersey Department of Health. New Jersey is well known for its accurate and complete reporting of vital statistics in the late nineteenth century. Among all the 1880 U.S. Census death registration areas, New Jersey was one of the only two states that provided reasonably accurate data, with more than 90% complete registration of deaths (Galishoff 1988). These death records are linked to the 1880 Census data by children’s first and last names and year of birth.
We identified a total number of 501 death records by June 1885 among 6,762 individuals who were infants (aged 0–1 year) in 1880 Newark. These numbers translate to 74 deaths per 1000 children during a roughly 5-year period, and crudely 15 deaths per 1000 on average in a single year. This number is quite close to the officially published death rate of Newark in 1880 (18.7 deaths per 1000 population; Galishoff 1988:96). 55 death records were not linked to the census data because of dubious name or age mismatch, yielding a roughly 90.1% match rate. In this analysis, we focus on the 438 death records among 5,767 infants who were Irish, Germans or Yankees from 5,558 households. The match rate is highest for Yankees (92.5%) and lowest for Germans (85.0%), but ethnic differences in matching are not statistically significant (p<.05).
It is possible that some of these infants moved out of Newark with their parents before age 5 and died elsewhere. However, industrial prosperity made Newark a population magnet. There was only one occasion of large-scale outmigration in the second half of the nineteenth century. That is, the city population dropped from 71,941 in 1860 to 68,999 in 1863 during the Civil War (Cunningham 1966:159). Thus, it is unlikely that a large number of children under age 5 migrated during 1880–1885, and empirical results from this study should be informative, albeit interpreted cautiously.
Another data limitation is that children who were not named at the time of death were not linked in these data. These children were likely infants, and it is therefore possible that the infant mortality captured here (about one fourth of child deaths occurred under age 1) is an underestimate. Unfortunately, without additional historical information, we are unable to assess the impact of this missing data on our analysis. We also do not have data on cause of death, which prevents further exploration of the sources of child mortality.
Measures
The individual-level variables include child’s age, ethnicity, and gender. The household-level variables include number of children in the household, and head’s age and socioeconomic status. Age is measured in years for both children and household heads. Child’s ethnicity and gender are coded as dummy variables. Number of children is treated as a continuous variable. Socioeconomic status is measured by a socioeconomic index (SEI) score based on average education and earnings in each occupation as measured in 1950 and standardized to be a continuous value bounded between 0 and 100 with 0 indicating unemployed. Such a coding strategy is robust with respect to the historical context (Sobek 1996).
The neighborhood-level variables include population density, socioeconomic condition, and ethnic diversity. High population density is related to overcrowded housing conditions and unsanitary living environments which in turn may increase the risk of low birth weight (Roberts 1997), post-neonatal mortality (Reid 2002), and later life mortality (Coggon et al. 1993). Neighborhood socioeconomic conditions have been associated with individual health outcomes. A lack of socioeconomic resources can lead to deteriorated physical environment, deprived access to social and public health services, and increased exposure to noxious waste and pollutants, all of which negatively impact residents’ health (Collins and Williams 1999; Diez-Roux 2000; Sampson et al. 2002).
Ethnic diversity may increase the likelihood of inter-group interactions which are likely to facilitate transmission of infectious diseases, the leading causes of death in 1880 Newark (Galishoff 1988), by increasing heterogeneity in one’s social network and expanding the range of possible risk factors. Contemporary research has found evidence for elevated transmission dynamics of infectious diseases due to inter-group interactions (Acevedo-Garcia 2001; Rothenberg et al. 2005; Rothenberg and Potterat 1988). For example, in a study of tuberculosis (TB) rates in 1985–1992 New Jersey, high contact with immigrants was an important risk factor for facilitating TB transmission among African Americans and Hispanics (Acevedo-Garcia (2001). Living in ethnically diverse neighborhoods may also facilitate exposure to a greater range of health risks. For example, immigrants may carry infectious diseases to which natives lack antibodies. A salient example is the 1875 measles epidemic in Fiji that occurred when the disease was introduced by Australian voyagers and killed over one quarter of the Fijian population (Hays 2005). Intensive quarantine turned out to be the only effective way to prevent recurrence of the tragedy between 1879–1920 when Indian immigrants continued to carry measles to Fiji (Cliff and Haggett 2004). On the other hand, immigrants are also susceptible to the endemic diseases present in their destination cities. Many Irish and German immigrants who were not exposed to yellow fever in their homelands died during the 1853 outbreak in New Orleans (Hays 2005). We therefore expect that living in ethnically diverse neighborhoods with high levels of inter-group interaction may contribute to increased exposure to infectious diseases and hence elevated child mortality risks in 1880 Newark.
We represent neighborhoods in three ways, and compare the relative strengths and weaknesses of these representations. First, similar to many contemporary studies, we use enumeration districts (EDs), a type of historical census geography, as proxies for neighborhoods. An ED is a geographic area assigned to an individual enumerator to gather census data from door to door. EDs in cities such as Newark are comparable in population size to modern census tracts, and an ED is the smallest unit for which census data can be readily tabulated in the nineteenth century (Logan et al. 2011). Population density is measured by dividing an ED’s total number of population by its area in kilometers squared (km2). Local area’s socioeconomic condition is the averaged SEI across all adults with a valid SEI score residing in the same ED. Ethnic diversity in an ED is measured by the entropy-based diversity index and calculated as a weighted (by population size) average of the entropy in each street block as (Theil 1972):
| (3) |
where Pi,j,m is the proportion of ethnic group m in the jth street block of the ith ED. The value of the entropy ethnic diversity index is bounded between 0 (indicating the least ethnic diversity—i.e. a single group dominates the neighborhood) and 1 (indicating greatest ethnic diversity—i.e. all groups appear in the neighborhood with equal proportions).
The second and third representations are egocentric neighborhoods based on Euclidean and street network distances, respectively, at varying radii from 100 to 2,000 meters to approximate the relatively small scales of children’s limited activity space in 1880 Newark – a pedestrian city. Following Reardon and O’Sullivan (2004), we compute neighborhood measures based on proximity-weighted functions. Specifically, we use the classic kernel density estimation (KDE) for circular buffering neighborhoods (Gatrell et al. 1996), and a recently developed approach for network-based neighborhoods (Xie and Yan 2008). A Euclidean KDE reflects the proximity-weighted population count within an individual’s local environment (i.e. spatially weighted number of people per unit area to one’s egocentric neighborhood). A network KDE provides an estimate of population density over a linear unit and hence is well-suited for constructing street network-based egocentric neighborhoods. The SEI of one’s egocentric neighborhood is calculated in a manner similar to the KDE; it is a spatially weighted average SEI of all adults with a valid SEI score in the neighborhood.
Following Reardon and O’Sullivan (2004), the scale-dependent spatial entropy diversity for a child living at location i can be computed in a similar fashion as in Eq. (3) but with the aspatial Pi,j,m replaced by the KDE-based Pi,m,r in the following way:
| (4) |
where KDEi,m,r is the Euclidean or street network KDE for group m within an egocentric neighborhood centered at location i with radius r. The value of the spatial diversity index is also bounded between 0 and 1. All continuous variables at individual-, household-, and neighborhood-level are standardized into z-scores to facilitate the Bayesian statistical computation.
Method
We employ multilevel-spatial Poisson models that incorporate both the standard neighborhood random effects and the spatially correlated random effects and compare the results with those from standard multilevel models. A limitation of a multilevel-spatial Poisson model is its high computational cost when incorporating a considerable number of individual locations, though usually no more than 100, in estimating spatial correlations (Banerjee et al. 2004; Gelfand et al. 2006). Therefore, we use the population-weighted centroid of an ED as the location for each subject living in that ED to estimate inter-neighborhood spatial correlations. Two types of location information are employed in the statistical analyses. First, the location of each child is used to construct egocentric neighborhoods to measure his/her very local environmental conditions. This location information hence serves as a basis for describing egocentric environment for the regression predictors denoted by Z in the Eq. (7) below, and thus contributes to the estimation of the effects of measured covariates (as opposed to unmeasured random effects, explained below) on child mortality.
Second, the ED centroid information attached to each child serves, loosely speaking, as a structural instrument to capture the unmeasured factors that lead to the correlation between child mortality in one ED and its nearby EDs. Specifically, this location information is used to compute the distance between any two EDs, which in turn is used to estimate the average strength of inter-ED correlation in child mortality for any two EDs separated by one unit of distance, and how this strength declines as the distance between two EDs increases. In short, the ED centroid information contributes to the estimation of unmeasured spatial correlation among EDs.
Assuming the number of child deaths follows a Poisson distribution, we can model the rate of child mortality with a Poisson model. For each child, this is equivalent to modeling the occurrence of death with an off-set term to adjust for the duration of survival (Holford 1980). For a child i in ED j, the model takes the following form:
| (5) |
| (6) |
| (7) |
where the number of deaths, yij, for each child takes a value of a value of 1 if he or she had died and a value of 0 if he or she had not died yet by age 5; Mij is an off-set that adjusts for variation in age of death for those who had died by age 5 and that in the 5-year exposure period for those who had not died yet by age 5. The death rate, θ, relative to the exposure, M, is modeled as a combination of linear predictors, in which Xij is a vector of individual- and household-level characteristics, Zj is a vector of neighborhood-level variables, uj-denotes the usual aspatial neighborhood random effects at ED-level, and sj-denotes the spatial random effects that have more similar values for EDs close to each other than for those further apart. Following previous research (Chaix et al. 2005a; Chaix et al. 2005b), we assume:
| (8) |
| (9) |
where denotes the variance of the aspatial neighborhood random effects, denotes the variance of the spatial random effects, and exp(−ϕdij) is an exponential correlation function with a parameter ϕ, and dij indicating the distance between two locations (i.e. the centroids of two EDs). In other words, we assume the spatial correlation between two locations declines exponentially as the distance between the two increases, and that the rate of decline is controlled by parameter ϕ. The value of 3/ϕ in this case is known as the practical range, that is, the distance at which spatial correlation drops to 5% and thus can be considered as “having disappeared”.
We choose a Poisson model for its simplicity and wide application in modeling survival time data (Powers and Xie 2008). The random effects terms in our models also help alleviate the problems of overdispersion and excess zeros often encountered in conventional Poisson models (Rabe-Hesketh and Skrondal 2012). As a robustness check, we have repeated our regression analyses by fitting negative binomial models, and the results are substantively the same.
We performed Bayesian estimations using the OpenBUGS 3.2.2 program (Lunn et al. 2009), and choosing non-informative priors for the parameters because we had no strong prior beliefs about what the parameter values should be (Gelman and Hill 2007). To ensure convergence, all models were run with three Markov chain Monte Carlo (MCMC) chains, each for 120,000 iterations with the first half discarded as burn-in. Model convergence was monitored by graphically examining the trace plots of MCMC chains and computing the Gelman-Rubin statistic (Gelman and Rubin 1992).
Results
Descriptive Statistics
Table 1 shows descriptive statistics for crude death rates and independent variables. The average death rate was highest for Irish children (0.11) and lowest for German children (0.05), and similar for boys and girls. More than half of the children were Yankees, about one fifth were Irish, and the rest were Germans. On average, Yankee children lived in the wealthiest households, whereas Irish children lived in the poorest households. Yankee children also tended to have fewer siblings compared with Irish and German children.
Table 1.
Descriptive Statistics
| Total (N=5,767)
|
Irish (N=1,071)
|
German (N=1,484)
|
Yankee (N=3,212)
|
|||||
|---|---|---|---|---|---|---|---|---|
| Mean | SD | Mean | SD | Mean | SD | Mean | SD | |
| Died by age 5 | 0.08 | 0.26 | 0.11 | 0.32 | 0.05 | 0.21 | 0.08 | 0.27 |
| Exposure (years) | 4.76 | 0.95 | 4.65 | 1.09 | 4.85 | 0.77 | 4.75 | 0.97 |
| Individual-Level | ||||||||
| Sex (male = 1, female = 0) | 0.49 | 0.50 | 0.46 | 0.50 | 0.50 | 0.50 | 0.50 | 0.50 |
| Household-Level | ||||||||
| Head’s age | 34.99 | 8.92 | 35.75 | 7.31 | 36.47 | 7.88 | 34.06 | 9.71 |
| Head’s SEI | 27.57 | 20.01 | 20.76 | 16.64 | 27.15 | 18.64 | 30.03 | 21.09 |
| Number of children | 3.12 | 1.84 | 3.46 | 1.85 | 3.78 | 1.94 | 2.69 | 1.66 |
| Neighborhood-Level | ||||||||
| Enumeration districta | ||||||||
| Diversity | 0.78 | 0.19 | 0.83 | 0.16 | 0.70 | 0.28 | 0.80 | 0.16 |
| Population density (persons/km2) | 10980 | 6960 | 6871 | 5552 | 13095 | 9403 | 11004 | 6075 |
| SEI | 32.54 | 6.66 | 29.87 | 6.86 | 31.26 | 6.66 | 33.38 | 6.58 |
| Euclidean distance (200m)b | ||||||||
| Diversity | 0.78 | 0.17 | 0.83 | 0.13 | 0.68 | 0.20 | 0.81 | 0.16 |
| Population density (persons/km2) | 4210 | 2245 | 3896 | 1793 | 4815 | 2744 | 4035 | 2066 |
| SEI | 26.67 | 4.44 | 24.89 | 3.71 | 25.50 | 3.26 | 27.81 | 4.79 |
| Street network distance (200m)b | ||||||||
| Diversity | 0.77 | 0.18 | 0.81 | 0.14 | 0.66 | 0.21 | 0.80 | 0.17 |
| Population density (persons/m) | 1.42 | 0.84 | 1.31 | 0.69 | 1.64 | 1.00 | 1.35 | 0.77 |
| SEI | 26.74 | 4.72 | 24.75 | 3.81 | 25.57 | 3.48 | 27.94 | 5.11 |
Summarized at enumeration district level
Summarized at individual level
Neighborhood characteristic are reported for EDs as well as for egocentric neighborhoods measured via Euclidean and street network distances. The egocentric neighborhood-level measures shown in Table 1 only refer to those with a radius of 200 meters to conserve space. Irrespective of measurement strategy, the average value of ethnic diversity was quite high, about 0.8. The average neighborhood-level SEI was also about the same for the two egocentric neighborhood representations (~27), but slightly higher for the ED representation (~33). The average population density at ED-level was nearly twice as large as that in egocentric neighborhoods defined via Euclidean distance. The similarities between different definitions of egocentric neighborhoods (via Euclidean or street network distances) were probably due to the relatively small radius. In fact, as the radius increased from 200 to 300 meters or more, both population density and ethnic diversity, but not average SEI, began to vary notably by definition of egocentric neighborhoods (results not shown). These variations imply that egocentric neighborhoods based on street network distance may capture very different features of local environment compared to those based on Euclidean distance, and therefore estimated neighborhood-level effects may vary substantially as well.
Do spatial effects matter for child mortality risks in 1880 Newark? The KDE of child death occurrences is plotted with contour lines in Fig. 2, and suggests a strong spatial pattern of mortality risks. Overall, the household locations of child deaths were largely confined within the central region of Newark. There existed two hot zones with relatively high death intensity. The more westerly one roughly corresponds to an area populated by Irish and Germans, whereas the more easterly represents an area predominantly occupied by Irish and Yankees. The level of child mortality risk continues to decline isotropically as distance from the two hotspots increases.
Fig. 2.
Kernel Density Estimation of Child Deaths
Regression Results
General Findings
For each radius used to delimit egocentric neighborhoods, a standard multilevel model and a multilevel-spatial model were estimated. The coefficient estimates for individual- and household-level variables are similar between the models using the two egocentric neighborhood representations with varying radii. Thus, to conserve space, Table 2 presents the means of Bayesian posterior distributions of the regression parameters from only four out of all the estimated models (34 in total). Statistical significance is determined by the credible intervals (CIs) of the posterior estimates.
Table 2.
Bayesian Posterior Estimates from Multilevel and Multilevel-Spatial Poisson Models
| Enumeration District
|
Euclidean Egocentric (200 Meters)
|
|||||||
|---|---|---|---|---|---|---|---|---|
| Multilevel | Multilevel-Spatial | Multilevel | Multilevel-Spatial | |||||
|
|
|
|
||||||
| β | SE | β | SE | β | SE | β | SE | |
| Intercept | −4.52 | 0.45 *** | −4.50 | 0.44 *** | −4.30 | 0.10 *** | −4.29 | 0.10 *** |
| Individual-Level | ||||||||
| Sex (ref: female) | 0.20 | 0.09 * | 0.20 | 0.10 * | 0.19 | 0.10 † | 0.19 | 0.09 * |
| Ethnicity (ref: Yankee) | ||||||||
| Irish | 0.30 | 0.12 ** | 0.29 | 0.12 * | 0.28 | 0.12 * | 0.27 | 0.12 * |
| German | −0.47 | 0.15 *** | −0.50 | 0.15 *** | −0.47 | 0.14 *** | −0.47 | 0.15 *** |
| Household-Level | ||||||||
| Head’s age | 0.00 | 0.01 | 0.00 | 0.01 | −0.02 | 0.05 | −0.02 | 0.05 |
| Head’s SEI | −0.01 | 0.00 ** | −0.01 | 0.00 ** | −0.14 | 0.05 * | −0.14 | 0.06 * |
| Number of children | 0.01 | 0.03 | 0.01 | 0.03 | 0.01 | 0.05 | 0.02 | 0.05 |
| Neighborhood-Levela | ||||||||
| Diversity | 0.69 | 0.36 † | 0.66 | 0.36 † | 0.18 | 0.07 * | 0.17 | 0.07 ** |
| Population density | 0.08 | 0.07 | 0.08 | 0.07 | 0.08 | 0.06 | 0.08 | 0.07 |
| SEI | 0.00 | 0.01 | 0.00 | 0.01 | −0.08 | 0.06 | −0.08 | 0.07 |
| Random Effects | ||||||||
| 0.10 | 0.05 | 0.06 | 0.05 | 0.08 | 0.04 | 0.05 | 0.04 | |
| — | 0.05 | 0.07 | — | 0.05 | 0.05 | |||
| Spatial Correlation | ||||||||
| ϕ | — | 5.42 | 2.87 | — | 5.40 | 2.65 | ||
| 3/ϕ (practical range in kilometers) | — | 0.55 | 0.56 | |||||
| DIC | 3985 | 3967 | 3976 | 3956 | ||||
p < 0.1;
p < 0.05;
p < 0.01;
p < 0.001 (based on credible intervals of Bayesian posterior estimates).
The radius is 200 meters for egocentric neighborhoods.
Regardless of neighborhood representation and modeling strategy, boys had a higher mortality risk compared with girls, a finding consistent with the literature (for a review, see Read and Gorman 2010). Irish children experienced a greater death rate, and German children had a lower rate, compared with Yankee children. At the household-level, head’s SEI exerted a protective effect against child mortality risk, but the effect size was much greater for the egocentric (−0.14) compared to the ED-level (−0.01) measure. Consistent with our expectation, living in an ethnically diversified neighborhood was associated with an elevated mortality risk. However, this relationship was only marginally significant for the ED-level measure, though the effect size was much larger than when using the egocentric measure.
Using the same neighborhood representation, either ED or egocentric, the coefficient estimates for the fixed effects are about the same across the standard multilevel model and the multilevel-spatial model. It is the estimate of random effects that differs. Regardless of neighborhood representation, the estimate of the aspatial neighborhood random effects is about 60% larger from the multilevel model than from the multilevel-spatial model. Within the multilevel-spatial models, the coefficient for (0.05) is of similar size as that for , although it is not straightforward to calculate intra-class correlation for Poisson models (Rabe-Hesketh and Skrondal 2012). In other words, about half of the seemingly large aspatial neighborhood random effects estimated from the multilevel model are likely to be driven by the spatial correlation between nearby neighborhoods. These results suggest that the standard multilevel model may run the risk of overestimating the strength of within-neighborhood correlation as a result of ignoring between-neighborhood correlation.
To better differentiate spatial effects from aspatial neighborhood effects, we calculated the posterior means of sj and uj for each ED as estimated from an empty multilevel-spatial model without covariates. We then mapped the mortality risks contributed by aspatial (uj) and spatial neighborhood effects (sj), respectively (Fig. 3). The child mortality risks due to spatial effects (Fig. 3 top) exhibited a clear spatially smoothed pattern similar to that observed on the KDE map (Fig. 2). The pattern suggests a spatial diffusion-like process with mortality risks transitioning gradually from high to low across the city. Clusters of high-risk EDs are surrounded by modest-risk clusters, which in turn are bordered by low-risk clusters. By contrast, the risks due to aspatial neighborhood effects displayed a relatively random distribution over the city (Fig. 3 bottom). Low-risk EDs were scattered among modest- and high-risk EDs, and there was no clear trend of mortality risks radiating from certain EDs and decaying over distance to the rest.
Fig. 3.
Estimates of ED-level variations in the child death rates, split into a spatially structured component (top) and an unstructured component (bottom), from the empty multilevel-spatial Poisson model.
The Impact of Geographic Scale
Figure 4 plots the mean coefficient estimates and the associated 95% CIs for the ethnic diversity index measured from egocentric neighborhoods with different radii. The coefficient is uniformly positive regardless of estimation strategy when egocentric neighborhoods are defined with different radii. However, the statistical significance of the estimated coefficients across different modeling strategies and neighborhood representations varies. In general, the estimates from the multilevel-spatial models are less likely to be significant than those from the standard multilevel models. In addition, there are notable variations in the coefficient estimates for random effects σu and σs (see Fig. 5) as well as for the spatial decay parameter ϕ (see Fig. 6) from multilevel-spatial models when defining egocentric neighborhoods using different radii, although a clear pattern is not evident. These results suggest that when the underlying mechanism is likely to be scale-dependent, we require theory to guide measurement and statistical modeling and we must be cautious about making scale-sensitive statistical inference.
Fig. 4.
Bayesian posterior estimates (mean and 95% credible interval) of diversity indices measured at different radii.
Fig. 5.
Mean Bayesian posterior estimates of non-spatial (σu) and spatial (σs) random effects from multilevel-spatial models.
Fig. 6.
Mean Bayesian posterior estimates of practical range (3/ϕ) from multilevel-spatial models.
At what geographic scale does spatial correlation matter? Figure 6 shows that after controlling for individual, household, and neighborhood level covariates, the estimated practical range from the multilevel-spatial models varied between 0.55 and 0.72, depending upon the choice of neighborhood representation and the distance radius used to define egocentric neighborhoods. That is, the spatial correlation between two locations in child mortality risks did not fully disappear (drop to 5%) until the distance stretched beyond 550–720 meters, a scale that may still exceed the distance between the centroids of two adjacent EDs in the inner city (refer to the map and the associated scale as shown in Fig. 3).
Choice of Neighborhood Representation
Besides the differences in the coefficient estimates for household head’s SEI and ethnic diversity, the estimates of random effects also varied across neighborhood representations. It can be seen in Fig. 5 that the coefficient estimate for σu from the multilevel-spatial models was consistently larger when using EDs to approximate neighborhoods than using egocentric neighborhoods, regardless of how distance was measured and what distance radius was. A similar pattern holds for the coefficient estimate for σu from the standard multilevel models (results not shown). By contrast, the estimate for σs from the multilevel-spatial models was generally smaller when using EDs as proxies for neighborhoods than using egocentric neighborhoods. In other words, using ED-based measures led to overestimated aspatial effects but underestimated spatial effects compared to using egocentric measures.
Further, Fig. 6 suggests that using ED-based neighborhood representation is likely to underestimate the practical range for the between-neighborhood spatial correlation compared to using egocentric neighborhoods, regardless of distance radius. On the other hand, the choice of how to measure distance in defining egocentric neighborhoods does not seem to have any substantial or consistent impact on the results as evidenced in Fig. 4–6.
Choice of Modeling Strategy
Comparison of the multilevel-spatial model to the standard multilevel model yields two insights. First, as mentioned above, the multilevel-spatial model allows us to distinguish the spatial effects that transcend administrative boundaries from the aspatial effects that are bounded with such boundaries, and thereby better capture the underlying spatial process and avoid inflated inference about within-neighborhood correlations (see Table 2 and Fig. 3). Second, the coefficient estimate for ethnic diversity was more likely to reach statistical significance from the standard multilevel models than from the multilevel-spatial models as shown in Fig. 4 (at the radii of 400–1,500 meters). This may imply that adjusting for within-neighborhood correlation alone but ignoring the presence of spatial correlation could result in inflated statistical significance for contextual effects.
Overall Goodness-of-Fit
The deviance information criterion (DIC) was applied to assess the relative goodness-of-fit across models (Spiegelhalter et al. 2002). DIC is a hierarchical modeling generalization of the Akaike information criterion (AIC) widely used to compare non-nested regression models (Akaike 1974). A smaller value of the DIC indicates a better model fit to the data. Figure 7 plots the values of the DIC for all the fitted models and several conclusions can be drawn from visual comparisons. First, the multilevel-spatial models outperform the standard multilevel models, regardless of neighborhood representation. Second, using egocentric neighborhoods generally increases the goodness-of-fit compared to approximating neighborhoods by EDs, regardless of modeling strategy. Third, the choice of how to calculate distance in defining egocentric neighborhoods makes little difference when fitting the standard multilevel models. However, in the case of the multilevel-spatial models, Euclidean distance generally outperforms street network distance, regardless of radius. One possible explanation is that unlike later modern cities, most residential buildings in 1880 Newark were low-rise, facilitating communication with back-yard neighbors. Calculating distance along streets would then underestimate actual social access.
Fig. 7.
Comparison of deviance information criterion (DIC) values between multilevel and multilevel-spatial models using different neighborhood-level measures.
Discussion
Drawing upon a unique spatial dataset, we propose an integrated multilevel-spatial approach with respect to neighborhood representation, contextual measures, and modeling strategy, and compare this approach to the standard multilevel approach in a study of child mortality in 1880 Newark. We demonstrate how this multilevel-spatial approach helps to shed light on the multifaceted spatial dynamics that shape individuals’ health. Several findings from this study have implications for future empirical research.
First, consistent with studies using OLS and logit models (Chaix et al. 2005a; Chaix et al. 2005b; Flowerdew et al. 2008), the choice of neighborhood representation affects empirical results from Poisson models, regardless of the choice of modeling strategy (spatial or aspatial). We found notable differences in not only the significance level but also effect size with respect to household- and neighborhood-level covariates when using administrative areal units compared to spatially-defined egocentric neighborhoods. Confirming previous researchers’ speculation (Boyle and Willms 1999; Chaix et al. 2009; Dietz 2002; Flowerdew et al. 2008; Riva et al. 2009), we found it inadequate to approximate neighborhoods by administratively defined geographic units, since this approach yields little insight into the underlying spatial dynamics of health outcomes and may result in incomplete or erroneous conclusions. Our results suggest that using administratively defined geographic units in a standard multilevel modeling framework that ignores spatial correlations can result in overestimated neighborhood effects.
Meanwhile, our results were not especially sensitive to different definitions of egocentric neighborhoods. In contrast to a previous study (Guo and Bhat 2007), we found little evidence of the superiority of defining egocentric neighborhoods along street networks in this study, even though such a neighborhood representation may be theoretically more appealing than drawing circular buffers (Grannis 1998). Possibly the choice of an egocentric neighborhood representation itself is robust against a variety of detailed geographic specifications.
Second, we found strong evidence of scale-dependent results. The statistical significance of ethnic diversity varies by the distance radius used to delimit the boundaries of egocentric neighborhoods. Only at the very local (200 meters) and large (2,000 meters) scales did we find consistent significant effects of ethnic diversity. Furthermore, the estimates of the spatial random effects from the multilevel-spatial models fluctuate as the distance radius of egocentric neighborhoods change, although we are unable to discern any clear pattern. These findings suggest that neighborhood and spatial associations are likely to be driven by different forces at different geographic scales and hence only become manifest at certain scales. Future research is needed to uncover the theoretical mechanisms underlying scale-sensitive relations between neighborhood context and health.
Third, we extended the multilevel-spatial models to Poisson models of discrete responses and demonstrate its applicability and advantage compared to the standard multilevel models in the presence of spatial dynamics. We found that the multilevel-spatial model allowed us to: (1) disentangle spatial effects from aspatial neighborhood effects by revealing an underlying spatial distribution of child mortality risks across the city; (2) identify the geographic scale within which the spatial correlation could transcend administrative boundaries, thereby highlighting the need for prudent choice of neighborhood representation; (3) recognize the relative importance of spatial and aspatial neighborhood effects in explaining residual health outcomes, and hence reduce the risk of exaggerating within-neighborhood correlations; and (4) avoid overestimating contextual effects resulting from ignoring between-neighborhood correlations. Together, these findings provide strong support for employing a multilevel-spatial modeling strategy as possible to achieve less biased results.
Certainly difficulties in gaining access to (or producing) high resolution spatial data are an obstacle to exploring spatial dynamics. However the available resources are constantly growing. Geocoding historical data faces few confidentiality issues and has gained momentum among historical demographers (DeBats 2011; Logan et al. 2011). The release of 100% microdata from the 1940 census will create an important and more contemporary source that can be combined with subsequent survey or documentary records of individuals under proper supervision. Despite increased costs to protect data confidentiality, researchers are also gaining experience in geocoding respondents’ address information in contemporary demographic surveys to study health behaviors and outcomes (Siqueira-Junior et al. 2008). Multiple sources of contextual data other than the population census, such as satellite imagery or remote sensing, can then be linked to individual observations (Frankenberg et al. 2005; Weeks et al. 2013). Even with traditional census data on small areas, new techniques are emerging to approximate effective contextual measures at fine-grained spatial scales (Reardon 2008). By demonstrating the availability and usefulness of new spatial methods, we hope to stimulate new efforts in these directions.
While we draw strong methodological conclusions, we interpret the substantive results from the empirical example cautiously. Similar to other cross-sectional studies, this analysis is susceptible to the well-known endogeneity problem (Dietz 2002). Without longitudinal data, it is difficult to adjust for the neighborhood sorting process in which people are self-selected into certain neighborhoods with specific contextual characteristics. Furthermore, since all predictors were measured in 1880, residential changes that occurred afterwards could result in measurement errors and biased estimates. However, among the children who died by age 5, nearly half of them died by one year old, and over 80% by three years old, which helps reduce the time window of residential changes.
The biggest challenge remains to conceptualize and explicitly test the role of space in shaping individuals’ health. Does spatial distance directly affect disease transmission through physical proximity, or is it merely a proxy for social and cultural distance? Is residential environment or activity space a more important context affecting health? It is known that different ethnic groups organized social and religious activities at different locations (e.g. attending different churches and visiting different bars) in 1880 Newark (Cunningham 1966). Therefore, our residence-based neighborhoods do not fully reflect children’s risk environment when, for example, their mothers exchanged experiences about infant feeding and periodic medical checkups after Sunday morning’s services at a church. Future research can benefit from incorporating locational information of social institutions in constructing neighborhood contexts to address these challenges. Similarly, future research that extends to adult populations can benefit from mapping locations of workplace since occupation segregation by ethnicity (e.g. German immigrants were concentrated in leather and brewing industries) shaped adult workers’ daytime activity space.
Even with these limitations in mind, this study contributes to emerging efforts in integrating spatial queries into neighborhood research. We are optimistic that our spatial treatment of neighborhood representations, contextual measurements, and analytic strategies, may be useful to future researchers who are faced with similar conceptual and methodological challenges in demographic research that features neighborhood contexts and their implications for individual outcomes.
Acknowledgments
The authors thank the research initiative on Spatial Structures in the Social Sciences at Brown University for providing the historical GIS data used in this study. The historical GIS data collection used in this work was supported by the National Science Foundation [grant number 0647584] and the National Institutes of Health [grant number 1R01HD049493–01A2]. The authors also thank participants at the 2011 Annual Meeting of the Population Association of America for helpful comments on an earlier draft.
Contributor Information
Hongwei Xu, Email: xuhongw@umich.edu, Institute for Social Research, University of Michigan, 426 Thompson St, 216 NU ISR Bldg, Ann Arbor, MI 48106, Phone: (734) 615-3552, Fax: (734) 763-1428.
John R. Logan, Department of Sociology, Brown University
Susan E. Short, Department of Sociology, Brown University
References
- Acevedo-Garcia D. Zip Code-level Risk Factors for Tuberculosis: Neighborhood Environment and Residential Segregation in New Jersey, 1985–1992. American Journal of Public Health. 2001;91:734–741. doi: 10.2105/ajph.91.5.734. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Akaike H. A new look at the statistical model identification. IEEE Transactions on Automatic Control. 1974;19(6):716–723. [Google Scholar]
- Anderson E. Streetwise: Race, Class, and Changes in an Urban Community. Chicago: University of Chicago Press; 1992. [Google Scholar]
- Anselin L. Spatial Econometrics: Methods and Models. Dordrecht, Netherlands: Kluwer Academic; 1988. [Google Scholar]
- Arcaya M, Brewster M, Zigler CM, Subramanian SV. Area variations in health: A spatial multilevel modeling approach. Health & Place. 2012;18(4):824–831. doi: 10.1016/j.healthplace.2012.03.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Banerjee S, Gelfand AE, Carlin BP. Hierarchical Modeling and Analysis for Spatial Data. Boca Raton, FL: Chapman & Hall/CRC Press; 2004. [Google Scholar]
- Boyle MH, Willms JD. Place effects for areas defined by administrative boundaries. American Journal of Epidemiology. 1999;149:577–585. doi: 10.1093/oxfordjournals.aje.a009855. [DOI] [PubMed] [Google Scholar]
- Browne W, Goldstein H. MCMC Sampling for a Multilevel Model With Nonindependent Residuals Within and Between Cluster Units. Journal of Educational and Behavioral Statistics. 2010;35(4):453–473. [Google Scholar]
- Chaix B. Geographic life environments and coronary heart disease: a literature review, theoretical contributions, methodological updates, and a research agenda. Annual Review of Public Health. 2009;30:81–105. doi: 10.1146/annurev.publhealth.031308.100158. [DOI] [PubMed] [Google Scholar]
- Chaix B, Merlo J, Chauvin P. Comparison of a spatial approach with the multilevel approach for investigating place effects on health: the example of healthcare utilization in France. Journal of Epidemiology & Community Health. 2005a;59:517–526. doi: 10.1136/jech.2004.025478. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chaix B, Merlo J, Evans D, Leal C, Havard S. Neighbourhoods in eco-epidemiologic research: delimiting personal exposure areas. A response to Riva, Gauvin, Apparicio and Brodeur. Social Science & Medicine. 2009;69:1306–1310. doi: 10.1016/j.socscimed.2009.07.018. [DOI] [PubMed] [Google Scholar]
- Chaix B, Merlo J, Subramanian SV, Lynch J, Chauvin P. Comparison of a spatial approach with the multilevel analytical approach in neighborhood studies: the case of mental and behavioral disorders due to psychoactive substance use in Malmö, Sweden, 2001. American Journal of Epidemiology. 2005b;162(2):171–182. doi: 10.1093/aje/kwi175. [DOI] [PubMed] [Google Scholar]
- Cliff A, Haggett P. Time, travel and infection. British Medical Bulletin. 2004;69:87–99. doi: 10.1093/bmb/ldh011. [DOI] [PubMed] [Google Scholar]
- Coggon D, Barker DJP, Inskip H, Wield G. Housing in early life and later mortality. Journal of Epidemiology & Community Health. 1993;47:345–348. doi: 10.1136/jech.47.5.345. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Collins C, Williams DR. Segregation and mortality: the deadly effects of racism? Sociological Forum. 1999;14(3):495–523. [Google Scholar]
- Crowder K, South SJ. Spatial Dynamics of White Flight: The Effects of Local and Extralocal Racial Conditions on Neighborhood Out-Migration. American Sociological Review. 2008;73:792–812. doi: 10.1177/000312240807300505. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cunningham JT. Newark. Newark, NJ: New Jersey Historical Society; 1966. [Google Scholar]
- DeBats DA. Political Consequences of Spatial Organization: Contrasting Patterns in Two Nineteenth-Century Small Cities. Social Science History. 2011;35(4):505–541. [Google Scholar]
- Dietz RD. The estimation of neighborhood effects in the social sciences: an interdisciplinary approach. Social Science Research. 2002;31:539–575. [Google Scholar]
- Diez-Roux AV. Multilevel Analysis in Public Health Research. Annual Review of Public Health. 2000;21(1):171–192. doi: 10.1146/annurev.publhealth.21.1.171. [DOI] [PubMed] [Google Scholar]
- Diggle PJ, Tawn JA, Moyeed RA. Model-Based Geostatistics. Applied Statistics. 1998;47(3):299–350. [Google Scholar]
- Downey L. Spatial measurement, geography, and urban racial inequality. Social Forces. 2003;81(3):937–952. [Google Scholar]
- Elliott P, Wartenberg D. Spatial Epidemiology: Current Approaches and Future Challenges. Environmental Health Perspectives. 2004;112(9):998–1006. doi: 10.1289/ehp.6735. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fang J, Madhavan S, Bosworth W, Alderman MH. Residential segregation and mortality in New York City. Social Science & Medicine. 1998;47(4):469–474. doi: 10.1016/s0277-9536(98)00128-2. [DOI] [PubMed] [Google Scholar]
- Flowerdew R, Manley DJ, Sabel CE. Neighbourhood effects on health: does it matter where you draw the boundaries? Social Science & Medicine. 2008;66:1241–1255. doi: 10.1016/j.socscimed.2007.11.042. [DOI] [PubMed] [Google Scholar]
- Frank LD, Andresen MA, Schmid TL. Obesity relationships with community design, physical activity, and time spent in cars. American Journal of Preventive Medicine. 2004;27(2):87–96. doi: 10.1016/j.amepre.2004.04.011. [DOI] [PubMed] [Google Scholar]
- Frankenberg E, McKee D, Thomas D. Health Consequences of Forest Fires in Indonesia. Demography. 2005;42(1):109–129. doi: 10.1353/dem.2005.0004. [DOI] [PubMed] [Google Scholar]
- Galishoff S. Newark: The Nation’s Unhealthiest City, 1832–1895. New Brunswick, NJ: Rutgers University Press; 1988. [Google Scholar]
- Gatrell AC, Bailey TC, Diggle PJ, Rowlingson BS. Spatial Point Pattern Analysis and Its Application in Geographical Epidemiology. Transactions of the Institute of British Geographers. 1996;21(1):256–274. [Google Scholar]
- Gelfand AE, Latimer A, Wu S, Silander JA., Jr . Building statistical models to analyze species distributions. In: Clark JS, Gelfand AE, editors. Hierarchial Modelling for the Environmental Sciences: Statistical Methods and Applications. New York: Oxford University Press; 2006. pp. 77–97. [Google Scholar]
- Gelman A, Hill J. Data Analysis Using Regression and Multilevel/Hierarchical Models. New York: Cambridge University Press; 2007. [Google Scholar]
- Gelman A, Rubin DB. Inference from Iterative Simulation Using Multiple Sequences. Statistical Science. 1992;7(4):457–472. [Google Scholar]
- Grannis R. The importance of trivial streets: residential streets and residential segregation. The American Journal of Sociology. 1998;103(6):1530–1564. [Google Scholar]
- Guo JY, Bhat CR. Operationalizing the concept of neighborhood: application to residential location choice analysis. Journal of Transport Geography. 2007;15:31–45. [Google Scholar]
- Hays JN. Epidemics and Pandemics: Their Impacts on Human History. Santa Barbara, California: ABC-CLIO, Inc; 2005. [Google Scholar]
- Hearst MO, Oakes JM, Johnson PJ. The effect of racial residential segregation on black infant mortality. American Journal of Epidemiology. 2008;168(11):1247–1254. doi: 10.1093/aje/kwn291. [DOI] [PubMed] [Google Scholar]
- Holford TR. The Analysis of Rates and of Survivorship Using Log-Linear Models. Biometrics. 1980;36(2):299–305. [PubMed] [Google Scholar]
- Jacquez GM, Greiling DA. Local clustering in breast, lung and colorectal cancer in Long Island, New York. International Journal of Health Geographics. 2003;2:3. doi: 10.1186/1476-072X-2-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kaplan DH, Holloway SR. Scaling ethnic segregation: causal processes and contingent outcomes in Chinese residential patterns. GeoJournal. 2001;53:59–70. [Google Scholar]
- Kramer MR, Cooper HL, Drews-Botsch CD, Waller LA, Hogue CR. Do measures matter? Comparing surface-density-derived and census-tract-derived measures of racial residential segregation. International Journal of Health Geographics. 2010;9:29. doi: 10.1186/1476-072X-9-29. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee BA, Reardon SF, Firebaugh G, Farrell CR, Matthews SA, OsSullivan D. Beyond the census tract: patterns and determinants of racial segregation at multiple geographic scales. American Sociological Review. 2008;73:766–791. doi: 10.1177/000312240807300504. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Logan JR. Making a Place for Space: Spatial Thinking in Social Science. Annual Review of Sociology. 2012;38:507–524. doi: 10.1146/annurev-soc-071811-145531. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Logan JR, Jindrich J, Shin H, Zhang W. Mapping America in 1880: The Urban Transition Historical GIS Project. Hist Methods. 2011;44(1):49–60. doi: 10.1080/01615440.2010.517509. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lunn DJ, Spiegelhalter DJ, Thomas A, Best N. The BUGS project: Evolution, critique and future directions. Statistics in Medicine. 2009;28(25):3049–3067. doi: 10.1002/sim.3680. [DOI] [PubMed] [Google Scholar]
- Matthews SA. Spatial Polygamy and the Heterogeneity of Place: Studying People and Place via Egocentric Methods. In: Burton LM, Kemp SP, Leung M, Matthews SA, Takeuchi DT, editors. Communities, Neighborhoods, and Health: Expanding the Boundaries of Place. New York: Springer; 2011. pp. 35–55. [Google Scholar]
- Montgomery MR, Hewett PC. Urban poverty and health in developing countries: Household and neighborhood effects. Demography. 2005;42(3):397–425. doi: 10.1353/dem.2005.0020. [DOI] [PubMed] [Google Scholar]
- Morenoff JD. Neighborhood Mechanisms and the Spatial Dynamics of Birth Weight. The American Journal of Sociology. 2003;108(5):976–1017. doi: 10.1086/374405. [DOI] [PubMed] [Google Scholar]
- Openshaw S. The Modifiable Areal Unit Problem. Norwich: Geo Books; 1984. [Google Scholar]
- Powers DA, Xie Y. Statistical methods for categorical data analysis. Bingley, UK: Emerald; 2008. [Google Scholar]
- Rabe-Hesketh S, Skrondal A. Multilevel and Longitudinal Modeling Using Stata. College Station, TX: Stata Press; 2012. [Google Scholar]
- Rabin Y. The roots of segregation in the eighties: the role of local government actions. In: Tobin GA, editor. Divided Neighborhoods: Changing Patterns of Racial Segregation. Newbury Park, CA: Sage Publications; 1987. pp. 208–226. [Google Scholar]
- Read JnG, Gorman BK. Gender and health inequality. Annual Review of Sociology. 2010;36:371–386. [Google Scholar]
- Reardon SF, O’Sullivan D. Measures of spatial segregation. Sociological Methodology. 2004;34:351–364. [Google Scholar]
- Reardon SF, Matthews Stephen A, O’Sullivan Davis, Lee Barrett A, Firebaugh Glenn, Farrell Chad R, Bischoff Kendra. The geographic scale of metropolitan racial segregation. Demography. 2008;45(3):489–515. doi: 10.1353/dem.0.0019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Reid A. Infant feeding and post-neonatal mortality in Derbyshire, England, in the early twentieth century population. Studies. 2002;56:151–166. doi: 10.1080/00324720215926. [DOI] [PubMed] [Google Scholar]
- Riva M, Gauvin L, Apparicio P, Brodeur JM. Disentangling the relative influence of built and socioeconomic environments on walking: the contribution of areas homogenous along exposures of interest. Social Science & Medicine. 2009;69:1296–1305. doi: 10.1016/j.socscimed.2009.07.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roberts EM. Neighborhood social environments and the distribution of low birth weight in Chicago. American Journal of Public Health. 1997;87:597–603. doi: 10.2105/ajph.87.4.597. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rothenberg R, Muth SQ, Malone S, Potterat JJ, Woodhouse DE. Social and geographic distance in HIV risk. Seuxally Transmitted Diseases. 2005;32(8):506–512. doi: 10.1097/01.olq.0000161191.12026.ca. [DOI] [PubMed] [Google Scholar]
- Rothenberg RB, Potterat JJ. Temporal and social aspects of gonorrhea transmission: the force of infectivity. Seuxally Transmitted Diseases. 1988;15(2):88–92. doi: 10.1097/00007435-198804000-00004. [DOI] [PubMed] [Google Scholar]
- Sampson RJ, Morenoff JD, Earls F. Beyond social capital: spatial dynamics of collective efficacy for children. American Sociological Review. 1999;64(5):633–660. [Google Scholar]
- Sampson RJ, Morenoff JD, Gannon-Rowley T. Assessing neighborhood effects: social processes and new directions in research. Annual Review of Sociology. 2002;28:443–478. [Google Scholar]
- Short SE, Linmao M, Wentao Y. Birth Planning and Sterilization in China population. Studies. 2000;54(3):279–291. doi: 10.1080/713779090. [DOI] [PubMed] [Google Scholar]
- Siqueira-Junior J, Maciel I, Barcellos C, Souza W, Carvalho M, Nascimento N, Oliveira R, Morais-Neto O, Martelli C. Spatial point analysis based on dengue surveys at household level in central Brazil. BMC Public Health. 2008;8(1):1–9. doi: 10.1186/1471-2458-8-361. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sobek M. Work, status, and income: men in the American occupational structure since the late nineteenth century. Social Science History. 1996;20:169–207. [Google Scholar]
- Spiegelhalter DJ, Best NG, Carlin BP, Van Der Linde A. Bayesian Measures of Model Complexity and Fit. Journal of the Royal Statistical Society, Series B. 2002;64(4):583–639. [Google Scholar]
- Spielman S, Yoo E-h. The spatial dimensions of neighborhood effects. Social Science & Medicine. 2009;68:1098–1105. doi: 10.1016/j.socscimed.2008.12.048. [DOI] [PubMed] [Google Scholar]
- Taeuber KE, Taeuber AF. Nigroes in Cities. Chicago: Aldine; 1965. [Google Scholar]
- Theil H. Statistical Decomposition Analysis. Amsterdam: North-Holland Publishing Company; 1972. [Google Scholar]
- Thornton P, Olson S. Mortality in late nineteenth-century Montreal: Geographic pathways of contagion population. Studies. 2011;65(2):157–181. doi: 10.1080/00324728.2011.571385. [DOI] [PubMed] [Google Scholar]
- Tobler WR. A Computer Moview Simulating Urban Growth in the Detroit Region. Economic Geography. 1970;46(2):234–240. [Google Scholar]
- Weeks J, Stoler J, Hill A, Zvoleff A. Fertility in Context: Exploring Egocentric Neighborhoods in Accra. In: Weeks JR, Hill AG, Stoler J, editors. Spatial Inequalities. Springer Netherlands; 2013. pp. 159–177. [Google Scholar]
- Xie Z, Yan J. Kernel density estimation of traffic accidents in a network space. Computers, Environment and Urban Systems. 2008;32:396–406. [Google Scholar]
- Xu H, Short SE. Health Insurance Coverage Rates In 9 Provinces In China Doubled From 1997 To 2006, With A Dramatic Rural Upswing. Health Affairs. 2011;30(12):2419–2426. doi: 10.1377/hlthaff.2010.1291. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xu H, Short SE, Liu T. Dynamic relations between fast-food restaurant and body weight status: a longitudinal and multilevel analysis of Chinese adults. Journal of Epidemiology & Community Health. 2012 doi: 10.1136/jech-2012-201157. [DOI] [PMC free article] [PubMed] [Google Scholar]







