Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2017 Jul 31.
Published in final edited form as: Stat Med. 2013 Aug 29;32(30):5241–5259. doi: 10.1002/sim.5944

A multilevel model for spatially correlated binary data in the presence of misclassification: an application in oral health research

Timothy Mutsvari a, Dipankar Bandyopadhyay b, Dominique Declerck c, Emmanuel Lesaffre a,d,*,
PMCID: PMC5535814  NIHMSID: NIHMS881238  PMID: 23996301

Abstract

Dental caries is a highly prevalent disease affecting the tooth’s hard tissues by acid-forming bacteria. The past and present caries status of a tooth is characterized by a response called caries experience (CE). Several epidemiological studies have explored risk factors for CE. However, the detection of CE is prone to misclassification because some cases are neither clearly carious nor noncarious, and this needs to be incorporated into the epidemiological models for CE data. From a dentist’s point of view, it is most appealing to analyze CE on the tooth’s surface, implying that the multilevel structure of the data (surface–tooth–mouth) needs to be taken into account. In addition, CE data are spatially referenced, that is, an active lesion on one surface may impact the decay process of the neighboring surfaces, and that might also influence the process of scoring CE. In this paper, we investigate two hypotheses: that is, (i) CE outcomes recorded at surface level are spatially associated; and (ii) the dental examiners exhibit some spatial behavior while scoring CE at surface level, by using a spatially referenced multilevel autologistic model, corrected for misclassification. These hypotheses were tested on the well-known Signal Tandmobiel® study on dental caries, and simulation studies were conducted to assess the effect of misclassification and strength of spatial dependence on the autologistic model parameters. Our results indicate a substantial spatial dependency in the examiners’ scoring behavior and also in the prevalence of CE at surface level.

Keywords: binary, autologistic, multilevel, spatial

1. Introduction

Oral health has improved tremendously over the last century, but the prevalence of dental caries in children remains a significant clinical problem that needs attention. Many epidemiological and clinical studies have explored the risk factors that influence dental caries. The most commonly used outcome that quantifies the present and past caries status of a tooth or tooth’s surface is caries experience (CE). However, the process of detecting CE is challenging, especially during the early stages of the disease. This leads to misclassified outcomes resulting in possibly poor quality data. A lot of effort has been devoted to improve CE detection. For example, guidelines for conducting a dental examination were developed by the (WHO) [1] and the British Association for the Study of Community Dentistry, (BASCD) [2] and more recently the (ICDAS) [3] system was proposed. All of these underline the need for training the examiners and measuring the reliability of the obtained scores. This is achieved by comparing the scores of dental examiners with the gold standard (or a benchmark scorer in the absence of a gold standard). However, for various reasons, see, for example, [4] and [5], the process of CE detection remains prone to misclassification error even after adhering to these standardized criteria. Therefore apart from the assessment of CE prevalence, the scoring process of CE needs further investigation and should be incorporated into epidemiological models.

In a dental study, data is recorded at surface level but can be analyzed at mouth, tooth, or surface level. However, from a dentist’s point of view, the tooth’s surface is most appealing, implying that the multilevel structure of the data (surface–tooth–mouth) needs to be taken into account. Indeed, the CE status of surfaces from the same tooth is more correlated than the CE status of surfaces from different teeth. Moreover, caries on one tooth surface may cause caries on the neighboring teeth, that is, there is spatial dependency in the CE status. Spatial modeling in dental caries has been studied by, for example, Zhang et al. [6] and Bandyopadhyay et al. [7]. In addition, the anatomical structure of a tooth or tooth surface, for example, pit, fissured, proximal, or smooth surfaces influences the development of dental caries. In this paper, we wish to investigate the factors that influence the prevalence of CE while taking care of the spatial dependency of CE. We also investigate whether the scoring behavior of examiners is spatially induced, that is, whether the scoring is more similar for surfaces on the same tooth or adjacent surfaces than surfaces further apart. This necessitates the investigation of spatial dependency of the scoring behavior of dental examiners, which to our knowledge has never been carried out before. Further, an exploration of scoring CE multilevel data coupled with spatial dependency has never been explored. It is clear that analyzing this type of data with its complex correlation structure provides a serious statistical challenge for researchers in the dental field.

There is a vast amount of literature that addresses the analysis of spatial associations [811]. In a typical disease mapping framework, the unit of spatial association is the geographical entity, say county, and subjects are clustered within a particular county. On the contrary, the spatial topology within a mouth is unique, because here the spatial association is manifested at the tooth, or site level clustered within a mouth/subject, and subjects are independent. Although spatial models are widely used with geographical data, its use in dental applications for binary data is scarce; see, for example, Bandyopadhyay et al. [7]. Under a Markov random field framework, (where the conditional distribution of the spatial unit depends on only the outcomes of the specified set of neighbors), one of the most common approaches to deal with spatial data is the conditional autoregressive (CAR) model [11]. The CAR model has desirable properties, namely one can implement the estimation by using full likelihood [10]. An alternative and plausible approach to deal with spatially correlated binary data (as in our case) is via the autologistic regression model; see, for example, Besag [8]. The autologistic model extends the usual logistic regression accounting for spatial correlation. Here, one computes the sum of the outcomes for the neighbors for each site and includes the sum as a covariate in the model to capture the spatial effect. However, the major drawback of this approach is that the standard maximum likelihood estimation methods are untenable due to the intractable normalizing constant, except for small data sets [10]. We address this problem by using a pseudo-likelihood (PL) estimation technique [12], where only the autocovariate term is included in the regression model without the normalizing constant. This is a more flexible but approximate approach, which allows the inclusion of multilevel random effects. Also, standard software can be used to carry out the estimation. The PL approach leads to unbiased and consistent estimates that are asymptotically normal [11, 13]; however, the efficiency of the PL estimate decreases with increasing spatial dependence.

In this paper, we examine two spatial hypotheses on CE outcomes recorded at surface level. First, we explore the spatial dependence of CE on proximal surfaces (adjacent surfaces), and compare it with the spatial dependence of the nonproximal surfaces (on the same tooth). Next, we explore whether there is evidence of spatial referencing for scoring CE by dental examiners. More specifically, we compare the spatial dependency on the misclassification process for proximal and nonproximal surfaces. For both hypotheses, we adopt a Bayesian framework for our autologistic model where the PL is used as an approximation to the full likelihood. Our analysis considers the surface level of tooth, thereby exploiting the full multilevel structure of CE data into account. For the first hypothesis, we fit the main model to the main cross-sectional data. For the second hypothesis, we adjust the autologistic model for misclassification and fit the misclassification model to the validation data. The main and validation data sets are obtained from the well-known Signal Tandmobiel® (ST) study, whereby children in the main study were examined by 16 dental examiners and also by a benchmark scorer in the validation study.

This paper is organized as follows. Section 2 provides a general description of the motivating data set and the study population. Statistical models for the main data (error-free) and validation date are outlined in Section 3. In Section 4, we present a model for the main data adjusted for misclassification. Details on the PL estimation technique appear in Section 5. Section 6 applies the methodology to the ST data set, while Section 7 presents a limited simulation study to investigate the impact of misclassification and spatial association on model parameters. Section 8 concludes, with some further discussion.

2. Motivating data set

2.1. Main data set

The ST study is a longitudinal (1996–2001) oral health intervention project that took place in Flanders (north of Belgium). At the first examination, the average age of the children was 7.1 years (SD = 0.4). Sixteen trained dentists (examiners) conducted annual oral health examinations on 4468 children (2315 boys and 2153 girls) from 179 primary schools, after parental consent was obtained. Data on oral hygiene and dietary habits were obtained through structured questionnaires completed by the parents. The children received a clinical examination based on the diagnostic criteria for caries prevalence surveys published by the BASCD [2]. The clinical examinations took place in a mobile dental clinic, with a standard dental chair and artificial dental light. Detection was performed using a visual–tactile method by using a mouth mirror and a World Health Organization/Community Periodontal Index of Treatment Needs type E probe (Prima Instruments, Gloucester, UK). No radiographs were taken. For the analysis of the main data, we consider only the data collected at the last visit, that is, year 2001. The outcome of interest is binary CE (CE = 1 if the surface shows CE, and 0 otherwise) subject to misclassification.

2.2. Validation data set

Training sessions for scoring CE were organized and the scoring behavior measured by sensitivity (SE) and specificity (SP), of each of the 16 dental examiners was compared with that of the benchmark scorer, that is, the third author (Dominique Declerck) who was trained in BASCD scoring (with regular updates) and has extensive experience in scoring CE in children. Because scoring was carried out at d3 level (irreversible caries), her scores were assumed to be close to perfect. Nevertheless, we realize that a benchmark scorer is never without error, but we had to ignore this in our analysis in order not to further increase the complexity of our model. SE and SP measure the capability of the examiners in detecting CE or its absence, respectively. During the study period (1996–2001), three calibration exercises for scoring CE (1996, 1998, and 2000) involving 92, 32, and 24 children respectively were organized, vis-à-vis the benchmark scorer. In the present work, data from the three calibration exercises were combined into one validation data set. More details of the ST study are given in Vanobbergen et al. [14].

To analyze the spatial scoring behavior of the examiners, we defined a misclassification binary score W = 1 if the benchmark and examiner agree in the CE score and 0 otherwise. A statistical model is fitted later to determine the factors that influence this outcome while accounting for spatial dependency.

3. Statistical models for error-free responses

3.1. Spatial dependency of caries experience data

In this section, we introduce the metric for the spatial dependency structure that will be investigated in this paper. Our main interest for the main and validation data is to investigate the spatial dependency at surface level. Note first that a tooth consists of four to five surfaces. Four surfaces are found on incisors and canines: (i) buccal surface, which faces the front of the mouth; (ii) lingual surface, which faces the tongue side; and (iii) the in-between teeth surfaces, mesial and distal. The mesial side faces the middle of the mouth while the distal side faces the back. The molars and premolars have an extra occlusal surface, which is the surface that makes contact with the vertically opposing tooth. (Table I)

Table I.

Tooth surfaces neighbor structure.

Surface Neighbor surfaces for five-surfaced teeth Mesial1 Distal2
Buccal Lingual Distal Mesial Occlusal
Buccal
Lingual
Distal
Mesial
Occlusal
Neighbor surfaces for four-surfaced teeth
Surface Buccal Lingual Distal Mesial Mesial1 Distal2
Buccal
Lingual
Distal
Mesial

✓ Indicates a neighbor.

1

Indicates that the mesial surface on the next tooth is a neighbor.

2

Indicates that the distal surface on the next tooth is a neighbor.

Indicates that a surface cannot be a neighbor to itself.

Motivated by the work of Leroy et al. [15], we distinguish spatial dependency for proximal surfaces (distal and mesial) and the other for nonproximal surfaces (buccal, lingual, and occlusal). Figure 1 shows these two types of spatial dependency. In both models, we let γ1 represent the spatial dependency for the nonproximal surfaces belonging to the same tooth and γ2 represent the spatial dependency for the proximal surfaces, that is, adjacent mesial and distal surfaces.

Figure 1.

Figure 1

Representation of the two types of spatial dependency for caries experience data.

3.2. The multilevel autologistic model for main data

The binary outcome variable YM,mts equals one if the sth surface (s = 1, …, Stm) in tooth t = 1, …, Tm of subject m = 1, …, M has CE and YM,mts = 0 otherwise. The subscript M stands for ‘main data’. Note that, following [8], our proposition takes the most natural route to model spatially correlated binary data. We consider a generalized linear mixed model framework to model the probability of CE and incorporate the fixed effects, heterogeneous random effects and spatial effects via the autologistic specification, that is,

logit(πmts)=xmtsβM+zmtsum+γM,1s:s~symts+γM,2s:ssymts, (1)

where ymts is the observed outcome of surface s′ considered a neighbor to surface s on the same tooth, ~ refers to neighboring nonproximal surfaces, and ≃ refers to neighboring proximal surfaces, ymts̃ is the observed outcome of surface , a neighbor of surface s but from an adjacent tooth, xmts is a d-dimensional vector of covariates for surface s within tooth t in mouth m with βM the corresponding vector of regression coefficients, and um=(um,utm) is the random effects vector, where um and utm are random intercepts at mouth and tooth (nested in mouth) level, distributed independently, with mean zero and variances σm2 and σtm2, respectively, that is, um ~ N(0,DM) where DM=diag(σm2,σtm2), and zmts the corresponding q-dimensional design matrix.

The probability of a surface s belonging to tooth t and mouth m having CE is given by πmts = Pr(YM,mts = 1|ymts, ymts̃, um, xmts, βM, γM,1, γM,2). In this model, we specified two spatial parameters γM,1 and γM,2 to determine the degree or strength of spatial correlation, that is, it explains how the CE status of a surface is influenced by the CE of neighboring surfaces. Specifically, γM,1 controls for the spatial association of surfaces within the same tooth while γM,2 controls for spatial dependency of surfaces on adjacent teeth on the same jaw.

The interpretation of the autologistic spatial parameters γM,1 and γM,2 is not straight forward. In the presence of positive spatial dependence, it can be illustrated that the traditional autologistic parameterization leads to the conditional expectation of YM,mts given its neighbors that always increases as long as neighbors have nonzero outcomes and never decreases. This is unreasonable if most of the neighbors are zeros and could bias the realizations toward 1. Hence, the interpretation of parameters is difficult across varying levels of statistical dependence. Because of this problem, Caragea and Kaiser [16] recently proposed a reparametrization of the autologistic regression in order to have interpretable parameters. This necessitates an adjustment, whereby Σs′:s~s ymts and Σs̃:s ymts̃ are substituted by Σs′:s~s(ymtsλmts) and Σs̃:s(ymtsbarλmts), respectively, and λmts=exp(xmtsβM+zmtsum)/(1+exp(xmtsβM+zmtsum)). Given this parameterization, the odds of YM,mts = 1 increases if the number of positive neighbor surfaces is greater than the number of neighbors expected under the independence model and decreases if the neighbor surfaces is less than the number expected under the independence model.

3.3. The multilevel autologistic model for validation data

Let YV,mts and YV,mts be the binary CE outcome variables in the validation data for surface s = 1, …, Stm in tooth t = 1, …, Tm of subject m = 1, …, M according to examiner and benchmark scorer, respectively. The subscript V stands for ‘validation data’. Let Wmts be the binary outcome variable for surface s = 1, …, Stm in tooth t = 1, …, Tm of subject m = 1, …, M, with Wmts = 1 if the examiner and the benchmark score the surface differently for CE and Wmts = 0 otherwise, that is, Wmts = 1 if YV,mts=0 & YV,mts = 1 or YV,mts=1& YV,mts = 0, and 0 otherwise. Then, the autologistic model that relates the discrepancy in scoring of the examiner and the benchmark scorer to determinants is given by

logit(μmts)=xmtsβV+zmtsum+γV,1s:s~swmts+γV,2s:sswmts,

where μmts is the probability of agreement (between examiner and benchmark scorer) in scoring CE for the m tooth, t th tooth, and the sth surface, wmts is the agreement outcome on the neighboring surface s′ (on the same tooth) while wmts̃ is the outcome of surface (on the adjacent tooth). The vector βV contains the fixed effects regression coefficients, γV,1 measures the degree of spatial dependency in scoring the CE status of surfaces of the same tooth and γV,2 measures the degree of spatial dependency in misclassifying the CE status of a proximal surface, that is, mesial (distal) surface and adjacent distal (mesial) surface. Similarly to the main model, we denote the covariance matrix of the random effects um as DV.

4. Correction for misclassification

In Section 3.2, we presented a statistical model relating the score of the gold standard (or benchmark scorer) to covariates taking into account the spatial structure of the data. However, in the main data set the dental examiners scored the children and their scores are likely to be prone to error. Ignoring in the statistical analysis that the binary CE in the main data is prone to misclassification might lead to wrong conclusions. In this section, we present an approach for correcting the autologistic regression model for misclassification error. We denote YM,mts the binary CE score obtained from the examiner(s) for surface s = 1, …, Stm in tooth t = 1, …, Tm of subject m = 1, …, M, with YM,mts=1 if the dental examiner noticed CE, and is 0 otherwise. In the main data set, the true score YM,mts (from the benchmark scorer) is not available. However, the SE and SP values obtained from the validation data can be used to correct for misclassification. Note that we assumed in this paper that all dental examiners scored similarly, this to reduce the complexity of the model. However, there is no theoretical difficulty (only computational burden in view of the large data set) to extend the model taking into account 16 different SE and SP estimates. Given YV,mts, the true (benchmark score) CE from the validation data, we estimate the correction terms, that is, τ11=Pr(YV,mts=1YV,mts=1) and τ00=Pr(YV,mts=0YV,mts=0), which are the SE and SP, respectively. The corrected autologistic regression model is then given by

Pr(YM,mts=1)=yPr(YM,mts=1YM,mts=y)Pr(YM,mts=y)=(1-τ00)+(τ11+τ00-1)g-1(η) (2)

where g is a link function and η=(xmtsβM+zmtsum+γM,1s:s~symts+γM,2s:ssymts), ‘…’ represents βM, xmts, DM, τ00, τ11, γM,1 and γM,2. Note that we did not use a misclassification spatial model for correction because it is not compatible with the correction terms needed in the main model. In the next section, we give the details of PL estimation approach.

5. The pseudo-likelihood approach

The full likelihood for model (1), that is, the joint probability distribution for all binary outcomes ymts is known only up to a normalization constant [17]. The contribution to the likelihood of the mth mouth and t th tooth is given by

L(βM,γM,umymt)=exp(symts[xmtsβM+zmtsum+γM,12symts+γM,22symts])k(βM,γM,um),

where ymt is the vector of responses for the t th tooth in the mth mouth, and γ′ = (γM,1, γM,2) is the vector of the spatial parameters. The constant

k(βM,γM,um)={ymts}exp(symts[xmtsβM+zmtsum+γM,12symts+γM,22symts])

involves Σ{ymts}, which is the sum of all possible realizations in the observed data. This is the source of difficulty in fitting the autologistic model, because it is an analytically intractable function of the regression parameters for large data sets. For a data set with 1000 observations, the full likelihood evaluation would require 21000 = 1,071, 509 × 10301 possible cases.

Certainly, a CAR specification [18] of the spatial terms under Markov random field assumptions would lead to a tractable likelihood. The CAR model is ideal and popular in a disease mapping [19] context but based on Gaussian assumptions that can be unrealistic in the context of its latent variable formulations to handle categorical data. As mentioned earlier in Section 3.2, our autologistic proposition is in the spirit of the celebrated 1972 paper by Besag [8] for discrete (binary) spatial data and directly addresses spatial association by estimating how much the response variable at any one site reflects response values at surrounding sites [20]. In addition, initial attempts to fit the CAR model turned out to be negative in the sense that convergence of some of the model parameters were not achieved, for example, for the variance of the spatial random effect. Henceforth, we proceeded with our analysis by using the autologistic specification.

Several techniques have been suggested to address the issue of intractable normalizing constants, see, for example, the Huffer and Wu approach [17], path sampling [21], Møller et al. algorithm [22], and so on. However, adapting these techniques to a multilevel data structure with various types of spatial associations manifested via γM,1 and γM,2, and a large number of observed data points, would involve formidable computations as in our case. In addition, implementing such techniques in standard Bayesian statistical software such as WinBUGS is infeasible. In contrast, the Bayesian approach based on PL is straightforward, and does not rely on the normal approximations or nonlinear maximization steps involved in the previous suggested methods. Besag [9] proposed the PL approximation in which the autocovariate term (Σs ymts and Σ ymts̃) is included in the regression, but the likelihood function is constructed as if the responses were independent. Hereby, standard software can be used. The PL for the CE responses is then given by

PL(βM,γM,uMyM)=Pr(ymts=1ymts,ymts,um,xmts,βM,γM)=m=1Mt=1Tms=1Stmexp(xmtsβM+zmtsum+γM,1symts+γM,2symts)1+exp(xmtsβM+zmtsum+γM,1symts+γM,2symts), (3)

with yM representing all CE outcomes. The evaluation of the marginal PL for the mth mouth involves integrating out the random effects um, and this is accomplished via MCMC techniques [23].

5.1. Bayesian inference

We adopt a Bayesian paradigm for parameter estimation in the models for validation data and the main data. By using this approach, we can incorporate prior knowledge about the parameters combined with the observed data to yield a posterior distribution of the parameters by using the MCMC approach [23]. For the main model corrected for misclassification, we opted to process the main data and the validation data simultaneously, that is, at each iteration of the Markov chain of the validation data, we obtained estimates of τ11 and τ00, and these estimates were imputed into the Markov chain pertaining to the main data as priors. This presents a simple way to take into account the variability with which τ11 and τ00 are estimated from the validation data. In the next subsections, we provide further details on the choice of our prior distributions and the related posterior distribution.

5.1.1. Prior distributions

For both the main and misclassification model, we chose non-informative priors for the model parameters. For the fixed effects βM and βV, vague independent normal priors were assumed, that is, βi ~ N(0, 106), i = 0, …, p. Independent Uniform(0, 100) priors were assumed for the standard deviations σm and σtm of the random effects [24]. Note that the prior of DM and of DV is determined by the priors of σm and σtm. For the spatial random effects γM,1, γM,2 prior belief of a positive (spatial) association suggested to take U(0, 3) priors. Independent Beta(1, 1) priors were taken for the misclassification probabilities τ11 and τ00.

5.1.2. Posterior distribution

Denoting the prior distributions on βM, γM, DM (the variance– covariance matrix) by p(βM), p(γM) and p(DM), respectively, the joint posterior distributions for the main model (assuming error-free) is p(βM, γM, DM|YM) ∝ PL(βM, γM, DM)p(βM)p(γM) p(DM|YM). For the misclassification model (M replaced by V), the joint posterior distributions become p(βV,γV,DVYV,YV)PL(βV,γV,DVYV,YV)p(βV)p(γV)p(DV), where YV and YV are vectors of misclassification responses in the validation data obtained from the examiner and benchmark, respectively.

We now look at the posterior distribution for the main corrected model (2). Note that we assumed that all dental examiners score similarly. This is an assumption that can be relaxed, but given the complexity of the suggested model, we preferred to make this simplifying assumption. For given values of SE and SP, the analysis of the cross-sectional main data yields p(βM,DM,γMYM,YV,YV,τ00,τ11), that is, the posterior estimates obtained are conditioned on the imputed τ00 and τ11, where YM is a vector of error-prone CE responses (from examiners) in the main data. Specifically, the prior distributions of τ00 and τ11, that is p(τ00) and p(τ11) yield the following posterior distributions:

τ10~Beta(n10+1,n00+1),τ01~Beta(n01+1,n11+1)

where τ00 = 1−τ10 and τ11 = 1−τ01, nij is the frequency of scoring ‘i′ for CE in the validation data by the examiner when the benchmark scorer scores ‘j′. This posterior distribution is imputed into the model for the main data to correct for misclassification. Given the posterior estimates of the misclassification probabilities τ = (τ10, τ01) from the validation data, the pseudo-likelihood Equation (3) and the prior information for the parameters in the main corrected model, the posterior distribution of the main model is given by

p(βM,γM,DMYM,YV,YV,τ00,τ11)PL(βM,γM,DMYM,YV,YV)p(βM)p(γM)p(DM)p(τ)

Note that the posterior estimates of misclassification probabilities are used as priors in the main corrected model and the process is carried out simultaneously.

5.2. Model selection and assessment

The Bayesian toolbox offers various model selection criteria. One of the most popular one is the deviance information criterion (DIC) [25], which is based on the deviance D(Θ) = −2Log(data|Θ), where Θ is the parameter space. DIC is defined as DIC = + pD, that is, posterior mean deviance plus the ‘effective number of parameters’ pD = D(Θ̄), whereby D(Θ̄) is the deviance evaluated at the posterior mean of the parameters. Models with smaller DIC are better supported by the data. Note that, in our proposition, the PL approximates the true likelihood, hence the reported DIC will be an approximate/pseudo-DIC. Because of our PL route, we wanted to avoid using Bayes factors [26], which is fundamentally based on full-likelihood computations.

To strengthen our model comparisons, we also explored some cross-validatory approaches on the basis solely of predictive performances of competing models. If ypr denotes the predictive data vector and y denotes the observed data vector, then the posterior predictive distribution is given as p(ypr|y) = ∫p(ypr |Θ)p(Θ|y)dΘ. One can obtain predictive data easily from a converged posterior sample and samples from the posterior predictive distribution are replicates of the observed model generated data. Cross-validation algorithms [27] involve fitting the model to a randomly selected subset of the data, predicting the remaining data, and finally comparing the error in this prediction (misclassification rates for our binary response) across alternative models. The model with the lowest estimated predictive error is chosen. In our example, our cross-validated measure randomly removes 10% of the data and monitors the total predictive error of incorrectly classified deleted binary responses. This is given as CV1 = Σ{y(1 − ypr)} and CV0 = Σ{(1 − y)ypr}, where y is the observed response, ypr is the predicted response, and the summations are taken over the respected deleted observations. Under a Bayesian paradigm, this measure is obtained as a posterior mean of the total predictive errors over all parameters and the data, computed as ECV0 and ECV1 where the expectation is over all other parameters and data.

We note, however, that the previous measures for model comparison measures do not make sense for the evaluation of the corrected (for misclassification) models, because these corrected models provide parameter estimates on the basis of the (unobserved) true responses. Evidently the fit of these models to the observed (but possibly corrupted) responses must be worse than the models that aim to predict the true responses.

Postmodel comparisons, model diagnostics to assess goodness-of-fit of the best model uses a criterion on the basis of the posterior predictive distribution. The discrepancy measure between model and data is computed as a summary statistic [28] by using model parameters and data defined as T(y,Θ)=(y-E{yΘ})2Var{yΘ}, where the summation is taken over all observations. The Bayesian p-value/posterior predictive p-value [28] is then defined as the number of times T(ypr, Θ) exceeds T(y, Θ) out of L simulated draws. A very large p-value (> 0.95) or a very small (< 0.05) both signals model misspecification, that is, the observed pattern would be unlikely to be seen in replications of the data under the true model [28]. In a similar spirit as explained previously, we computed the Bayesian p-value for the best (uncorrected) model.

6. Application to the Signal Tandmobiel® study

In this section, we summarize the results obtained from the ST main and validation study. The analysis was carried out in WinBUGS 1.4.3 [29]. For both models, we ran three MCMC chains each for 10,000 iterations with a burn-in period of 2000. The MCMC chains show adequate mixing and convergence, as revealed from the trace plots, ACF plots and the Gelman–Rubin potential scale-reduction factor and Rubin’s diagnostic () [28], which were less than 1.1 for all the parameters and with a Monte Carlo error less than 5% of the posterior standard deviation (see e.g., [23]).

Table II presents the comparison among the three competing models fitted for the main data by using the DIC, CVO, and CV1. All the model selection tools (DIC, CV0, and CV1) favored the multilevel autologistic as the best model. This motivated the extension of multilevel autologistic model to correct for misclassification, and henceforth we discuss parameter estimates on the basis of this model. The Bayesian p-value for the multilevel autologistic model was 0.574, and so we do not suspect any inadequacy of model fit.

Table II.

Model comparison by using deviance information criterion (DIC), cross-validation and L-measure for the main data.

Model DIC CV0 CV1
Logistic 2402.800 29.827 31.518
Multilevel logistic 2517.012 27.534 28.071
Multilevel autologistic 2017.613 24.783 18.214
Corrected multilevel autologistic 1979.312 24.551 28.915

6.1. Analysis of the main data

Four models were fitted to the main data: that is, (i) logistic regression model that assumes that the data are independent; (ii) multilevel model that assumes that the data are dependent without considering the spatial random effects; (iii) multilevel autologistic model that incorporates unstructured random effects as well as spatial random effects; and (iv) the multilevel autologistic model in (iii) but corrected for misclassification.

For all the previously mentioned models, the following covariates were considered; the effect of gender (girls versus boys), dentition type (permanent versus deciduous), tooth type (canine versus incisor, molar or premolar versus incisor), jaw (mandible versus maxilla) and quadrant (left versus right), age of the child, and the geographical location (represented by the standardized (x,y) coordinate of the municipality of the school to which the child belongs). The latter covariate was included because it turned out to be a significant predictor in previous analyses; see for example, Mwalili et al. [30]. The reason for its importance is, though, still not clear. Table III shows the posterior means, standard deviations and 95% equal tail credible intervals (CI) for the fixed effects (in log-odds), random effects and spatial parameters obtained for models (i), (ii), and (iii). Generally, the three models gave similar statistical significant conclusions.

Table III.

Parameter estimates, standard deviation and 95% equal tail credible intervals (CIs) for standard logistic, multilevel, and multilevel autologistic models for the main data.

Parameter Standard logistic Multilevel Multilevel autologistic



Estimate (SD) 95% CI Estimate (SD) 95% CI Estimate (SD) 95% CI
FIXED EFFECTS:
Intercept −2.519(0.136) [−2.797; −2.259] −3.997(0.417) [−4.827; −3.182] −3.541(0.249) [−4.047; −3.073]
Gender
Girls 0.099(0.124) [−0.141; 0.341] 0.412(0.431) [−0.408; 1.276] 0.208(0.253) [−0.274; 0.723]
Boys
Age −0.191(0.067) [−0.317; −0.058] −0.249(0.218) [−0.691; 0.176] −0.151(0.130) [−0.413; 0.101]
Dentition Type
Permanent −1.823(0.125) [−2.073; −1.577] −2.611(0.240) [−3.101; −2.155] −1.474(0.172) [−1.817; −1.145]
Deciduous
Jaw
Upper 0.089(0.126) [−0.153; 0.338] 0.239(0.190) [−0.121; 0.617] 0.170(0.136) [−0.095; 0.435]
Lower
Quadrant
Right 0.429(0.120) [0.197; 0.663] 0.594(0.191) [0.222; 0.974] 0.335(0.135) [0.072; 0.600]
Left
Geographical location
x-coordinate −0.010(0.062) [−0.131; 0.111] 0.114(0.229) [−0.327; 0.573] 0.054(0.133) [−0.202; 0.328]
y-coordinate −0.150(0.062) [−0.271; −0.028] −0.178(0.206) [−0.578; 0.230] −0.123(0.125) [−0.369; 0.127]
SPATIAL EFFECTS:
γ1 1.100(0.106) [0.892; 1.302]
γ2 1.202(0.323) [0.553; 1.821]
RANDOM EFFECTS:
σm2
2.811(0.811) [1.585; 4.679] 0.815(0.324) [0.328; 1.583]
α2tm 2.312(0.466) [1.485; 3.308] 0.024(0.065) [0.001; 0.138]
σe2
0.099(0.352) [0.000; 1.157] 0.066(0.130) [0.000; 0.371]

6.2. Analysis of the validation data

Three models were fitted to the validation data: that is, (i) logistic regression model that assumes that the data are independent; (ii) multilevel model that assumes that the data are dependent without considering the spatial random effects; and (iii) multilevel autologistic model that incorporates unstructured random effects as well as spatial random effects. The covariates considered now are gender (girls versus boys), dentition type (permanent versus deciduous), tooth type (canine versus incisor, molar or premolar versus incisor), jaw (mandible versus maxilla), and quadrant (left versus right) were considered. Table IV shows the posterior means, standard deviations, and 95% equal tail CIs for the fixed effects (in log-odds), variance of random effects and spatial parameters obtained from the validation data.

Table IV.

Parameter estimates, standard deviation, and 95% equal tail credible intervals (CIs) for standard logistic, multilevel, and multilevel autologistic models for the validation data.

Parameter Standard logistic Multilevel Multilevel autologistic



Estimate (SD) 95% CI Estimate (SD) 95% CI Estimate (SD) 95% CI
FIXED EFFECTS:
Intercept −4.604(0.229) [−5.089; −4.172] −7.595(0.577) [−8.783; −6.562] −7.381(0.515) [−8.410; −6.409]
Gender
Girls −0.549(0.087) [−0.720; −0.373] −0.762(0.328) [−1.412; −0.130] −0.723(0.294) [−1.312; −0.150]
Boys
Dentition Type
Permanent −0.769(0.093) [−0.951; −0.592] −1.036(0.235) [−1.486; −0.582] −0.933(0.228) [−1.383; −0.491]
Deciduous
Tooth Type
Canine 0.960(0.253) [0.478; 1.476] 1.371(0.494) [0.448; 2.403] 1.292(0.460) [0.409; 2.207]
Pre(molar) 2.017(0.215) [1.625; 2.461] 3.219(0.429) [2.449; 4.148] 2.879(0.390) [2.124; 3.669]
Incisor
Jaw
Upper −0.182(0.086) [−0.341; −0.010] −0.308(0.202) [−0.701; 0.074] −0.241(0.192) [−0.618; 0.136]
Lower
Quadrant
Right −0.246(0.100) [−0.435; −0.051] −0.267(0.237) [−0.730; 0.185] −0.239(0.230) [−0.692; 0.213]
Left
RANDOM EFFECTS:
σm2
1.665(0.471) [0.902; 2.722] 1.218(0.386) [0.589; 2.092]
α2tm 4.538(0.605) [3.466; 5.859] 4.103(0.551) [3.114; 5.284]
DIC 5148.213 4229.123 3892.617

DIC, deviance information criterion.

There was a consensus for the three models regarding the influence of gender, that is, a lower probability of misclassification for girls compared to boys. The probability of misclassifying a surface belonging to permanent teeth was lower compared with that of deciduous teeth in all the models and higher for canine and pre(molar) teeth compared with incisors. The position of the surface in the mouth also has an impact on the probability of misclassifying CE, but this was only observed in the standard logistic regression model. Finally, for all models misclassifying CE on surfaces on the right quadrant happens more than with surfaces on the left quadrant.

Because the 95% equal tail CI of the difference γ1γ2 does not contain zero (−0.735; −0.050), we concluded that proximal spatial dependency is significantly greater than nonproximal spatial dependency. This motivated the extension of our model in order to investigate examiners’ specific spatial dependency on proximal and nonproximal surfaces. The results of the two types of spatial dependencies for each examiner are shown in Figure 2. The left panel indicates the examiners’ specific spatial dependency estimates on proximal surfaces and the right panel indicates estimates on nonproximal surfaces. A high estimate of γ1 means that the examiner scores adjacent distal and mesial surface either both right or both wrong. This situation is observed for examiner 14. Similarly, a high estimate of γ2 is an indication of an examiner who scores surfaces on one tooth (excluding distal and mesial) all wrong or all right, which was noticed for examiners 6, 12, and 14. An obvious conclusion from the analysis of validation data is that a flaw in scoring CE of a surface is more likely to be repeated on the neighboring surface.

Figure 2.

Figure 2

Plots of examiner-specific spatial correlations and 95% equal tail credible intervals (CIs) for proximal surfaces (γ1) and nonproximal surfaces (γ2) from the validation data.

6.3. Analysis of the main data (corrected for misclassification)

The results for the main corrected model (2), that is, after correcting for misclassification are shown in Table V. By using the SE and SP from the validation data, we proceeded to correct the main model for possible misclassification. Substantial changes were observed in the parameter estimates of the fixed effects and standard deviations. Also, the spatial effects parameters and variance of the random effects increased in the corrected model. However, the conclusions did not change qualitatively. The parameter estimates obtained with differential correction are generally higher in absolute value compared to the models without correction. Also, the 95 % equal tail CI for differential correction are often wider than when no correction and nondifferential correction was applied.

Table V.

Parameter estimates, standard deviation and 95% equal tail credible intervals (CIs) for multilevel autologistic model for the main data corrected for misclassification.

Parameter Corrected multilevel autologistic

Estimate (SD) 95% CI
FIXED EFFECTS:
Intercept −3.616(0.345) [−4.325; −2.961]
Gender
Girls −0.107(0.386) [−0.893; 0.644]
Boys
Age −0.138(0.200) [−0.555; 0.252]
Dentition Type
Permanent −2.188(0.290) [−2.783; −1.663]
Deciduous
Jaw
Upper 0.203(0.203) [−0.203; 0.587]
Lower
Quadrant
Right 0.476(0.199) [0.101; 0.867]
Left
Geographical location
x-coordinate 0.027(0.205) [−0.346; 0.443]
y-coordinate −0.196(0.202) [−0.619; 0.174]
SPATIAL EFFECTS:
γ1 1.406(0.161) [1.106; 1.739]
γ2 1.350(0.411) [0.527; 2.144]
RANDOM EFFECTS:
σm2
1.636(0.795) [0.576; 3.423]
α2t m 0.061(0.054) [0.002; 0.190]
σe2
0.152(0.239) [0.000; 0.837]

The probability of a surface having CE decreased as the age increases, which is a biologically awkward result. Also, the probability of CE of a surface of a permanent tooth was lower compared with that of a deciduous tooth. Further, there was a higher probability of CE on surfaces on the right quadrant than surfaces on the left quadrant (from the patient’s perspective). There was a remarkably high proximal and nonproximal spatial dependency of CE. The comparison of γ1 and γ2 revealed a nonsignificant (95% equal tail CI = [−0.729; 0.538]) difference between the two spatial dependencies. Although we reported the DIC (1979.31) for the corrected autologistic model, we cannot compare this model with the other models since in this case we are modeling the latent (unobserved response).

6.4. Sensitivity analysis

We performed a limited sensitivity analysis in order to investigate the stability of our model because of prior distribution choices for β, γ1, γ2, σm2, and σtm2. More specifically, we changed the prior distributions of αm and αtm from Uniform(0, 100) to IG (10−4, 10−4) for σm2 and σtm2. Further we changed the prior distribution of the regression coefficients from a normal to a t -distribution with 4 degrees of freedom. A dflat() prior was also tested for the regression coefficients. We have also chosen a Uniform(0,100) prior and a N(0, 106) prior for the spatial parameters. Although slight changes were observed in the parameter estimates (both fixed effects, spatial, and variance components), results were robust overall, and did not change any conclusions regarding the strength of the spatial parameters, and inference on the fixed effects.

7. Simulation study

In this section, we present a simulation study to investigate the performance of the corrected autologistic regression model. In order to assess the performance, we include also the logistic regression, corrected logistic model, and uncorrected autologistic in the simulation. Triggered by the criticism expressed in the literature about biased parameter estimates for the autologistic specification as compared with a non-spatial logistic regression [31], we wished to illustrate here the effect of misclassification and the impact of the strength of spatial dependency on parameter estimation.

7.1. Generating the simulation data sets

We set the true parameters for β0, β1, β2, and γ to generate spatial data on a lattice of size 20 × 20, that is, 400 observations by using the autologistic model given by

logit(πj)=β0+β1xj+β2yj+γj:j~jyj,

where x and y are the coordinates on the lattice. Note that here we did not use γ1 and γ2 in order to avoid unnecessary complexity. The adjacency matrix consists of the four nearest neighbor adjacent cells and sites on the corner have three neighbors. Because in the autologistic model the outcome is also a covariate, we used a Gibbs Sampler to generate the equilibrium lattice or map. For more technical details on generating spatially correlated data we refer to Huffer and Wu [17] and Hughes et al. [32]. Initial binary values yk,l where k, l = 1, …, 10 for site j, j = 1, …, 400 on the lattice were arbitrarily assigned. After n completed sweeps of updating the map, it was assumed that an equilibrium state has been reached. The last update was then taken as the fixed observed binary data for map Y. Examples of generated maps for different values of the spatial parameter are shown in Figure 3. The Gibbs sampling algorithm to generate the data for iteration i is as follows:

Figure 3.

Figure 3

Adjacent matrices for different values of spatial dependency; □ represents the disease-free site and ■ epresents the presence of the disease.

  1. Pick arbitrary values of yk,l to create map Y0(i).

  2. Given map Y0(i) and parameters (β0, β1, β3, γ), generate the next map Y1(i), that is, where each observation in map Y1(i) is obtained by predicting using the autologistic model (7.1). In the general map Y(t+1)(i) is created from map Y(t)(i).

  3. Generate sufficient maps (t → ∞) until equilibrium state is attained, and keep the last updated map.

7.2. Simulation set-up

We generated 10,000 equilibrium maps. For each of the generated equilibrium maps, we applied the misclassification process in order to generate the corrupted outcome. Let Y be the true outcome generated from the previous algorithm and τ00 and τ11 be the SP and SE, respectively. We generate the corrupted outcome Y * by using the following command:

IfY=1thengenerateY~Bin(1,τ11)elseifIfY=0thengenerateY~Bin(1,1-τ00)

Next, by using Y* as the outcome and x and y as covariates, we fitted the following models: (i) logistic model; (ii) corrected logistic model; (iii) autologistic model without correction; and (iv) corrected autologistic model. We then calculated the bias and 95% coverage rates for the parameters β0, β1, β3, and γ from each model.

7.3. Simulation results

The results obtained from the previous simulation study are shown in Table VI. The data were generated for three different scenarios of γ and two different scenarios of SE and SP: (i) SE = 0.7, SP = 0.7 γ = 0.1, (ii) SE = 0.7, SP = 0.7 γ = 0.6, and (iii) SE = 0.7, SP = 0.7 γ = 1.0, (iv) SE = 0.8, SP = 0.8 γ = 0.1, (v) SE = 0.8, SP = 0.8 γ = 0.6, and (vi) SE = 0.8, SP = 0.8 γ = 1.0. The corresponding generated values are plotted in Figure 4. Plots for the other scenarios are given in Figures 5 and 6. From Table VI, it is clear that the logistic regression and autologistic regression produce biased estimates in the presence of misclassification. One of the important findings is that the bias increases with increasing misclassification. This is also confirmed by the 95% coverage rates of the 95% equal tail CIs. Also, as the spatial dependency increases, the bias in the parameter estimation increases.

Table VI.

Simulation results: bias (95% coverage) of logistic, autologistic, and corrected autologistic models for different sensitivity (SE), specificity (SP), and spatial parameters.

Model SE = 0.8, SP = 0.8, γ = 0.1 SE = 0.7, SP = 0.7, γ = 0.1


β0 β1 β2 γ β0 β1 β2 γ
Logistic 0.207 0.447 0.447 0.193 0.632 0.627
(67.22) (55.71) (55.38) (65.27) (29.08) (29.89)
Corrected logistic 0.147 0.066 0.071 0.128 0.106 0.117
(91.73) (98.14) (97.91) (94.89) (97.58) (97.28)
Autologistic 0.127 0.475 0.475 0.035 0.141 0.650 0.645 0.048
(91.41) (52.91) (52.64) (89.75) (84.26) (28.50) (28.71) (80.08)
Corrected autologistic 0.006 0.013 0.019 0.003 0.007 0.071 0.083 0.005
(98.41) (98.40) (98.28) (98.37) (97.86) (97.49) (97.27) (97.61)
SE = 0.8, SP = 0.8, γ = 0.6 SE = 0.7, SP = 0.7, γ = 0.6


Logistic 1.136 0.093 0.088 1.141 0.275 0.282
(0.24) (95.17) (95.19) (0.42) (77.52) (76.05)
Corrected logistic 1.060 0.385 0.397 1.057 0.410 0.399
(16.44) (91.73) (91.30) (48.77) (95.73) (95.48)
Autologistic 0.576 0.294 0.291 0.261 0.7799 0.405 0.412 0.364
(32.09) (79.69) (80.30) (7.98) (14.03) (61.43) (59.49) (1.65)
Corrected autologistic 0.015 0.000 0.009 0.008 0.0570 0.012 0.005 0.032
(98.24) (98.41) (98.25) (98.06) (97.83) (97.66) (97.32) (97.59)
SE = 0.8, SP = 0.8, γ = 1.0 SE = 0.7, SP = 0.7, γ = 1.0


Logistic 1.762 0.193 0.189 1.841 0.025 0.021
(0.00) (87.85) (88.52) (0.00) (95.12) (94.94)
Corrected logistic 1.584 0.567 0.559 1.587 0.570 0.567
(1.60) (72.32) (72.94) (15.52) (85.69) (87.07)
Autologistic 0.968 0.151 0.153 0.461 1.329 0.197 0.205 0.638
(1.29) (93.06) (93.51) (0.00) (0.00) (87.53) (86.61) (0.00)
Corrected autologistic 0.0419 0.0030 0.0017 0.0218 0.126 0.018 0.007 0.058
(97.81) (98.39) (98.35) (0.9781) (97.53) (97.59) (96.55) (97.07)

Figure 4.

Figure 4

Density plots of simulated results from logistic, autologistic, and corrected autologistic models (SE = 0.8, SP = 0.8, γ = 0.1).

Figure 5.

Figure 5

Density plots of simulated results from logistic, autologistic, and corrected autologistic models (SE = 0.8, SP = 0.8, γ = 0.6).

Figure 6.

Figure 6

Density plots of simulated results from logistic, autologistic, and corrected autologistic models (SE = 0.8, SP = 0.8, γ = 1.0).

The simulation study confirmed that ignoring the spatial structure and ignoring misclassification implies biased estimates for the regression coefficients. This limited simulation study, however, does not confirm the large bias in case of a strong spatial dependence. An advantage of using the PL approach is that it is computationally efficient even for data with high spatial dependency such as that shown in Figure 3. We also repeated the simulation by using SE, SP, and γ values from the Signal Tandmobiel® study (Table VII). From this simulation, we also concluded that the corrected autologistic gave unbiased estimates compared with all the other models.

Table VII.

Simulation results: bias (95% coverage) of logistic, autologistic and corrected autologistic models using sensitivity (SE), specificity (SP), and spatial parameters from the Signal Tandmobiel® study.

Model SE = 0.79, SP = 0.99, γ = 1.35 SE = 0.79, SP = 0.99, γ = 1.40


β0 β1 β2 γ β0 β1 β2 γ
Logistic 1.057 0.226 0.200 1.012 0.138 0.117
(31.90) (76.05) (76.25) (37.30) (80.60) (81.05)
Corrected logistic 1.267 0.266 0.237 1.190 0.172 0.148
(28.40) (75.40) (75.75) (34.25) (80.15) (80.70)
Autologistic 0.040 0.028 0.020 0.272 0.041 0.020 0.03269 0.261
(99.20) (99.55) (99.65) (55.50) (99.30) (99.75) (99.75) (67.00)
Corrected autologistic 0.036 0.010 0.028 0.003 0.0275 0.002 0.020 0.033
(99.80) (99.65) (99.90) (99.30) (99.65) (99.90) (99.75) (99.65)

8. Discussion

In this paper, we proposed an autologistic regression framework for the multilevel validation and main caries experience data. For the validation data, our aim was to model the probability of misclassifying the CE status of a surface as a function of covariates. By using this model, we have addressed the issue of spatial dependency in examiners’ scoring behavior. For the main data, we were interested in investigating the factors that influence the prevalence of CE. Also, we investigated if the prevalence of CE is spatially distributed in the mouth. Previous research has confirmed the existence of spatial dependency in CE data [6, 7]. An analysis closely related to our current work is by Leroy et al. [15], where the authors concluded that cavity formation in permanent first molars was clearly influenced by the status of the adjacent primary molars, confirming the findings of this study. Our analysis is unique in a sense that the correction for misclassification of the outcome within a multilevel autologistic framework has never been attempted before. Note that the spatial structure within a mouth is unique and must be carefully chosen.

When correcting for misclassification, a fundamental assumption in a validation study is that true scores are recorded by a gold standard, that is, an approach (instrument or examiner) that is 100% error free. However, in practice, the scores are often generated by a benchmark scorer, that is, an experienced examiner who is assumed to be error free or nearly so. A benchmark scorer is also referred to in the literature as an ‘alloyed gold standard’. The question of how ‘alloyed’ a benchmark can be in order to maintain valid statistical inferences is subjective. More discussion on this matter is given by Wacholder et al. [33]. Indeed, it is possible that methods correcting for the effect of misclassification that make use of information gathered by comparing with a benchmark scorer might introduce more bias than they are correcting. The effect of the violation of the assumption on the corrected estimate depends on the magnitudes of the errors in the two measures and on their correlation. In particular, when the errors are negatively correlated, independent, or weakly positively correlated, the corrected estimate will tend to overcorrect beyond the true value. In this work, we are more confident because the benchmark scorer had vast experience in CE surveys and followed a special training.

Although the autologistic framework is widely used to model spatially correlated binary data, the statistical inference is wrought with complications due to the intractable normalizing constant. Hence, likelihood approaches cannot be directly implemented. The consequence becomes severe for large data sets [11, 17, 34]. We avoided these immense computational complexities encountered in the various propositions that tries to handle the normalizing constant by using the straightforward PL framework. Because the PL is an approximation to the full likelihood, the methodology may not be considered a full Bayes approach, but a ‘working/pseudo’ Bayes approach, providing us with a computationally feasible scheme in this multilevel framework. The relevant computations can also be carried out in JAGS or WinBUGS software and can be easily tuned to adapt to other data sets by any practitioner. Extensions to include temporal/longitudinal patterns of data are also feasible with some modifications.

Results from this analysis advances the subject area knowledge in various directions. We showed that there is spatial referencing among CE statuses of surfaces of the same tooth and for surfaces from adjacent teeth. Another interesting finding is that the scoring behavior of examiners also exhibits some degree of spatial dependency. This information could be useful in training dental examiners to promote independent examination of CE for clustered multilevel data.

Acknowledgments

This investigation was supported by research grant OT/05/60, KU Leuven; data collection was supported by Unilever, Belgium. The Signal Tandmobiel project has the following partners: D. Declerck (Department of Oral Health Sciences, KU Leuven), L. Martens (Dental School, University Ghent), J. Vanobbergen (Dental School, University Ghent), P. Bottenberg (Dental School, University Brussels), E. Lesaffre (Department of Biostatistics, Erasmus Medical Center, Rotterdam, the Netherlands and L-Biostat, KU Leuven), and K. Hoppenbrouwers (Youth Health Department, KU Leuven (University of Leuven); and Flemish Association for Youth Health Care). Bandyopadhyay acknowledges funding support from grants P20RR017696-06, R03-DE020114, and R03-DE021762 from the US National Institutes of Health.

References

  • 1.World Health Organization. Oral health surveys. Basic methods. World Health Organization; Geneva: 1997. [Google Scholar]
  • 2.Pine CM, Pitts NB, Nugent ZJ. British Association for the Study of Community Dentistry (BASCD) guidance on the statistical aspects of training and calibration of examiners for surveys of child dental health. A BASCD coordinated dental epidemiology programme quality standard. Community Dent Health. 1997;14(Suppl 1):18–29. [PubMed] [Google Scholar]
  • 3.International Caries Detection and Assessment System Coordinating Committee. Criteria manual - International Caries Detection and Assessment System (ICDAS II) 2005 [Google Scholar]
  • 4.Assaf AV, Meneghim MC, Zanin L, Mialhe FL, Pereira AC, Ambrosano GM. Assessment of different methods for diagnosing dental caries in epidemiological surveys. Community Dent Oral Epidemiology. 2004;32:418–425. doi: 10.1111/j.1600-0528.2004.00180.x. [DOI] [PubMed] [Google Scholar]
  • 5.Heifetz SB, Brunelle JA, Horowitz HS, Leske GS. Examiner consistency and group balance at baseline of a caries clinical trial. Community Dent Oral Epidemiology. 1985;13:82–85. doi: 10.1111/j.1600-0528.1985.tb01682.x. [DOI] [PubMed] [Google Scholar]
  • 6.Zhang Y, Todem D, Kim K, Lesaffre E. Bayesian latent variable models for spatially correlated tooth-level binary data in caries research. Statistical modelling. 2011;11(1):25–47. doi: 10.1177/1471082X1001100103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Bandyopadhyay D, Reich BJ, Slate EH. Bayesian modeling of multivariate spatial binary data with applications to dental caries. Statistics in Medicine. 2009;28:3492–3508. doi: 10.1002/sim.3647. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Besag JE. Nearest-neighbor systems and autologistic model for binary data. Journal of Royal Statistical Society, Series B. 1972;34:75–83. [Google Scholar]
  • 9.Besag J. Spatial interaction and the statistical analysis of lattice systems. Journal of Royal Statistical Society, Series B. 1975;2:192–236. [Google Scholar]
  • 10.Besag J, Kooperberg C. On conditional and intrinsic autoregressions. Biometrika. 1995;82(4):733–746. [Google Scholar]
  • 11.Cressie NA. Statistics for Spatial Data. John Wiley & Sons; New York: 1993. [Google Scholar]
  • 12.Besag J, Tantrum J. Likelihood analysis of binary data in space and time. Oxford Statistical Science Series. 2003;9A:289–294. [Google Scholar]
  • 13.Guyon X. Random Fields on a Network. Springer; Berlin: 1995. [Google Scholar]
  • 14.Vanobbergen J, Martens L, Lesaffre E, Declerck D. The Signal Tandmobiel® project, a longitudinal intervention health promotion study in Flanders (Belgium): Baseline and first year results. European Journal of Paediatric Dentistry. 2000;2:87–96. [Google Scholar]
  • 15.Leroy R, Bogaerts K, Lesaffre E, Declerck D. Effect of caries experience in primary molars on cavity formation in the adjacent permanent first molar. Caries Research. 2005;39:342–349. doi: 10.1159/000086839. [DOI] [PubMed] [Google Scholar]
  • 16.Caragea CP, Kaiser MS. Autologistic models with interpretable parameters. Journal of Agriculture, Biological, and Environmental Statistics. 2009;14(3):281–300. [Google Scholar]
  • 17.Huffer WF, Wu H. Markov chain Monte Carlo for autologistic regression models with appplication to the distribution of plant species. Biometrics. 1998;54:509–524. [Google Scholar]
  • 18.Banerjee S, Gelfand AE, Carlin BP. Hierarchical Modeling and Analysis for Spatial Data. Vol. 101. Chapman and Hall/CRC; London/CRC: 2003. [Google Scholar]
  • 19.Lawson AB. Bayesian Disease Mapping: HierarchicalModeling in Spatial Epidemiology. Vol. 32. Chapman & Hall; Boca Raton, FL: 2009. [Google Scholar]
  • 20.Dormann FC, McPherson MJ, Araújo BM, Bivand R, Bolliger J, Carl G, Davies GR, Hirzel A, Jetz W, Daniel Kissling W, et al. Methods to account for spatial autocorrelation in the analysis of species distributional data: a review. Ecography. 2007;30(5):609–628. [Google Scholar]
  • 21.Gelman A, Meng X. Simulating normalizing constants: from importance sampling to bridge sampling to path sampling. Statistical Science. 1998;13(2):163–185. [Google Scholar]
  • 22.Møller J, Pettitt A, Reeves R, Berthelsen K. An efficient Markov chain Monte Carlo method for distributions with intractable normalising constants. Biometrika. 2006;93(2):451–458. [Google Scholar]
  • 23.Lesaffre E, Lawson AB. Bayesian Biostatistics. John Wiley & Sons; New York: 2012. [Google Scholar]
  • 24.Gelman A. Prior distributions for variance parameters in hierarchical models (comment on article by Browne and Draper) Bayesian Analysis. 2006;1(3):515–534. [Google Scholar]
  • 25.Spiegelhalter DJ, Best NG, Carlin BP, Van Der Linde A. Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 2002;64(4):583–639. [Google Scholar]
  • 26.Kass RE, Raftery AE. Bayes factors. Journal of the American Statistical Association. 1995;90(430):773–795. [Google Scholar]
  • 27.Vehtari A, Lampinen J. Bayesian model assessment and comparison using cross-validation predictive densities. Neural Computation. 2002;14(10):2439–2468. doi: 10.1162/08997660260293292. [DOI] [PubMed] [Google Scholar]
  • 28.Gelman A, Carlin JB, Stern HS, Rubin DB. Bayesian Data Analysis. 2. Chapman & Hall/CRC; New York: 2004. [Google Scholar]
  • 29.Plummer M. JAGS: A program for analysis Bayesian graphical models using Gibbs sampling. In: Hornik K, Leisch F, Zeileis A, editors. Proceedings of the 3rd International Workshop on Distributed Statistical Computing (DSC, 2003); Vienna, Austria: Technische Universitaet Wien; 2003. http://www.ci.tuwien.ac.at/Conferences/DSC.html. [Google Scholar]
  • 30.Mwalili S, Lesaffre E, Declerck D. A Bayesian ordinal logistic regression model to correct for inter-observer measurement error in a geographical oral health study. Journal of the Royal Statistical Society, Series C. 2005;54:77–93. [Google Scholar]
  • 31.Dormann C. Assessing the validity of autologistic regression. Ecological Modeling. 2007;207:234–242. [Google Scholar]
  • 32.Hughes J, Haran M, Caragea CP. Autologustic models for binary data on a lattice. Environmetrics. 2011;22(7):857–871. [Google Scholar]
  • 33.Wacholder SBA, Hartge P. Validation studies using an alloyed gold standard. American Journal of Epidemiology. 1993;137:1251–1258. doi: 10.1093/oxfordjournals.aje.a116627. [DOI] [PubMed] [Google Scholar]
  • 34.Weir I, Pettitt A. Alternative to the autologistic model using a hidden conditional autoregressive Gaussian process. Proceedings 28th Symposium on the Interface, SISC-96.(a)(b); Citeseer. 1996. [Google Scholar]

RESOURCES