Web-based Supporting Materials for Incomplete contingency tables with censored cells with application to estimating the number of people who inject drugs in Scotland by Overstall, King, Bird, Hutchinson and Hay 1. Equivalence of posterior inference under Poisson and multinomial models Forster [1] shows that for complete contingency tables (i.e. when N is known), the joint posterior distribution for m and m is identical under the Poisson and multinomial formulations when (0) / 1, assuming the same prior distribution on the remaining parameters and over the model space. We extend this result here to the case of incomplete contingency tables (i.e. when N is unknown) and in the presence of (uninformative) left censoring. In particular, the posterior distributions are identical under the di erent model formulations under the prior speci cations, (0) / 1 and (N) / N 1 (assuming identical priors on all other parameters and over model space). First we consider the Poisson formulation given in equation (2.1) of the main manuscript. The full set of model parameters, under model m, is denoted by m = fm; 2g. We let m = m; 2 (i.e. the set of parameters excluding the intercept term). In addition, for model m (dropping the subscript notation for simplicity) we let hi = i 0 (i.e. the linear predictor for cell i, minus the intercept term). From equation (2.5) of the main manuscript, and integrating out the intercept term, the (marginal) posterior distribution of m, m, yC and yU is given by ( m; m;yC ;yU jyO; zC) / (zC jyC)( m;m) Z R (yjm;m)( 0)d 0 / (zC jyC)( m;m) Z R Qn i=1 exp(i) yi iQn i=1 yi d 0 (recalling that ( 0) / 1) = (zC jyC)( m;m) Qn i=1 exp(hi) yi Qn i=1 yi Z R exp(0) N exp(0) nX i=1 exp(hi) ! d 0: 1Using the substitution u = exp(0) and simplifying the expression, we obtain ( m; m;yC ;yU jyO; zC) / (zC jyC)(m;m) Qn i=1 exp(hi) yi (N 1)! Qn i=1 yi! ( Pn i=1 exp (hi)) N : (1) Next we consider the alternative multinomial formulation. The posterior distribution of m, m, yC and yU is given by, (m; m;yC ;yU jyO; zC) / (zC jyC)(m;m) (yjN; m;m)(N) / (zC jyC) (m;m) N ! Qn i=1 yi! nY i=1 pyii 1 N ; substituting the probability mass function for the multinomial distribution. The result immediately follows (i.e. the posterior distribution is identical to equation (1) in this document) by noting that pi = exp(hi) Pn j=1 exp(hj) : 2. Weighted least squares implementation of the Metropolis-Hastings algorithm In this section we describe the weighted least squares implementation of the Metropolis-Hastings algorithm for GLMs [2] as applied to log-linear models. Let the current parameter values be denoted by m and de ne W(m) = diag fig and i(m) = (yi i)= i, where i is evaluated at the current log-linear parameters, m. Furthermore, we let ~y(m) = Xm m + (m); C(m) = 1 2 I + XTmW( m)Xm 1 ; and m(m) = C( m)X T mW(m)~y(m): The Metropolis-Hastings proposal parameters, 0 are simulated from, 0 N (m( m);C( m)) ; 2and accepted with the standard acceptance probability, min(1; A), where A = (0jy; 2;m)q(mj 0) ( mjy; 2;m)q( 0jm) ; in which q(0jm) denotes the multivariate normal proposal density for the proposal values, given the current parameter values (and vice versa). 3. Reversible jump algorithm Here we consider the reversible jump algorithm [3] to update the model within the MCMC algorithm. We let m denote the current model with associated vector of log-linear parameters m and design matrix Xm. Let ~m denote the maximal model, i.e. the most complex model we are prepared to consider and ~ ~m the corresponding posterior mode of the log-linear parameters under the maximal model tted to the observed cell counts, yO. Note that, for the examples we consider, the maximal model corresponds to the model with all main e ects and two-way interactions present. We set ~ = X ~m~ ~m and de ne the (n n) matrix ~W = diag fexp(~)g. We propose to move to a model that di ers with respect to the current model m by only a single interaction and choose each of these models with equal probability. Suppose that we propose to move to model k, which involves adding an interaction term (i.e. a \birth" move). We let the associated design matrix for model k be Xk. We can write Xk = (Xm;S) where S is the column vector of the design matrix corresponding to the interaction term that is added to the current model (note that for such moves we also re-order the terms accordingly). We de ne, P k = Xk XTk ~WXk 1 XTk ~W ; Ck = ST ~W (I P k)S 1 ; and mk = CkS T ~W (I P k) ~: We simulate u N (mk; Ck) and set the proposed model parameters 0 k such that, 0k = 0 B @ 0(1) 0(2) 1 C A = 0 B @ I XTk ~WXk 1 XTk ~WS 0 1 1 C A 0 B @ m u 1 C A : 3Note that 0(1) denotes the proposed parameter values for the log-linear terms present in model m and 0 (2) the parameter value for the interaction term that is proposed to be added. The move is accepted with probability, min(1; A), where, A = ( 0k; kjy; zC ; 2) (m;mjy; zC ; 2)q(u) ; such that q denotes the proposal normal density function with mean mk and variance Ck. The Jacobian term is simply equal to one and the probabilities of moving between models m and k cancel in the probability so that these terms are omitted in the acceptance probability. We now consider the case where we move from model k with parameters 0k to model m with parameters m which involves removing a single interaction term from the model (i.e. a \death" move). The corresponding log-linear parameters in the proposed model are deterministically given by, m = 0 (1) + XTk ~WXk 1 XTk ~WS 0 (2); and u = 0(2). Recall that 0 (1) is the vector of current elements of 0 k corresponding to the log-linear parameters present in model m and 0(2) is the current value of the log-linear parameter that is removed from the model. This move is accepted with probability min(1; A1), where A is given above. Finally, note that in both types of model moves (adding or removing an interaction parameter), the hyperparameter 2 is not updated within the model move. 4. Additional output Web Table 1 shows the posterior means of the interaction terms for each year and for the INC-C, REM-C and IGN-C methods. This table acts as a complement to Table 5 (showing posterior probabilities) in the main manuscript. Web Table 2 shows the posterior mean and 95% HPDIs for the total population size for each year under four di erent speci cations of the prior hyperparameters, a and b, under the proposed INC-C method. The values in this table should be compared against the corresponding values in the rst two columns of Table 4 in the main manuscript, where the prior hyperparameters are a = 0:001 and b = 0:001. 4Web Table 1: The marginal posterior means for each two-way log-linear interaction term for the INC-C, REM- C and IGN-C methods. The data-sources are labelled as S1 - social enquiry reports; S2 - hospital records; S3 - Scottish Drug Misuse Database (SDMD) and S4 - HCV diagnosis data-source. An NA indicates that this interaction cannot be identi ed with the REM-C method. 2003 2006 2009 Interaction INC-C REM-C IGN-C INC-C REM-C IGN-C INC-C REM-C IGN-C S1 S2 0.00 0.00 0.13 -0.01 -0.00 0.00 0.00 0.01 0.19 S1 S3 -0.08 -0.09 0.08 0.12 0.14 0.19 0.07 0.09 0.26 S1 S4 -0.00 NA -0.01 0.01 NA -0.00 0.04 NA 0.01 S2 S3 0.02 0.02 0.19 -0.01 0.00 0.06 -0.04 -0.02 0.17 S2 S4 0.31 NA 0.27 0.27 NA 0.21 0.18 NA 0.07 S3 S4 0.01 NA -0.01 0.01 NA -0.00 0.01 NA -0.01 S1 Age 0.21 0.21 0.25 -0.17 -0.16 -0.21 0.05 0.05 0.19 S2 Age -0.04 -0.04 -0.00 0.13 0.13 0.08 -0.24 -0.24 -0.11 S3 Age 0.09 0.09 0.15 -0.13 -0.12 -0.18 0.01 0.01 0.16 S4 Age -0.00 NA -0.01 0.00 NA 0.01 0.03 NA -0.01 S1 Sex 0.09 0.09 0.09 -0.00 -0.00 -0.01 0.00 0.00 0.01 S2 Sex -0.00 -0.00 -0.00 0.12 0.12 0.10 -0.13 -0.13 -0.12 S3 Sex 0.00 0.00 0.00 0.01 0.01 0.00 -0.00 -0.00 0.00 S4 Sex -0.01 NA -0.00 0.00 NA 0.01 -0.00 NA -0.00 S1 Region 0.06 0.06 0.07 0.01 0.01 0.04 0.00 0.00 0.01 S2 Region -0.16 -0.17 -0.15 0.00 0.00 0.03 -0.00 -0.00 0.01 S3 Region -0.00 -0.00 0.00 0.21 0.21 0.24 0.12 0.12 0.14 S4 Region -0.14 NA -0.19 -0.00 NA -0.10 -0.01 NA -0.25 Age Sex -0.12 -0.12 -0.12 -0.15 -0.15 -0.14 -0.16 -0.16 -0.14 Age Region 0.19 0.19 0.18 -0.14 -0.14 -0.13 0.13 0.13 0.14 Sex Region -0.00 -0.00 -0.00 0.00 0.00 0.00 -0.00 -0.00 -0.00 Web Table 2: Posterior mean (95% HPDI) for the total population size under the INC-C method for each year, for di erent values of the prior hyperparameters, a and b. The analysis presented in the main manuscript corresponds to a = b = 0:001 and the posterior mean (95% HPDI) for the total population size under this analysis (from Table 4) is also shown here for comparison. Year a = 0:001 a = 0:001 a = 0:001 a = 0:001 Gelman b = 0:004 b = 0:002 b = 0:001 b = 0:0005 prior 2003 16300 16500 16700 16700 16500 (14200, 20500) (14300, 20800) (14300, 20900) (14300, 20900) (14300, 20700) 2006 22800 23200 22900 23000 22900 (15700, 26600) (19800, 27000) (16300, 27000) (19300, 27600) (18700, 27800) 2009 14600 14600 15600 15200 14600 (11400, 18300) (11500, 18300) (11500, 18600) (11700, 18700) (11500, 18400) 5References [1] Forster, J. J. (2010). Bayesian inference for Poisson and multinomial log-linear models. Statistical Methodology, 7, 210{224. [2] Gamerman, D. (1997). Sampling from the posterior distribution in generalised linear mixed models. Statistics and Computing, 7, 57{68. [3] Forster, J. J., Gill, R. C. & Overstall, A. M. (2012). Reversible jump methods for generalised linear models and generalised linear mixed models. Statistics and Computing, 22, 107{120. 6