Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2020 Dec 5.
Published in final edited form as: Epidemiology. 2020 Mar;31(2):238–247. doi: 10.1097/EDE.0000000000001143

Transmission modeling with regression adjustment for analyzing household-based studies of infectious disease: application to tuberculosis

Forrest W Crawford 1, Florian M Marx 2, Jon Zelner 3, Ted Cohen 4
PMCID: PMC7718772  NIHMSID: NIHMS1644993  PMID: 31764276

Abstract

Background:

Household contacts of people infected with a transmissible disease may be at risk due to this proximate exposure, or from other unobserved sources. Understanding sources of variation in infection risk is essential for effectively targeting interventions.

Methods:

We develop an analytical approach to estimate household and exogenous forces of infection, while accounting for individual-level characteristics that may affect susceptibility to disease, and factors that may affect transmissibility. We apply this approach to a longitudinal cohort study conducted in Lima, Peru during 2009-2012 of 18,544 subjects in 4,500 households with at least one active tuberculosis (TB) case, and compare the results to those obtained by Poisson and logistic regression.

Results:

We estimate that HIV-coinfected (susceptibility hazard ratio, SHR=3.80, 1.56-9.29), child (SHR=1.72, 1.32-2.23) and teenage (SHR=2.00, 1.49-2.68) household contacts of TB cases experience a higher hazard of TB as a result of such household contact than do adult contacts. Isoniazid preventive therapy (SHR=0.30, 0.21-0.42) and previous BCG vaccination (SHR=0.66, 0.51-0.86) substantially reduce the risk of disease among household contacts. TB cases that were not microbiologically confirmed exert a smaller hazard of causing TB among their close contacts compared with cases that were smear or culture positive (excess HR=0.88, 0.82-0.93 for HIV− cases and 0.82, 0.57-0.94 for HIV+ cases). We estimated the extra-household force of infection results in 0.01 (95% CI: 0.004,0.028) TB cases per susceptible household contact per year, and the rate of transmission between a microbiologically confirmed TB case and susceptible household contact at 0.08 (95% CI: 0.045,0.129) TB cases per pair, per year.

Conclusions:

Accounting for exposure to infected household contacts permits estimation of risk factors for disease susceptibility/transmissibility, and comparison of within-household and exogenous forces of infection.


Much of what is known about TB transmissibility comes from prospective household-based cohort studies focused on risk of infection or disease among household contacts of a newly diagnosed “index” patient [14]. This study design is appealing because in addition to characterizing index cases and their co-prevalent household contacts, it may also reveal the time dynamics of secondary infections and yield information helpful for designing better TB interventions. From prospective household-based cohort studies of TB infection, three levels of epidemiologic inference are needed in order to inform targeted TB testing, prevention, and treatment resources.

First, knowledge of risk factors affecting susceptibility to TB infection and progression to disease after infection would help identify contacts who are at most risk of TB disease [57]. Young people may be especially susceptible to infection and severe forms of disease [813] and HIV+ individuals are more likely to progress to disease after infection. BCG vaccination and isoniazid preventive therapy (IPT) are associated with a decreased risk of TB disease [1417].

Second, knowledge of factors affecting the risk of transmission by an infectious individual would help identify cases who are most likely to infect their susceptible contacts [1820]. While HIV+ individuals are at higher risk for TB disease after infection, HIV+ people may transmit TB less often given the association between immunosuppression and extra-pulmonary TB and, in some settings, shorter durations of infectiousness [2127]. Pulmonary disease with positive sputum smear status at the time of diagnosis is associated with higher risk of TB transmission to household contacts [12, 2830].

Third, knowledge of the relative importance of community and household forces of infection could guide case-finding and intervention strategies at the community and household levels [7, 3133].

Analytical approaches for estimating transmission risk from household-based data generally fall into two categories. First, researchers have developed transmission models for infectious disease dynamics in small groups [32, 3440]. But fitting these models to individual-level data in small populations, i.e. households, can be challenging because likelihoods may not be tractable. Furthermore, most transmission models do not accommodate multiple covariates, making it difficult perform regression-style adjustment for individual-level characteristics.

Second, researchers routinely employ traditional models for regression adjustment to clustered infectious disease data. When disease outcomes are binary, generalized linear models are often used: Poisson, logistic, and log-binomial regression approaches are popular [16, 31, 41, 42], often with household-level random intercept terms to account for unobserved heterogeneity. These methods permit estimation of risk or odds ratios, but may not deliver estimates of important epidemiologic features of disease transmission: the extra-household force of infection, and factors associated with susceptibility to disease, versus the infectiousness of an infected individual. Some researchers have attempted to adjust for measures of infection risk by assuming that an initial “index” case is the source of all infection risk experienced by household members [12, 27]. However, epidemiologists have warned that traditional approaches to estimation and regression adjustment for infectious disease outcomes may result in biased estimates because they cannot account for disease transmission [35, 4347].

More recently, hybrid transmission models that combine features of mechanistic and regression approaches have been introduced, permitting regression-style adjustment for individualistic correlates of susceptibility and infectiousness [39, 4856].

In this study, we adapt a hybrid regression/transmission model to estimate household and exogenous forces of TB infection, while adjusting for individual-level covariates associated with individual infectiousness and susceptibility to disease. We apply this approach to analyze outcomes among household contacts of TB cases in an prospective cohort study conducted in Lima, Peru from 2009-2012 [12, 16, 27, 57].

METHODS

Data and design of the household TB cohort study

We analyze data from a prospective cohort study of household contacts of adults (age>=15 years) treated for active TB in Lima, Peru. Households were identified when a member (called the index case) was diagnosed with pulmonary TB at one of 106 participating health centers between September 2009 and August 2012. All household contacts of index cases were eligible for this study and were recruited during home visits within two weeks following the index case diagnosis.

Baseline, six- and twelve-month follow-up home visits were conducted by trained study nurses. During these visits, symptoms consistent with active TB were assessed among contacts via questionnaire; individuals with symptoms were then referred to a health center for clinical and bacteriological evaluation. A detailed description of the methods for this cohort study has previously been published [12, 16, 27, 57].

We analyze TB outcomes for household contacts who were TB disease-free at the time of enrollment. Contacts with active disease during the baseline visit (i.e. co-prevalent cases) were not included as incident cases in our analyses. We group index and co-prevalent cases together as “primary cases” and restrict attention to the 3,446 households with at least one primary TB case, and at least one disease-free household contact at enrollment. This leaves a total of 17,490 subjects, of whom 3,673 (21%) were primary TB cases and 13,817 were disease-free at enrollment. A total follow-up time of one year was considered. Our outcome of interest was incident bacteriologically-confirmed TB among household contacts. We refer to cases of incident active TB occurring among these subjects as secondary household TB.

For the 13,817 contacts free of active TB at enrollment, age, sex, HIV status, IPT status, and BCG vaccination status were recorded. From September 2009 to August 2011, HIV status was assessed using an enzyme immunoassay (EIA), with positive or ambiguous samples confirmed by immunofluorescence assay (IFA). From August 2011 to August 2012, HIV status was first assessed by rapid screening test, with positive or ambiguous samples confirmed by EIA and IFA [58]. 36/3,673=1% of primary cases and 163/13,817=1% of at-risk subjects were missing HIV test results; we assumed these subjects had negative HIV status. Prior BCG vaccination in at-risk subjects was assessed by self-report (and by inspection for characteristic scarring [59]); 4/13,817=0.03% were missing BCG information, and these subjects were assumed to have not received BCG. No at-risk subjects were missing IPT status. Every primary case had laboratory sputum smear or culture results immediately following diagnosis. A composite variable for microbiological confirmation (MC+), denoting either smear-positive or culture-positive TB disease was used.

While the exact date of onset of active TB for these secondary cases is not observed, the last home visit before diagnosis and the date of diagnosis provide a time interval for each secondary TB case during which onset of active TB occurred. Section 3 of the Supplementary Material describes the temporal pattern of active disease onset intervals for secondary cases.

Transmission model

Hybrid transmission modeling and regression adjustment approaches can permit valid epidemiologic inference by accounting for both the structure of infectious disease transmission within households and covariates that may influence susceptibility and infectiousness [35, 36, 39, 5153, 55, 56, 60, 61]. We develop a general model for infectious disease transmission that describes the instantaneous risk of infection for each susceptible household contact. The model permits simultaneous estimation of the exogenous and household forces of infection, while adjusting for covariates associated with progression to active disease, and infectiousness. For a collection of N households, where household i has ni members, let xij be a p × 1 vector of susceptibility-related baseline covariates for subject j in household i. Let zij be a q × 1 vector of possibly different covariates that are related to infectiousness, given that j has active (transmissible) infection. Define yij(t) to be the indicator that subject j in household i is infected just before time t, and define tij to be the time at which subject j in household i becomes infectious, or the end of the household follow-up window ti, whichever comes first. Both times ti and tij are measured relative to enrollment of household i. Define tijdx>tij to be the date on which the incident case j was diagnosed with infection.

We construct a hazard model of infection by considering different sources of infectiousness to which a susceptible person j in household i is subjected, at time t following enrollment of household i. Assume person j is subjected to a constant force of infection (hazard) α from outside the household, measured as infections per unit time. Suppose further that the covariates xij are related to the risk of infection from outside the household per unit time by the hazard α×exp[xijβ]. Assume j is also subject to a distinct and independent force of infection at time t from any other infectious members of household i. Suppose that person k ≠ j in household i is infectious. Then j is subject to a hazard of exp[xijβ]×exp[zikγ] per unit time, where exp[zikγ] is the force of infection due to person k. To incorporate loss of infectiousness following diagnosis and onset of treatment of infection, we parameterize the baseline hazard of infection from k to a susceptible individual. Define ω ∈ [0,1] to be the proportion loss of infectiousness following onset of treatment, and let wik(t) = 0 if t < tik, 1 if t(tik,tikdx), and ω if tik>tikdx. The total hazard to subject j in household i at time t is therefore

λij(t)=exp[xijβ](α+k=1niwik(t)exp[zikγ])

We treat ω as fixed in the statistical analysis and conduct a sensitivity study of its effect on β and γ in the Supplement. Figure 1 shows a schematic illustration of the infection hazard experienced by a susceptible subject over time, as other household members become infectious. A formal mathematical construction of the hazard model is given in Section 1 of the Supplementary Appendix.

Figure 1.

Figure 1.

Schematic illustration of the TB transmission hazard model in a single household of size three.

Epidemiologic meaning of parameters

The scalar parameter α represents the exogenous force of infection experienced by a single susceptible subject. Formally, α is the adjusted baseline risk for an individual subject to no household infectiousness. In this analysis, α is assumed to be constant in time and across households, which may be appropriate in areas where TB is endemic. When the model is correctly specified, α can be interpreted as the expected number of new infections per unit time that a susceptible individual would experience in the absence of a household exposure. If the model is incorrectly specified, for example by failing to accurately capture household exposure, α acts to absorb the extra risk not accounted for through the parameter describing within-household transmission. The household-level force of infection at time t is the sum of each infectious member’s infectiousness at that time. The regression coefficient γ controls the infectiousness of person k according to zik. Likewise, the regression coefficient β controls the susceptibility of person j to exposure to disease according to their characteristics xij. The susceptibility hazard ratio comparing the overall hazard at covariate level x* with that at level x is SHR = exp[(x* − x)′β]. Likewise, the infectiousness hazard ratio comparing the hazard of infection at level z* with that at level z is IHR = exp[(z* − z)′ γ]. When the first element of z is equal to 1, the corresponding first element of the parameter vector γ, denoted γ0, is interpretable as the log baseline risk of transmission, absent other infectiousness covariates.

These parameters also have familiar epidemiological meaning in combination. The excess hazard due to a single exposure to disease at level z is the ratio of the infective hazard exerted by a single person with infectiousness covariates z to the overall hazard, EHE = exp[z′ γ]/(α + exp[z′ γ]). The excess hazard due to exposure can change following initiation of treatment of a person with active disease. When ω < 1, the excess hazard at level z following diagnosis of the infectious individual is EHE = ω exp [z′ γ]/(α + ω exp [z′ γ]). The excess hazard due to exposure is a more specific measure of infective hazard than the ratio of infection odds in household contacts to controls (called the “community infection ratio”) that has previously been used to quantify TB risk due to household exposure [33]. The probability of exogenous infection for a particular incident secondary case j in household i is the ratio of the exogenous hazard to the total hazard experienced by person j at the time of infection, Cj=α/(α+k=1niwik(tij)exp[(zikγ)]). Section 1.4 of the Supplementary Appendix describes these composite estimands, and their definitions in terms of the hazards λij(t) in greater detail.

Relationship to other transmission models

The hazard model employed in this study is unlike traditional statistical models (e.g. logistic, Poisson) for regression analysis of binary infection outcomes because it incorporates the cumulative exposure to infective hazard experienced by each susceptible subject during the study. In particular, the hazard model does not assume that secondary cases are solely attributable to exposure from the index case [12, 27, 57]. However, the model is closely related to several common constructs in epidemiology. First, the susceptible-infective (SI) model of infectious disease transmission in a closed population [37, 38, 62] is a special case of the household-based hazard model obtained by setting α = 0 and ignoring individual-level covariates. Second, the Wells-Riley equation [6367] describing the occurrence of new cases of an airborne disease in an enclosed space can be derived from the expected number of infections in the household hazard model. Third, several researchers have proposed similar risk or hazard regression models for infectious disease outcomes [39, 5256]. In particular, Zelner et al. [12] propose a generalized additive mixed logistic model to estimate age-specific risk factors for latent TB infection; predictions from this model were used to estimate the community force of infection. Kenah [51] proposes a semiparametric version of the hazard model presented above with α = 0; time-varying transmission hazards can be estimated from this model. Similar transmission modeling approaches have been used to successfully adjust for individual-level covariates in household transmission models of influenza [4850]. Section 1.3 of the Supplementary Appendix provides formal derivations of these correspondences.

Statistical analysis

The likelihood of the observed data under the hazard model takes a convenient form that is amenable to iterative maximization with respect to the unknown parameters θ = (α, β, γ). We estimate θ by maximum likelihood, with a multiple imputation procedure [68] for dealing with unobserved infectiousness onset times, derived in Section 3 of the Supplementary Appendix. We compute SHR for age, sex, HIV status, concurrent IPT, and BCG vaccination; IHR for HIV+ and microbiological TB confirmation; the community force of infection α, the baseline infective hazard exp[γ0], EHE under different infectiousness parameters, and the probability of community transmission for each secondary case. We also compare the results to risk ratios obtained by modified Poisson regression [42].

RESULTS

Descriptive results

Table 1 shows descriptive characteristics of subjects, stratified by their TB status: primary cases (index and secondary co-prevalent) diagnosed at baseline, secondary cases diagnosed during the study, and subjects who were not diagnosed during the study. Primary cases at enrollment were 58% male, but the overall gender balance including all household contacts was 47% male. The average overall age was 28.3+/− 19.5 years; most primary cases were between 18 and 65, matching the enrollment criteria for index patients. HIV prevalence among primary cases at baseline was 123/3,673=3% and 52/13,817=0.4% among at-risk individuals at baseline. A microbiological confirmation of TB disease was made for 3200/3673=87% of primary cases, and 3384/4068 = 83% of cases overall. Table 2 shows the number of primary and secondary TB cases across household sizes. A simulation analysis, presented in the Supplement, shows that simulated TB outcomes from the transmission model match the frequency distribution of observed TB cases across household sizes in Table 2.

Table 1.

Descriptive characteristics of subjects, stratified by TB status.

Primary TB Secondary No TB All
Count % Count % Count % Count %




Age 0-2yr 4 0 14 4 688 5 706 4
2-12yr 38 1 94 24 2974 22 3106 18
12-18yr 297 8 61 15 1507 11 1865 11
18-65yr 3091 84 210 53 7591 57 10892 62
>65yr 243 7 16 4 662 5 921 5
Male Sex 2113 58 200 51 5960 44 8273 47
HIV 123 3 5 1 47 0 175 1
IPT 17 0 42 11 3026 23 3085 18
BCG 204 6 318 81 11572 86 12094 69
TB dx 3673 100 395 100 0 0 4068 23

Table 2.

Number of primary and secondary TB cases in households of different size

Number primary cases Household size
2 3 4 5 6 7 8 9 10 11 12 13 14 15 >15
1 489 651 624 473 335 227 134 92 64 48 30 21 12 7 37
2 22 24 19 35 16 18 8 7 5 6 4 3 0 1 8
3 0 2 5 3 3 1 3 0 1 0 1 0 0 0 2
4 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0
5 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1
6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
Number secondary cases

0 490 635 608 469 312 205 128 81 58 44 30 20 11 7 30
1 21 41 34 41 37 36 14 14 10 8 5 2 1 1 7
2 0 1 6 2 5 3 3 2 1 2 0 2 0 0 6
3 0 0 0 0 1 2 1 1 1 0 0 0 0 0 3
4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2
5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
7 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0

Risk factors for susceptibility and transmission

Table 3 shows regression results for the hazard model of transmission with the proportion reduction in infectiousness following diagnosis ω = 0.2, and composite estimands comparing the exogenous and household forces of infection. We set ω to reflect empirical evidence that treatment for TB is effective in reducing transmissibility very rapidly [69, 70]. The parameter ω is the subject of a sensitivity analysis presented in the Supplement. The exogenous force of infection α, estimated to be 0.01 (95% CI: 0.004,0.03). Note that α is a raw hazard, not a hazard ratio, that incorporates features of infection and disease progression. Hazard ratios corresponding to susceptibility and infectiousness covariates (defined heuristically above and formally in Section 1.4 of the Supplementary Appendix) are given in the rows below.

Table 3.

Hazards and hazard ratios for susceptibility and infectivity, with comparisons of the community and within-household force of infection for ω = 0.2.

Exogenous FOI Estimate 0.01 2.50% 0.00 97.50% 0.03
Susceptibility SHR
Age 0-2yr 1.04 0.60 1.81
2-12yr 1.72 1.32 2.23
12-18yr 2.00 1.49 2.68
>65yr 0.75 0.44 1.28
Male Sex 1.17 0.96 1.43
HIV+ 3.80 1.56 9.29
IPT 0.30 0.21 0.42
BCG 0.66 0.51 0.86
Infectivity IHR
Intercept 0.08 0.05 0.13
HIV 0.60 0.20 1.83
MC+ 1.50 0.88 2.58
Excess HR due to exposure
HIV−/MC− 0.88 0.82 0.93
HIV+/MC− 0.82 0.57 0.94
HIV−/MC+ 0.92 0.84 0.96
HIV+/MC+ 0.87 0.64 0.96

Adjusted estimates of TB hazard experienced by household contacts in different age groups (adults age 19-65 are the reference group) exhibit patterns that are not evident in the unadjusted prevalence rates. Relative to adults (18-65 years), the susceptibility hazards by age are: for infants (0-2 years) SHR=1.04 (0.60,1.81); for children (3-12 years) SHR=1.72 (1.32,2.23); for teens (13-18 years) SHR=2.0 (1.49,2.68); and for the elderly (>65 years) SHR=0.75 (0.44,1.28). Figure 2 shows raw TB prevalence on the left-hand vertical axis by age, with a kernel regression estimate and bootstrap pointwise confidence interval. Estimated age susceptibility hazard ratios and confidence intervals are overlaid, with the hazard ratio scale on the right-hand vertical axis. Male sex was not significantly associated with increased hazard, SHR = 1.17 (0.96,1.43). Positive HIV status (measured at baseline) is associated with increased risk, SHR=3.8 (1.56,9.29). Isoniazid preventive therapy (IPT) is associated with decreased hazard, SHR=0.30 (0.21,0.42). BCG was associated with decreased hazard, SHR=0.66 (0.51,0.86). The supplementary appendix shows a sensitivity analysis for values of ω varying from 0.2 to 1.0.

Figure 2.

Figure 2.

Raw TB prevalence and adjusted susceptibility hazard ratios by age. Raw prevalence (left-hand vertical axis) is estimated by kernel regression and bootstrap pointwise confidence interval. Estimated age susceptibility hazard ratios and confidence intervals overlaid, with the hazard ratio scale on the right-hand vertical axis.

The estimated “baseline” infectiousness hazard (exp[γ0] in the hazard model), in the absence of smear- or culture confirmed TB, and negative HIV status, is 0.08 (0.05,0.13). The baseline infectiousness is a raw hazard, with units of transmissions per infective-susceptible pair per year. TB cases with MC+ status upon diagnosis are not associated with increased infectiousness hazard compared with TB cases that were diagnosed on the basis of symptoms in the absence of microbiological confirmation, IHR=1.5 (0.88,2.58). HIV+ status was not significantly associated with infectiousness hazard, IHR=0.6 (0.2,1.83) in this analysis.

Comparison of exogenous and household forces of infection

Comparing the baseline infectiousness hazard for a microbiologically unconfirmed case (exp[γ0]) to α directly, the hazard experienced by a susceptible subject from a single microbiologically unconfirmed HIV− household case (before that individual is diagnosed and treated) is almost eight times greater than the annual exogenous force of infection exp[γ0/α=7.7 (2.4,24.2). The excess hazard due to exposure to a single household primary case (defined heuristically above and formally in Section 1.4 of the Supplementary Appendix) is useful for comparison of the hazard of infection from this household source with the hazard of infection from exogenous sources. Larger values of the excess hazard indicate that more infectiousness is due to a single infective individual with the given characteristics than due to outside sources. The excess hazard of exposure is greatest when the infectious individual is HIV−. When the infectious individual is MC+ and HIV−, EHE=0.92 (0.84,0.96); when the infectious individual is MC− and HIV−, EHE=0.88 (0.82,0.93).

Comparison to estimated risk ratios from Poisson regression

Modified Poisson regression is a standard method for estimating risk ratios for clustered binary outcomes in prospective cohort studies. Table 4 shows estimated adjusted risk ratios for all susceptibility-related parameters: age, sex, HIV status, IPT, and BCG vaccination. We employ the standard “modified” Poisson regression [42] fitting framework using generalized estimating equations and heteroscedasticity-robust standard errors under an exchangeable household working correlation structure [71]. Section 4 of the Supplementary Material gives similar results obtained by using logistic regression with the same outcome and susceptibility-related covariates. Estimates of susceptibility hazard ratios are similar to risk ratios for the same variables obtained by modified Poisson regression, with heteroscedasticity-robust standard errors, shown in Table 4. However, the proposed transmission modeling approach reveals much more about the transmission of TB than is available through traditional analysis of aggregate risk. Specifically, we report IHR estimates for transmission-related variables and the community force of infection, information that is not obtained from traditional regression analyses or existing transmission modeling approaches.

Table 4.

Results from modified Poisson regression.

Risk Ratio 2.50% 97.50%

(Intercept) 0.04 0.03 0.05
Age 0-2yr 1.00 0.58 1.73
2-12yr 1.61 1.21 2.15
12-18yr 1.93 1.44 2.57
>65yr 0.83 0.50 1.37
Male Sex 1.21 1.00 1.46
HIV+ 3.60 1.40 9.24
IPT 0.30 0.20 0.44
BCG 0.67 0.52 0.87

DISCUSSION

Transmission modeling and regression analysis

Longitudinal household-based studies of infectious disease outcomes reveal features of susceptible subjects’ exposure to infectiousness, but this information is often ignored in statistical analyses. Hybrid transmission modeling and regression approaches have been proposed, but are not yet widely used. The approach we employed uses assessments of exposure over time to disentangle individual risk factors from exposure to infection, while simultaneously estimating the overall community force of infection. Estimates of susceptibility parameters under the transmission hazard model have some similarities with risk ratios from Poisson regression, and previous analyses of the same study [12, 16, 27, 57], but traditional regression analyses do not provide information about transmissibility or infectiousness of infected individuals.

The approach outlined in this paper is subject to several limitations. We imposed parametric modeling assumptions to ensure interpretability and identifiability of parameters: constant exogenous force of infection, conditional independence of waiting times to transmission, and a log-linear model for infection hazards. When baseline enrollment of individuals happens simultaneously, it may be possible to estimate a time-varying exogenous force of infection α(t). Because all households in this study had at least one member with active TB at baseline, α may not be interpretable as the overall community force of infection if the within-household transmission model is mis-specified. Instead, α could be interpreted as a marginal TB risk not accounted for by other covariates, conditional on at least one household member with active TB. We also employed a parametric model for the reduction in risk of transmission following onset of treatment at the time of diagnosis. Misspecification of these hazards or unmodeled heterogeneity in individual or household-level risk could bias inferences of regression coefficients. Future extensions of this model could incorporate covariates and prior information to account for sources of variation in exogenous risk, as in Bayesian hierarchical regression models.

Implications of findings for TB

SHR estimates largely agree with previously published risk and odds ratios computed using longitudinal or cross-sectional (at baseline enrollment) TB outcomes. Younger people appear to be at increased risk of active TB disease when exposed [810, 12, 72], highlighting the difference between the raw TB prevalence and adjusted estimates shown in Figure 2. HIV+ individuals have greater susceptibility to TB [2226, 57, 73]. While it is known that HIV+ people are less likely to transmit TB when infected [21, 22, 26, 57], our estimated IHR for positive HIV status in this analysis is not significantly different from one; previous analysis of these data suggested that the degree of HIV-associated immunosuppression (CD4) is associated with decreased transmission [57]. Furthermore, MC+ individuals may exert a slightly larger hazard of infection on their susceptible household members [16, 2830] than individuals who are diagnosed without microbiological confirmation, which is consistent with previous data that show a positive relationship between bacterial load in the sputum and risk of infection among contacts.

As compared to a traditional regression-based approach, our mechanistic model allows us to make sense of seemingly contradictory inferences about the relative importance of the exogenous force of TB infection experienced by these household contacts of known TB patients [74]. We estimate the exogenous force of infection to be approximately α =0.01 infection per susceptible person-year, which is substantially higher than the average exposure risk of community members in Lima at large where the annual TB incidence was between 150-300 cases/100,000 (0.0015-0.003) during the period of the study. This suggests that household contacts were at a higher than average risk of exogenous exposure (possibly as result of living in a relative transmission hotspot), or that our model did not accurately capture all of the risk associated with within-home exposure. One possibility is that household contacts of TB cases may share a higher risk of progression given infection due to factors not accounted for here, such as poor nutrition. A single infectious primary case (in the absence of other infectiousness effects) is estimated to exert a hazard of exp[γ0] = 0.08 on each susceptible in their household. When the infective individual is microbiologically confirmed and does not have HIV, this hazard is increased.

Our application of this novel method is subject to some additional caveats. Outcome misclassification could occur if some subjects were TB-infected but not diagnosed at enrollment. Some subjects classified as not TB-infected at baseline may have had undetected latent TB infection. In this case, the parameter α retains an interpretation as the adjusted overall risk of progressing to active TB disease in the absence of household-level infectiousness, but may not accurately reflect the exogenous force of infection. The onset of active TB, and hence onset of infectiousness, was unobserved for secondary cases, and is known only to lie within the time interval between the last TB-negative home visit and the date of diagnosis. Since we could not rule out possible dependence of clinic visit and diagnosis dates on the date of active disease onset, we chose not to impose the parametric conditional distribution for disease onset time, conditional on diagnosis date. Instead, we implemented a multiple imputation estimator that uniformly samples onset times within these intervals, which may result in a loss of statistical efficiency.

This modeling framework has applicability to diseases other than TB and to questions that extend beyond household-based transmission. For example, in the context of a public health response, e.g. the Ebola outbreak in West Africa, data are often collected from clustered environments (households, villages, schools, hospitals) in the course of contact investigation. In these contexts, when the exogenous infection rate may be the primary outcome of interest, it is still necessary to account for the role of transmission within clusters to obtain accurate estimates of community exposure that make maximal use of available data.

Supplementary Material

supplementary appendix

Acknowledgements

This work was supported by NIH grants NIAID U19 A1076217, NICHD 1DP2HD091799, and NIAID R01 AI112438. We are grateful to Chuan Chin Huang, Olga Morozova, Megan Murray, Laura F. White, and Zibiao Zhang for helpful comments on the manuscript. We also thank Mercedes Becerra, Leo Lecca, and all the members of Socios En Salud in Lima, Peru.

Contributor Information

Forrest W. Crawford, Department of Biostatistics, Yale School of Public Health; Department of Statistics & Data Science, Yale University; Department of Ecology & Evolutionary Biology, Yale University; Yale School of Management

Florian M. Marx, Department of Epidemiology of Microbial Diseases, Yale School of Public Health, New Haven, CT

Jon Zelner, Department of Epidemiology, University of Michigan School of Public Health Ann Arbor, MI.

Ted Cohen, Department of Epidemiology of Microbial Diseases, Yale School of Public Health, New Haven, CT.

References

  • 1.Teixeira L, et al. , Infection and disease among household contacts of patients with multidrug-resistant tuberculosis. The International Journal of Tuberculosis and Lung Disease, 2001. 5(4): p. 321–328. [PubMed] [Google Scholar]
  • 2.Guwatudde D, et al. , Tuberculosis in household contacts of infectious cases in Kampala, Uganda. American journal of epidemiology, 2003. 158(9): p. 887–898. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Morrison J, Pai M, and Hopewell PC, Tuberculosis and latent tuberculosis infection in close contacts of people with pulmonary tuberculosis in low-income and middle-income countries: a systematic review and meta-analysis. The Lancet infectious diseases, 2008. 8(6): p. 359–368. [DOI] [PubMed] [Google Scholar]
  • 4.Van Schalkwyk C, et al. , Incidence of TB and HIV in prospectively followed household contacts of TB index patients in South Africa. PloS one, 2014. 9(4): p. e95372. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Lienhardt C 12, From exposure to disease: the role of environmental factors in susceptibility to and development of tuberculosis. Epidemiol Rev, 2001. 23(2). [DOI] [PubMed] [Google Scholar]
  • 6.Tornee S, et al. , Risk factors for tuberculosis infection among household contacts in Bangkok, Thailand. 2004. [PubMed] [Google Scholar]
  • 7.Middelkoop K, et al. , Decreasing household contribution to TB transmission with age: a retrospective geographic analysis of young people in a South African township. BMC infectious diseases, 2014. 14(1): p. 1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Lienhardt C, et al. , Risk factors for tuberculosis infection in children in contact with infectious tuberculosis cases in the Gambia, West Africa. Pediatrics, 2003. 111(5): p. e608–e614. [DOI] [PubMed] [Google Scholar]
  • 9.Singh M, et al. , Prevalence and risk factors for transmission of infection among children in household contact with adults having pulmonary tuberculosis. Archives of disease in childhood, 2005. 90(6): p. 624–628. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Middelkoop K, et al. , Childhood tuberculosis infection and disease: a spatial and temporal transmission analysis in a South African township. SAMJ: South African Medical Journal, 2009. 99(10): p. 738–743. [PMC free article] [PubMed] [Google Scholar]
  • 11.Wood R, et al. , Tuberculosis transmission to young children in a South African community: modeling household and community infection risks. Clinical infectious diseases, 2010. 51(4): p. 401–408. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Zelner JL, et al. , Age-specific risks of tuberculosis infection from household and community exposures and opportunities for interventions in a high-burden setting. American journal of epidemiology, 2014. 180(8): p. 853–861. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Dodd PJ, et al. , Age-and sex-specific social contact patterns and incidence of Mycobacterium tuberculosis infection. American journal of epidemiology, 2016. 183(2): p. 156–166. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Colditz GA, et al. , Efficacy of BCG vaccine in the prevention of tuberculosis: meta-analysis of the published literature. Jama, 1994. 271(9): p. 698–702. [PubMed] [Google Scholar]
  • 15.Mangtani P, et al. , Protection by BCG vaccine against tuberculosis: a systematic review of randomized controlled trials. Clinical infectious diseases, 2014. 58(4): p. 470–480. [DOI] [PubMed] [Google Scholar]
  • 16.Zelner JL, et al. , Bacillus Calmette-Guérin and isoniazid preventive therapy protect contacts of patients with tuberculosis. American journal of respiratory and critical care medicine, 2014. 189(7): p. 853–859. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Nemes E, et al. , Prevention of M. tuberculosis infection with H4: IC31 vaccine or BCG revaccination. New England Journal of Medicine, 2018. 379(2): p. 138–149. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Rodrigo T, et al. , Characteristics of tuberculosis patients who generate secondary cases. The International Journal of Tuberculosis and Lung Disease, 1997. 1(4): p. 352–357. [PubMed] [Google Scholar]
  • 19.Tostmann A, et al. , Tuberculosis transmission by patients with smear-negative pulmonary tuberculosis in a large cohort in the Netherlands. Clinical Infectious Diseases, 2008. 47(9): p. 1135–1142. [DOI] [PubMed] [Google Scholar]
  • 20.Crampin A, et al. , Assessment and evaluation of contact as a risk factor for tuberculosis in rural Africa. The International Journal of Tuberculosis and Lung Disease, 2008. 12(6): p. 612–618. [PMC free article] [PubMed] [Google Scholar]
  • 21.Cruciani M, et al. , The impact of human immunodeficiency virus type 1 on infectiousness of tuberculosis: a meta-analysis. Clinical Infectious Diseases, 2001. 33(11): p. 1922–1930. [DOI] [PubMed] [Google Scholar]
  • 22.Carvalho AC, et al. , Transmission of Mycobacterium tuberculosis to contacts of HIV-infected tuberculosis patients. American Journal of Respiratory and Critical Care Medicine, 2001. 164(12): p. 2166–2171. [DOI] [PubMed] [Google Scholar]
  • 23.Mohammad Z, et al. , A preliminary study of the influence of HIV infection in the transmission of tuberculosis. 2002. [PubMed] [Google Scholar]
  • 24.Pai M, McCulloch M, and Colford JM, Meta-analysis of the impact of HIV on the infectiousness of tuberculosis: methodological concerns. Clinical Infectious Diseases, 2002. 34(9): p. 1285–1287. [DOI] [PubMed] [Google Scholar]
  • 25.Kenyon T, et al. , Risk factors for transmission of Mycobacterium tuberculosis from HIV-infected tuberculosis patients, Botswana. The International Journal of Tuberculosis and Lung Disease, 2002. 6(10): p. 843–850. [PubMed] [Google Scholar]
  • 26.Crampin AC, et al. , Tuberculosis transmission attributable to close contacts and HIV status, Malawi. Emerging infectious diseases, 2006. 12(5): p. 729–35. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Huang C, et al. , Cigarette smoking among tuberculosis patients increases risk of transmission to child contacts. The International Journal of Tuberculosis and Lung Disease, 2014. 18(11): p. 1285–1291. [DOI] [PubMed] [Google Scholar]
  • 28.Shaw J and Wynn-Williams N, Infectivity of pulmonary tuberculosis in relation to sputum status. American Review of Tuberculosis and Pulmonary Diseases, 1954. 69(5): p. 724–32. [DOI] [PubMed] [Google Scholar]
  • 29.Lutong L and Bei Z, Association of prevalence of tuberculin reactions with closeness of contact among household contacts of new smear-positive pulmonary tuberculosis patients [Notes from the Field]. The International Journal of Tuberculosis and Lung Disease, 2000. 4(3): p. 275–277. [PubMed] [Google Scholar]
  • 30.Ferrarini M, et al. , Rate of tuberculosis infection in children and adolescents with household contact with adults with active pulmonary tuberculosis as assessed by tuberculin skin test and interferon-gamma release assays. Epidemiology and infection, 2016. 144(04): p. 712–723. [DOI] [PubMed] [Google Scholar]
  • 31.Verver S, et al. , Proportion of tuberculosis transmission that takes place in households in a high-incidence area. The Lancet, 2004. 363(9404): p. 212–214. [DOI] [PubMed] [Google Scholar]
  • 32.Brooks-Pollock E, et al. , Epidemiologic inference from the distribution of tuberculosis cases in households in Lima, Peru. Journal of Infectious Diseases, 2011. 203(11): p. 1582–1589. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Madico G, et al. , Community infection ratio as an indicator for tuberculosis control. The Lancet, 1995. 345(8947): p. 416–419. [DOI] [PubMed] [Google Scholar]
  • 34.Akhtar S, Carpenter T, and Rathi S, A chain-binomial model for intra-household spread of Mycobacterium tuberculosis in a low socio-economic setting in Pakistan. Epidemiology and infection, 2007. 135(01): p. 27–33. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Longini IM Jr and Koopman JS, Household and community transmission parameters from final distributions of infections in households. Biometrics, 1982: p. 115–126. [PubMed] [Google Scholar]
  • 36.Rampey AH Jr, et al. , A discrete-time model for the statistical analysis of infectious disease incidence data. Biometrics, 1992: p. 117–128. [PubMed] [Google Scholar]
  • 37.Anderson RM, May RM, and Anderson B, Infectious diseases of humans: dynamics and control. Vol. 28 1992: Wiley Online Library. [Google Scholar]
  • 38.Andersson H and Britton T, Stochastic epidemic models and their statistical analysis. Vol. 151 2012: Springer Science & Business Media. [Google Scholar]
  • 39.Cauchemez S, et al. , A Bayesian MCMC approach to study transmission of influenza: application to household longitudinal data. Statistics in medicine, 2004. 23(22): p. 3469–3487. [DOI] [PubMed] [Google Scholar]
  • 40.Kasaie P, et al. , Timing of tuberculosis transmission and the impact of household contact tracing. An agent-based simulation model. American journal of respiratory and critical care medicine, 2014. 189(7): p. 845–852. [DOI] [PubMed] [Google Scholar]
  • 41.McNutt L-A, et al. , Estimating the relative risk in cohort studies and clinical trials of common outcomes. American journal of epidemiology, 2003. 157(10): p. 940–943. [DOI] [PubMed] [Google Scholar]
  • 42.Zou G, A modified poisson regression approach to prospective studies with binary data. American journal of epidemiology, 2004. 159(7): p. 702–706. [DOI] [PubMed] [Google Scholar]
  • 43.Morozova O, Cohen T, and Crawford FW, Risk ratios for contagious outcomes. Journal of The Royal Society Interface, 2018. 15(138). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Halloran ME and Struchiner CJ, Causal inference in infectious diseases. Epidemiology, 1995: p. 142–151. [DOI] [PubMed] [Google Scholar]
  • 45.Longini IM Jr, et al. , Statistical inference for infectious diseases: risk-specific household and community transmission parameters. American Journal of Epidemiology, 1988. 128(4): p. 845–859. [DOI] [PubMed] [Google Scholar]
  • 46.Koopman JS, et al. , Assessing risk factors for transmission of infection. American Journal of Epidemiology, 1991. 133(12): p. 1199–1209. [DOI] [PubMed] [Google Scholar]
  • 47.Eisenberg JN, et al. , Bias due to secondary transmission in estimation of attributable risk from intervention trials. Epidemiology, 2003: p. 442–450. [DOI] [PubMed] [Google Scholar]
  • 48.Tsang TK, et al. , Association between antibody titers and protection against influenza virus infection within households. The Journal of infectious diseases, 2014. 210(5): p. 684–692. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Tsang TK, et al. , Individual correlates of infectivity of influenza A virus infections in households. PloS one, 2016. 11(5): p. e0154418. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Cheung DH, et al. , Association of Oseltamivir Treatment With Virus Shedding, Illness, and Household Transmission of Influenza Viruses. The Journal of Infectious Diseases, 2015. 212(3): p. 391–396. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Kenah E, Semiparametric relative-risk regression for infectious disease transmission data. Journal of the American Statistical Association, 2015. 110(509): p. 313–325. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Auranen K, et al. , Transmission of pneumococcal carriage in families: a latent Markov process model for binary longitudinal data. Journal of the American Statistical Association, 2000. 95(452): p. 1044–1053. [Google Scholar]
  • 53.Cauchemez S, et al. , Investigating heterogeneity in pneumococcal transmission: a Bayesian MCMC approach applied to a follow-up of schools. Journal of the American Statistical Association, 2006. 101(475): p. 946–958. [Google Scholar]
  • 54.Petrie JG, et al. , Application of an individual-based transmission hazard model for estimation of influenza vaccine effectiveness in a household cohort. American journal of epidemiology, 2017. 186(12): p. 1380–1388. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Rhodes PH, Halloran ME, and Longini IM Jr, Counting process models for infectious disease data: distinguishing exposure to infection from susceptibility. Journal of the Royal Statistical Society. Series B (Methodological), 1996: p. 751–762. [Google Scholar]
  • 56.Tsang TK, et al. , Transmissibility of Norovirus in Urban Versus Rural Households in a Large Community Outbreak in China. Epidemiology, 2018. 29(5): p. 675–683. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Huang C-C, et al. , The effect of HIV-related immunosuppression on the risk of tuberculosis transmission to household contacts. Clinical infectious diseases, 2014. 58(6): p. 765–774. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Galea JT, et al. , Rapid home-based human immunodeficiency virus testing to reduce costs in a large tuberculosis cohort study [Short communication]. Public health action, 2013. 3(2): p. 172–174. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Floyd S, et al. , BCG scars in northern Malawi: sensitivity and repeatability of scar reading, and factors affecting scar size. The International Journal of Tuberculosis and Lung Disease, 2000. 4(12): p. 1133–1142. [PubMed] [Google Scholar]
  • 60.Haber M, Longini IM Jr, and Cotsonis GA, Models for the statistical analysis of infectious disease data. Biometrics, 1988: p. 163–173. [PubMed] [Google Scholar]
  • 61.Kenah E, Non-parametric survival analysis of infectious disease data. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 2013. 75(2): p. 277–303. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Becker NG, Analysis of infectious disease data. Vol. 33 1989: CRC Press. [Google Scholar]
  • 63.Wells WF, Airborne Contagion and Air Hygiene. An Ecological Study of Droplet Infections. Airborne Contagion and Air Hygiene. An Ecological Study of Droplet Infections., 1955. [Google Scholar]
  • 64.Riley RL, Airborne infection. The American journal of medicine, 1974. 57(3): p. 466–475. [DOI] [PubMed] [Google Scholar]
  • 65.Riley E, Murphy G, and Riley R, Airborne spread of measles in a suburban elementary school. American Journal of Epidemiology, 1978. 107(5): p. 421–432. [DOI] [PubMed] [Google Scholar]
  • 66.Beggs C, et al. , The transmission of tuberculosis in confined spaces: an analytical review of alternative epidemiological models. The international journal of tuberculosis and lung disease, 2003. 7(11): p. 1015–1026. [PubMed] [Google Scholar]
  • 67.Noakes CJ and Sleigh PA. Applying the Wells-Riley equation to the risk of airborne infection in hospital environments: The importance of stochastic and proximity effects. in Indoor Air 2008: The 11th International Conference on Indoor Air Quality and Cl 2008. Leeds. [Google Scholar]
  • 68.Little RJ and Rubin DB, Statistical analysis with missing data. 2014: John Wiley & Sons. [Google Scholar]
  • 69.Crofton J, The contribution of treatment to the prevention of tuberculosis. Bull Int Union Tuberc, 1962. 32(2): p. 643–653. [Google Scholar]
  • 70.Rouillon A, Perdrizet S, and Parrot R, Transmission of tubercle bacilli: the effects of chemotherapy. Tubercle, 1976. 57(4): p. 275–299. [DOI] [PubMed] [Google Scholar]
  • 71.Liang K-Y and Zeger SL, Longitudinal data analysis using generalized linear models. Biometrika, 1986: p. 13–22. [Google Scholar]
  • 72.Johnstone-Robertson SP, et al. , Social mixing patterns within a South African township community: implications for respiratory disease transmission and control. Am J Epidemiol, 2011. 174(11): p. 1246–55. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Ayles HM, et al. , ZAMSTAR, The Zambia South Africa TB and HIV Reduction study: Design of a 2× 2 factorial community randomized trial. Trials, 2008. 9(1): p. 1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.McCreesh N and White RG, An explanation for the low proportion of tuberculosis that results from transmission between household and known social contacts. Scientific reports, 2018. 8(1): p. 5382. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

supplementary appendix

RESOURCES