Abstract
Patterns in risk-related behaviors identified using clinically deployed surveys may hold value for public health surveillance. However, because such surveys assess subjects only when subjects choose to visit clinics, clinical data are subject to variability in observation patterns that is not present in conventional longitudinal data sets in which research teams contact subjects at regular intervals. In this issue of the Journal, Wilkinson et al. (Am J Epidemiol. 2017;185(8):627–635) describe how they applied a latent transition analysis technique to surveillance data collected during clinic visits. In this commentary I discusses the selection bias that may arise in longitudinal analysis of clinical data due to subject-specific observation patterns, with particular focus on issues that may arise due to classifying successive clinical visits as waves. I suggest that quantitative bias analysis and inverse probability weighting may be useful techniques with which to assess and control bias in future latent transition analyses of clinical data.
Keywords: clinical cohort studies, latent transition analysis, selection bias, surveillance
Surveillance—monitoring of trends in health behaviors and environmental conditions that may affect disease control—is a central function of a public health system (1). Surveillance systems collect data on primary indicators of risk to set health system priorities, to plan interventions, and to evaluate intervention effectiveness (2). The distinction between surveillance and research is not well-defined; whereas surveillance is typically more descriptive and research more etiological, the two often use similar analytical strategies (3).
In this issue, Wilkinson et al. (4) apply an analytical technique developed for research to identify surveillance-relevant categories (latent classes) of behaviors related to human immunodeficiency virus risk among men who have sex with men (MSM) in the state of Victoria, Australia, as well as transitions between those categories (latent transitions). Identifying risk categories and transitions representative of the underlying population of MSM is valuable for surveillance: Risk categories may help authorities to target public health initiatives that ensure surveillance covers relevant populations or identify changes in underlying risk behavior patterns. However, the use of clinical data for surveillance is intrinsically limited, because the only opportunities to observe subjects are clinic visits, which may themselves result from risk behavior or changes in risk behavior (5). As a result, the categories and transitions identified may not represent true patterns of risk behavior and changes in risk behavior in the underlying population of MSM in Victoria.
Specifically, Wilkinson et al. use data from risk assessment questionnaires routinely administered during clinic visits that are part of the Victorian Primary Care Network for Sentinel Surveillance on BBVs and STIs (VPCNSS) (4). VPCNSS clinic visits occurred at subject-specific intervals—that is, each subject was observed only when something caused him to visit a clinic. This variability in observation, sometimes referred to as a dynamic observation plan—in contrast to the static observation plans used in conventional longitudinal studies, wherein researchers contact subjects within fixed waves of data collection (5)—may induce several forms of selection bias. While this selection bias may take the conventional form in which subjects who avoid screening cannot be accounted for in identifying behavior classes or transition patterns, it may also take a less familiar form when successive clinical observations are treated as waves of longitudinal data (as Wilkinson et al. treated them). First, the behaviors reported during periods in which subjects were observed (i.e., visited a clinic) may differ from behaviors undertaken during periods in which subjects were not observed (i.e., did not visit a clinic), resulting in misidentified transitions. Second, because the interval between observations is subject-specific and may be related to the subject's perception of risk, it is unclear how to interpret predictors of transitioning between latent classes.
To illustrate, consider why men might choose to visit VPCNSS clinics. One reason, as Wilkinson et al. suggest, is that men may be following clinical guidelines recommending that MSM be routinely screened for human immunodeficiency virus or other sexually transmitted infections (4). An alternate reason may be that men who have recently acquired a new partner or partners are concerned that they have recently been exposed to sexually transmitted infections and hence choose to visit a clinic for screening (6). A third possibility might be a reason that is inconsistent in timing but unrelated to risk behavior (e.g., inconsistent nagging from concerned relatives). Note that while men visiting a clinic for the first hypothesized reason can be empirically identified in the VPCNSS data set—men whose clinic visits are inspired by guidelines should visit clinics at roughly even time intervals—distinguishing between the second 2 reasons would require more detailed evidence than is available in the VPCNSS survey.
If men visit clinics as a matter of routine, then behaviors reported when subjects are observed should reflect behaviors engaged in when subjects go unobserved. However, if there is some association between the underlying patterning of risk behaviors and the probability of visiting a VPCNSS clinic, latent class and latent transition analysis may be flawed in 2 ways. First, in the latent transition analysis, missing observations within subjects might result in failing to identify transitions between latent classes that actually occurred. For example, imagine a subject who had sex with several partners, using condoms, in 2007 and visited the clinic to be screened. In 2008, he became sexually committed to a single partner for a year and, perceiving his lower risk, chose not to be screened. In 2009, he acquired several new partners, again using condoms, and chose once again to be screened. Such a man would likely appear in Wilkinson et al.’s analysis to have remained in the “risk minimizer” latent class from wave 1 to wave 2, though in fact his overall behavior was more consistent with a transition from “risk minimizer” to “monogamous” and back to “risk minimizer” (Figure 1) (4). More generally, the risk categories that men embody during unobserved times may not reflect the categories they embody during observed times, leading to inaccurate portrayal of class transitions.
Second, both calendar time and elapsed time between observation intervals differ for subjects depending on how often they visit the clinic. Figure 2 illustrates how clinical visit history for 5 hypothetical subjects would have been operationalized as waves in Wilkinson et al.’s analysis (4). One implication of this operationalization is that “waves” are properly interpreted as time elapsed until the next visit and have no cohort-wide interpretation. It is thus somewhat perplexing that probability of transition was quite different between waves 1 and 2 as compared with waves 2 and 3; secular trends cannot explain the differences, because calendar time was different for each subject, nor can differences in time observed, because observations had no consistent baseline.
With these concerns in mind, is latent transition analysis of clinical surveillance data appropriate? On the one hand, as the growth of the “Big Data” research paradigm continues to make clinical and other secondary data widely available (7), identification of patterns in these data, even patterns subject to moderate bias, may provide important insights for public health stewardship. On the other hand, planning interventions using flawed or misinterpreted evidence may be worse than taking no action at all. For example, if selection bias due to subjects’ choosing to visit clinics after beginning to engage in riskier behaviors results in the appearance of an increase in risky sexual behavior among MSM more broadly, then public health authorities—unaware that this appearance is artifactual—might devote resources to attempting to reverse a trend that does not exist in truth, thereby depriving other worthy public health causes of these resources.
Future latent transition analyses of clinical surveillance data might benefit from techniques to assess and control selection bias. For example, quantitative bias analysis (8) might be used to explore how strong selection bias would need to be to produce qualitative differences in key inferences. Similarly, inverse-probability-weighted analyses that account for differences in the probability of visiting a clinic in relation to underlying risk behavior patterns might improve future analyses. Such inverse-probability-weighted analyses have been shown to reduce bias in analyses of clinical cohort data (5), though to the best of my knowledge, they have not previously been used in latent transition analyses. Indeed, future surveillance surveys might productively include a question assessing the purpose for the clinic visit, acknowledging that such a question may increase respondent burden (9), to allow for more complete control of selection bias than would be possible using clinical measures alone in computing inverse probability weights.
ACKNOWLEDGMENTS
Author affiliation: Harborview Injury Prevention and Research Center, University of Washington, Seattle, Washington (Stephen J. Mooney).
S.J.M. was supported by Eunice Kennedy Shriver National Institute of Child Health and Human Development grant 5T32HD057833-07.
I thank Stephanie Shiau for her helpful comments.
Conflict of interest: none declared.
REFERENCES
- 1. Teutsch SM, Churchill RE. Principles and Practice of Public Health Surveillance. New York, NY: Oxford University Press; 2000. [Google Scholar]
- 2. Thacker SB, Berkelman RL. Public health surveillance in the United States. Epidemiol Rev. 1988;10:164–190. [DOI] [PubMed] [Google Scholar]
- 3. Lussier MT, Richard C, Bennett TL, et al. . Surveillance or research: what's in a name. Can Fam Physician. 2012;58(1):117. [PMC free article] [PubMed] [Google Scholar]
- 4. Wilkinson AL, El-Hayek C, Fairley CK, et al. . Measuring transitions in sexual risk among men who have sex with men: the novel use of latent class and latent transition analysis in HIV sentinel surveillance. Am J Epidemiol. 2017;185(8):627–635. [DOI] [PubMed] [Google Scholar]
- 5. Hernán MA, McAdams M, McGrath N, et al. . Observation plans in longitudinal studies with time-varying treatments. Stat Methods Med Res. 2008;18(1):27–52. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Kellerman SE, Lehman JS, Lansky A, et al. . HIV testing within at-risk populations in the United States and the reasons for seeking or avoiding HIV testing. J Acquir Immune Defic Syndr. 2002;31(2):202–210. [DOI] [PubMed] [Google Scholar]
- 7. Mooney SJ, Westreich DJ, El-Sayed AM. Commentary: epidemiology in the era of big data. Epidemiology. 2015;26(3):390–394. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Lash TL, Fox MP, Fink AK. Applying Quantitative Bias Analysis to Epidemiologic Data. New York, NY: Springer Science & Business Media; 2011. [Google Scholar]
- 9. Ulrich CM, Wallen GR, Feister A, et al. . Respondent burden in clinical research: when are we asking too much of subjects. IRB. 2005;27(4):17–20. [PubMed] [Google Scholar]