Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2018 Jan 1.
Published in final edited form as: J Clin Epidemiol. 2016 Oct 19;81:129–139. doi: 10.1016/j.jclinepi.2016.09.012

A pseudo-random patient sampling method evaluated

Nicole L De La Mata 1,§, Mi-Young Ahn 2, Nagalingeswaran Kumarasamy 3, Penh Sun Ly 4, Oon Tek Ng 5, Kinh Van Nguyen 6, Tuti Parwati Merati 7, Thuy Thanh Pham 8, Man Po Lee 9, Nicolas Durier 10, Matthew G Law 1
PMCID: PMC5318236  NIHMSID: NIHMS824168  PMID: 27771357

Abstract

Objective

To compare two HIV cohorts to determine whether a pseudo-random sample can represent the entire study population.

Study Design and Setting

HIV-positive patients receiving care at 8 sites in 7 Asian countries. TAHOD pseudo-randomly selected a patient sample, while TAHOD-LITE included all patients. We compared patient demographics, CD4 count and HIV viral load testing for each cohort. Risk factors associated with CD4 count response, HIV viral load suppression (<400 copies/mL) and survival were determined for each cohort.

Results

There were 2318 TAHOD patients and 14714 TAHOD-LITE patients. Patient demographics, CD4 count and HIV viral load testing rates were broadly similar between the cohorts. CD4 count response and all-cause mortality were consistent among the cohorts with similar risk factors. HIV viral load response appeared to be superior in TAHOD and many risk factors differed, possibly due to viral load being tested on a subset of patients.

Conclusions

Our study gives the first empirical evidence that analysis of risk factors for completely ascertained endpoints from our pseudo-randomly selected patient sample may be generalized to our larger, complete population of HIV-positive patients. However, results can significantly vary when analysing smaller or pseudo-random samples, particularly if some patient data are not completely missing at random, such as viral load results.

Keywords: Asia, HIV, patient sampling, cohort, selection bias, observational data

Introduction

Observational cohort studies are useful when evaluating the relationship between health related outcomes and exposures or when randomized controlled trials (RCTs) are not always feasible or ethical to be conducted [1, 2]. However, often there is little focus on the potential pitfalls of suboptimal patient sampling methods employed in observational cohorts. Also, selection bias is more likely to occur in cohorts than RCTs and may impact upon the validity and generalizability of the study findings [3-5].

Ideal patient sampling methods would produce a representative sample of the target population, with respect to patient demographics and disease-related variables. Although favorable, completely random sampling is not always feasible as recruitment is often costly and inefficient. As such, alternate sampling methods are used that are most appropriate to the given situation, including convenience sampling, quota sampling or homogenous sampling [6, 7]. For example, observational studies in emergency departments tend to use convenience sampling where patients presenting during “business hours” are selected as more staff are available to process recruitment data [8].

In HIV research, observational cohorts have been a key epidemiological resource with the ability to assess the natural history of HIV, antiretroviral treatment (ART) use and clinical outcomes within regions or target populations [9-12]. Early cohort studies of HIV-positive homosexual men were pivotal in identifying several HIV-related biomarkers that are still used for assessing disease progression [13]. However, HIV-positive patients require lifelong treatment and so, patients’ loss to follow up (LTFU) is a prominent concern [14]. Patients LTFU is a major source of bias in cohort studies and if large, can significantly impact upon the validity of the results [3].

Most HIV observational studies either recruit all patients seen at a clinic, or a pseudo-random subset of patients are recruited. HIV observational studies in the Asia-Pacific region utilize pseudo-random sampling for patient recruitment. In 2003, the TREAT Asia HIV Observational database (TAHOD) began collecting data on HIV-positive patients presenting at clinical sites across the Asia-Pacific region. In order to minimize costs and LTFU rates but retain heterogeneity across a very diverse region, a limited number of patients with good retention in care were consecutively recruited from a number of sites [15]. Although convenient, this pseudo-random selection method can introduce another source of bias as patients LTFU are not completely at random [16]. In 2014, the TREAT Asia HIV Observational database-Low Intensity Transfer (TAHOD-LITE), a sub-study of TAHOD, was initiated where data was collected on all patients seen at certain clinical sites, from a nominated calendar year.

These two cohorts presented an opportunity to evaluate whether pseudo-random patient sampling methods produce a representative sample and reach similar study findings to sampling of entire programs. The study objective was to compare patient demographics, pre-ART HIV biomarkers and response to ART between TAHOD and TAHOD-LITE to determine whether the TAHOD sample suitably represents all of the patients seen in TAHOD-LITE.

Methods

Data collection and Participants

The TREAT Asia HIV Observational database (TAHOD) is a collection of 20 HIV treatment centres across the Asia-Pacific region including China (1 site), Hong Kong (1 site), Taiwan (1 site), South Korea (1 site), India (2 sites), Indonesia (2 sites), Malaysia (2 sites), Philippines (1 site), Singapore (1 site), Thailand (4 sites), Japan (1 site), Cambodia (1 site) and Vietnam (2 sites). Detailed data were collected for a subset of patients that attend care at the sites. Patients were not entirely randomly recruited instead each site consecutively selected patients that were likely to be retained in follow up, with those receiving and not receiving ART eligible to be selected. Patients were prospectively recruited from September 2003 however retrospective data on enrolled patients were also retrieved. To date, TAHOD has recruited over 8 000 patients, with over 5 000 in active follow up to March 2015. Further description of TAHOD protocols and methods has been described elsewhere [15].

The TREAT Asia HIV Observational database Low Intensity Transfer (TAHOD-LITE) is a sub-study of TAHOD and currently involves only 8 of the 20 TAHOD sites from Cambodia (1 site), Hong Kong (1 site), India (1 site), Indonesia (1 site), Singapore (1 site), South Korea (1 site) and Vietnam (2 sites). Conversely to TAHOD, TAHOD-LITE included data from all patients seen at a site from a certain nominated calendar time point. Hence, TAHOD-LITE is a collection of previously recruited TAHOD patients and all other patients that were not recruited to TAHOD. However, patient data were limited to a few variables. To date, TAHOD-LITE included data on over 30 000 HIV-positive adult patients, with follow up to May 2014.

Ethical approvals were obtained for TAHOD and TAHOD-LITE from Institutional Review Boards (IRB) at each site, the University of New South Wales, and the coordinating center at TREAT Asia/amfAR. Written informed consent for data collection was only retrieved if required by the site-specific IRBs.

This analysis included all patients from TAHOD-LITE aged over 18 years, who had been receiving an ART regimen consisting of three or more drugs from 01 January 2003 to 31 December 2013 and who had at least one subsequent visit after the date of ART initiation. The TAHOD cohort was represented by the patients within TAHOD-LITE that were previously recruited into TAHOD. All other remaining eligible patients were used to represent the TAHOD-LITE cohort.

The primary endpoints included CD4 cell count, HIV viral load and overall mortality from ART initiation. The secondary endpoints included patient demographics, CD4 cell count and HIV viral load prior to ART initiation and, CD4 and HIV viral load testing rates.

Statistical analyses

TAHOD and TAHOD-LITE were compared based on the findings of the patient response to treatment, measured as CD4 cell count change, HIV viral load suppression and survival from ART initiation. Analogous to an intention-to-treat approach, we only considered the first ART regimen initiated, with all modifications to treatment after ART initiation being ignored. Risk factors were selected a priori and included year of ART initiation, age at ART initiation, mode of HIV exposure, pre-ART HIV viral load, pre-ART CD4 cell count, first ART regimen, previous mono/dual therapy exposure, ever hepatitis B co-infection (HBV) and ever hepatitis C co-infection (HCV).

The CD4 cell count response to treatment was examined as the change in CD4 cell count, defined as the difference between the pre-ART CD4 cell count and the CD4 cell count closest to and within 90 days of the given time point from ART initiation. We examined the median change in CD4 cell count, with interquartile range (IQR), every 6 months up to 24 months from ART initiation, by cohort. We also compared the risk factors associated with the change in CD4 cell count at 12 months from ART initiation for each cohort using a linear regression model, adjusted by clinical site.

The virological response to treatment was examined as HIV viral load suppression, defined as achieving a HIV viral load <400 copies/mL closest to and within 90 days of the given time point from ART initiation. We examined the proportion achieving HIV viral load suppression, with 95% binomial confidence interval (95% CI), every 6 months up to 24 months from ART initiation, by cohort. The risk factors associated with HIV viral load suppression at 12 months from ART initiation was examined for each cohort using logistic regression model, stratified by clinical site.

Kaplan-Meier curves were used to compare survival estimates between the cohorts. A log rank test was used to determine whether the survival was significantly different between the cohorts. Patient follow-up was from the start of ART and censored at the date of death or most recent clinic visit date, whichever was prior. As TAHOD prospectively recruited patients, TAHOD patients were left censored at the date of study recruitment. Cox proportional hazards model was used to evaluate the risk factors associated with mortality, within each cohort and stratified by clinical site.

To assess whether risk factors differed in TAHOD and TAHOD-LITE, interaction terms between the cohort and respective risk factor were also examined in all of the models, where a Wald test was used to determine if the interaction term was significant. Descriptive statistics were used to summarize the patient demographics, by cohort. The rate of CD4 and HIV viral load testing from ART initiation was calculated for each cohort. Rates were presented as per person-year of follow up (pys), with their 95% CI.

Data were analysed using Stata version 12 (Stata Corporation, College Station, Texas, USA).

Results

A total of 18 441 patients from TAHOD-LITE had initiated an ART regimen between 01 January 2003 and 31 December 2013 and were aged over 18 years at ART initiation. Subsequently, 1 409 patients were excluded as they did not have a further clinic visit after ART initiation. The remaining 17 032 patients were included in the analysis, where 2 318 patients were previously recruited into TAHOD and the remaining 14 714 patients were in TAHOD-LITE only.

Patient demographics comparison

Overall, the patient demographics were relatively similar between the cohorts (Table 1). Further country comparison also showed broad similarities in the patient demographics between the cohorts (Appendix 1). The distribution for mode of HIV exposure was somewhat different between the cohorts, where 82% in TAHOD-LITE and 68% in TAHOD indicated heterosexual exposure. This mainly arose from heterogeneity between the sites rather than between cohorts as the proportion reporting heterosexual contact varied from as low as 19% to as high as 99%. Only two sites had substantial differences in the proportion reporting heterosexual contact as mode of HIV exposure (Indonesia: 84% TAHOD-LITE vs 53% TAHOD; Singapore: 49% TAHOD-LITE vs 69% TAHOD).

Table 1. Demographics of the patients by cohort.

TAHOD-LITE
n=14714
TAHOD
n=2318
n (%) (%)1 n (%) (%)1
Year of ART initiation
2003-05 2367 (16) 577 (25)
2006-09 5500 (37) 748 (32)
2010-13 6847 (47) 993 (43)
Age
≤30 3692 (25) 646 (28)
31-40 6639 (45) 1033 (45)
41-50 2922 (20) 439 (19)
51 + 1461 (10) 200 (9)
Median [IQR] 35 [30, 42] 35 [30, 41]
Sex
Male 10081 (69) 1610 (69)
Female 4619 (31) 706 (30)
Transgender 14 (<0.1) 2 (<0.1)
Mode of HIV exposure
Heterosexual 12026 (82) 1576 (68)
Homosexual 1031 (7) 275 (12)
Injecting drug user 527 (4) 255 (11)
Other/Unknown 1130 (8) 212 (9)
Hepatitis C Co-infection2
Positive 717 (5) (11) 313 (14) (17)
Negative 6029 (41) (89) 1486 (64) (83)
Not tested 7968 (54) - 519 (22) -
Hepatitis B Co-infection3
Positive 694 (5) (9) 198 (9) (10)
Negative 6824 (46) (91) 1744 (75) (90)
Not tested 7196 (49) - 376 (16) -
Pre-ART CD4 (cells/μL)
≤50 2988 (20) (24) 659 (28) (32)
51-100 1997 (14) (16) 298 (13) (14)
101-200 3367 (23) (27) 494 (21) (24)
>200 4084 (28) (33) 624 (27) (30)
Not tested 2278 (15) - 243 (10) -
Median [IQR] 138 [53, 232] 116 [35, 216]
Pre-ART viral load
(copies/mL)
≤10^5 1314 (9) (47) 414 (18) (51)
>10^5 1491 (10) (53) 402 (17) (49)
Not tested 11909 (81) - 1502 (65) -
Median [IQR] 115000 [31791, 384000] 94675 [25900, 322000]
First ART regimen
NRTI+NNRTI4 13988 (95) 2105 (91)
NRTI+PI5 632 (4) 188 (8)
Other6 94 (1) 25 (1)
Previous mono/dual therapy
No 14217 (97) 2200 (95)
Yes 497 (3) 118 (5)
1

Column percentages excluding frequencies for not tested.

2

Hepatitis C antibody result where positive indicates ever positive result.

3

Hepatitis B surface antigen result where positive indicates ever positive result

4

Regimen combination consisting of nucleoside reverse transcriptase inhibitors (NRTIs) and nonnucleoside reverse transcriptase inhibitor (NNRTI).

5

Regimen combination consisting of nucleoside reverse transcriptase inhibitors (NRTIs) and protease inhibitor (PI).

6

Any regimen combination excluding NRTIs+NNRTI or NNRTIs+PI.

The percentage of patients with HBV or HCV ever positive was higher in TAHOD (TAHOD vs TAHOD-LITE: HBV positive: 9% vs 5%; HCV positive: 14% vs 5%) however TAHOD-LITE had a greater percentage of patients who had never been tested (TAHOD vs TAHOD-LITE: HBV not tested: 16% vs 49%; HCV not tested: 22% vs 54%).

CD4 cell count response from ART initiation

A further 243 TAHOD patients and 2 279 TAHOD-LITE patients were excluded from the CD4 cell count response analysis as they did not have a pre-ART CD4 cell count to determine the CD4 change over time. Hence, a total of 2075 TAHOD patients and 12435 TAHOD-LITE patients were used to describe the CD4 change from ART initiation. Across all countries, the median CD4 cell count change from ART initiation was similar between the cohorts (Figure 1A).

Figure 1. The response to first-line ART, by cohort.

Figure 1

(A) Median CD4 cell count change (cells/μL), with interquartile range, from ART initiation across all countries, by cohort. (B) Proportion of patients with HIV viral load <400 copies/mL since ART initiation across all countries, by cohort, where the shaded region represents the 95% binomial proportion confidence interval. (C) Probability of survival, as from Kaplan-Meier estimates, from ART initiation across all countries, by cohort (τ = log rank test).

A total of 9035 patients also had a CD4 cell count at 12 months from ART initiation and were included in the linear regression model (Table 2). Of the 9 035, 7 338 patients represented TAHOD-LITE and 1 697 patients represented TAHOD. Risk factors associated with CD4 cell count change that was similar between the cohorts included age at ART initiation and pre-ART HIV viral load. Other associated risk factors included year of ART initiation, sex, mode of HIV exposure, pre-ART CD4 cell count, first-line ART regimen and previous mono/dual therapy. TAHOD-LITE identified 4 more associated risk factors than TAHOD. The main effects of the cohort on the CD4 cell count change was not significant (Table 3), nor was the interaction effects with any of the risk factors (Table 2).

Table 2. Risk factors associated with CD4 change at 12 months from ART initiation, stratified by cohort.

TAHOD-LITE
n =7 338
TAHOD
n =1 697
Interaction
p value1

N (%)2 (%)3 Mean
Diff.
95% CI p
value4
N (%)2 (%)3 Mean
Diff.
95% CI P
value4
Year of ART Initiation 0.475 0.074 0.534
2003-2005 948 (13) 0 4 15 (24) 0
2006-2009 2938 (40) −12 (−25, 2) 0.086 619 (36) −3 (−22, 16) 0.765
2010-2013 3452 (47) −1 (−15, 13) 0.898 663 (39) 24 (−0, 47) 0.050

Age at ART initiation (years) <0.001 0.038 0.585
≤30 1707 (23) 0 436 (26) 0
31-40 3213 (44) −10 (−20, 0) 0.058 748 (44) −7 (−25, 10) 0.408
41-50 1548 (21) -23 (−35, −10) <0.001 350 (21) -29 (−51, −8) 0.007
51+ 870 (12) −21 (−36, −6) 0.006 163 (10) −14 (−41, 14) 0.333

Sex 0.195
Male 4959 (68) 0 1179 (69) 0
Female 2379 (32) 20 (10, 29) <0.001 518 (31) 6 (−11, 23) 0.514

Mode of HIV Exposure 0.023 0.783 0.850
Heterosexual contact 5787 (79) 0 1184 (70) 0
Homosexual contact 698 (10) 20 (3, 37) 0.020 234 (14) 11 (−16, 37) 0.440
Injecting drug use 266 (4) −24 (−51, 4) 0.090 123 (7) −6 (−42, 30) 0.743
Other/unknown 587 (8) −2 (−17, 13) 0.768 156 (9) 11 (−17, 39) 0.447

Pre-ART viral load
(copies/mL)
<0.001 <0.001 0.962
≤100000 850 (12) (46) 0 356 (21) (51) 0
>100000 1011 (14) (54) 35 (19, 51) <0.001 348 (21) (49) 56 (33, 79) <0.001
Not tested 5477 (75) - 18 (3, 34) 0.017 993 (59) - 31 (3, 59) 0.030

Pre-ART CD4 (cells/μL) <0.001 0.537 0.254
≤50 1790 (24) 0 524 (31) 0
51-100 1076 (15) 10 (−3, 23) 0.140 231 (14) 3 (−20, 25) 0.829
101-200 1943 (26) 3 (−9, 14) 0.622 410 (24) 6 (−14, 26) 0.564
201+ 2529 (34) -17 (−29, −6) 0.003 532 (31) −9 (−29, 12) 0.403

First ART regimen 0.078 0.710 0.468
NRTI+NNRTI5 6934 (94) 0 1510 (89) 0
NRTI+PI6 351 (5) 1 (−19, 20) 0.960 167 (10) 12 (−18, 42) 0.440
Other7 53 (1) 55 (7, 104) 0.025 20 (1) −7 (−74, 61) 0.844

Previous mono/duo exposure 0.654
No 7099 (97) 0 1618 (95) 0
Yes 239 (3) -44 (−66, −21) <0.001 79 (5) −29 (−64, 7) 0.118

Hepatitis B co-infection 8 0.363 0.229 0.796
Negative 4328 (59) (91) 0 1336 (79) (89) 0
Positive 443 (6) (9) −9 (−26, 8) 0.282 158 (9) (11) −3 (−27, 21) 0.821
Not tested 2567 (35) - −8 (−24, 8) 0.312 203 (12) - −30 (−64, 4) 0.087

Hepatitis C co-infection 9 0.256 0.108 0.754
Negative 3924 (53) (90) 0 1192 (70) (86) 0
Positive 431 (6) (10) −15 (−37, 6) 0.162 194 (11) (14) −24 (−53, 5) 0.103
Not tested 2983 (41) - 7 (−11, 25) 0.436 311 (18) - 19 (−12, 49) 0.228
1

Wald test for the interaction term between cohort (ie. TAHOD or TAHOD-LITE) and the respective risk factor.

2

Column percentages for the respective risk factor, by cohort.

3

Column percentages excluding frequencies for not tested in the respective risk factor, by cohort.

4

Wald test for each level and global of the respective risk factor. Global p values for year of ART initiation, age and pre-ART CD4 count are test for trend while all other global p values are test for heterogeneity.

5

Regimen combination consisting of nucleoside reverse transcriptase inhibitors (NRTIs) and nonnucleoside reverse transcriptase inhibitor (NNRTI).

6

Regimen combination consisting of nucleoside reverse transcriptase inhibitors (NRTIs) and protease inhibitor (PI).

7

Any regimen combination excluding NRTIs+NNRTI or NRTIs+PI.

8

Hepatitis B surface antigen result where positive indicates ever positive result.

9

Hepatitis C antibody result where positive indicates ever positive result.

Table 3. Summary of the main effect of cohort on the treatment response.

(A) CD4 cell count change at 12 months from ART initiation. (B) HIV viral load <400 copies/mL at 12 months from ART initiation. (C) All-cause mortality from ART initiation.

Unadjusted Adjusted1
Mean
Diff.
95% CI p
value2
Mean
Diff.
95% CI p
value2
(A) CD4 Response
 TAHOD-LITE 0 0
 TAHOD −6 (−15, 4) 0.235 −4 (−14, 5) 0.359

OR 95% CI p
value 2
OR 95% CI p
value 2

(B) Viral suppression
 TAHOD-LITE 1.00 1.00
 TAHOD 2.36 (1.83, 3.03) <0.001 3.15 (2.38, 4.16) <0.001

HR 95% CI p
value 2
HR 95% CI p
value 2

(C) Mortality
 TAHOD-LITE 1.00 1.00
 TAHOD 0.90 (0.72, 1.13) 0.366 0.93 (0.74, 1.17) 0.556
1

Adjusted for year of ART initiation, age at ART initiation, sex, mode of HIV exposure, pre-ART HIV viral load, pre-ART CD4 cell count, first-line ART regimen, previous mono/duo therapy exposure, hepatitis B co-infection (ever) and hepatitis C co-infection (ever).

2

Wald test.

HIV viral load response from ART initiation

Over 80% of TAHOD-LITE and 60% of TAHOD patients did not have regular HIV viral load testing and were excluded from the HIV viral load response analysis. Across all countries, the proportion of patients with a HIV viral load <400 copies/mL from ART initiation followed a similar trend between the cohorts (Figure 1B). However, TAHOD patients did have a slightly higher proportion up to 24 months from ART initiation.

A total of 3 448 patients were included in the logistic regression model, where 2 574 patients represented TAHOD-LITE and 874 patients represented TAHOD (Table 4). Associated risk factors that were similar between the cohorts included year of ART initiation, while other associated risk factors included age at ART initiation, sex, first-line ART regimen, previous mono/dual therapy and, HBV and HCV co-infection. The main effects of cohort was significant, where TAHOD patients had 3.15 times (95% CI: 2.38, 4.16) higher adjusted odds of achieving a HIV viral load <400 copies/mL (Table 3). Interaction effects were also significant for many of the risk factors (Table 4).

Table 4. Risk factors associated with HIV viral load < 400 copies/mL at 12 months from ART initiation, stratified by cohort.

TAHOD-LITE
n = 2 574
TAHOD
n = 874
Interaction
p value1

N % with
VL<400
copies/mL2
OR 95% CI p
value3
N % with
VL<400
copies/mL2
OR 95% CI p
value3
Year of ART Initiation <0.001 <0.001 <0.001
2003-2005 187 (4) 1.00 143 (14) 1.00
2006-2009 628 (18) 3.05 (1.82, 5.09) <0.001 349 (43) 4.07 (2.13, 7.79) <0.001
2010-2013 1759 (78) 20.14 (11.96, 33.92) <0.001 382 (44) 6.37 (3.05, 13.30) <0.001

Age at ART initiation (years) 0.920 0.031 0.028
≤30 571 (22) 1.00 193 (20) 1.00
31-40 1023 (38) 1.16 (0.83, 1.63) 0.379 367 (43) 2.51 (1.40, 4.48) 0.002
41-50 569 (23) 1.20 (0.80, 1.79) 0.380 212 (25) 1.85 (0.96, 3.59) 0.067
51+ 411 (17) 0.98 (0.62, 1.56) 0.935 102 (12) 2.83 (1.13, 7.11) 0.027

Sex 0.686
Male 1995 (79) 1.00 694 (80) 1.00
Female 579 (21) 1.42 (1.03, 1.95) 0.031 180 (20) 1.05 (0.56, 1.94) 0.883

Mode of HIV Exposure 0.303 0.889 <0.001
Heterosexual contact 1550 (54) 1.00 470 (54) 1.00
Homosexual contact 581 (27) 1.53 (0.96, 2.45) 0.072 223 (27) 1.04 (0.51, 2.11) 0.921
Injecting drug use 135 (6) 1.15 (0.53, 2.52) 0.719 54 (4) 0.65 (0.21, 2.04) 0.458
Other/unknown 308 (13) 1.25 (0.82, 1.91) 0.306 127 (15) 1.06 (0.49, 2.33) 0.879

Pre-ART HIV viral load
(copies/mL)
0.596 0.638 <0.001
≤100000 712 (28) 1.00 335 (41) 1.00
>100000 782 (30) 1.25 (0.82, 1.91) 0.306 312 (37) 0.79 (0.42, 1.47) 0.458
Not Tested 1080 (43) 0.85 (0.59, 1.22) 0.372 227 (21) 0.70 (0.31, 1.58) 0.391

Pre-ART CD4 cell count
(cells/μL)
0.882 0.688 <0.001
≤50 583 (25) 1.00 209 (22) 1.00
51-100 282 (11) 1.03 (0.64, 1.66) 0.892 101 (11) 1.26 (0.60, 2.67) 0.544
101-200 437 (16) 1.10 (0.72, 1.69) 0.664 187 (22) 1.67 (0.84, 3.33) 0.141
201+ 1102 (41) 1.11 (0.75, 1.65) 0.591 339 (41) 1.34 (0.69, 2.60) 0.383
Not Tested 169 (6) 0.85 (0.49, 1.48) 0.576 38 (4) 1.42 (0.45, 4.44) 0.548

First-line regimen 0.021 0.667 0.583
NRTIs+NNRTI4 2252 (88) 1.00 692 (78) 1.00
NRTIs+PI5 276 (10) 0.52 (0.32, 0.83) 0.006 167 (21) 1.38 (0.59, 3.23) 0.462
Other6 46 (2) 0.58 (0.16, 2.04) 0.392 15 (2) 1.77 (0.27, 11.65) 0.551

Previous mono/duo exposure <0.001
No 2513 (98) 1.00 834 (97) 1.00
Yes 61 (2) 1.82 (0.79, 4.22) 0.162 40 (3) 0.22 (0.09, 0.55) 0.001

Hepatitis B co-infection 7 <0.001 0.922 0.050
Negative 1996 (81) 1.00 706 (82) 1.00
Positive 186 (8) 1.10 (0.64, 1.88) 0.724 63 (7) 0.88 (0.39, 2.00) 0.761
Not Tested 392 (11) 3.35 (2.17, 5.19) <0.001 105 (11) 0.87 (0.32, 2.37) 0.783

Hepatitis C co-infection 8 <0.001 0.978 <0.001
Negative 1755 (78) 1.00 590 (70) 1.00
Positive 209 (9) 0.73 (0.39, 1.38) 0.335 73 (6) 1.09 (0.36, 3.33) 0.876
Not Tested 610 (13) 0.22 (0.15, 0.34) <0.001 211 (24) 0.95 (0.38, 2.38) 0.907
1

Wald test for the interaction term between cohort (ie. TAHOD or TAHOD-LITE) and the respective risk factor.

2

Column percentages of the patients with HIV viral load <400 copies/mL for the respective risk factor, by cohort.

3

Wald test for each level and global of the respective risk factor. Global p values for year of ART initiation, age and pre-ART CD4 count are test for trend while all other global p values are test for heterogeneity.

4

Regimen combination consisting of nucleoside reverse transcriptase inhibitors (NRTIs) and nonnucleoside reverse transcriptase inhibitor (NNRTI).

5

Regimen combination consisting of nucleoside reverse transcriptase inhibitors (NRTIs) and protease inhibitor (PI).

6

Any regimen combination excluding NRTIs+NNRTI or NRTIs+PI.

7

Hepatitis B surface antigen result where positive indicates ever positive result.

8

Hepatitis C antibody result where positive indicates ever positive result.

Mortality from ART initiation

Overall, there were 903 deaths with 787 deaths occurring in TAHOD-LITE and 116 deaths occurring in TAHOD. Kaplan-Meier survival estimates from ART initiation indicated that the probability of survival not significantly different between the cohorts (p-value=0.797), up to 4 years follow-up (Figure 1C).

All eligible patients were included in the Cox Proportional Hazards regression model (Table 5). Risk factors that were associated with mortality that were identified in both cohorts included year of ART initiation, age at ART initiation and pre-ART CD4 cell count. Six other risk factors were associated with mortality in TAHOD-LITE only. The main effect of cohort on mortality was not significant (Table 3). Interaction effects were only significant for year of ART initiation and HBV co-infection (Table 5).

Table 5. Risk factors associated with mortality from ART initiation, stratified by cohort.

TAHOD-LITE
n = 14 713
TAHOD
n = 2 284
Interaction
p value1

Total
Patients
% of
deaths2
HR 95% CI p
value3
Total
Patients
% of
deaths2
HR 95% CI p
value3
Year of ART Initiation 0.044 0.048 0.015
2003-2005 2367 (27) 1.00 562 (46) 1.00
2006-2009 5500 (41) 0.83 (0.69, 1.01) 0.067 743 (27) 0.72 (0.41, 1.26) 0.246
2010-2013 6846 (32) 0.79 (0.63, 0.99) 0.044 979 (27) 0.45 (0.20, 1.01) 0.054

Age at ART initiation (years) <0.001 0.003 0.877
≤30 3692 (18) 1.00 638 (26) 1.00
31-40 6638 (41) 1.08 (0.88, 1.32) 0.458 1016 (38) 1.11 (0.68, 1.79) 0.680
41-50 2922 (21) 1.23 (0.98, 1.56) 0.077 430 (19) 1.17 (0.65, 2.11) 0.604
51+ 1461 (20) 2.23 (1.74, 2.86) <0.001 200 (17) 3.35 (1.76, 6.37) <0.001

Sex 0.461
Male 10094 (80) 1.00 1584 (80) 1.00
Female 4619 (20) 0.72 (0.60, 0.87) 0.001 700 (20) 0.82 (0.49, 1.35) 0.431

Mode of HIV Exposure <0.001 0.517 0.463
Heterosexual contact 12025 (80) 1.00 1552 (59) 1.00
Homosexual contact 1031 (4) 0.64 (0.44, 0.94) 0.025 271 (11) 1.40 (0.69, 2.84) 0.356
Injecting drug use 527 (8) 1.91 (1.29, 2.85) 0.001 250 (21) 1.50 (0.75, 3.02) 0.253
Other/unknown 1130 (8) 1.04 (0.80, 1.37) 0.760 211 (9) 1.42 (0.72, 2.82) 0.310

Pre-ART HIV viral load
(copies/mL)
<0.001 0.614 0.910
≤100000 1314 (6) 1.00 410 (12) 1.00
>100000 1491 (12) 1.18 (0.82, 1.70) 0.366 398 (18) 0.84 (0.41, 1.72) 0.628
Not tested 11908 (82) 1.78 (1.26, 2.52) <0.001 1476 (70) 1.16 (0.52, 2.59) 0.717

Pre-ART CD4 cell count
(cells/μL)
<0.001 <0.001 0.394
≤50 2987 (35) 1.00 653 (49) 1.00
51-100 1997 (22) 0.90 (0.74, 1.10) 0.309 291 (15) 0.61 (0.35, 1.06) 0.079
101-200 3367 (19) 0.48 (0.39, 0.59) <0.001 490 (21) 0.58 (0.35, 0.94) 0.028
201+ 4083 (10) 0.26 (0.20, 0.33) <0.001 618 (7) 0.17 (0.08, 0.38) <0.001
Not tested 2278 (14) 0.53 (0.42, 0.67) <0.001 232 (8) 0.40 (0.19, 0.85) 0.017

First-line regimen 0.048 0.996 0.560
NRTIs+NNRTI4 13987 (94) 1.00 2072 (95) 1.00
NRTIs+PI5 632 (6) 1.52 (1.08, 2.13) 0.015 187 (5) 1.04 (0.39, 2.77) 0.932
Other6 94 (<1) 1.34 (0.50, 3.64) 0.563 25 (-) 0.00 1.000

Previous mono/duo exposure 0.202
No 14216 (95) 1.00 2167 (86) 1.00
Yes 497 (5) 1.16 (0.84, 1.61) 0.357 117 (14) 1.72 (0.90, 3.30) 0.102

Hepatitis B co-infection 7 0.079 0.339 0.018
Negative 6823 (47) 1.00 1730 (74) 1.00
Positive 694 (7) 1.37 (1.03, 1.82) 0.029 195 (7) 0.60 (0.29, 1.24) 0.168
Not tested 7196 (46) 0.96 (0.72, 1.28) 0.774 359 (19) 0.78 (0.36, 1.69) 0.531

Hepatitis C co-infection 8 0.652 0.888 0.068
Negative 6028 (40) 1.00 1475 (59) 1.00
Positive 717 (9) 1.17 (0.80, 1.70) 0.414 310 (18) 0.92 (0.47, 1.80) 0.807
Not tested 7968 (51) 1.09 (0.78, 1.51) 0.620 499 (23) 1.15 (0.52, 2.56) 0.729
1

Wald test for the interaction term between cohort (ie. TAHOD or TAHOD-LITE) and the respective risk factor.

2

Column percentages of the patients who have died for the respective risk factor, by cohort.

3

Wald test for each level and global of the respective risk factor. Global p values for year of ART initiation, age and pre-ART CD4 count are test for trend while all other global p values are test for heterogeneity.

4

Regimen combination consisting of nucleoside reverse transcriptase inhibitors (NRTIs) and nonnucleoside reverse transcriptase inhibitor (NNRTI).

5

Regimen combination consisting of nucleoside reverse transcriptase inhibitors (NRTIs) and protease inhibitor (PI).

6

Any regimen combination excluding NRTIs+NNRTI or NRTIs+PI.

7

Hepatitis B surface antigen result where positive indicates ever positive result.

8

Hepatitis C antibody result where positive indicates ever positive result.

CD4 and HIV viral load testing rates

Across all the countries, the CD4 testing rate was slightly higher in TAHOD-LITE than TAHOD, while the HIV viral load testing rate was similar between the cohorts. The CD4 testing rate was 1.77 tests per pys (95% CI: 1.76, 1.78) in TAHOD-LITE and 1.34 tests per pys (95% CI: 1.32, 1.35) in TAHOD. The HIV testing rate was 0.65 tests per pys (95% CI: 0.65, 0.66) in TAHOD-LITE and 0.65 tests per pys (95% CI: 0.64, 0.66) in TAHOD.

Discussion

Overall, 2 318 patients and 14 714 patients represented TAHOD and TAHOD-LITE, respectively. Detailed comparison of the patient demographics, by country and across all countries, showed somewhat of a similarity between the cohorts. The CD4 cell count response and all-cause mortality from ART initiation was consistent among the cohorts. Additionally, both cohorts identified similar risk factors associated with CD4 cell count change and overall survival at 12 months from ART initiation. There was also little evidence to suggest that being in TAHOD or TAHOD-LITE interacted with any of the risk factors to produce differing CD4 cell count change or overall survival. However, the HIV viral load response from ART initiation was superior for TAHOD patients compared to TAHOD-LITE patients. The CD4 cell count and HIV viral load testing rates were also comparable between the cohorts.

Our findings did suggest that while CD4 and survival outcomes and risk factors were broadly similar between TAHOD and TAHOD-LITE, HIV viral load response differed between the cohorts, with interaction and main effects being significant. The virological suppression being superior in TAHOD patients could have arisen from differences in standard of care, where TAHOD patients may have been more engaged in care, perhaps attending the clinic more regularly or having better adherence [17, 18]. These data were not collected, so could not be thoroughly compared between the cohorts.

However, these findings may not reflect true differences in patient outcome, but rather may be a result of the lack of data. The HIV viral load response analysis was reliant on HIV viral load results that were missing for ~80% and ~60% of TAHOD-LITE and TAHOD patients, respectively. We believe the interpretation of our findings should be that analyses of endpoints that are not routinely collected in all patients are very prone to different results and conclusions. It is difficult to see how a statistical analysis technique could untangle the biases inherent in targeted endpoints, and we would recommend that analyses are only done for subsets for patients with routine testing performed.

There were limitations in our study. Although TAHOD-LITE collects data on a substantial number of patients, it is restricted to few variables relating to patient data. This limited our ability to explore other important variables relating to our outcomes, including duration and stage of HIV-infection. In addition, some variables that were not routinely collected had large proportions of missing data, such as HBV, HCV and HIV viral load, which could introduce another source of bias. In particular, viral load results have likely arisen from targeted testing where data are not missing completely at random (MCAR). Data not MCAR can affect the validity of the analysis results as the probability of missing data is partly reliant on other unobserved factors [19]. Hence, caution is advised not to overly interpret findings relating to these variables.

Patients LTFU is an expected reality in any study, but is especially a concern in observational studies. TAHOD-LITE did not have data from all the sites on whether patients had been transferred out to other clinics. Therefore, the overall patients LTFU could not be determined in TAHOD-LITE. However, for sites that did provide patient transfer data, the percentage of patients LTFU was quite low and comparable between the cohorts. TAHOD-LITE had a slightly higher percentage LTFU at 12% compared to TAHOD which had 10% LTFU. Our study also had limited sites per country, where eight clinical sites were used to represent seven countries. Hence, our findings cannot be generalized to the entire Asia-Pacific region or specific country, but rather are a reflective of the clinical sites themselves. In addition, there was heterogeneity between the clinical sites and the results were heavily weighted by the Indian site. Nonetheless, we have accounted for the heterogeneity and unequal weighting in the multivariate models with stratification by clinical site.

Random sampling is the optimal sampling method for producing a representative sample of the study population [20]. However, in practice, it is not always feasible to utilize random sampling methods and instead alternative methods are often used [2]. We believe that our analysis is the first to evaluate whether an observational patient sample is representative of the true study population. Our study suggests that pseudo-randomly patient sample cohorts generally produce risk factor estimates that were similar to those obtained from the entire study population.

In summary, we found that our pseudo-random patient sample, TAHOD, is representative of our larger, study population, TAHOD-LITE, and importantly produces comparable findings in an analysis of response to treatment for endpoints that are routinely ascertained. However, endpoints for data that are not routinely collected or not missing completely at random does introduce bias that can significantly impact upon subsequent analyses, particularly relating to viral load suppression. As such, our analyses should be limited to routinely collected data. Thus, our study gives the first empirical confirmation that analysis of risk factors for completely ascertained endpoints can be generalized from the pseudo-randomly selected patient sample to the larger, complete patient population in our HIV cohorts.

Supplementary Material

What is new?

Key findings

  • Our pseudo-random patient sample, TAHOD, is representative of our larger study population, TAHOD-LITE, and produced comparable findings for routinely ascertained endpoints relating to the response to antiretroviral treatment.

  • Endpoints that were not routinely collected or not missing completely at random may introduce bias that can significantly impact upon subsequent analyses, such as viral load suppression

What this study adds to what was known?

  • Pseudo-random sampling of patients in observational cohorts is common, but can introduce bias. Our findings provide the first empirical evidence that a pseudo-random sample can produce comparable results seen in the larger, entire study population.

What is the implication and what should change now?

  • Pseudo-random sampling methods should not be dismissed where random sampling is impractical, as analyses relating to routinely collected data may still produce comparable results.

Acknowledgements

TAHOD-LITE (TREAT Asia HIV Observational Database Low-Intensity TransfEr) is an initiative of TREAT Asia, a program of amfAR, The Foundation for AIDS Research, with support from the U.S. National Institutes of Health’s National Institute of Allergy and Infectious Diseases, Eunice Kennedy Shriver National Institute of Child Health and Human Development, and National Cancer Institute, as part of the International Epidemiologic Databases to Evaluate AIDS (IeDEA; U01AI069907). TREAT Asia is also supported by ViiV Healthcare. The Kirby Institute is funded by the Australian Government Department of Health and Ageing, and is affiliated with the Faculty of Medicine, UNSW Australia (The University of New South Wales). The content of this publication is solely the responsibility of the authors and does not necessarily represent the official views of any of the governments or institutions mentioned above.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Competing Interests

The authors do not have any competing interests to declare.

Authors’ contributions

NLD, MYA and ML contributed to the concept development. KN, PSL, OTN, KVN, TPM, TTP, MPL and MYA contributed data for the analysis. NLD performed the statistical analysis and wrote the first draft of the manuscript. All authors commented on the draft manuscript and approved of the final manuscript.

References

  • 1.Prentice RL. Design issues in cohort studies. Stat Methods Med Res. 1995 Dec;4(4):273–92. doi: 10.1177/096228029500400402. [DOI] [PubMed] [Google Scholar]
  • 2.Soh SE, Saw SM. Cohort studies: design and pitfalls. Am J Ophthalmol. 2010 Jul;150(1):3–5. doi: 10.1016/j.ajo.2010.03.008. [DOI] [PubMed] [Google Scholar]
  • 3.Howe CJ, Cole SR, Lau B, Napravnik S, Eron JJ., Jr. Selection bias due to loss to follow up in cohort studies. Epidemiology. 2015 Oct 19; doi: 10.1097/EDE.0000000000000409. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Pizzi C, De Stavola BL, Pearce N, Lazzarato F, Ghiotti P, Merletti F, et al. Selection bias and patterns of confounding in cohort studies: the case of the NINFEA web-based birth cohort. J Epidemiol Community Health. 2012 Nov;66(11):976–81. doi: 10.1136/jech-2011-200065. [DOI] [PubMed] [Google Scholar]
  • 5.Schooling CM, Cowling BJ, Jones HE. Selection bias in cohorts of cases. Prev Med. 2013 Sep;57(3):247–8. doi: 10.1016/j.ypmed.2013.05.025. [DOI] [PubMed] [Google Scholar]
  • 6.Bornstein MH, Jager J, Putnick DL. Sampling in Developmental Science: Situations, Shortcomings, Solutions, and Standards. Dev Rev. 2013 Dec;33(4):357–70. doi: 10.1016/j.dr.2013.08.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Suresh K, Thomas SV, Suresh G. Design, data analysis and sampling techniques for clinical research. Ann Indian Acad Neurol. 2011 Oct;14(4):287–90. doi: 10.4103/0972-2327.91951. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Valley MA, Heard KJ, Ginde AA, Lezotte DC, Lowenstein SR. Observational studies of patients in the emergency department: a comparison of 4 sampling methods. Ann Emerg Med. 2012 Aug;60(2):139–45 e1. doi: 10.1016/j.annemergmed.2012.01.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Hendel H, Cho YY, Gauthier N, Rappaport J, Schachter F, Zagury JF. Contribution of cohort studies in understanding HIV pathogenesis: introduction of the GRIV cohort and preliminary results. Biomed Pharmacother. 1996;50(10):480–7. doi: 10.1016/s0753-3322(97)89278-5. [DOI] [PubMed] [Google Scholar]
  • 10.Pantazis N, Morrison C, Amornkul PN, Lewden C, Salata RA, Minga A, et al. Differences in HIV natural history among African and non-African seroconverters in Europe and seroconverters in sub-Saharan Africa. PLoS One. 2012;7(3):e32369. doi: 10.1371/journal.pone.0032369. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Petoumenos K, Australian HIVOD The role of observational data in monitoring trends in antiretroviral treatment and HIV disease stage: results from the Australian HIV observational database. J Clin Virol. 2003 Feb;26(2):209–22. doi: 10.1016/s1386-6532(02)00119-1. [DOI] [PubMed] [Google Scholar]
  • 12.Samet JM, Munoz A. Evolution of the cohort study. Epidemiol Rev. 1998;20(1):1–14. doi: 10.1093/oxfordjournals.epirev.a017964. [DOI] [PubMed] [Google Scholar]
  • 13.Polk BF, Fox R, Brookmeyer R, Kanchanaraksa S, Kaslow R, Visscher B, et al. Predictors of the acquired immunodeficiency syndrome developing in a cohort of seropositive homosexual men. N Engl J Med. 1987 Jan 8;316(2):61–6. doi: 10.1056/NEJM198701083160201. [DOI] [PubMed] [Google Scholar]
  • 14.McManus H, Petoumenos K, Brown K, Baker D, Russell D, Read T, et al. Loss to follow-up in the Australian HIV Observational Database. Antivir Ther. 2014 Nov 7; doi: 10.3851/IMP2916. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Zhou J, Kumarasamy N, Ditangco R, Kamarulzaman A, Lee CK, Li PC, et al. The TREAT Asia HIV Observational Database: baseline and retrospective data. J Acquir Immune Defic Syndr. 2005 Feb 1;38(2):174–9. doi: 10.1097/01.qai.0000145351.96815.d5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Berger VW. Conservative handling of missing information. J Clin Epidemiol. 2012 Nov;65(11):1237–8. doi: 10.1016/j.jclinepi.2012.04.017. [DOI] [PubMed] [Google Scholar]
  • 17.El-Khatib Z, Ekstrom AM, Coovadia A, Abrams EJ, Petzold M, Katzenstein D, et al. Adherence and virologic suppression during the first 24 weeks on antiretroviral therapy among women in Johannesburg, South Africa - a prospective cohort study. BMC Public Health. 2011;11:88. doi: 10.1186/1471-2458-11-88. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Shah B, Walshe L, Saple DG, Mehta SH, Ramnani JP, Kharkar RD, et al. Adherence to antiretroviral therapy and virologic suppression among HIV-infected persons receiving care in private clinics in Mumbai, India. Clin Infect Dis. 2007 May 1;44(9):1235–44. doi: 10.1086/513429. [DOI] [PubMed] [Google Scholar]
  • 19.Norris CM, Ghali WA, Knudtson ML, Naylor CD, Saunders LD. Dealing with missing data in observational health care outcome analyses. J Clin Epidemiol. 2000 Apr;53(4):377–83. doi: 10.1016/s0895-4356(99)00181-x. [DOI] [PubMed] [Google Scholar]
  • 20.Kalsbeek W, Heiss G. Building bridges between populations and samples in epidemiological studies. Annu Rev Public Health. 2000;21:147–69. doi: 10.1146/annurev.publhealth.21.1.147. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

RESOURCES