Table 1.
Assumption | Criteria | Expected outcome | ||
Representative data source should be a random sample of the target population | ||||
|
Check all RDS-IIa assumptions | |||
|
|
Reciprocity (N/Ab) | Ask participants’ relationship to the person who gave them a study coupon and if they say stranger then reciprocity will not be fulfilled. | Participants more likely to be recruited by friends and acquaintances. |
|
|
Sampling with replacement (N/A) | Always violated in real-life RDSc studies, when the RDS successive sampling estimator is not used. | —d |
|
|
Accurate report of personal network size (N/A) | Sensitivity analysis of different network size questions. | RDS estimates should agree with each other regardless of different network size questions used. |
|
|
Final sample independent of the original seeds | Assess whether seed dependence was removed using convergence plots. | Overall estimate of P converges to the final estimate of P and remains stable as additional participants are recruited. |
|
|
Completely connected networked population at each site | Assess whether the FSWe population is networked using bottleneck plots. | Estimate of P from individual seeds converge to a shared estimate. |
|
|
Random recruitment | Assess whether there is an indication of nonrandom recruitment by measuring recruitment homophily. | Recruitment homophily should be approximately 1. |
|
Two data sources combined are drawn from the same population, with the RDS data being representative of the target population | Compare sociodemographic and other characteristics of RDS surveys participants reporting program attendance with records of program attenders for the same time reference using logistic regression. | No evidence of difference in characteristics of RDS surveys participants who report program attendance within the reference period and the characteristics of program attenders in the program dataset during the reference period. | |
All members of the population being counted should have a chance of being included in both sources | Assess if all RDS surveys participants are familiar with the existence of the program by using chi-square tests to compare characteristics of individuals who had ever heard of the program with those who had not across sites. | No evidence of difference between individuals who had ever heard of the program with those who had not. | ||
Data sources should have the same and clear time references, age ranges, geographic areas and individuals should not be counted more than once in each data source. | Assess if time references, age ranges and geographic areas of RDS and program data are similar or not; deduplicate program data if participants visited the program several times during the reference period. | Report if time references, age ranges and geographic areas are similar or not. Deduplicated program data. |
||
The 2 data sources should be independent of each other, that is inclusion of individuals in 1 source should not be related to the inclusion of individuals in the other source. | Do not identify seeds and participants in general through the program; given that seed participants might also be more likely to be program attenders, even if they are not selected on this basis, assess convergence of P over time for evidence of seed dependence using convergence plots. | Report how RDS participants were identified and recruited; overall estimate of P converges to the final estimate of P and remains stable as additional participants are recruited. |
aRDS-II: RDS Volz-Heckathorn estimator.
bN/A: denotes the assumptions that could not be investigated with the data available in this study.
cRDS: respondent-driven sampling.
dAssumption always violated when other RDS estimators (not the RDS successive sampling estimator) are used.
eFSWs: female sex workers.