Abstract
Background
When an exclusionary criterion is an imperfect screen, some ineligible patients will remain in a study. Medical record review for outcome adjudication can reveal such individuals.
Objective
To ascertain the circumstances under which it is advisable to remove outcome cases first found to be ineligible on chart review.
Methods
The impact on the relative risk caused by removal of ineligible outcome cases was examined under different circumstances of confounding, prevalence, and efficacy of the screening criterion for exclusions. The result is illustrated by a hospital-based cohort study in which electronic medical record diagnosis served to exclude ineligible cases, and review of text notes for putative outcome cases revealed that the codes were only 95% sensitive. Other hypothetical scenarios provide further evidence.
Results
If a condition to be excluded is a confounder of the exposure–outcome relation, residual confounding will continue to bias a study after application of an imperfect screening criterion. Removal of ineligible outcome cases after chart review creates a new bias, distinct from residual confounding. The new bias does not depend on the magnitude of the confounder–outcome association, and will be small if the exclusion criterion has resulted in a low prevalence of the exclusionary condition. The new bias caused by removal of ineligible outcome cases is almost certain to be smaller than the confounding bias that can result if they are retained.
Conclusions
Outcome cases first discovered at chart review to be study-ineligible should be removed from the study, even when similar scrutiny is infeasible for non-cases.
Keywords: exclusions, eligibility, confounding, study design, adjudication, analysis
Video abstract
Background
In most epidemiologic studies, there are successive steps of data collection, over which the detail of information and the unit cost both rise. The initial implementation of exclusion and inclusion criteria is typically applied to the full population of potential study subjects during the least expensive phase of data acquisition. As more resources are invested, more detailed information becomes available on key subjects, raising the possibility that the exclusion criteria might be reapplied.
Expert review of a medical record for the verification of outcome events (case adjudication) is an example of a late-stage, expensive, and detailed phase of data collection. In some people, case adjudication will reveal the unexpected presence of an exclusionary condition. The purpose of this comment is to demonstrate that such cases should be removed, even if it is not feasible to apply the same level of scrutiny to the entire study group.
Motivating example
In a study of the possible carcinogenicity of intravenous antifungal drugs, Schneeweiss et al1 used hospital electronic medical records (EMRs) to identify drug recipients, whom they monitored after hospital discharge through the US National Death Index (NDI; ClinicalTrials.gov NCT01686607; ENCePP.eu number 2858). The outcome of interest was death from hepatocellular carcinoma (HCC).
One of the cohort eligibility criteria was the absence of a diagnosis of liver cancer or of any condition that might represent the first clinical presentation of HCC. To carry out the exclusion, Schneeweiss et al1 checked diagnosis codes in the EMR. Text notes from the EMR were not used at this stage because of the size of the population potentially eligible for the study. Of 40,485 patients who were otherwise eligible, 375 (0.9%) were excluded because of diagnosis codes in the EMR. The numbers quoted in the example are interim grouped and personally non-identifiable results from an ongoing study. They are provided by the principal investigator (SS), and are used with the permission of the study sponsor, Astellas Pharma Europe R&D.
The study drugs varied in their likelihood of being prescribed for the patients with HCC because the drugs differed in the degree of acute liver toxicity.2 The anticipated result was that baseline HCC would be more common in some drug groups than others.
After propensity-score matching of groups exposed to different drugs, 23,365 patients were ultimately included in the cohort. Per protocol, each NDI identification of an HCC death triggered a review of the EMR text notes to verify that the subject met all eligibility criteria. Part way through the study, 16 HCC deaths had been identified through the NDI. Eight of the HCC deaths proved on review to have had the possibility of liver cancer mentioned in the EMR text notes at the index hospitalization, even though they had none of the EMR diagnoses that had been used for the exclusion of prevalent cases. Per protocol, the eight persons with text mentions related to HCC were removed from the study.
Among the 375 cases who were excluded from the cohort because of an EMR diagnosis, there were 28 deaths from HCC. These were in addition to the 16 who had been identified among cohort members without an EMR diagnosis. The crude HCC mortality risk in those with an EMR diagnosis was, therefore, about 7.5% (28/375). In those without an EMR diagnosis, there were eight accepted deaths from HCC among 23,365 persons, for a crude risk of 0.034% (8/23,365). With a risk of HCC death of 7.5% in persons with diagnosis codes related to HCC at the index hospitalization, and a risk of HCC death in persons with no codes of 0.034%, the crude relative risk (RR) of death from HCC in those with codes at index hospitalization versus those without was just over 200 (7.5/0.034). While a RR of 200 would be unusual for etiologic confounders, it is the kind of risk distortion that might readily arise with definitional exclusions, as was the case in this example.
How big an effect?
Had Schneeweiss et al1 reviewed the EMR text of all the cohort members, they might have found further individuals with possible HCC at baseline. Absent a full-cohort, full-record review, the unidentified, ineligible persons were counted as part of the study population.
If ineligible subjects remain in the study cohort after screening, and if they are not removed from the outcome case series, the resulting error is residual confounding. Table 1 presents the algebra and Table 2 provides an example. In Table 1, each cell label that is presented without a formula is a value supplied by the study setting. The cells in Table 1 that contain formulas are functions of other cells in the table. The formulas take account of the sensitivity and specificity of the exclusion criterion in the two study groups, the RRs for test group versus comparator and for the confounder present versus absent, as well as the prevalence of the confounder in the test and comparator groups.
Table 1.
Performance of imperfect exclusion criterion in detecting the condition to be excluded
|
Effects of test and comparator on occurrence of the outcome
|
|||||
---|---|---|---|---|---|---|
Group | Sensitivity | Specificity | Risk in the Comparator group | E | ||
Test | A | B | Relative Risk (Test v Comparator groups) | F | ||
Comparator | C | D | Relative Risk (condition to be excluded: present v absent) | G | ||
| ||||||
Number | Prevalence of the condition to be excluded | Cases | Crude risk | Cases found on chart review to have the condition to be excluded | Risk after removal of cases found on chart review to have the condition to be excluded | |
Before removal of individuals using the imperfect population exclusion criterion | ||||||
Test | H | J | L=H*E*F*(1−J+G*J) | N=L/H | Q=H*J*E*F*G | S=(L−Q)/(H−Q) |
Comparator | I | K | M=I*E*(1−K+G*K) | O=M/I | R=I*K*E*G | T=(M−R)/(I−R) |
Crude RR | P=N/O | RR after removal of cases among persons with the condition to be excluded | U=S/T | |||
| ||||||
After removal of individuals using the imperfect population exclusion criterion | ||||||
Test | V=H−H*[J*A+(1−J)*(1−B)] | X=H*J*(1−A)/V | Z=V*E*F*(1−X+G*X) | BB=Z/V | EE=V*X*E*F*G | GG=(Z−EE)/(V−EE) |
Comparator | W=I−I*[K*C+(1−K)*(1−D)] | Y=I*K*(1−C)/W | AA=V*E*(1−K+G*Y) | CC=AA/W | FF=W*Y*E*G | HH=(AA−FF)/(W−FF) |
RR after application of the populations exclusion criterion | DD=BB/CC | RR after removal of cases found on chart review to have the condition to be excluded and after application of the population exclusion criterion | II=GG/HH |
Notes: Letters alone in cells indicate values supplied by the problem. Letters followed by “=” are calculated from the supplied values. An Excel spreadsheet to perform the calculations above is available in Supplementary material.
Abbreviations: RR, relative risk.
Table 2.
Group | Performance of diagnosis codes in detecting prevalent HCC at baseline
|
Effects of test and comparator antifungal drugs on risk of death from HCC
|
||||
---|---|---|---|---|---|---|
Sensitivity | Specificity | Risk (Comparator) | 0.034% | |||
Test | 0.95 | 0.99 | Relative Risk (Test v Comparator) | 1.00 | ||
Comparator | 0.95 | 0.99 | Relative Risk (HCC at baseline: present v absent) | 200 | ||
| ||||||
After removal of HCC deaths found on chart review to have had HCC at baseline | ||||||
Number | Prevalence of HCC at baseline | Deaths from HCC | HCC mortality risk | Removed deaths from HCC: persons with HCC at baseline | HCC mortality risk | |
| ||||||
Before exclusion of HCC at baseline using diagnosis codes | ||||||
Test | 7,000 | 4.2% | 22 | 0.318% | 20.0 | 0.033% |
Comparator | 16,000 | 1.9% | 26 | 0.163% | 20.7 | 0.033% |
RR in the full population | 1.96 | RR after removal of deaths found on chart review to have had HCC at baseline | 0.98 | |||
| ||||||
After removal of persons with HCC at baseline using diagnosis codes | ||||||
Test | 6,654 | 0.22% | 3 | 0.049% | 1.0 | 0.034% |
Comparator | 15,554 | 0.10% | 6 | 0.041% | 1.0 | 0.034% |
RR after removal of persons with diagnosis codes for HCC at baseline | 1.21 | RR after removal of persons with diagnosis codes for HCC at baseline and of deaths found on chart review to have had HCC at baseline | 1.00 |
Abbreviations: EMR, electronic medical record; HCC, hepatocellular carcinoma; RR, relative risk.
Table 2 presents hypothetical data modeled after the motivating example. The test and comparator drug groups number 7,000 and 16,000 persons. Sensitivity and specificity of the diagnosis codes have been set to 95% and 99%, respectively. The risk of HCC death in persons without HCC at baseline is 0.034%. The RR of HCC death associated with HCC at baseline is 200. For the purposes of illustration, the prevalence of HCC at baseline has been set at 4.2% in persons receiving the test drug and 1.9% in persons receiving the comparator. The RR of HCC death associated with test versus comparator drug is set at 1.00, so that any apparent change in RR in the rest of the table is due entirely to artifact.
The effect of the population screening criterion is seen on the left-hand side of Table 2. The crude RR of 1.96 (due entirely to confounding) is reduced to 1.21 through use of the screening criterion. The 1.21 value is a measure of residual confounding. The right-hand side of Table 2 shows the added effect of removal of deaths for which the chart review indicated that HCC may have been present at baseline. Without prior screening using the population criterion, removal of ineligible outcome cases eliminates the confounding and causes the RR to “overshoot” by a small amount, to a value of 0.98. This value reflects a correct retention of cases and a residual error in the underlying populations, since some ineligible subjects remain in the underlying study population. Carried out after preliminary screening by the population criterion, removal of ineligible cases restores an RR of 1.00.
Table 3 provides further examples with scenarios that are variations on the study parameters used in Table 2. The variations are described in the left-hand columns, and the resulting RRs in columns to the right. In each scenario, the underlying RR for test versus comparator is maintained at 1.00, so that the RR’s observed are measures of bias. The column labeled “Crude” shows that, within the range of study parameters examined, confounding bias rises with increasing prevalence of the confounder, with decreasing sensitivity and specificity of the screening criterion to remove the confounder, and with increasing RR associated with the confounder. Application of the population exclusion criterion in every case reduces confounding, as does application of the chart-based exclusion of ineligible cases. The last column of Table 3 shows that the overshoot resulting from the population bias in the preceding column (outcome case removal only) is almost entirely corrected by application of the population exclusion criterion as well, particularly when the latter has good performance.
Table 3.
Invariant factors in the scenario | Variable factor in the scenario | Level of the variable factor | No chart-based exclusion criterion
|
Chart-based exclusion criterion applied
|
||
---|---|---|---|---|---|---|
Crude | Population exclusion criterion applied | No population screening | Population exclusion criterion applied | |||
Sens/Spec=0.90/0.95 | Prevalence of exclusionary | 2%/0.5% | 1.26 | 1.03 | 0.99 | 1.00 |
RR=20 | condition in the Test/Comparator groups | 10%/2.5% | 1.97 | 1.16 | 0.92 | 0.99 |
40%/10% | 2.97 | 1.84 | 0.67 | 0.95 | ||
| ||||||
Prevalence=10%/2.5% RR=20 |
Sensitivity/Specificity of the screening criterion | 0.95/0.99 | 1.97 | 1.08 | 0.92 | 1.00 |
0.90/0.95 | 1.97 | 1.16 | 0.92 | 0.99 | ||
0.85/0.90 | 1.97 | 1.24 | 0.92 | 0.99 | ||
| ||||||
Prevalence=10%/2.5% Sens/Spec=0.90/0.95 |
RR of the outcome associated with the exclusionary condition | 2 | 1.07 | 1.01 | 0.92 | 0.99 |
20 | 1.97 | 1.16 | 0.92 | 0.99 | ||
200 | 3.50 | 2.15 | 0.94 | 0.99 |
Abbreviations: Sens, sensitivity; Spec, specificity; RR, relative risk.
Conclusions and further implications
Ineligible outcome cases discovered at chart review ought to be removed, even when it is not possible to apply the same rigor in excluding ineligible members from the total study population.
The concern about confounding by the presence of ineligible persons also applies in a case-control study. Chart review to confirm case status can reveal study ineligibility that might not be discovered for controls. Imagine the liver cancer study above being done as a case-control study within a cohort of (presumed cancer-free) persons hospitalized for fungal infection. The effect of the exclusion of ineligible cases would be nil for the ratio measures that a case-control study typically yields if the exclusionary condition is equally prevalent in exposed and non-exposed controls, and is otherwise proportional to the prevalence of eligible persons in the two compared study groups.
For a self-controlled design, the removal of ineligible cases during adjudication results in the removal of the subject altogether from the analysis, with both exposed and unexposed person-time similarly suppressed. There is no problem.
Acknowledgments
There was no financial support for the preparation of this report. Work on the motivating example was supported by Astellas Pharma Europe through a research contract with World Health Information Science Consultants (WHISCON), LLC. An earlier draft of this comment was prepared as an internal position statement by WHISCON in a research project funded by Pfizer, Inc. No products of either Astellas or Pfizer are mentioned here.
Footnotes
Disclosure
AMW is an employee of WHISCON. SS is a consultant to WHISCON. MDS and MSD are employees of Analysis Group, Inc. The authors report no other conflicts of interest in this work.
References
- 1.Schneeweiss S, Carver PL, Datta K, et al. Short-term risk of liver and renal injury in hospitalized patients using micafungin: a multicentre cohort study. J Antimicrob Chemother. 2016;71(10):2938–2944. doi: 10.1093/jac/dkw225. [DOI] [PubMed] [Google Scholar]
- 2.Wang JL, Chang CH, Young-Xu Y, Chan KA. Systematic review and meta-analysis of the tolerability and hepatotoxicity of antifungals in empirical and definitive therapy for invasive fungal infection. Antimicrob Agents Chemother. 2010;54(6):2409–2419. doi: 10.1128/AAC.01657-09. [DOI] [PMC free article] [PubMed] [Google Scholar]