. Author manuscript; available in PMC: 2024 Aug 1.

Published in final edited form as: Diagnosis (Berl). 2023 Apr 5;10(3):225–234. doi: 10.1515/dx-2022-0130

Table 1.

Rationale and inferences to be drawn from different comparator groups

Comparator Type	Rationale for Choice	Inferences for Different Controls
Look back^a inter-group disease comparator (Figure 1A)	Disease discovery: Use this comparator in a look-back analysis to determine which diseases are missed presentations (or not) for a particular target symptom. Be cautious of prevalence-linked confounders for downstream adverse events.	Negative controls are used to show specificity (discriminant construct validity) of one or more disease associations in a symptom-disease pairing. Positive controls are used to demonstrate alternative-forms reliability for disease clusters linked to a symptom.
Look back^a intra-group symptom comparator (Figure 1B)	Symptom discovery: Use this comparator in a look-back analysis to determine which symptoms are missed presentations (or not) for a particular target disease.	Negative controls are used to show specificity (discriminant construct validity) of one or more symptom associations in a symptom-disease pairing. Positive controls are used to demonstrate alternative-forms reliability for symptom clusters linked to a disease.
Look-forward^b intra-group disease comparator (Figure 1C)	Disease specificity: Use this comparator in a look-forward analysis to compare harm rates across diseases for a particular target symptom.	Negative controls are used to show specificity (discriminant construct validity) of one or more disease associations in a symptom-disease pairing. Positive controls are used to demonstrate alternative-forms reliability for disease clusters linked to a symptom.
Look-forward^b inter-group symptom comparator (Figure 1D)	Symptom specificity: Use this comparator in a look-forward analysis to compare harm rates across symptoms for a particular target disease. Be cautious of prevalence-linked confounders for downstream adverse events.	Negative controls are used to show specificity (discriminant construct validity) of one or more symptom associations in a symptom-disease pairing. Positive controls are used to demonstrate alternative-forms reliability for symptom clusters linked to a disease.

Look-back analyses begin with diseases as the denominator, so largely normalize disease prevalence/risk when comparing across symptoms that might be associated. This effectively controls for disease prevalence effects. For example, patients with acute dizziness or vertigo have a roughly 3-5% prevalence of cerebrovascular causes, but the prevalence of cerebrovascular causes is higher among patients with new unilateral weakness. In a look-back analysis, this difference in prevalence has little impact on the result. The outcome is that dizziness or vertigo are found to have higher odds of missed stroke than unilateral weakness. The inference takes as a given that the patient actually has a stroke, so the result relates solely to the relative likelihood of error for target symptoms.

Look-forward analyses begin with symptoms as the denominator, so do NOT normalize disease prevalence/risk when comparing across diseases that might be associated. This does not control for disease prevalence effects, so analytic results admix prevalence effects and diagnostic error-related effects. As a result, look-forward analyses most closely approximate real-world harm rates. For example, patients with acute dizziness or vertigo have a roughly 3-5% prevalence of cerebrovascular causes, but the prevalence of cerebrovascular causes is higher among patients with new unilateral weakness. In a look-forward analysis, this difference in prevalence has a substantial impact on the result. The outcome is that weakness is found to have higher absolute rates/risks of missed stroke than dizziness or vertigo, even though dizziness and vertigo create a far higher (relative) risk of misdiagnosis when the patient does have a stroke. The inference just takes as a given that the patient has one or the other symptom, so the result admixes the prevalence of stroke in those symptoms with the relative likelihood of error for target symptoms.