Skip to main content
. Author manuscript; available in PMC: 2019 Dec 1.
Published in final edited form as: Med Care. 2018 Dec;56(12):e83–e89. doi: 10.1097/MLR.0000000000000875

Table 3.

Algorithm versions’ strengths, weaknesses, and suggested uses.

Algorithm Strengths Weaknesses Qualitative Summary of Findings
1. Updated incident cancer identification, utilizing 6-month claims data window to exclude cancer prior to study period Requires only claims data with a narrow additional time window (six months) for excluding prevalent cases Limited time period (six months) for excluding prevalent cancer cases Lowest specificity and comparatively low PPV and Kappa points to limitations in using claims data alone for identification of incident cancer cases.
2. Incident cancer identification, utilizing NHS data to exclude prevalent cancer at any point prior to study period Claims not used to exclude prevalent cancer cases. Moderate PPV and kappa, especially for colorectal cancer Can be applied when data on cancer history are obtained at cohort inception to ensure only incident cases are identified through claims
3. Incident cancer identification, utilizing 6-month window in claims data and NHS to exclude prevalent cancer Makes full use of both data sources Very close performance characteristics to Algorithm #2. Higher specificity results in small improvement in PPV and kappa. Use of both data sources minimizes false positive incidenct cancer diagnoses with minimal change in sensitivity.
4. Prevalent cancer identification, utilizing claims only Only requires claims from a two-year observation window to identify those who have ever had cancer Cannot distinguish incident from prevalent cases High sensitivity, specificity, PPV, NPV, kappa for identifying ever cancer diagnoses. Useful in when date of diagnosis is not required (eg, studies of genetic factors, or of early-life risk factors for adult cancers) or if diagnosis date is available from other sources.