Skip to main content
. 2015 May 9;107(8):djv134. doi: 10.1093/jnci/djv134

Table 2.

Studies of health claims–based methods for identify metastasis and/or recurrence in population-based samples*

Author Study aim: validation of algorithm to identify metastases or recurrence in health claims by using: Data sources/setting Population Study period Sample size/# of patients with recurrence/metastasis Method to validate metastasis or recurrence Findings (best algorithm)
Chawla et al. (13) ICD-9 diagnosis codes to identify presence and location of metastatic disease at the time of diagnosis for patients with breast, colorectal, or lung cancer diagnosis Medicare inpatient, hospital outpatient, and physician claims linked to SEER cancer registry data Medicare patients age 65+ y Registry cancer diagnosis: 2005–2007 # cases/# distant metastasis
BC-27 143/1642
CRC-24 216/3790
LC-28 693/13 594
Metastasis identified from diagnosis codes and stage inferred as local, regional, or distant based on metastasis location (or absence of codes); compared with registry data, used as gold standard Distant metastasisBC: Sensitivity-51%
  Specificity-99%
  PPV-66%
  NPV-97%
CRC: Sensitivity-73%
   Specificity- 94%
   PPV-69%
   NPV-95%
LC: Sensitivity-43%
  Specificity-95%
  PPV-88%
  NPV-65%
Chubak et al. (14) ICD-9 diagnosis and procedure codes, HCPCS codes, and pharmacy claims to identify second breast cancer (BC) events, including recurrence Health and pharmacy claims from an integrated health care system linked to medical records and SEER cancer registry Women age 18+ y with stage I and II BC who participated in prior cohort studies Cancer diagnosis: 1993–2006
Evaluation of second events after surgery
# of breast cancer cases-3152
# recurrences-299
# new primaries-93
Algorithms with and without SEER variables to identify second events and recurrences; medical record used as gold standard; analysis produced several receiver operator curves to identify high sensitivity or specificity ROC for recurrence:
High sensitivity
  Sensitivity-94%
  Specificity-92%
  PPV-58%
  NPV-99%
High specificity & PPV
  Sensitivity-72%
  Specificity-97%
  PPV-75%
  NPV-97%
Cooper et al. (15) ICD-9 diagnosis codes to identify presence and location of metastatic disease at the time of diagnosis for patients with breast, colorectal, endometrial, lung, pancreatic, or prostate cancer diagnosis Medicare inpatient and hospital outpatient claims linked to SEER cancer registry data Medicare patients age 65+ y Registry cancer diagnosis: 1984–1993 # cases/# distant metastasis
BC-60 445/3360
CRC-75 576/13 165
EN-14 157/1364
LC-71 468/28 255
PC-13 859/5857
PR-85 132/10 153
Metastasis identified from diagnosis codes within 3 mo +/- of diagnosis and stage inferred as local, regional, or distant based on metastasis location (or absence of codes); compared with registry data, used as gold standard Performance of claims to identify distant metastasis:
BC: Sensitivity-60%
  PPV-58%
CRC: Sensitivity-79%
   PPV-63%
EN: Sensitivity-53%
  PPV-65%
LC: Sensitivity-58%
  PPV-79%
PC: Sensitivity-79%
  PPV-74%
PR: Sensitivity-58%
  PPV-65%
Hassett et al. (16) ICD-9 diagnosis and procedure codes, HCPCS codes, DRG, and revenue center codes for secondary malignant neoplasm and chemotherapy codes for patients with breast, lung, colorectal, and prostate cancer Cancer Care Outcomes Research and Surveillance medical record data linked to Medicare claims and HMO/Cancer
Research Network data
CanCORs: Medicare patients with stage I-III lung and colorectal cancer who had definitive local therapy for stage I-III lung and colorectal cancer
HMO/CRN: Age 21 y and older with stage I-IIIA cancer treated with definitive therapy
CanCORS cases diagnosed 2003–2005 and followed for 14 mo
HMO/CRN cases diagnosed 2000–2005 and followed for 60 mo
# cases/# recurrence
CanCORs:
  LC-309/59
  CRC-620/56
HMO/CRN:
  BC-2726/212
  CRC-1088/191
  LC-333/129
  PR-1151/89
Algorithm using ICD-9 diagnosis codes for secondary malignancy and chemotherapy codes to identify recurrence at 14 and 60 mo from claims or encounter data when compared with medical record data, used as gold standard CanCORs 14-mo data:
LC: Sensitivity-77%
  Specificity-70%
CRC: Sensitivity-81%
   Specificity- 83%
CRN 60-mo data:
BC: Sensitivity-78%
  Specificity-79%
  PPV-30%
CRC: Sensitivity-83%
   Specificity-79%
   PPV-53%
LC: Sensitivity-85%
  Specificity-72%
  PPV-72%
PR: Sensitivity-19%
  Specificity- 83%
  PPV-11%
Nordstrom et al. (17) ICD-9 diagnosis and treatment codes, HCPCS codes, and NDC codes to identify presence of metastatic disease at the time of diagnosis for patients with breast, colorectal, lung, and prostate cancer Health and pharmacy claims from a national insurer linked to electronic medical record data from oncologists Patients of all ages with commercial (non-Medicare) insurance Cancer diagnosed 2004–2010 BC- 1385/175
CRC-727/215
LC- 1036/477
PR-267/176
Metastases identified from cancer diagnoses and procedure or pharmacy claims for cancer treatment compared with EMR, used as gold standard for incidence and stage BC: Sensitivity-62%
  Specificity-97%
  PPV-75%
  NPV-95%
CRC: Sensitivity-67%
   Specificity- 93%
   PPV-80%
   NPV-87%
LC: Sensitivity-60%
  Specificity-88%
  PPV-81%
  NPV-72%
PR: Sensitivity-81%
  Specificity-75%
  PPV-86%
  NPV-67%
Warren et al. (18) ICD-9 diagnosis and procedure codes, HCPCS codes, NDC codes to assess the sensitivity of Medicare claims to identify cancer recurrence for patients with breast and colorectal cancer Medicare inpatient, hospital outpatient, physician, durable medical equipment, and hospice claims linked to cancer incidence in SEER cancer registry data Patients ages 65+ y diagnosed with stage II or III BC or CRC who received definitive treatment, had a treatment- free interval and later died from cancer Cancer diagnoses 1994–2003
Evaluation of recurrence: 1994-until death (2008 at latest)
BC-3826
CRC-6910
All patients were assumed to have recurred because they died from cancer;
following definitive treatment and 3-mo treatment-free interval, claims were reviewed until death for additional cancer therapy (surgery, chemotherapy, radiation) or hospice admission
Additional therapy as first indicator of recurrence:
BC-39%
CRC-35%
No indicator of recurrence or hospice only indicator of recurrence
BC-19%
CRC-25%
Whyte et al. (19) ICD-9 and HCPCS codes on health claims to identify patients with metastatic breast, colorectal, or lung cancer Inpatient, outpatient, emergency room, physician and surgery center claims from a national insurer linked to cancer reported in clinical oncology data Patients of all ages with commercial (non-Medicare) insurance Claims and clinical oncology data from 2007–2010; claims were reviewed for up to 1 y prior to and 3 mo following a cancer diagnosis in the clinical oncology database BC-4631/371
CRC- 2058/528
LC- 2449/1204
General and cancer site-specific algorithms were evaluated using ICD-9 diagnosis codes for secondary neoplasms; algorithms varied by frequency/timing of codes and specific metastatic sites; compared with clinical oncology data as gold standard Best algorithm:
BC: Sensitivity-66%
  Specificity-97%
  PPV-75%
  NPV-96%
CRC: Sensitivity-63%
   Specificity-88%
   PPV-71%
   NPV-83%
LC: Sensitivity-61%
  Specificity-81%
  PPV-77%
  NPV-68%

* BC = breast cancer; CanCORs = Cancer Care Outcomes Research and Surveillance; CRC = colorectal cancer; DFS = disease-free survival; DRG = diagnostic-related group; EN = endometrial cancer; HCPCS = Healthcare Common Procedure Coding System; ICD = International Classification of Disease; LC = lung cancer; NDC = national drug code; NPV = negative predictive value; PC = pancreatic cancer; PR = prostate cancer; PPV = positive predictive value; ROC = receiver operating characteristic curve; SEER = Surveillance, Epidemiology, and End Results.