Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2017 Nov 25.
Published in final edited form as: Clin Trials. 2010 Aug 20;7(6):686–695. doi: 10.1177/1740774510380953

Cancer registries: a novel alternative to long-term clinical trial follow-up based on results of a comparative study

Qian Shi a, Y Nancy You b, Heidi Nelson b, Mark S Allen c, David Winchester d, Andrew Stewart d, Tonia Young-Fadok e, Paul A Decker a, Erin M Green a, Sara J Holton f, Karla V Ballman a
PMCID: PMC5702272  NIHMSID: NIHMS920369  PMID: 20729254

Abstract

Background

Data collection and review were identified as major contributors to the cost of randomized clinical trials (RCTs).

Purpose

We proposed and assessed a novel alternative for long-term clinical trial follow-up based on the data captured through an accredited Cancer Registry (CR) that is part of the National Cancer Database (NCDB).

Methods

Patients from Mayo Clinic, Rochester, enrolled in the North Central Cancer Treatment Group N934653 (COST) trial (98 patients) and the American College of Surgeons Oncology Group Z0030 trial (55 patients) were included in the study. Demographic, treatment, and long-term outcome data were compared between the hospital-based CR and the RCTs’ databases. Concordances were used to estimate the agreement between two databases. Kaplan–Meier curves were plotted to examine the consistency of time-to-event long-term outcomes of the CR and RCT databases.

Results

High concordances (>95%) were observed for most demographic and treatment variables between the CR data and RCT data. The vital status concordances were 100% and 94.5% between the CR and COST and Z0030 databases, respectively. Three discrepant death dates were observed, one in the COST trial and two in the Z0030 trial. The concordances of disease-free status between the CR and RCT databases were 99.0% and 87.3%, and 15 discrepant disease recurrence cases were identified: 4 for COST and 11 for Z0030.

Limitations

The analysis has been focused on patients from a single site, Mayo Clinic, Rochester, enrolled in two large RCT evaluating surgical treatments. The findings herein need to be confirmed in a broader setting, such as multi-center, multi-registry including nonsurgical trials.

Conclusions

CR data were nearly identical to data from two randomized phase III trials in different disease types and conducted by two different cooperative groups. The NCDB Cancer Registries represent a feasible alternative for obtaining long-term follow-up data for large clinical trials.

Introduction

Level I evidence from randomized clinical trials (RCT) has become the gold standard for establishing clinical standards and for setting practice guidelines. Unfortunately, clinical trials have become costly and resources have become scarce [1]. Several national efforts, including two Institute of Medicine Workshops and a C-Change Group investigation, have been sponsored to understand clinical trial costs and inefficiencies [24]. The C-Change Group found that the overall costs for subjects enrolled in phase II or phase III therapeutic trials ranged from $1966 to $6950 [2]. When clinical trial costs were further scrutinized, data collection (in terms of having to contact sites to get overdue and missing information), data quality control processes, and data discrepancy resolution were identified as the most time-consuming and labor-intensive component, consuming at least one-third of the total labor and nonlabor costs for each subject. Thus, novel alternatives for long-term data collection and discrepancy resolution should be considered where possible to reduce the costs. In particular, if long-term cancer outcome data is collected routinely outside the purview of a clinical trial, considerable cost savings could be realized by using the data that is already available versus collecting it again.

In a limited scope study, we considered whether the National Cancer Database (NCDB) could be used as an alternative to the traditional data capture process during long-term follow-up in clinical trials. Several strengths of the NCDB support this consideration. First, the NCDB is a nationwide network of cancer registries that generate an oncology database capturing more than 70% of all new malignant cancers diagnosed and treated in the United States. There are over 1460 commissions on cancer accredited cancer registries in the United States [5]. Second, data coding and data submission associated with the NCDB are standardized and quality controlled. Third, key data routinely captured in clinical trials, including long-term survival and disease status, have been regularly reported to the NCDB since its inception. We propose that the existing cancer registries of the NCDB, if they could provide the same long-term data as required in the clinical trial, could reduce the need for redundant data collection and thereby could reduce the trial costs. To explore the feasibility of this approach, we performed a pilot study to examine the hypothesis that long-term outcome data captured through an accredited cancer registry (CR) that is part of the NCDB, the Mayo Clinic CR, would be as accurate as long-term outcome data captured in two phase III clinical trials conducted at Mayo Clinic, Rochester.

Patients and methods

Patients from Mayo Clinic, Rochester, enrolled in two large multi-institutional RCT were used to compare the accuracy of data collection between prospective RCTs and a hospital-based CR. The statistical centers for both the trials are at the Mayo Clinic, Rochester campus. Clinical data under consideration included demographics, treatment data, and outcome data. All the data fields were concurrently extracted from the RCT and CR databases. This study was Health Insurance Portability and Accountability Act (HIPAA) compliant and approved by the Mayo Institutional Review Board (IRB).

Phase III trials

Two multi-center phase III trials led by the Mayo Clinic surgeons within different cancer populations had high accrual at Mayo Clinic, Rochester. The first trial, North Central Cancer Treatment Group (NCCTG) N934653 [6,7], conducted by the Clinical Outcomes of Surgical Therapy (COST) Study Group, was a noninferiority trial comparing laparoscopically assisted and open colectomy for patients with curable colon cancer. Eight hundred seventy-two patients at 48 institutions were enrolled between August 1994 and 2001, including 103 patients from Mayo Clinic, Rochester. Five of the 103 Mayo patients were excluded due to benign disease, yielding a total of 98 patients (50 and 48 assigned to the open colectomy arm and the laparoscopically assisted colectomy arm, respectively) for analyses in this study. All 98 patients have at least 5 years of follow-up recorded in the CR compared to 97 with at least 5 years of follow-up recorded in the COST database; one patient was lost to follow-up in the COST database (Figure 1(a)).

Figure 1.

Figure 1

Schema, (a) COST trial; (b) Z0030 trial; COST trial, the Clinical Outcomes of Surgical Therapy trial; MCR, Mayo Clinic Rochester; CR, Cancer Registry

The second trial (Z0030) [8], led by the American College of Surgeons Oncology Group, compared mediastinal lymph node sampling to complete mediastinal lymphadenectomy during pulmonary resection in patients with N0 or N1 (non-hilar) non-small-cell lung cancer (NSCLC). Patient enrollment started in 1999 and ended in 2004, with 55 patients accrued at Mayo Clinic, Rochester, of the total 1111 on the trial. All surviving patients enrolled in Z0030 had at least 5 years of recorded follow-up in the Z0030 database and one patient was lost to follow-up in the CR database (Figure 1(b)). The primary analysis of Z0030 has not been published yet.

Demographic information and limited treatment data were collected at trial registration and subsequent to surgical intervention, respectively. The long-term follow-up outcomes of interest for both trials were vital status and disease recurrence status. Disease recurrence monitoring for COST patients was done at 3, 6, 9, and 12 months after surgery, and then every 6 months for 5 years. For the Z0030 trial, patients were monitored for local, regional, and distant recurrence until death. Follow-up visits were scheduled at 4, 8, 12, 18, and 24 months after surgery, and then yearly starting at month 36.

Mayo clinic CR

Patients who were diagnosed and/or treated at Mayo Clinic were captured by the Mayo Clinic CR. Demographic information was directly downloaded from the Mayo patient registration system and verified against the information in the electronic medical record. Tumor characteristics and treatment data were abstracted from the entire electronic medical record, including clinical documentation, pathology reports, and surgical reports. In general, the disease and vital status of patients were updated through incidental and systematic reviews of patients’ medical records. An incidental review occurred when a patient had a disease recurrence based on pathological tissue diagnosis. The systematic review was carried out routinely every 3 months. At the end of each year, patients without a tissue diagnosis of recurrence but who had received chemotherapy and/or radiation therapy were rescreened for recurrence status.

Mayo Clinic Rochester patients enrolled on the COST or Z0030 trial were identified in the Mayo Clinic CR database by matching the patients’ institutional medical number. The data for the study variables of interest for these patients were downloaded from the CR database. According to the coding rules of the CR, an event of locoregional recurrence and distant metastasis related to the cancer of interest were considered as recurrences. Therefore, if a new primary cancer other than colon or lung cancer occurred, the recurrence status of colon or lung cancer would not be updated.

Statistical methods

Accuracy of the CR data compared to the RCT data was measured as concordance, which was defined as the proportion of patients for which the data of a given variable matched exactly between the RCT and CR databases. Kappa statistics with 95% confidence intervals (CIs) were used to test the concordance of a given categorical data field between the two data collection methods. Differences in death dates between the RCT and CR databases are described in ‘Results’ section. Since very few discrepancies in death dates were expected, no formal statistical tests were conducted. Long-term outcomes under investigation were overall survival (OS) and disease-free survival (DFS). OS was defined as the time from randomization to (all cause) death. DFS was defined as time from randomization to the first occurrence of confirmed tumor recurrence or death. Events of new primary cancer were not considered as events for DFS. Kaplan–Meier curves were plotted based on RCT and CR data for long-term outcomes to evaluate the consistency between the two data collection methods; hazard ratio (HR) estimates and 95% CIs obtained from a Cox proportional hazards model were used to compare the OS and DFS between the intervention groups. Additional data were abstracted from the medical records and reviewed by the study team to resolve/explain discrepancies in data between a RCT and the CR.

Results

Demographics and treatment data

Although the primary intent of this study was to ascertain the quality of the CR long-term outcome data, we also compared the demographic and treatment data that were in common between the clinical trials and the CR as an additional check on data quality. For demographic and treatment information, concordance was high, ranging from 92% to 100%; lowest concordance was observed in race in the COST trial (77.6% concordant, Table 1). When the discrepant cases were reviewed in detail (Table 1), race discrepancies were primarily due to missing data; date discrepancies due to entry errors; and primary tumor site discrepancies due to coding differences. For example, the coding of colon cancer sites differed between the RCT and CR databases such that the RCT only recognized three anatomic sites (i.e., left, right, or sigmoid) and the CR recognized multiple sites (i.e., ascending, descending, cecum, sigmoid, etc.) and included a code for overlapping sites. These coding differences accounted for all four discrepant cases.

Table 1.

Demographics and treatment information

Trial Variable Concordance Descriptions of discrepancies
COST
Demographics Gender 100% (98/98)
Race 77.6% (76/98) Race was missing in 22 cases in the COST database, but was known in CR
Date of birth 98.0% (96/98) DOB was missing for one patient in the CR, but was known in the COST database. For another patient, year of DOB was recorded as 1935 in the COST database, but 1934 in the CR
Treatment Surgical date 95.9% (94/98) The surgical dates of four patients in the CR were 02/17/1995a, 03/28/2000a, 01/27/1999a, and 02/14/2000a, but were 02/28/1995, 02/29/2000, 02/05/1999, and 07/11/2000 in the COST database
Primary site 95.9% (94/98) Overlapping lesions were recorded in CR for all the four patients. In COST database, recorded sites were sigmoid for one patient and right for the other three patients
Z0030
Demographics Gender 100% (55/55)
Race 98.2% (54/55) Race was missing in one case in the CR, but was known in the Z0030 database
Date of birth 96.4% (53/55) For one patient, the DOB was recorded as 06/20/1926 in the Z0030 database, but 01/29/1926 in the CR. For another patient, the year of DOB was recorded as 1934 in Z0030 database, but 1924 in the CR (month and day of DOB matches)
Treatment Surgical date 100% (55/55)
Primary site 92.3% (51/55) For three patients, upper lobes were indicated in the CR, but upper and lower, and upper and middle were recorded in the Z0030 database for one and two patients, respectively. The fourth patient had upper lobe in the CR and left hilum in the Z0030 database

DOB, date of birth, CR: cancer registry.

a

These dates are the dates of the biopsy. Upon review, it was discovered that a CR software conversion caused these discrepancies; in the original version of the CR, the dates matched those of the RCT.

Death and 5-year survival data

In the COST trial, there was a complete agreement between the RCT and CR databases with respect to vital status (Table 2). One discrepancy in date of death was observed between the two databases; the dates differed in the day but agreed in month and year. The Kaplan–Meier survival curves based on COST trial and CR databases were essentially identical within both surgery arms (Figure 2 (a1) and (a2)). Both the COST and CR data yielded no evidence of treatment arm differences based on HR estimates: the COST data HR = 0.74 (95% CI: 0.37–1.48) and CR data HR = 0.77 (95% CI: 0.38–1.55). The overall COST trial results showed no evidence of difference in the OS between the two treatment arms [6].

Table 2.

Long-term outcome data

Trial Concordance Kappa statistics (95% CI)
COST
Death status 100% (98/98) 1.00 (1.00, 1.00)
Date of deatha 96.9% (31/32) NA
Disease-free status 99.0% (97/98) 0.98 (0.93, 1.00)
Z0030
Death status 94.5% (52/55) 0.89 (0.77, 1.00)
Date of deatha 92.6% (25/27) NA
Disease-free status 87.3% (48/55) 0.74 (0.57, 0.92)
a

Analyses of agreement of dates were conducted among patients whose death status agreed between CR and trial databases.

Figure 2.

Figure 2

Estimated probability of OS and DFS by intervention arms for the COST trial

For Z0030 trial, the vital status concordance was 94.5% (Table 2) corresponding to a Kappa statistic of 0.89 (95% CI: 0.77, 1.00). There were three discrepancies. Two patients were recorded as dead in the Z0030 database but alive in the CR database, whereas the third patient was recorded as alive in the Z0030 database but dead in CR database. The discrepancies arise due to different timings while updating the death information between the databases. For example, the last follow-up dates in the CR database for those two patients recoded as dead in the Z0030 database were prior to the death dates. Dates of death differed for two cases between the CR and RCT among patients whose vital status matched between two databases. The date of death differed in the day only (6th vs. 8th) for one patient and differed by exactly 1 year for the other patient. The Kaplan–Meier survival curves based on Z0030 trial and CR databases (Figure 3(a)) are nearly identical.

Figure 3.

Figure 3

Estimated probability of OS and DFS for the Z0030 trial

Disease recurrence and 5-year DFS data

In order for the cancer registries and clinical trials to have equivalent disease recurrence data, they first of all need to use the same follow-up schedule. This is usually the case for long-term follow-up, because both the clinical trial protocol and the CR typically use a standard-of-care follow-up schedule for patients. In spite of this, the observed decrease in concordances between the RCT and CR databases for DFS (Table 2) is due to the differences in disease recurrence status. The concordances of disease-free status between the two databases were 99.0% and 87.3% corresponding to a Kappa statistic of 0.98 (95% CI: 0.93, 1.00) and 0.74 (95% CI: 0.57, 0.92) for the COST and Z0030 trials, respectively. The loss of agreement in DFS was reflected in the Kaplan–Meier survival curves based on RCT and CR data in Figures 2 (b1) and (b2) and 3(b). Both the COST and CR data yielded no evidence of differences in the treatment arm based on HR estimates: the COST data HR = 0.66 (95% CI: 0.34–1.29) and CR data HR = 0.75 (95% CI: 0.38–1.47). The overall COST trial results showed no evidence of a difference in DFS between the two treatment arms [6].

Table 3 provides details of the discrepancies in tumor recurrence data between the two trials and CR as well as the clinician impression based on blinded medical chart review. The four disease recurrence status discrepancies between the COST and the CR databases are all of the same nature: the COST database recorded the patient as having a disease recurrence and the CR database lists them as recurrence free. In three cases, the medical records indicate that the patients did not have disease recurrence but had a new primary. For the remaining case, the medical record agrees with the COST database in that the patient had disease recurrence. This particular patient had liver metastases found at the time of the original operation. According to the coding rules for the COST trial, this was considered an immediate recurrence. However, since the metastatic nodule in the liver was completely resected, this was not considered as a disease recurrence according to CR coding rules. There were two more patients who had immediate recurrences as recorded in COST databases. In CR database, disease recurrences were also recorded, but at later time. One recurrence was about 5 months after surgery; another was about 1.5 years after surgery.

Table 3.

Discrepancy of tumor recurrence data

Patient Indications in the databases Blinded MD review


Trial CR Clinical Dx Method of Dxa Notes
COST trial
1 Rec Rec-free New primary CT, biopsy Squamous cell CA of lung
2 Rec Rec-free New primary Operative pathology Metastatic breast CA; NED (colon CA)
3 Rec Rec-free New primary Operative pathology Adeno CA, transverse colon, T1N0
4 Rec Rec-free Rec Operative pathology Liver and lung metastatic disease, both resected (10/1999)b
Z0030 trial
1 Rec Rec-free Rec Bronchoscopy, biopsy Local, previous resection margin
2 Rec Rec-free Rec Operative pathology Distant, right cerebellum
3 Rec Rec-free Rec PET positive Local/regional metastases to mediastinal nodes
4 Rec Rec-free Rec Biopsy Local, surgical scar
5 Rec Rec-free Rec Bronchoscopy, biopsy Metastatic in lung
6 Rec Rec-free Rec Biopsy Metastatic in lung/pleura
7 Rec-free Rec Rec CT, biopsy Local, previous resection margin
8 Rec Rec-free NED CT Clinician and radiologist per re-review of CT
9 Rec Rec-free New primary Bronchoscopy, brushing Opposite lung
10 Rec Rec-free Rec vs new primary Treated with radiation Too advanced to differentiate
11 Rec Rec-free Rec vs new primary Treated with radiation Too advanced to differentiate

Rec, recurrence; CA, carcinoma; NED, no evidence of disease; CT, computer tomography; and PET, positron emission tomography.

a

All cases with discrepancies had radiographic imaging and a clinical impression by the treating MD. For all except two cases, the clinical impression was followed by additional confirmatory biopsies and histology or by treatment.

b

Clinical Dx, NED with Level 2 clinical impression based on 06/02/2008 medial record notes.

There were 11 discrepancies with respect to disease recurrence status between the Z0030 and the CR databases. Among these instances, only one case was listed as recurrence free in the Z0030 trial and as disease recurrence in the CR (case 7 for Z0030 trial in Table 3); the last known recurrence-free date indicated in Z0030 was about 3 months earlier than the disease recurrence date indicated in CR. In the remaining cases, Z0030 indicated disease recurrence and the CR database listed the patients as recurrence free. In six cases, a blinded review of the medical records confirmed disease recurrence of which five were based on biopsy/surgery reports (Table 3, cases 1–6). Reasons the CR did not identify the disease recurrences for these cases included: (1) the relevant notes in the medical record indicating disease recurrence were overlooked by the CR abstractor; (2) the event was considered as a new primary by the CR abstractor; and (3) the event was not flagged for CR review or confirmed for recurrence due to lack of tissue. For two cases, the medical chart review could not differentiate between recurrence and new primary due to advanced disease. In the other two cases, the Z0030 indicated disease recurrence, but both the CR database and medical record review indicated that it was a new primary for one case, and no evidence of disease for the other. The Z0030 disease recurrence designations were likely due to unconfirmed abnormalities found at a follow-up examination (e.g., from imaging) and the level of diagnostic evidence for disease recurrence was less than definitive.

Discussion

Motivated by the cost, complexities, and challenges of obtaining long-term follow-up on clinical trials, we assessed the feasibility of using CR data as an alternative method of securing long-term cancer outcomes. We studied two clinical trial populations for which data were available within a single institution from both the clinical trial and the CR databases and found that the data from the CR was essentially as accurate as that from the clinical trials, such as demographics, treatment information, and survival data. For the laparoscopic colon trial, the CR data generated the same survival curves and reached the same conclusions regarding the relative efficacy of laparoscopic versus open surgery as reached by the COST trial itself. For the lung trial, a comparison of OS between the two treatment arms was not presented because the primary manuscript for the trial has not been published yet, but the OS curves between the two databases were compared and found to be nearly identical. These results demonstrate the potential feasibility of garnering long-term follow-up from cancer registries.

However, we did identify some discrepancies between the two databases with respect to disease recurrence status and there are a number of issues yet to be resolved. The discrepancies identified in the course of the study prompted us to delve deeper in the methodologic differences between the CR and clinical trials, data gathering processes. We anticipated that the CR might underperform; in fact, the registry identified the same events, but designated them differently than the clinical trials. Closer scrutiny disclosed that the proscribed rules and definitions for CR and clinical trials differ and accounted for the 6 of 15 or 40% of the discrepancies. For example, the clinical trial protocol defined the existing metastatic disease in patient 4 of the COST trial (Table 3) as an immediate recurrence, whereas the CR classified it as part of the initial diagnosis. This is not unexpected since definitions for endpoints such as disease recurrence differs among the different cancer cooperative groups. The initiative of the National Cancer Institute’s (NCI) Standardized Case Report Forms is now in its final stages and should be completed within the next year. Once these are finalized, a standardized set of cancer outcomes will be collected by all groups performing cancer trials funded by NCI. The broader and more relevant question is whether the Commission on Cancer and the NCIs can harmonize the collection of long-term outcome data, especially those related to the disease itself (recurrence or progression). This is likely very possible given that the NCDB is already committed and open to assisting the NCI cooperative groups as evidenced through their assistance of the groups with targeting accruals.

In general, it is recognized that determining disease recurrence status is sometimes subjective. In our study, 3 of 15 (20%) disease recurrence status discrepancies were a matter of subjective interpretation of nondefinitive information. For example, the information available for patients 10 and 11 under the Z0030 trial (Table 3) could not definitively determine whether there was a disease recurrence or a new primary. The level of discrepancies observed between the clinical trials’ databases and the CR fall within the range of discrepancies observed between treating physician opinion and central review opinion. For randomized trials, it would be expected that the discrepancies affect the arms of the trials equally. Obviously, if the CR consistently under-reports disease recurrence for a considerable fraction of the patients, this would dilute the real treatment effects; in other words, would decrease the power of the study.

This study is limited in scope because it is based on data from a single CR. It may be possible that the Mayo Clinic CR is not representative of a typical accredited CR such as a community hospital. However, there are reasons to believe that other cancer registries would produce the same degree of concordance as the registry investigated in this study. First, the Commission on Cancer has well-established operational standards for cancer registries at all accredited programs, including the requirement that case abstraction is performed or supervised by a Certified Tumor Registrar. Second, each registry is audited every 3 years to ensure that case abstracting is performed in a timely manner using established guidelines; that data meet a broad range of stringent quality criteria; and that satisfactory long-term follow-up rates are maintained. This gives us confidence that utilization of cancer registries as a mechanism for effective long-term vital-and disease-status follow-up may be broadly applicable across many US institutions.

Furthermore, there are many clinicians who are interested in participating in clinical trials but are limited by the lack of an established clinical trials/research infrastructure at their institutions. In other instances, a definitive randomized trial for a relatively rare disease (e.g., GIST, gastrointestinal stromal tumors) or intervention under investigation (e.g., radiofrequency ablation) requires clinicians with access to patient populations or expertise that is outside that of the membership of the majority of NCI-funded cooperative groups. Recruiting clinicians who do not normally participate in clinical trials may be highly successful, but follow-up on the disease outcomes of patients enrolled onto clinical trials can be more challenging because it extends years beyond the trial capitation payment. The novel approach described here could offer an alternative mechanism for long-term follow-up that could harness a unique population of clinicians and their patients.

It should be acknowledged that the use of cancer registries for the collection of long-term outcome data is likely not an option for some trials. The feasibility of using cancer registries should be examined closely based on the nature of both the study and the outcomes of interest. First, we are only proposing the registries be used for long-term cancer outcomes, meaning once patients are being followed according to a standard-of-care schedule. In particular, it could not be used for patients for whom the protocol specifies a follow-up schedule that is more frequent than standard-of-care. It is only useful for those patients who are on a standard-of-care follow-up schedule and the data collected at follow-up is limited to disease recurrence and vital status. Furthermore, if an objective of a trial is to summarize the different types of disease recurrences (local, regional, and distant), a CR could not be used to obtain that data at the present time because it only captures the first disease occurrence of any type. Second, it would not make sense for trials that have long-term outcome endpoints other than those of vital status or disease recurrence to use the CR. Examples of those endpoints are functional assessments, quality-of-life measures, and late-term adverse events. These are not routinely collected by the CR currently. In addition, trials that are expected to be used as FDA registration trials would likely not be good candidates for obtaining long-term outcome follow-up from the cancer registries due to the additional reporting and oversight required for such trials. Third, trials that assess rare tumors that may not be treated by oncologists, such as eye cancer, would likely not be able to use the CR for long-term cancer outcome data. Finally, cancer registries would not be a viable option for obtaining treatment level data that is required by most protocols. The treatment information captured by the registries is too coarse to be of use in trials investigating new therapies. During the treatment period, it is essential that the trial personnel monitor the patient’s condition and record the treatment dose, dose modification, and more importantly the adverse events. Hence, cancer registries may not be a good source covering study data collection. However, after the completion of the treatment and when the patient is followed on a standard-of-care schedule, the long-term outcome data collection by trial personnel can be ended, and the use of the CR can be implemented.

Although this study was based on a limited sample size of patients enrolled from a single institution, the results show the initial ‘proof of principle’ that utilizing CR data can be a feasible alternative to long-term follow-up for future clinical trials. The next step would be to determine whether similar results would be obtained in a multi-institutional study, which would involve conducting a prospective study with a larger number of diverse institutions to test this novel follow-up approach. Such a study would allow us to set more rigorous prospective targets, that is, recruiting those institutions that would be expected to participate in clinical trials for which this novel opportunity might be best suited, and including a variety of therapeutic clinical trials in different disease populations. Finally, if the broader investigation shows that the results of this study are generalizable across multiple cancer registries, implementation issues would need to be addressed and resolved, for example, obtaining access to the NCDB for oncology clinical trials groups following the regulatory requirements and harmonizing audit requirements between the cancer registries and cancer clinical trial groups.

In summary, we conducted a preliminary investigation of whether it might be possible for cancer clinical trials to use long-term cancer outcome data that is already being collected in a standardized fashion, and which is subject to quality control processes, by cancer registries that deposit this information into the NCDB. Our preliminary results for a limited scale study show that little information and data quality appears to be lost if CR data were to be used. On the other hand, the cost savings and efficiencies gained by groups doing clinical trials would be considerable in that it would allow groups to use the existing data instead of replicating data collection already undertaken by the cancer registries.

Acknowledgments

This study was supported by the National Institutes of Health (grant no. U10CA076001).

References

RESOURCES