Abstract
The Prostate, Lung, Colorectal, and Ovarian (PLCO) Cancer Screening Trial is a large-scale research effort conducted by the National Cancer Institute. PLCO offers an example of coordinated research by both the extramural and intramural communities of the National Institutes of Health. The purpose of this article is to describe the PLCO research resource and how it is managed and to assess the productivity and the costs associated with this resource. Such an in-depth analysis of a single large-scale project can shed light on questions such as how large-scale projects should be managed, what metrics should be used to assess productivity, and how costs can be compared with productivity metrics. A comprehensive publication analysis identified 335 primary research publications resulting from research using PLCO data and biospecimens from 2000 to 2012. By the end of 2012, a total of 9679 citations (excluding self-citations) have resulted from this body of research publications, with an average of 29.7 citations per article, and an h index of 45, which is comparable with other large-scale studies, such as the Nurses’ Health Study. In terms of impact on public health, PLCO trial results have been used by the US Preventive Services Task Force in making recommendations concerning prostate and ovarian cancer screening. The overall cost of PLCO was $454 million over 20 years, adjusted to 2011 dollars, with approximately $37 million for the collection, processing, and storage of biospecimens, including blood samples, buccal cells, and pathology tissues.
Over the past two decades, the complexity and scale of medical research in the United States and in the developed world as a whole has increased substantially, as has the total dollar amount of research expenditures. For nations as a whole, as well as for institutions, assessing the productivity and costs of medical research programs is critical for being able to successfully manage them and make them as productive as possible.
The Prostate, Lung, Colorectal, and Ovarian (PLCO) Cancer Screening Trial is a large-scale research effort conducted by the National Cancer Institute (NCI), a component of the National Institutes of Health (NIH). From its inception, the PLCO was designed not only as a randomized controlled trial (RCT) of screening for four cancers but also more broadly as a research enterprise consisting of the trial, a large, well-characterized cohort with all-cancer outcomes, and a biorepository (Figure 1). PLCO offers an example of coordinated research by both the extramural and intramural communities of the NIH. The purpose of this article is to describe the PLCO research resource and how it is managed, assess the productivity of the resource, and examine the costs associated with the resource. Such an in-depth analysis of a single large-scale project can shed light on questions such as how such projects should be managed, what metrics should be used to assess productivity, and how costs can be compared with productivity metrics overall and on a component-by-component basis.
Background/History of the Trial
The original impetus for the PLCO trial was a concern about advertised claims for prostate cancer screening with transrectal ultrasound (1). This evolved over a 2-to-3–year period into a Prostate, Lung, and Colorectal (PLC) trial of older men evaluating chest radiograph (CXR) screening for lung cancer, flexible sigmoidoscopy (FSG) screening for colorectal cancer and prostate-specific antigen (PSA) and digital rectal exam (DRE) screening for prostate cancer; transrectal ultrasound was dropped because it was found in pilot studies to add little to PSA and DRE. The PLC trial was approved by the Board of Scientific Counselors of the NCI’s Division of Cancer Prevention (DCP) in October 1989, and procedures were initiated to award contracts for the trial. However, in August 1990, the NCI Executive Committee halted the awarding of contracts for the trial because of concerns that women were not included even though they were at risk for lung and colorectal cancer. At that point, several options were explored, including adding women to PLC and starting a parallel screening trial for women. Eventually, it was decided to expand PLC to include screening for ovarian cancer, and the resultant PLCO trial was approved by the Board of Scientific Counselors in 1991. The ovarian cancer screening included CA125 and transvaginal ultrasound (TVU); bimanual ovarian palpation was also an initial component but was dropped because it did not detect additional cancers missed by CA125 and TVU. PLCO also included a biorepository that stored blood samples collected from trial participants. Table 1 shows the timeline of the PLCO trial.
Table 1.
October 1989 | PLC trial concept approved by the DCP BSC |
May 1990 | Request for proposals (RFP) issued |
August 1990 | NCI Executive Committee stopped the trial, demanded women be included |
January 1991 | PLCO trial concept approved by the DCP BSC |
September 1992 | Contracts awarded to 10 screening centers, a central laboratory, a coordinating center |
October 1992 | PLCO Biorepository concept approved by the DCP BSC |
November 1993 | Pilot phase enrollment began |
September 1994 | Main-phase enrollment began |
September 2001 | Enrollment completed |
September 2006 | Screening phase completed |
March 2009 | Prostate cancer mortality outcome result published |
June 2011 | Ovarian cancer mortality outcome result published |
October 2011 | Lung cancer mortality outcome result published |
October 2011 | Centralized follow-up begins |
March 2012 | Updated prostate cancer mortality outcome result published |
May 2012 | Colorectal cancer mortality outcome result published |
October 2015 | Planned completion of centralized follow-up |
* BSC = Board of Scientific Counselors; DCP = Division of Cancer Prevention; NCI = National Cancer Institute; PLC = Prostate, Lung, Colorectal Trial; PLCO = Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial.
The PLCO was funded through a contract mechanism administered by the NCI’s DCP. The major DCP PLCO contracts were for 10 screening centers, a central laboratory, and a coordinating center. In addition, there was a separate DCP contract for a data management and analysis center; this contract supported both PLCO activities and other division projects. An intramural NCI division, the Division of Cancer Epidemiology and Genetics (DCEG), used an existing specimen storage contract to physically house the PLCO biorepository.
Trial Protocol and Population
The design of PLCO called for 148 000 men and women to be randomized to either an intervention or a usual care arm at multiple screening centers across the United States (2). The primary inclusion criterion was age 55 to 74 years at entry. The primary exclusion criteria included history of a PLCO cancer and current treatment for cancer (except basal or squamous cell skin cancer). Enrollment began in 1993 and finished in 2001. Before October 1996, women with no ovaries were excluded, and from April 1995 onward, subjects with a lower gastrointestinal endoscopy within the past 3 years and men with more than one PSA test in the past 3 years were excluded. The original design specified 10 years of follow-up from randomization; this was increased to 13 years of follow-up early in the trial.
Men and women in the intervention arm received postero-anterior view CXR at baseline (T0) and then annually for 3 more years (T1–T3) and received FSG at baseline (T0) and then study year 5 (T5). Subjects randomized through April 1995 received FSG at year 3 instead of year 5; additionally, for subjects randomized after April 1995, never smokers were not offered the final (year 3) CXR. For sex-specific screens, intervention arm men received PSA and DRE tests at baseline and then annually for 5 years (PSA) and 3 years (DRE), whereas intervention arm women received CA125 and TVU at baseline and then annually for 5 years (CA125) and 3 years (TVU). Subjects randomized before the end of April 1995 received PSA or CA125 only through study year 3. Year 4 and year 5 PSA and CA125 screening were added in conjunction with the increase from 10 to 13 years in follow-up for the trial. Usual care arm subjects received no screening from the trial.
Blood was collected at the screening centers for PSA and CA125 testing and sent to the central laboratory where the assays were performed. All other screening procedures were performed and interpreted at the screening centers. Additional blood was collected and processed at each screening visit for future research and sent to a central biorepository for long-term storage.
The primary endpoint for each of the four organ sites within the PLCO trial was cause-specific mortality through 13 years of follow-up. Secondary endpoints included screening compliance and positivity rates, diagnostic follow-up, sensitivity and specificity, cancer incidence and stage distribution, and all-cause mortality.
To characterize the trial population, 50.5% were women, and 35.9% were aged 65 years or older at randomization. By race/ethnicity, 85.6% were non-Hispanic white, 5.0% were non-Hispanic black, 1.8% were Hispanic, and 3.6% were Asian (4.0% were other or unknown). Slightly more than half of subjects were current or former smokers, and about one-third were college graduates.
Major Trial Results
The primary outcome results for the four PLCO cancers were published in 2009 to 2012. Each described an intent-to-screen analysis of cancer-specific mortality and cancer incidence through a maximum of 13 years of follow-up. Final enrollment for PLCO was 154 901. For prostate cancer screening with PSA and DRE, a slightly (but not statistically significantly) elevated prostate cancer mortality relative risk (RR) for the intervention arm was observed, with a relative risk of 1.09 (95% confidence interval [CI] = 0.87 to 1.36). An excess of diagnosed cases (through 13 years) was observed in the intervention arm, with incidence relative risk of 1.12 (95% CI = 1.07 to 1.17) (3,4). For ovarian cancer screening with CA125 and TVU, the ovarian cancer mortality relative risk through 13 years of follow-up was 1.18 (95% CI = 0.82 to 1.71); a modest, borderline statistically significant increase in incidence was observed (RR = 1.21; 95% CI = 0.99 to 1.48) (5). Screening for lung cancer (with CXR) also did not show a mortality reduction, with a relative risk of 0.99 (95% CI = 0.87 to 1.22) (6). FSG screening for colorectal cancer was the only PLCO modality to demonstrate a statistically significant reduction in cause-specific mortality, with a relative risk of 0.74 (95% CI = 0.63 to 0.87) (7). Importantly, a reduction in colorectal cancer incidence was also observed for the intervention vs usual care arm, with incidence relative risk of 0.79 (95% CI = 0.72 to 0.85).
The harms of screening were also assessed in the primary outcome manuscripts. These included direct complications (primarily minor) from PLCO screens, false-positive screens, diagnostic procedures arising from false-positive screens and resultant complications, and overdiagnosed cancers (3–7).
PLCO Resources and Management
Data Resources and Management
Table 2 summarizes the data collected by PLCO. The data types include four major questionnaires (baseline, supplemental, and two dietary history), screening test results, diagnostic procedures performed after positive screens and in association with PLCO cancer diagnoses, all-cancer incidence, and all-cause mortality.
Table 2.
Data type | Timing | Arm | Data description |
---|---|---|---|
Baseline questionnaire | Baseline | Both arms | Baseline risk factors such as demographics, history of health, smoking, NSAIDS, and sex-specific details |
Dietary questionnaire | Baseline | Intervention | Food frequency, nutrient data in grams |
Diet history questionnaire | 1998 | Both arms | Food frequency, nutrient data in grams |
Supplemental questionnaire | 2006–2008 | Both arms | Risk factors such as demographics, history of health, smoking, medications, physical activity, and sex-specific details |
Screening results | Study years 0–5 | Intervention | Positive/negative screen, quantitative levels (PSA, CA125), descriptions of abnormalities (TVU, FSG, CXR, DRE) |
Diagnostic procedures | Study years 0–5 | Both arms | Types and dates of diagnostic procedures after positive PLCO screens and/or associated with diagnoses of a PLCO cancer |
Cancer diagnoses | All years | Both arms | Cancer type (ICDO codes) and date of diagnosis. For PLCO cancers (and breast), stage, grade and initial treatment |
All-cause mortality | All years | Both arms | ICD underlying cause of death and date of death. Death review by expert panel for deaths possibly related to PLCO cancers |
Nonprotocol screening use questionnaire | 2001–2012 | Both arms | Use, reason for use, and timing of screening modalities (including exams used in PLCO) outside of PLCO protocol. Assessed for a sample of PLCO subjects |
* CXR = chest radiograph; DRE = digital rectal exam; FSG = flexible sigmoidoscopy; ICD = International Classification of Diseases; ICDO = International Classification of Diseases for Oncology; NSAIDS = Nonsteroidal anti-inflammatory drugs; PLCO = Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial; PSA = prostate-specific antigen; TVU = transvaginal ultrasound.
There were also several special studies conducted in PLCO, which collected additional data and/or performed additional exams. Two of the largest were a study of surveillance colonoscopy and a study of aberrant crypt foci. In the former, a sample of PLCO subjects was questioned about surveillance colonoscopy usage and findings; confirmatory medical records were abstracted. In the latter, a sample of PLCO subjects underwent rectal exams to detect aberrant crypt foci, with tissue samples taken and analyzed for molecular markers.
The PLCO protocol specified that each screening center would follow their subjects for cancer incidence and mortality endpoints for at least 13 years from randomization. In 2011, after the screening and planned follow-up had been completed, continued follow-up was switched to a centralized operation where a single coordinating center performed the follow-up; this centralized follow-up will continue until at least 2015. In general, the current PLCO data available for research are censored at 13 years from randomization or December 31, 2009, whichever came first. For this dataset, median follow-up was 11.9 years; approximately 26 000 (17%) subjects developed a verified cancer and 22 000 (14%) died.
The management process for conducting the PLCO trial has been described in detail previously (8). Briefly, this involved protocol development, protocol implementation (including documentation and training), quality assurance, laboratory quality control, site visits and record audits, creation of data structures, and data monitoring and reporting. Below we describe the collaborative management of PLCO as a research enterprise.
The PLCO Steering Committee, consisting of the NCI project officer, chief statistician, and their associates, as well as the site principal investigators, were responsible for carrying out and publishing the major analyses related to the primary and secondary trial endpoints. For each endpoint, or group of endpoints, analysis and publication was performed in real time after the collection and processing of all relevant data. For example, for the group of secondary endpoints comprised of screening compliance, screen positivity rates, and diagnostic follow-up for the baseline screening round, manuscripts for each of the four PLCO organ sites were published in 2005, about 3 years after the final relevant events occurred.
A subcommittee structure was developed to manage research projects outside of the major analyses of trial primary and secondary endpoints, including projects directly related to the screening interventions (and their downstream effects). Initially, there was a subcommittee for each of the PLCO cancers (prostate, lung, colorectal, and ovarian), as well as a publications subcommittee; later additional subcommittees were added for other cancer sites or research areas. Projects were proposed, scientifically reviewed and approved, and tracked through the subcommittee structure. PLCO principal investigators and co-investigators (and others associated with the PLCO centers) and NCI staff from DCP (extramurual) and DCEG (intramural) comprised the subcommittee membership and could propose projects. Outside investigators could also propose projects, provided a PLCO or NCI investigator was a collaborator. The publications subcommittee reviewed and approved all publications arising from PLCO projects. Note this subcommittee structure was designed for so-called “data only” projects (ie, projects not requiring the use of biospecimens); projects requiring biospecimens are discussed in the next section. The PLCO coordinating center, along with the data management and analysis center, provided critical support for the subcommittee process and also provided data analysis services for approved projects.
In general, for projects related to trial interventions, only after the primary or secondary endpoint results were published were the relevant data released for investigator use. For example, after the paper summarizing the results of the baseline screening round (and subsequent diagnostic follow-up) for prostate cancer was published, all data relating to this baseline screen (and follow-up) were made available to investigators for further analyses. With publication of the primary endpoint results for a given cancer site, all data for that site were made available. For projects not related to trial interventions—for example, projects on cancer etiology—generally only cases (and controls) in the intervention arm could be selected for study until the publication of the primary endpoint results for that cancer site (this applied only to the PLCO cancers).
As of November 2012, the process for submitting and approving data-only proposals changed to a more open, Web-based system—the Cancer Data Access System (CDAS). Under CDAS, any researcher can submit an application for standardized PLCO datasets, and there is no scientific review. Researchers must stipulate that they will use the data for research purposes only and not attempt to identify study participants, as well as sign a data transfer agreement. Proposals are reviewed by NCI staff solely to determine whether they constitute legitimate research and whether they could feasibly be performed with PLCO data. A list of approved projects and related PLCO publications is available on the CDAS website (https://biometry.nci.nih.gov/cdas/).
Biospecimen Resource and Management
As mentioned above, at each of the six screening rounds extra blood samples were collected from consented screening arm participants under an institutional review board–approved protocol to be used for future research. Amounts collected and fractionation protocols varied by study year (Table 3). Serum, plasma, red blood cell, and Buffy coat samples are available for most study years. These samples are stored in −80oC upright freezers or in liquid nitrogen tanks. Cryopreserved whole blood was collected (only) at T3; these samples contain viable lymphocytes that can be used directly for in vitro functional studies or for Epstein-Barr virus (EBV) transformation to make cell lines. Whole blood samples are stored in vapor-phase liquid nitrogen freezers. Buccal cells were collected from control arm participants; these samples were intended primarily as a DNA source, although RNA can be extracted as well (up to 10 μg of DNA may be extracted from each vial).
Table 3.
Study year | Specimen type | Vacutainer | Vials/subject | Volume/Vial (mL) |
---|---|---|---|---|
T0 | Buffy coat | Green | 1 | 1.8 |
T3 | Buffy coat | Green | 1 | 1.8 |
T3 | Buffy coat | Lavender | 1 | 1.8 |
T5 | Buffy coat | Lavender | 2 | 1.8 |
T4 | Buffy coat/RBC | Lavender | 1 | 3.6 |
T0 | Plasma | Green | 2 | 1.8 |
T3 | Plasma | Green | 2 | 1.8 |
T3 | Plasma | Lavender | 4 | 0.8 |
T4 | Plasma | Lavender | 1 | 3.6 |
T5 | Plasma | Lavender | 4 | 1.8 |
T0 | RBC | Green | 1 | 1.8 |
T3 | RBC | Lavender | 1 | 1.8 |
T5 | RBC | Lavender | 1 | 1.8 |
T0 | Serum | Red | 4 | 1.8 |
T1 | Serum | Red | 2 | 1.8 |
T2 | Serum | Red | 2 | 1.8 |
T4 | Serum | Red | 2 | 1.8 |
T5 | Serum | Red | 2 | 1.8 |
T0 | Serum zinc free | Royal blue | 1 | 1.8 |
T3 | Whole blood | Yellow | 12 | 1.8 |
* RBC = red blood cell.
In January 2006, NCI began to collect pathology tissue samples from PLCO participants who developed a cancer. Formalin-fixed, paraffin-embedded tissues blocks were obtained from pathology departments for a subset of cases for selected cancer sites. Tissue microarrays were constructed from them. Additional tissue cores were stored for DNA, RNA, or protein extraction. Tissue microarrays for colorectal adenomas and colorectal, ovarian, prostate, lung, and breast cancers are available, with numbers of cases (total cores) ranging from 130 (n = 968) for ovarian cancer to 807 (n = 5557) for breast cancer. An added value of the tissue microarrays is that nearly all have corresponding prediagnostic blood samples or buccal cells.
The PLCO Biorepository Resource has been available for research to the general scientific community since 2005. Any investigator wishing to use PLCO biospecimens can submit a research application to the Etiologic and Early Marker Studies (EEMS) Program online at http://www.plcostars.com. Applications are accepted twice a year (June and December) and are reviewed based on the criteria of scientific merit, suitability for using the PLCO biospecimens, and parsimonious use of specimens. The PLCO Study Tracking and Review System (STaRS) is a Web application that supports online submission, peer review, and tracking of applications. Abstracts for approved applications are posted on STaRS, allowing prospective investigators to search for past and current research activities so that duplicate research efforts can be avoided.
Peer review of the applications is carried out by the EEMS Review Panel, which is comprised of two NCI DCP program officials, two NCI DCEG investigators, two PLCO Screening Center principal investigators, two statistical reviewers, and two extramural scientists not associated with PLCO. The DCP and DCEG review panel members are not directly involved in the management of PLCO. Reviewers typically serve a 2-to-3–year term. Ad hoc reviewers with certain areas of expertise are asked to join the panel when needed. The panel critiques and scores each application based on the review criteria.
DCP and DCEG jointly manage the PLCO Biorepository and oversee the scientific use of this resource through the EEMS program. Both divisions, with contract assistance from the coordinating center and data management center, support the PLCO EEMS infrastructure and provide extensive capabilities in biospecimen management and tracking, scientific coordination, administration, and strategic planning. The EEMS Steering Committee, which is comprised of DCP and DCEG scientists, develops management policies and procedures, provides oversight and direction, and resolves conflicts over management and policy issues. Additionally, the Steering Committee reviews the panel critiques and decides on final approval of proposals, sometimes requesting further clarification from the applicant. All Steering Committee decisions are subject to review by the directors of DCP and DCEG.
The above described review process, however, is only for access to the PLCO biospecimens; funding is not provided. Therefore, extramural investigators must apply for funding separately. This two-tiered application process may have impeded the use of the PLCO biorepository resource by extramural investigators. To alleviate this problem, a new funding opportunity announcement was issued by DCP, NCI in December 2012 to facilitate use of the PLCO biospecimens by extramural researchers. This funding opportunity announcement (PAR-13–036) allows application for NCI funding and for PLCO biospecimens at the same time. Applications that are funded automatically obtain access to the specimens.
The PLCO Biorepository currently stores approximately 2.8 million vials of the above biologic specimens. These specimens and associated data are suitable for a wide range of research encompassing the continuum from etiologic studies on the causes and natural history of cancers to early markers studies aiming to develop reproducible, reliable biomarkers of early disease for cancer screening.
Productivity of PLCO
The PLCO as a broad research enterprise encompasses both direct trial-related and non–trial-related research. To assess the productivity of the overall research enterprise, we conducted a publications analysis. All research publications (published through 2012) that used PLCO data or specimens were included; design papers, review articles, commentaries, and opinion pieces were excluded.
We subcategorized PLCO research as follows. “Trial-related research” covers analyses of the screening examinations and of all potential downstream effects, including diagnostic follow-up, cancer incidence and characteristics, and mortality; it also covers studies related to trial recruitment and retention and other research concerning trial implementation. “Research using data only” includes studies of cancer etiology that use PLCO data but not biospecimens (or data derived from biospecimens), statistical and epidemiological methods studies that use data to illustrate or motivate novel methods, and risk modeling studies using PLCO data. Finally, “research using biospecimens” includes molecular epidemiological studies of various biochemical and genetic risk factors for cancer, studies of early detection biomarkers, and laboratory methods studies.
As of December 31, 2012, we had identifided 335 PLCO research manuscripts (Table 4). More than half (n = 186 of 335) fell into the “research using biospecimens” category; a majority of these are molecular epidemiological studies of genetic and biochemical risk factors for cancer. Note that all 186 of these publications resulted from the use of blood/buccal specimens; the pathology tumor tissues are a relatively new addition to the PLCO biorepository, and only a handful of studies have been inititated to date. Within this group, etiology—genetic studies, or genotype–phenotype analyses, predominated (n = 132 of 186). Of the genetic studies, 37% (n = 49 of 132) were genome-wide association studies. PLCO has contributed samples in genome-wide association studies of prostate, breast, lung, pancreatic, colorectal, kidney, bladder, and ovarian cancers. The remaining two-thirds of publications in this group represent the candidate genes approach and were focused on specific genes and pathways that have been previously implicated in cancers. Only a small fraction (n = 4 of 186) were focused on early detection biomarkers—a potential gap as well as research opportunity. Just less than one-quarter (22%) of publications were categorized as trial-related research. These include interim analyses of enrollment and screening results, recruitment strategies and other implementation methodologies, quality-of-life and cost analyses, and final mortality outcome results. The category “research using data only” comprised another 22% of the total publications; these used the extensive data collected through the questionnaires and during screening and follow-up (Table 2). A majority of them are classicial epidemiological studies of various dietary and lifestyle risk factors, such as smoking, alcohol, meat consumption, and physical activity.
Table 4.
Article type | Number of articles | Total citations through 2012† | Avg. total citations through 2012 | Total projected 10-year citations‡ | Average total projected 10-year citations | Average number of cohorts per article |
---|---|---|---|---|---|---|
Trial related | ||||||
Trial implementation | 11 | 121 | 12.10 | 190.09 | 19.01 | 1.09 |
Trial results/ related to trial intervention | 64 | 2464 | 40.39 | 7636.68 | 152.73 | 1.06 |
Research using data only | ||||||
Etiology—diet/lifestyle/risk model/others | 53 | 1304 | 25.57 | 4022.37 | 85.58 | 2.36 |
Methods—statistics/epidemiology | 19 | 244 | 12.84 | 556.48 | 30.92 | 1.47 |
Others | 2 | 19 | 9.50 | 50.11 | 25.06 | 1.00 |
Research using biospecimens (EEMS) | ||||||
Early detection biomarkers | 4 | 81 | 20.25 | 731.05 | 182.76 | 1.25 |
Etiology— biochemical | 42 | 949 | 22.60 | 3902.15 | 102.69 | 2.26 |
Etiology—genetic | 132 | 4412 | 33.94 | 15140.15 | 131.65 | 6.21 |
Methods | 7 | 63 | 10.50 | 176.78 | 35.36 | 2.00 |
Others | 1 | 22 | 22.00 | 295.53 | 295.53 | 1.00 |
Grand total | 335 | 9679 | 29.69 | 32701.39 | 112.76 | 3.49 |
* EEMS = Etiologic and Early Marker Studies; PLCO = Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial.
† Citations were downloaded from SCOPUS, after excluding self-citations.
‡ Through 10 years from publication date. Method for calculating the projected citation is detailed in Appendix 1.
Citation data from SCOPUS were used to assess the scientific impact of the PLCO publications. As of December 31, 2012, 9679 citations (excluding self-citations) have resulted from this body of research publications, an average of 29.7 citations per article. Two research categories had substantially more citations than others: etiology—genetic, with 4412 citations and an average of 33. 9 citations per article; and trial results, with 2464 citations and an average of 40.4 citations per article. When examined more closely, the high citations of the genetic category is driven by the genome-wide association studies, whereas the trial results category is driven by the final mortality outcome publications of the trial. The h index of the PLCO research enterprise (defined as the number, h, of publications with a citation number of at least h) was 45. The impact index (9), a metric designed to be independent of the size of an institution or research project by scaling the h index by Nm, where N is number of publications and m is a scaling factor set at 0.4, was 4.4. Note the h index and impact index were calculated excluding self-citations.
We also estimated a total citations count over the 10-year period after each publication. Because not all papers had fully 10 years of follow-up citations data, we developed a projection method to do this (see Appendix 1 for details). The total projected number of citations through 10 years was 32 701, or 113 (projected) citations per article. The relative proportion of projected total citations for each research category remained similar to the observed total citations.
Many published etiologic studies included other cohorts in addition to PLCO, usually to achieve desired statistical power. This is especially true for the genome-wide association studies publications, with an average of 10.4 (range = 1–48; median = 7) cohorts in each study. PLCO also contributed to other consortia studies, including the Breast and Prostate Cancer Cohort Consortium and the Vitamin D Pooling Project. Overall, 116 studies included at least one other cohort, whereas 219 involved only PLCO.
Figure 2 shows the number of publications over time by research type. There is a steady trend of increasing number of publications over the last 10 years. Trial-related publications were relatively constant, ranging from one to eight publications per year. In contrast, there was a sharp increase in research using biospecimens starting in 2005, the year that EEMS, which provided equitable access to this resource by the scientific communities, was established.
The impact factor of a journal provides another measure of scientific impact for publications. PLCO research publications are clustered around the impact factor bin of 4 to 5, with 90% of all articles published in journals with impact factor less than 15. There were 27 articles published in journals with impact factor greater than 30; these include the four mortality results papers published in JAMA and the New England Journal of Medicine and 16 genome-wide association studies published in Nature or Nature Genetics.
Table 5 shows publications by research type crossed with cancer site. As expected, the PLCO cancers (prostate, lung, colorectal, ovarian) are well represented, with these four cancers together accounting for more than half of all publications (n = 191 of 335). Prostate cancer has the most publications (n = 93) overall as well as in each research category; this may reflect the fact that prostate cancer is common among older men, and therefore a large number of cases are available for research. The same is true for colorectal adenoma (included in the colorectal cancer category). Breast cancer, although not a PLCO cancer, is third in numbers of publications, highlighting the fact the PLCO resource is not limited to the PLCO cancers. Also displayed in the table is the case count for each cancer site to give an idea of how number of publications correlates with the available number of cases for study.
Table 5.
Cancer site | Number of publications | Number of cases† | |||
---|---|---|---|---|---|
Trial related | Research using data only | Research using biospecimens (EEMS) | Total | ||
Prostate | 21 | 12 | 61 | 94 | 8468 |
Colorectal | 14 | 15 | 29 | 58 | 2291 |
Breast | 8 | 17 | 25 | 4438 | |
Lung | 7 | 3 | 12 | 22 | 3567 |
Pancreas | 8 | 12 | 20 | 753 | |
Ovary | 12 | 6 | 18 | 372 | |
Hematologic‡ | 1 | 7 | 8 | 2555 | |
Bladder | 2 | 2 | 4 | 1430 | |
Kidney | 2 | 2 | 776 | ||
Osteosarcoma | 2 | 2 | 38 | ||
Thyroid | 2 | 2 | 248 | ||
Endometrium | 1 | 1 | 703 | ||
Head and neck | 1 | 1 | 576 | ||
Stomach | 1 | 1 | 343 | ||
Multiple sites | 1 | 1 | 6 | 8 | |
Not applicable§ | 20 | 21 | 28 | 69 | |
Total | 75 | 74 | 186 | 335 |
* EEMS = Etiologic and Early Marker Studies; PLCO = Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial.
† Total number of cases from both arms, ascertained through 13 years of follow-up.
‡ Includes lymphoma, leukemia, and multiple myeloma.
§ Includes studies of common risk factors, such as smoking, obesity, and vitamin D, as well as methodological studies.
Impact on Clinical Guidelines
To assess how results from PLCO as a screening trial impacted clinical guidelines, we examined how PLCO findings were used by the United States Preventive Services Task Force (USPSTF) in making recommendations concerning cancer screening.
In 2012, the USPSTF released its Screening for Prostate Cancer Recommendations Statement, giving PSA screening a “D” rating (recommend against providing the service) (10). The results of two RCTs of prostate cancer screening were examined—PLCO and the European Randomized Study of Screening for Prostate Cancer, ERSPC (11). Based on the results of these two trials, the task force concluded that there was adequate evidence that PSA screening prevented from 0 to 1 prostate cancer deaths per 1000 men screened. The lack of clear benefit, in conjunction with the estimated harms, resulted in the “D” rating, meaning there is moderate or high certainty that the service has no net benefit or that the harms outweigh the benefits. Additionally, the task force stated that “the screening intervals, PSA thresholds, use of DRE, enrollee characteristics and follow-up diagnostic and treatment strategies used in the PLCO trial are most applicable to current U.S. settings and practice patterns” (10).
The USPSTF’s most recent recommendations for colorectal screening were in 2008, before the PLCO reported the colorectal component final outcome (12). The systematic evidence review supporting the 2008 recommendations statement did reference two earlier PLCO publications, which described screening results, including diagnostic follow-up and cancer (and adenoma) yield. The 2008 statement gave an “A” recommendation (recommend service) for various CRC screening modalities, including FSG, for the age group 50 to 75 years, similar to that of PLCO.
The Screening for Ovarian Cancer: USPSTF Reaffirmation Recommendation Statement was published in 2012 (13). It discussed three screening RCTs, including PLCO, the only one that has reported mortality outcome results to date. The task force reaffirmed their “D” rating for ovarian cancer screening, which was consistent with the PLCO trial results.
The USPSTF is expected to release its statement on lung cancer screening sometime in 2013.
Costs of PLCO
Table 6 shows the costs of PLCO by category. The total cost was $454 million over 20 years, adjusted to 2011 dollars. The majority of this amount, $284.5 million, or 63%, was for the 10 screening centers themselves, with another large fraction (26%) for the coordinating center and data management center. Approximately $25.5 million (6%) is estimated as the cost of the collection, processing, and storage of the blood samples and buccal cells comprising the original PLCO Biorepository, with an additional $11.5 million for the collection of pathology tissue samples, for a total of approximately $37 million; note the portion of these costs borne by the screening centers has been subtracted from the screening center total (and similarly for the coordinating and data management centers). Finally, a team of eight to ten full-time NCI employees, led by the PLCO project officer and the PLCO lead contracting (financial) officer, was involved in the day-to-day management of the trial (including scientific, operational, and financial oversight), the cost of which is estimated to be around $15 million.
Table 6.
Cost category | Total† |
---|---|
Coordinating center and data management center | 117500 |
Screening centers | 284500 |
Biorepository | |
Blood/buccal cell | 25500 |
Pathology tissue | 11500 |
NCI staff‡ | 15000 |
Total | 454000 |
* NCI = National Cancer Institute; PLCO = Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial.
† Through fiscal year 2012. In thousands. Adjusted to 2011 dollars.
‡ Approximate salaries and benefits.
Although the main phase of the PLCO trial concluded in 2011, the data and biospecimens will continue to be useful for future research. These resources will have long-term value beyond the initial cost of obtaining them. Only a small fraction of the stored biospecimens have been used so far. As of the end of 2012, about 13% of serum specimens in cancer patients have been depleted, with somewhat higher rates for buffy coat and buccal cells. During the same period, a 186 research publications involving the use of PLCO biospecimens were published. Based on the low sample depletion rate, it is likely that the PLCO Biorepository will be able to continue to provide biospecimens for hundreds of research projects.
A part of the rationale for the design of PLCO, with the four screening trials performed simultaneously under a single administrative umbrella with the same cohort of subjects, was cost efficiency. If stand-alone trials had been performed for each of the four PLCO cancers with the same sample size as in PLCO, three times as many subjects would have been enrolled (note that the sizes for the prostate and ovarian components were effectively only half the full cohort size because they were restricted to a single sex). Much of the costs of PLCO, including recruitment and enrollment, participant tracking, questionnaire administration, blood collection and processing for the biorepository, medical record abstracting of incident cancers, and National Death Index (NDI) searches, were proportional to the total number of subjects enrolled. However, some costs, such as those of the screening tests themselves and of abstracting follow-up diagnostic procedures, would be the same regardless of how many separate studies were conducted. A rough estimate is that conducting four separate trials would have cost about 2.5 times as much as the four-in-one PLCO design.
Discussion
We have described here the PLCO trial as a complex research enterprise and attempted to ascertain its productivity and costs. The experience with PLCO can provide guidance as to how similar projects in this era of large-scale collaborative science can be optimally managed.
The stated total costs ($454 million) of PLCO encompassed those of conducting the trial and of creating, maintaining and managing the PLCO resource. As noted above, most trial-related research was covered under these costs. In general, however, non–trial-related research was not included, and these projects incurred additional costs associated with conducting the research using the PLCO biospecimen and data resource. The majority (approximately three-quarters) of the PLCO research, as judged by the number of publications, was in fact not trial-related, and hence generally not funded through the major PLCO contracts. The preponderance of the nontrial research was conducted by researchers from the DCEG, an NCI intramural division. DCEG provided a large pool of able researchers (including fellows and tenure-track and tenured investigators), institutional knowledge of PLCO, dedicated laboratory facilities, relatively stable funding, and internal coordination for PLCO research.
In contrast, extramural investigators are often less familiar with PLCO, do not have institutional-level support, and often struggle to obtain NIH funding. Further, for research involving biospecimens, an extramural investigator must go through two separate application processes, one for specimen access and one for NIH funding. This two-step process adds an extra burden and lengthens the timeline considerably. This was why the NIH funding opportunity announcement for PLCO biospecimen research described above was initiated; it is our hope that it will facilitate research using the PLCO biospecimens by non-NCI scientists.
It should be noted that, whereas more than half (55%) of the PLCO publications were based on research using blood/buccal biospecimens, the additional cost of the biorepository was quite modest. Of the total $454 million in costs documented for PLCO, only $25.5 million was for the blood/buccal cell portion of the biorepository. This highlights the value of adding a biorepository to a planned cohort study.
As discussed, PLCO was both an RCT and a cohort study. With respect to evaluating the costs and benefits of PLCO as an RCT, efforts have been made to assess the costs and direct net benefits to society of RCTs. For example, an analysis of all RCTs conducted by the NIH’s National Institute of Neurologic Disorders and Stroke examined the results of each trial intervention in terms of increased (or decreased) use after the trial, the costs (or savings) of the net change in use, and the net change in quality-adjusted life years gained (valued at the per capita annual US gross domestic product) (14). A positive trial that increased use of an intervention could result in a net benefit or a net loss depending on whether the net change in treatment costs, plus the cost of the trial itself, was outweighed by the change in quality-adjusted life years. A negative trial could have net benefit by reducing the use (and hence costs) associated with an intervention that was not beneficial.
We performed a similar analysis for PLCO, using the example of prostate cancer screening. The costs associated with PSA screening are substantial, largely because of the well-described phenomenon of overdiagnosis. A European analysis estimated the added costs of PSA screening for a cohort of 100 000 men screened annually (biannually) over 25 years to be 46 million (40 million) Euro (15). Based on US estimates of PSA screening use from the National Health Interview Survey and translating to current dollars, the costs of current PSA screening in the United States would be estimated at roughly $2.5 billion per year. This clearly dwarfs the cost of the prostate component of PLCO. However, because of the mixed results of the two major trials, PLCO and ERSPC, one of which reported benefit (ERSPC) and one of which did not (PLCO), along with other non-RCT evidence interpreted in various ways, there is still no clear consensus on the efficacy of PSA screening, and the test continues to be widely used, despite the recent USPSTF “D” recommendation (not recommended for screening).
Another way of assessing trials is by their impact on guidelines issued by the USPSTF and other bodies. As discussed above, PLCO did contribute substantially to USPSTF screening guidelines for prostate and ovarian cancer, as well as to guidelines promulgated by other entities. We evaluated the productivity of PLCO as a cohort study primarily through our publications analysis. Other measures have also been used for evaluating cohort study productivity. For example, Colditz and Winn employed three metrics in evaluating the body of research produced by the Nurses’ Health Study, a large cohort study that analyzed the effect of a broad range of lifestyle, dietary, and hormonal factors on cancer incidence and all-cause mortality (16). Their “discovery” metric, defined as explaining the etiology of disease, was assessed through examination of scientific publications and was thus similar to our publications analysis. However, they also defined two other metrics that assessed the translation of primary research findings into population health impacts. These were development—providing a basis for control measures and prevention procedures—and delivery— implementation/use of findings (17). The development metric included those findings from the Nurses’ Health Study that were cited as having an impact on public health—for example, as having influenced published guidelines or as having contributed to causal inference about an exposure and outcome. Delivery impacts included a change in US Food and Drug Administration labeling requirements for transfatty acids in foods. Because much of the PLCO research is relatively recent, more time is needed to assess how PLCO (nontrial) research results are translated into public health interventions or effects.
To put the PLCO publications citation statistics into general perspective, we compared them with an analysis of research publications by all medical schools, which can be seen as representing an average citation level across all fields of medical research. Hendrix examined medical school research publications from 1997 to 2007, with citation analysis performed at the end of that period, which is similar to the current analysis of PLCO research publications from 2002 to 2012, with citation analysis performed at the end of 2012 (18). For all 123 accredited US medical schools, the mean impact index of their research output was 3.2 (standard deviation = 0.63); the average number of citations per article was 14.1. This compares with an impact index of 4.4 for PLCO, and an average number of citations per article of 30.
Conclusions
PLCO successfully completed four randomized controlled trials, the results of which have influenced clinical guidelines. Further, PLCO produced a large and complex research resource. The efficient and effective management of this resource has led to the productive use of its data and specimens.
Appendix 1: Projected Citations
We searched SCOPUS for all 1999 publications with search words of Cancer AND Epidemiology or Prevention. Of these, we eliminated review and opinion papers, studies with cell lines or animals, and other studies deemed not representing clinical or population studies in cancer prevention or epidemiology. Of the remaining 650 manuscripts, we extracted the number of citations for each year through 2011. We then modeled the total citations through the next 10 years, not counting the year of publication (ie, through 2009, denoted as Cit10, as a function of the number of citations in the first year, first two years, etc.). Specifically, we calculated the mean of Cit10/Citn, where Citn is the total number of citations in the first n years (the publication year is not counted). Then, for each PLCO publication the number of years available for observed citations is determined and the correct multiplier applied to compute projected citations. Note for publications with 10 or more years of citations, the actual number of citations was used.
References
- 1. Gohagan JK, Prorok PC, Hayes RB, et al. The Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer Screening Trial of the National Cancer Institute: history, organization, and status. Control Clin Trials. 2000;21(6 Suppl):251S–272S [DOI] [PubMed] [Google Scholar]
- 2. Prorok PC, Andriole GL, Bresalier RS, et al. Design of the Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer Screening Trial. Control Clin Trials. 2000;21(6 Suppl):273S–309S [DOI] [PubMed] [Google Scholar]
- 3. Andriole GL, Grubb RL, Buys SS, et al. Mortality results from a randomized prostate-cancer screening trial. New Engl J Med. 2009;360(13):1310–1319 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Andriole GL, Crawford ED, Grubb RL, et al. Prostate cancer screening in the randomized prostate, lung, colorectal, and ovarian cancer screening trial: mortality results after 13 years of follow-up. J Natl Cancer Inst. 2011;104(2):125–132 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Buys SS, Partridge E, Black A, et al. Effect of screening on ovarian cancer mortality: the Prostate, Lung, Colorectal and Ovarian (PLCO) cancer screening randomized controlled trial. JAMA. 2011;305(22):2295–2302 [DOI] [PubMed] [Google Scholar]
- 6. Oken MM, Hocking WG, Kvale PA, et al. Screening by chest radiograph and lung cancer mortality: the Prostate, Lung, Colorectal, and Ovarian (PLCO) randomized trial. JAMA. 2011;306(17):1865–1873 [DOI] [PubMed] [Google Scholar]
- 7. Schoen RE, Pinsky PF, Weissfeld JL, et al. Colorectal-cancer incidence and mortality with screening flexible sigmoidoscopy. New Engl J Med. 2011;366(25):2345–2357 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. O’Brien B, Nichaman L, Browne J, Levin D, Prorok P, Gohagan J. Coordination and management of a large multicenter screening trial: the Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer Screening Trial. Control Clin Trials. 2000;21(6 Suppl): 273S–309S [DOI] [PubMed] [Google Scholar]
- 9. Molinari JF, Molinari A. A new methodology for ranking scientific institutions. Scientometrics. 2008;75(1):163–174 [Google Scholar]
- 10. Moyer VA, US Preventive Services Task Force Screening for prostate cancer: U.S. Preventive Services Task Force recommendation statement. Ann Intern Med. 2012;157(2):120–134 [DOI] [PubMed] [Google Scholar]
- 11. Schroder FH, Hugosson J, Roobol MJ, et al. Prostate-cancer mortality at 11 years of follow-up. New Eng J Med. 2012;366(11):981–990 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Moyer VA, US Preventive Services Task Force Screening for colorectal cancer: U.S. Preventive Services Task Force recommendation statement. Ann Intern Med. 2008;149(9):627–637 [DOI] [PubMed] [Google Scholar]
- 13. Moyer VA, US Preventive Services Task Force Screening for ovarian cancer: U.S. Preventive Services Task Force reaffirmation recommendation statement. Ann Intern Med. 2012;157(12):900–904 [DOI] [PubMed] [Google Scholar]
- 14. Johnston SC, Rootenberg JD, Katrak S, Smith WS, Elkins JS. Effect of a US National Institutes of Health programme of clinical trials on public health and costs. Lancet. 2006;367(9519):1319–1327 [DOI] [PubMed] [Google Scholar]
- 15. Heijnsdijk EAM, der Kinderen A, Wever EM, Draisma G, Roobol MJ, de Koning HJ. Overdetection, overtreatment and costs of prostate-antigen screening for prostate cancer. Br J Cancer. 2009;101(11):1833–1838 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Colditz GA, Hankinson SE. The Nurses’ Health Study: lifestyle and health among women. Nature Rev Cancer. 2005;5(5):388–396 [DOI] [PubMed] [Google Scholar]
- 17. Colditz GA, Winn DM. Criteria for the evaluation of large cohort studies: an application to the Nurses’ Health Study. J Natl Cancer Inst. 2008;100(13):918–925 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Hendrix D. An analysis of bibliometric indicators, National Institutes of Health funding, and faculty size at Association of American Medical Colleges medical schools, 1997–2007. J Med Lib Assoc. 2008;96(4):324–334 [DOI] [PMC free article] [PubMed] [Google Scholar]