Abstract
The Clinical and Translational Science Award (CTSA) Program was designed by the National Institutes of Health (NIH) to develop processes and infrastructure for clinical and translational research throughout the United States. The CTSA initiative now funds 61 institutions. In 2012, the National Center for Advancing Translational Sciences, which administers the CTSA Program, charged the Evaluation Key Function Committee of the CTSA Consortium to develop common metrics to assess the efficiency of clinical research processes and outcomes. At this writing, the committee has identified 15 metrics in 6 categories. It has also developed a standardized protocol to define, pilot-test, refine, and implement the metrics. The ultimate goal is to learn critical lessons about how to evaluate the processes and outcomes of clinical research within the CTSAs and beyond. This article describes the work involved in developing and applying common metrics and benchmarks of evaluation.
Keywords: Clinical Research, Common metrics, CTSA, Efficiency of clinical research, Evaluation
Introduction
Researchers continue to struggle with the slow pace at which findings are translated from bench to bedside and, in particular, with the amount of time required to conduct clinical trials and publish results. In cancer research, for example, Dilts and colleagues (2009) found there are almost 300 distinct processes involved in activating a phase III trial and that the median time from conception to activation is over 600 days. While many clinical trials in various fields are never completed because of recruitment and other problems, Ross and colleagues (2012) found that the results of completed trials are published within 30 months in fewer than half of cases and that the overall publication rate is only 68%.
In an attempt to improve processes involved in clinical research, the National Institutes of Health (NIH) created and funded the Clinical and Translational Science Award (CTSA) Program. Administered by the National Center for Advancing Translational Sciences (NCATS), this program was designed to develop infrastructure for clinical and translational research at different institutions throughout the United States. The CTSA initiative now funds 61 institutions and thus represents the largest NIH-funded program to date (CTSA Central, access 2013). Although CTSA-funded institutions have responded to the challenge of improving the processes of clinical and translational research in various ways, they all have core entities that help improve the efficiency of research by eliminating barriers and offering specialized services to investigators. Examples of such cores include biostatistics, regulatory compliance, informatics, and clinical research facilities cores.
Recognizing the importance of evaluation, NIH has required that each institution that applies for a CTSA have an evaluation plan that explains in detail how it would evaluate its program and assess its use of funds if it were to receive a CTSA. NIH has required an evaluation plan in every Request for Applications for CTSAs since its inception in 2005. Many institutions employ a variety of evaluation methods, including surveys, bibliometric analyses, and social network analysis to gain a better understanding of teams and multidisciplinary research. Evaluators at each CTSA institution are members of the Evaluation Key Function Committee. This committee has 4 workgroups and 2 interest groups, with focuses on methodology (bibliometrics, qualitative methods, and social network analysis) and learning or defining best practices (research translational mapping and measurement, definitions, and shared resources).
The Evaluation Key Function Committee meets regularly via conference calls and annually at face-to-face meetings to share best practices in evaluation and to collaborate on evaluation projects. The purpose of the committee is not to engage in a national evaluation. In fact, during the first 6 years of the CTSA program, the NIH employed consultants to conduct a national evaluation, focusing on a summative approach and studying progress of the CTSAs, with special emphasis on the accomplishments of scholars trained through the Research Training and Education Key Function (Rubio, Sufian, & Trochim, 2012). The national evaluation was not designed to generate tools or metrics for the individual institutions but, rather, to report on what the institutions with CTSAs had accomplished.
In 2012, the acting director of Division of Clinical Innovation within NCATS, where the CTSA program is managed, charged the Evaluation Key Function Committee with generating common metrics to assess the efficiency of clinical research in terms of processes and outcomes. These metrics can then be used for benchmarking that would allow each institution to see where its performance falls with regards to efficiency across CTSA funded institutions. The intent is to provide a tool for institutions to know whether or not they should engage in a process improvement so that they can improve the efficiency of clinical research at their institution. Collectively, the data can be used to document the efficiency of clinical research across all of the CTSA funded institutions.
Also in 2012, the Institute of Medicine (IOM) was charged with evaluating the CTSA Program. In their report, they argue for the need for common metrics that can be used consistently at all CTSA sites to demonstrate progress of the CTSA Program (IOM 2013). The overarching goal of the CTSA program is to improve health. However, as they note, this is not feasible or practical to evaluate. One thing that the CTSA Program can do is to develop common metrics that can demonstrate improvements over time with regards to the efficiency of clinical research.
Development of Clinical Research Metrics
In an effort to develop common metrics, the chair and co-chair of the Evaluation Key Function Committee began by asking an evaluation liaison from each of the 61 institutions to meet with the principal investigator of his or her institution’s CTSA to generate a list of 5 to 8 metrics for clinical research processes and outcomes and to bring the list to the annual face-to-face meeting. At the October 2012 face to face meeting the 127 participants met in small groups and shared their metrics. Each group was assigned a facilitator and was asked to rank the metrics in order to identify the top 5–10 metrics. Afterwards, the facilitators from the small groups met to synthesize top metrics. The process resulted in 15 metrics that they believed to be the most promising and feasible to collect. The following day a clicker system was used so that the participants could rate the metrics on the importance of the metric and the feasibility of collecting data on the metric. All of the metrics were strongly endorsed by the participants.
The committee presented a list of the 15 metrics, along with the importance and feasibility scores for each metric, to the CTSA Steering Committee (CCSC), which consists of principal investigators from each CTSA institution. The CCSC gave its enthusiastic and unanimous support for the evaluation committee’s effort.
The 15 metrics (Table 1) can be grouped into 6 categories: clinical research processes, careers, services used at the institution, economic return, collaboration, and products. While the rationales for most of these categories are evident, the rationales for the careers and collaboration categories deserve mention. The 2 metrics in the career category (career development and career trajectory) reflect the training of the investigators, and the 2 metrics in the collaboration category (researcher collaboration and institutional collaboration) reflect the willingness to engage in multidisciplinary approaches to conducting clinical research (i.e., investigators from different disciplines collaborating on research) and overcoming barriers to this research. Together, the training and collaboration affect the efficiency of the research endeavors.
Table 1.
Category | Metric |
---|---|
Clinical research processes | 1. Time from institutional review board (IRB) submission to approval |
2. Studies meeting accrual goals | |
3. Time from notice of grant award to study opening | |
Careers (how well investigators are trained in clinical research) | 4. Career development |
5. Career trajectory | |
Services used at the institution | 6. Volume of investigators who used services |
7. Volume of types of services used | |
8. Satisfaction/needs assessment | |
Economic return | 9. Leveraging/return on investment (ROI) of pilot studies and KL2 scholars |
Collaboration | 10. Researcher collaboration |
11. Institutional collaboration | |
Products | 12. Number of technology transfer products |
13. Time to publication | |
14. Influence of research publication | |
15. Time from publication to research synthesis |
The Evaluation Key Function Committee formulated a smaller workgroup (Common Metrics Workgroup) to further refine each metric. To help consistently define the metrics, the members of this workgroup examined and modified a template of measure attributes that had been developed earlier by the National Quality Measurement Clearing House (NQMC) and Agency for Healthcare Research and Quality (AHRQ, accessed 2012). Our modified template is shown in Table 2.
Table 2.
Attribute | Explanation |
---|---|
Title | Identifies the title of the measure |
Description | Provides a concise statement of the specific aspects of health care, the patient population, providers, setting(s) of care, and the time period that the measure addresses |
Rationale | Identifies the rationale that briefly explains the importance of the measure (i.e., why it is being used) |
Inclusion/exclusion | Provides the general description of any component that is the basis for inclusions and exclusions for the variable (e.g., time period, type of research) |
Possible descriptive data to collect | Lists descriptive data that should be collected with this variable (e.g., type of funding source, type of research) |
Data source | Identifies the data source(s) necessary to implement the measure |
Scoring | Identifies the method used to score the measure (choose one of the following: unspecified, categorical, continuous, count, frequency distribution, nonweighted score/composite score, rate, ratio, or weighted score/composite score) |
Unit of analysis | Clarifies what each observation is |
This template was adapted from a template created by the National Quality Measurement Clearinghouse (NQMC) and Agency for Healthcare Research and Quality (AHRQ), U.S. Department of Health and Human Services. The original template is in the public domain and is available at http://www.qualitymeasures.ahrq.gov/about/template-of-attributes.aspx (accessed December 4, 2012).
In modifying the template, the Common Metrics Workgroup recognized that descriptive data needed to be specified and collected in conjunction with each metric. The descriptive data would provide a context for the metric and enable better interpretation of the data for the metric. For example, expectations regarding the time that lapses between receipt of a grant award and recruitment of the first study subject would vary based on the type of study (phase I, II, or III) and whether the disease being studied was common or rare.
Definition of the Proposed Metrics
At this writing, these metrics have been defined: 1) time from institutional review board (IRB) submission to approval, 2) studies meeting accrual goals, and 3) time from notice of grant award to study opening (Table 3–8). While these metrics may seem to be straightforward, their definition proved to be challenging.
Table 3.
Description | The time in days between the date that the application for IRB review is received by the IRB office and the date of final approval granted by the IRB with no IRB-related contingencies remaining. Operational Definitions: The receipt date is the actual date that the IRB office initially received an application for IRB review in their office or in an electronic in-box for review; this includes the receipt date for submissions to IRBs that perform triage or pre-review. If the IRB Office serves as a distribution mechanism for applications that must first be reviewed by other committees or entities (e.g., scientific review committees) prior to IRB review and the IRB office takes no other action than forwarding the application on to another committee or entity, the receipt date should be when the IRB office receives the application and begins triage or triage and pre-review. The final approval date is the date that the IRB determined that the protocol was approved with no IRB-related contingencies remaining, so that, from the perspective of human subjects, the research can commence at the local site. The duration is the final approval date minus the receipt date and is expressed in number of calendar days |
|
Rationale | This is the primary duration for assessing the length of time needed for an IRB review. It is critical for establishing the duration of this part of the translational research process. | |
Inclusion/Exclusion | NIH-funded, human subjects clinical research protocols that received IRB approval from a fully convened IRB. This may include multi-site studies. Protocols with a status like deferred, tabled, or significant modifications needed should be excluded. | |
Possible Descriptive Data to Collect | Different institutions are likely to have different processes for IRB review. A number of contextual variables are likely to be related to IRB duration. Two previous CTSA studies have investigated, for small samples, the relationship of these variables to the review duration. This needs to be considered in interpreting the data. These include:
|
|
Data Source | The data source can be IRB electronic records, manual records or a hybrid for each submitted protocol. This is a retrospective metric that is collected for already completed protocols. Protocols that are currently under review are excluded. | |
Scoring | The primary challenges with this metric are definitional and operational; it is simple to score the metric once the two required dates are obtained. However, to insure comparability across institutions it will be necessary to define carefully how the receipt and final approval dates are measured at each institution. It is likely that institutional definitions will differ slightly or that the operational definition defined here will not be readily available at some institutions without some modification to their data collection systems. | This is a continuous metric that ranges from 0 to the maximum number of days required to complete an IRB review. It is collected on an ongoing basis. |
Unit of Analysis | Data will be collected at an institution-level and protocol level. | |
Comments |
Table 8.
Description | Percent of IRB-approved studies that are terminated due to failure to meet recruitment targets. Operational Definitions:
|
|
Rationale | Is subject recruitment a significant barrier to clinical research? How much resources are wasted in due to failure to meet recruitment targets? | |
Inclusion/Exclusion |
Inclusion: NIH-funded, human subjects clinical research studies that are closed to recruitment (studies may still be collecting data or conducting analyses) Exclusion: studies with corporate/non-federal funding; those using only qualitative methods; multi-site studies |
|
Possible Descriptive Data to Collect |
Type of review: exempt/non-exempt/full board review Type of study: clinical trial; observational/survey Type of population: Pediatric only; Adult only Disease/condition studied: need to determine what would be the relevant groups Minority accrual targets: were there explicit minority accrual targets established (yes/no) |
|
Data Source | Potential sources-- IRB records, PI study records | |
Scoring | Identifies the method used to score the measure | Ratio/rate |
Unit of Analysis | Data will be collected at a protocol level and aggregated to the institution-level | |
Comments: |
|
The time from IRB submission to approval is defined as the number of days between the date that the IRB office received the IRB application for review and the date that the IRB gave final approval with no IRB-related contingencies remaining. One of the challenges in defining the metric was that some institutions require a scientific review of a protocol before the IRB reviews the protocol, while other institutions do not. For the institutions that do require a scientific review, we grappled with whether to define the first date as the date that the proposal was submitted to the IRB or the date that the scientific review was completed. The goal was to provide a definition that could be consistently applied by all institutions. Thus, whenever issues of differences arose, we used descriptive data to define the differences and refine the definitions. This approach resulted in the need to collect more data for the metrics.
When we tried to define the second metric, which is called study meeting accrual goals, but found that it was too broad to define as a single metric. We found that we needed to describe 4 metrics to capture the initial metric: studies with adequate accrual (recruitment/retention), length of time spent in recruitment, study startup time, and problems with subject recruitment.
The third metric, called time from notice of grant award to study opening, was probably the least problematic, but it also presented challenges. We ended up defining study opening as the date that the first subject provided informed consent for participation in the study. Since the prevalence of the disease being studied can greatly affect the length of time until study opening, we included information about disease prevalence in the descriptive data section.
Development of a Standardized Protocol
In addition to defining the first 3 metrics, the Common Metrics Workgroup developed a standardized protocol to do the following: define the remaining metrics, recruit CTSA institutions to pilot-test the metrics, use the results to refine the metrics, implement the refined metrics across the CTSA Consortium, and create benchmarks. The Common Metrics Workgroup recognizes the need to work with the leadership of the CTSA consortium to implement the metrics (e.g., CCSC).
A standardized protocol enables the Common Metrics Workgroup to solicit assistance from other groups that may be interested in helping to define the common metrics. Within the Evaluation Key Function Committee, for example, groups include the definitions workgroup, which has already engaged in defining key constructs of the CTSA Consortium, and the bibliometric workgroup, which could be instrumental in defining metrics regarding publications.
To help implement our protocol, we are creating a database that will contain the various lists of metrics that were brought to the 2012 face-to-face meeting of the Evaluation Key Function Committee. While our initial efforts will focus on defining the 15 metrics listed in Table 1, the database will enable us to prioritize other metrics that need to be defined. As we progress through this process, we will strive to minimize the redundancies across the metrics and to keep the number of metrics to be implemented at a minimum. The intent is for the common metrics work to be useful, not burdensome to institutions.
Because we have a large number of metrics to consider, we anticipate that the metrics will be rolled out in waves, with definitions introduced every 4 months and with pilot-testing instituted during each new wave. For each wave, we will recruit 3–5 CTSA institutions to pilot-test the metrics for 6 weeks. During this time, we will ask each institution to gather data on at least 10 protocols and input the data into REDCap™ (Research Electronic Data Capture) which are electronic data capture tools hosted at Vanderbilt University (Harris et al 2009). REDCap is a secure, web-based application designed to support data capture for research studies, providing: 1) an intuitive interface for validated data entry; 2) audit trails for tracking data manipulation and export procedures; 3) automated export procedures for seamless data downloads to common statistical packages; and 4) procedures for importing data from external sources.
At the end of 6 weeks, we will ask the piloting institutions to complete a brief survey about the feasibility of collecting data and about the barriers and obstacles they encountered. Then we will review all of the data to determine if the metric needs to be further refined. Given the diversity in how research is conducted and implemented by the CTSA institutions, we believe that the definitions will have to undergo several iterations. For example, some institutions have an electronic IRB submission and review process, while other institutions still rely on paper applications. These differences will impact the way in which data can be collected.
By collecting the data from all of the institutions in one database, we will be able to use the database for benchmarking. The long-term plan is that for each metric, each CTSA institution will be able to log in to the system, generate a report that displays the de-identified distribution of responses for the metric, and then determine where it lies on the continuum of all institutions. The benchmarking information will remain confidential for individual institutions. Institutions can use the data to determine if they should develop a process improvement plan to increase the efficiency of clinical research at their institution.
Future Directions
The future of clinical research is dependent on developing significant efficiencies for translating findings from the bench to bedside more quickly and with fewer resources. The metrics that we are working to define and implement can help move us in that direction, but common metrics are not a panacea; they are the first step in assessing several areas for possible improvement.
The overwhelming support for this work across the CTSA funded institutions and the enthusiasm of the CCSC have strengthened our commitment to establishing common metrics for clinical research. Using these common metrics, we will learn critical lessons about how to evaluate and change the processes and the outcomes of clinical research within CTSA funded institutions. We believe that this, in turn, will affect clinical research throughout the academic research community and beyond.
Table 4.
Description | This measure is intended to capture improvements in the efficiency of clinical research processes attributable to the CTSA initiative. There is a need to improve the extent to which clinical research studies achieve sufficient enrollment. Studies without sufficient enrollment are unable to evaluate proposed scientific hypotheses and are not a cost-effective use of administrative and clinical resources. Operational Definitions:
|
Rationale | This measure assesses the part of the clinical research process from when an investigator is formally notified that an NIH grant has been awarded (the date on the NOGA) to the time the first subject is accrued (enrolled) in the study. |
Inclusion/Exclusion |
Inclusion: NIH-funded, human subjects clinical research grants that accrued its first patient. Exclusion: Basic research grants, contracts, studies not directly involving human subjects. |
Possible Descriptive Data to Collect | Research Phase Population (age, gender) Indication of Disease |
Data Source | Sample: IRB to obtain list of qualifying approved protocols Date of NOGA: Grants administration database or PI or department Date of FA: IRB (if data element requesting this is required for renewal) or PI records or accounting/billing records (date sponsor first billed for study subject) |
Scoring | Discrete interval= number of days from NOGA date to FA date. |
Unit of Analysis | Data will be collected at the protocol level |
Comments | On 1/8/13, the workgroup decided to change the originally proposed metric of NOGA to Study Opening to the metric NOGA to First Accrual. A clear definition of Study Opening that could be uniformly applied across institutions did not seem feasible. Updates of enrollment tables requested by the IRB at renewal are not adequate to obtain this measure unless dates associated with enrollment are required. (At UC Davis, this data element is not available but it will be added to the required data for IRB renewal applications in the coming months). The date of first enrollment might be obtained from a clinical trial management system for studies that use the system. |
For a study measuring accrual achievement in cancer clinical trials, see Cheng et al. “Predicting Accrual Achievement: Monitoring Accrual Milestones of NCI-CTEP Sponsored Clinical Trials”, Clin Cancer Res, 2011 April 1; 12(7): 1247–1955.
Table 5.
Description | What proportion of studies recruited and retained the number of subjects required to analyze their primary research question. Operational Definitions:
|
|
Rationale | Studies not meeting the number will have insufficient statistical power to complete their analyses. | |
Inclusion/Exclusion |
Inclusion: NIH-funded, human subjects clinical research studies that are closed to recruitment (studies may still be collecting data or conducting analyses) Exclusion: studies with corporate/non-federal funding; those using only qualitative methods; multi-site studies |
|
Possible Descriptive Data to Collect |
|
|
Data Source | Grant proposal; IRB protocol submission; IRB progress reports | |
Scoring | Identifies the method used to score the measure | Rate |
Unit of Analysis | Data will be collected at a protocol level and aggregated to the institution-level | |
Comments: |
Table 6.
Description | How long did recruitment take and how did this compare to the projected time period Operational Definitions:
|
|
Rationale | Study costs are directly related to recruitment; inadequate planning adds to costs; this indicator would provide information on a potential intervention point; can be used as a performance metric | |
Inclusion/Exclusion |
Inclusion: NIH-funded, human subjects clinical research studies that are closed to recruitment (studies may still be collecting data or conducting analyses) Exclusion: studies with corporate/non-federal funding; those using only qualitative methods; multi-site studies |
|
Possible Descriptive Data to Collect |
Type of review: exempt/non-exempt/full board review Type of study: clinical trial; observational/survey Type of population: Pediatric only; Adult only Disease/condition studied: need to determine what would be the relevant groups Minority accrual targets: were there explicit minority accrual targets established (yes/no) |
|
Data Source | IRB protocol and annual progress report | |
Scoring | Identifies the method used to score the measure | Rate OR categorical for example a) less than projected; b) as projected; c) <25% more time than projected; d) 26–50% more time; e) 51–75% more time; f) 76–100% more time; g) >100% |
Unit of Analysis | Data will be collected at a protocol level and aggregated to the institution-level | |
Comments: | uncertain how easily this information is accessed |
Table 7.
Description | Time in days from date of IRB approval to first subject accrual | |
Rationale | How long does it take an investigator to implement a protocol once it has been approved by the IRB? | |
Inclusion/Exclusion |
Inclusion: NIH-funded, human subjects clinical research studies that are closed to recruitment (studies may still be collecting data or conducting analyses) Exclusion: studies with corporate/non-federal funding; those using only qualitative methods; multi-site studies |
|
Possible Descriptive Data to Collect |
Type of review: exempt/non-exempt/full board review Type of study: clinical trial; observational/survey Type of population: Pediatric only; Adult only Disease/condition studied: need to determine what would be the relevant groups Minority accrual targets: were there explicit minority accrual targets established (yes/no) |
|
Data Source | Potential—IRB records, PI records | |
Scoring | Identifies the method used to score the measure | Continuous |
Unit of Analysis | Data will be collected at a protocol level and aggregated to the institution-level | |
Comments: | Doesn’t identify the reasons for longer/shorter start-up times but will provide data to see if greater study is needed. |
Acknowledgments
Funding: The project reported here was supported by the National Institutes of Health (NIH) through the Clinical and Translational Science Award (CTSA) Program. The NIH CTSA funding was awarded to the University of Pittsburgh (UL1 TR000005).
Footnotes
Declaration of Conflicting Interests: The author declares no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
REFERENCES
- Agency for Healthcare Research and Quality (no date) Template of Measure Attributes. [Accessed December 4, 2012]; http://www.qualitymeasures.ahrq.gov/about/template-of-attributes.aspx.
- CTSA Central. [Accessed June 26, 2013]; https://ctsacentral.org/institutions. [Google Scholar]
- Dilts DM, Sandler AB, Cheng SK, Crites JS, Ferranti LB, Wu AY, Finnigan S, Friedman S, Mooney M, Abrams J. Steps and time to process clinical trials at the Cancer Therapy Evaluation Program. Journal of Clinical Oncology. 2009;27:1761–1766. doi: 10.1200/JCO.2008.19.9133. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Harris PA, Taylor R, Thielke R, Payne J, Gonzalez N, Conde JG. Research electronic data capture (REDCap) - A metadata-driven methodology and workflow process for providing translational research informatics support. Journal of Biomedical Informatics. 2009;42:377–381. doi: 10.1016/j.jbi.2008.08.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- IOM (Institute of Medicine) The CTSA program at NIH: Opportunities for advancing clinical and translational research. Washington, DC: The National Academies Press; 2013. [PubMed] [Google Scholar]
- Ross JS, Tse T, Zarin DA, Xu H, Zhou L, Krumholz HM. Publication of NIH funded trials registered in ClinicalTrials.gov: Cross sectional analysis. BMJ. 2012;344:d7292. doi: 10.1136/bmj.d7292. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rubio DM, Sufian M, Trochim WM. Strategies for a national evaluation of the Clinical and Translational Science Awards. Clinical and Translational Science. 2012;5:138–139. doi: 10.1111/j.1752-8062.2011.00381.x. [DOI] [PMC free article] [PubMed] [Google Scholar]