ABSTRACT
A high‐quality protocol document is essential for the successful and efficient implementation of clinical trials, but there is no consensus on how clinical trial protocol document quality should be evaluated. We used a modified Delphi approach and cognitive interviews to develop a new protocol document quality assessment tool, the Protocol Quality Rating Tool (PQRT). We compiled a checklist of elements that should be included in a high‐quality trial protocol document and asked experts to rate the importance of each element. We developed the PQRT by describing the expected content of each element and identified essential vs. additional (bonus) content to differentiate high‐ versus low‐quality protocol documents and then organized the elements into 18 sections. We revised the PQRT based on feedback from and cognitive interviews with our protocol quality rating team. We then tested the PQRT using ten protocol documents previously approved by the Institutional Review Board. All the protocol quality raters found the tool easy to use and their scores were highly concordant for eight of ten protocol documents. We have developed and tested a simple tool to measure clinical trial protocol document quality and encourage other researchers to evaluate and validate it.
Keywords: clinical trial protocol checklist, clinical trials, cognitive interview, protocol document quality, quality rating tool
Abbreviations
- CTR‐Launch
Clinical and Translational Research‐Launch
- IIT
investigator‐initiated trials
- IRB
institutional review board
- NIH
National Institutes of Health
- PI
principal investigator
- PQRT
protocol quality rating tool
Summary.
- What is the current knowledge on the topic?
-
○A high‐quality clinical trial protocol document can facilitate faster and more efficient discovery and simplify the execution of complex study designs, making the protocol easier for less experienced clinical research staff to accurately implement and decreasing the likelihood of avoidable protocol deviations. There is no accepted scoring system to evaluate clinical trial protocol documents holistically.
-
○
- What question did this study address?
-
○This study described the process of developing and testing the feasibility of a reliable application of a tool to assess the quality of an entire clinical trial protocol document—the protocol quality rating tool (PQRT).
-
○
- What does this study add to our knowledge?
-
○We showed that PQRT is user‐friendly, applicable to a wide range of investigator‐initiated trials, and yields qualitatively concordant scores when applied by experienced clinical trialists.
-
○
- How might this change clinical pharmacology or translational science?
-
○In addition to evaluating clinical trial protocol document quality, PQRT may be used by institutions to assess individual‐ and group‐level changes in clinical trial protocol document quality after completing protocol development training programs; inform investigators where their own knowledge gaps may be to guide the development of a tailored training program and verify whether their protocol document is of high quality.
-
○
1. Introduction
Clinical trials are the gold standard for evaluating the benefits and risks of interventions in clinical research [1]. Regardless of trial design, condition studied, intervention evaluated, or outcomes measured, a high‐quality protocol document is essential for successful and efficient trial implementation. While several scoring systems have been developed to evaluate clinical trial protocol document quality, most have focused on features of trial design; few have evaluated the entire protocol document holistically [2, 3, 4, 5].
A high‐quality protocol document includes scientific information as well as implementation details. It should be comprehensive and appropriately detailed, as well as clear and consistent. It can facilitate faster and more efficient discovery and simplify the execution of complex study designs, making the protocol easier for less experienced clinical research staff to accurately implement and decreasing the likelihood of avoidable protocol deviations. While multiple strategies to improve clinical trial protocol quality have been proposed, including the provision of guidelines, checklists, templates, and internal review, their impact on protocol quality and trial performance has not been rigorously tested. We believe that quality assessment tools are needed for all clinical trial protocol documents. We focused on investigator‐initiated trials (IITs) because many IITs are initiated by less experienced investigators, are often prepared by a single investigator, and with limited support, in contrast to industry‐funded trials or trials initiated by research networks funded by the National Institutes of Health or other sources.
In response to the NIH mandate to accelerate discovery, we proposed a novel initiative to improve clinical trial protocol document quality. Clinical and Translational Research‐Launch (CTR‐Launch) is a single‐center, randomized, controlled parallel‐group study of two protocol writing support interventions: Professional Protocol Writing Support and Protocol Writing Templates with Brief Consultation. CTR‐Launch will enroll investigator‐initiated single‐center clinical trials with the primary endpoint being overall clinical trial protocol document quality. Because there is no consensus on protocol document quality measurement, we used a modified Delphi approach and cognitive interviews to develop a new assessment tool, the Protocol Quality Rating Tool (PQRT). Herein we describe the process of developing and testing the feasibility of reliable application of the PQRT (Figure S1).
2. Methods
Six clinical trialists (three in behavioral and three in drug/device trials) and two faculty statisticians were selected to serve as protocol document quality raters and to contribute to the quality tool development. Raters were selected based on expertise and experience (median 20, range 10–40 years) in designing and executing trials as well as reputation, track record, and nomination by clinical research leaders. This protocol document quality rating team was oriented to the project by the CTR‐Launch Principal Investigator (PI) and project manager.
We first compiled a list of elements that should be included in a high‐quality clinical trial protocol. These elements were drawn from discussions with leaders in our clinical research enterprise and a literature review [6], peer institution websites, standard templates (e.g., NIH Protocol Template for Behavioral and Social Sciences Research [7]) and protocol reporting guidelines (e.g., CONSORT and SPIRIT) [8, 9]. Six stakeholders (two experienced clinical trialists—co‐authors ASL and SLM, one regulatory expert, and three senior research managers) independently rated the importance of the assembled elements and provided comments about their ratings. These stakeholders then met with the CTR‐Launch project manager to review the list of protocol elements and their rankings to select the elements to be included in the quality tool as required elements. The required elements were grouped into protocol sections in accordance with existing protocol templates.
We developed an initial draft of the PQRT that included a description of essential and “bonus” (additional details that differentiate a high‐quality from low‐quality protocol document) content for each element. We shared the initial draft PQRT with the protocol rating team and revised it based on discussions with the raters individually and as a group. We developed criteria for raters to use to generate a numeric quality score for each section and the overall protocol document. The criterion for the overall rating was designed to capture the protocol holistically as opposed to being a sum or average of scores for each section, as sections carry a different weight, which may vary across protocols. The criteria for scoring each section were designed to reflect how well the content, language, and organization of that section support the study's scientific rigor and reproducibility and operational needs. When a section is missing, it is marked as missing or unable to judge. The overall score is higher (indicating lower quality) if that section is applicable to that trial but not impacted if that section is irrelevant to the trial.
To verify that the protocol document quality raters interpreted the PQRT as it was intended and to assess its ease of use, the CTR‐Launch project manager conducted cognitive interviews with seven protocol raters (one rater joined the team after the completion of cognitive interviews). Each rater participated in an audio‐recorded cognitive interview [10]. One day before their interview, raters were asked to read a sample protocol document and use the PQRT to rate that protocol. During cognitive interviews, three main topics were addressed for each PQRT section: (1) what the protocol raters were thinking about when determining their score; (2) ease of assigning a score; and (3) the utility of the numeric rating scale for capturing the quality of that section. Feedback was synthesized by co‐author CZK and reviewed with the protocol rating team. The PQRT was finalized based on information gleaned from the interviews.
We then tested feasibility of reliable application of the PQRT by inviting each protocol rating team member to apply the PQRT to three protocol documents selected from a library of 10 investigator‐initiated single center clinical trial protocol documents previously approved by our IRB and would be eligible for CTR‐Launch (five behavioral, three drugs, two devices). Each protocol document was randomly assigned by the CTR‐Launch PI to three protocol raters for scoring. PI name and other potentially identifiable information were redacted from the protocol documents before sending them for review. We convened the protocol rating team to review the scores and comments on each section to understand how scores were assigned. We reviewed the scores and comments and made minor wording changes to generate the final PQRT, which will be used to rate the protocol document quality in CTR‐Launch.
3. Results
A total of 62 protocol elements were initially identified (Table S1). Of these, 53 were rated by stakeholders and the protocol rating team as “Very Important” to a high‐quality protocol document. These 53 elements were grouped into 18 sections (Table S2).
The initial tool used a scale of 1–5 with intervals of 0.5, with one being the worst and five being the best, but was modified based on protocol raters' feedback to align with the familiar NIH grant scoring rubric based on an integer scale of 1–9, with one being the best and nine being the worst, and quality categorized as high, medium, or low. Aside from this modification, only minor wording changes were made to items. The PQRT therefore comprises 19 scores (one for each section and one overall). Table 1 provides the scoring rubric for the protocol section on endpoints.
TABLE 1.
Protocol quality rating tool—scoring rubric.
| Score | Quality rating | Quality category | Criteria used to evaluate quality of section 6: endpoints |
|---|---|---|---|
| 1 | Exceptional | High quality |
|
| 2 | Outstanding | ||
| 3 | Excellent | ||
| 4 | Very good | Medium quality |
|
| 5 | Good | ||
| 6 | Satisfactory | ||
| 7 | Fair | Low quality |
|
| 8 | Marginal | ||
| 9 | Poor | ||
| M | Missing or unable to judge |
|
|
Note: Example of the rubric for assigning a score to the protocol section on Endpoints (Outcomes). Reviewers are instructed to compare the comprehensiveness, clarity, consistency, and organization of each section of the protocol document against the criteria used to assess quality for that section, and to assign a score for that section.
Overall protocol quality scores for the 10 pilot protocols are shown in Table 2. All the protocol raters reported the PQRT was easy to use, with self‐reported median time to complete scoring of each test protocol of 30 min. As shown, overall scores were highly concordant, with eight of ten protocols having a difference between best and worst scores less than or equal to two, and six of ten protocols in the same quality category (high, medium vs. low) across all three raters. The median difference in overall scores between raters with greater or less than 20 years' experience was 0.9. The main reason for wide differences in overall scores for some protocol documents was found to be due to differences in how raters weighted different sections or the seriousness of a deficiency.
TABLE 2.
Overall protocol document quality score for 10 test protocol documents with each document assessed by 3 members of the protocol quality rating team.
| Trial no. | Type of intervention |
Overall score Rater 1 |
Overall score Rater 2 |
Overall score Rater 3 |
Difference between best and worst overall scores | Quality category based on mean of 3 overall scores |
|---|---|---|---|---|---|---|
| 1 | Device | 9 | 8 | 9 | 1 | Low |
| 2 | Behavioral | 7 | 7 | 7 | 0 | Low |
| 3 | Behavioral | 9 | 5 | 8 | 4 | Low |
| 4 | Drug | 9 | 7 | 8 | 2 | Low |
| 5 | Drug | 2 | 1 | 1 | 1 | High |
| 6 | Device | 7 | 2 | 4 | 5 | Medium |
| 7 | Behavioral | 3 | 2 | 2 | 1 | High |
| 8 | Behavioral | 8 | 8 | 8 | 0 | Low |
| 9 | Drug | 8 | 8 | 6 | 2 | Low |
| 10 | Behavioral | 3 | 4 | 4 | 1 | Medium |
4. Discussion
A high‐quality clinical trial protocol document is essential not only for the success of the scientific goals of the trial but also for its safe and efficient execution [11, 12]. Because there is no consensus on the definition of a high‐quality protocol document and no published clinical trial protocol document quality rating tool, we developed the PQRT.
All the protocol raters indicated the tool was easy to use and the median time required to rate the test protocol documents was 30 min (mean 31, range 10–60 min). We recognized that our protocol raters were experienced clinical trialists and had participated in PQRT development; however, other clinical researchers and experienced clinical research or regulatory staff can be trained on how to use this tool. In anticipation of turnover of protocol raters, we have developed a training process for new raters which involves an initial review of the PQRT and scoring rubric with the CTR‐Launch project manager, followed by reviewing at least two test protocol documents that had been previously scored by the current protocol rating team. If the scores are concordant, the new rater will serve as a secondary reviewer for three protocol documents enrolled in CTR‐Launch in parallel with the current protocol raters and participate in team meetings when scores and comments from other protocol raters will be reviewed and discussed. After that, the new rater will officially join the rating team.
We developed the PQRT as a tool to assess the primary endpoint of CTR‐Launch; however, we believe it has many potential uses. First, institutions that are investing resources into education programs for clinical research faculty, staff, or trainees to develop high‐quality trial protocols may find the tool useful to assess individual‐ and group‐level changes in protocol document quality after completion of these programs. Second, the tool can inform individual investigators where their own knowledge gaps may be to guide development of a tailored training program. Third, the PQRT can be used by investigators who are designing their own trials and protocol writers who are providing writing support to verify their protocol document meets the standards of a high‐quality protocol document.
Although we extensively evaluated, tested, and refined the PQRT, ultimate proof of whether a protocol document judged to be of high quality based on the PQRT score is associated with meeting the scientific goals and safe and efficient execution of the trial remains to be determined. Many factors contribute to clinical trial protocol document quality, including trial complexity, experience of the IIT PI, and experience of the protocol writer (if different from the IIT PI). As CTR‐Launch progresses, we will continue to examine inter‐rater variability in scores. In addition, we will assess whether protocols with better scores have shorter time to initial IRB approval and fewer protocol amendments in the first year after initial IRB approval. We will also provide free access to the PQRT and encourage others to validate.
We acknowledge several limitations to our process. First, we did not formally test inter‐rater reliability by having all 10 protocols in our test library rated by all eight raters. Second, the PQRT may not be the right tool to assess the quality of highly complex multi‐center trial protocol documents. These trials are more likely to be initiated by pharmaceutical companies or supported by clinical research networks with experienced protocol writers and are not the target of CTR‐Launch—single center IITs, often initiated by early career researchers who are less experienced and have less time and resources to develop a high‐quality protocol document. Third, quality assessment is inherently subjective and may be influenced by the raters' expertise, experience, and personal biases. We mitigated these factors by selecting experienced researchers with a long history of success in IITs. In addition, we will use the mean of three protocol rater scores (two clinical trialists with expertise in the same stratum of trials—behavioral vs. drug/device—and one statistician) to determine overall protocol document quality. Furthermore, we will closely monitor discordance and will discuss IITs with a wide range in overall scores at our team meetings. Finally, a high‐quality protocol document based on the PQRT may not be associated with a higher likelihood of success of the trial https://available‐inventions.umich.edu/product/protocol‐quality‐rating‐tool‐pqrt‐for‐clinical‐trials.
We shared our experience in developing a comprehensive tool—the PQRT—to evaluate clinical trial protocol document quality. Our results showed that it is user friendly based on feedback from our protocol raters, applicable to a wide range of IITs, and yields qualitatively concordant scores when applied by experienced clinical trialists. We hope colleagues from other institutions will evaluate the PQRT to determine its usability in a wider range of IITs and settings. We hope that institutions will find value in the adoption of the PQRT as a tool for training investigators and evaluating the efficacy of strategies to improve clinical trial protocol document quality.
Author Contributions
A.K.L., C.Z.K., C.S., S.L.M., J.C.L., and A.S.L. wrote the manuscript; C.S., S.L.M., J.C.L., and A.S.L. designed the research; A.K.L., C.Z.K., and A.S.L. performed the research; A.K.L., C.Z.K., and A.S.L. analyzed the data.
Conflicts of Interest
The authors declare no conflicts of interest.
Supporting information
Data S1.
Acknowledgments
We deeply appreciate input and feedback from our protocol quality rating team: Drs. Tom Braun, Sung Choi, Grant Comer, Peter Higgins, Dinesh Khanna, William Meurer, Jeremy Taylor, and Justine Wu; the experts who reviewed and ranked the initial list of protocol elements: Arijit Bhaumik, Sue Burhop, Sana Shaikh, and Sana Shakour; and the superb editorial assistance of CTR‐Launch project coordinator: Ms. Johanna Carroll.
Funding: This work was supported by NIH Clinical and Translational Science Award to the University of Michigan (UM1 TR004404).
References
- 1. Backmann M., “What's in a Gold Standard? In Defence of Randomised Controlled Trials,” Medicine, Health Care, and Philosophy 20, no. 4 (2017): 513–523, 10.1007/s11019-017-9773-2. [DOI] [PubMed] [Google Scholar]
- 2. Berger V. W. and Alperson S. Y., “A General Framework for the Evaluation of Clinical Trial Quality,” Reviews on Recent Clinical Trials 4, no. 2 (2009): 79–88, 10.2174/157488709788186021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Juni P., Altman D. G., and Egger M., “Systematic Reviews in Health Care: Assessing the Quality of Controlled Clinical Trials,” BMJ 323, no. 7303 (2001): 42–46, 10.1136/bmj.323.7303.42. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. “Protocol Registration Quality Control Review Criteria,” ClinicalTrials.gov, 2024, accessed October 19, 2024, https://clinicaltrials.gov/submit‐studies/prs‐help/protocol‐registration‐quality‐control‐review‐criteria.
- 5. “Ctti Quality by Design Project ‐ Critical to Quality (ctq) Factors Principles Document,” Clinical Trials Transformation Initiative, 2015, accessed October 19, 2024, https://ctti‐clinicaltrials.org/wp‐content/uploads/2021/07/CTTI_QbD_Workshop_Principles_Document‐2.pdf.
- 6. Ciolino J. D., Spino C., Ambrosius W. T., et al., “Guidance for Biostatisticians on Their Essential Contributions to Clinical and Translational Research Protocol Review,” Journal of Clinical and Translational Science 5, no. 1 (2021): e161, 10.1017/cts.2021.814. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. NIH Office of Behavioral and Social Sciences Research , “Clinical Trials Protocol Template for the Behavioral and Social Sciences,” accessed December 5, 2024, https://obssr.od.nih.gov/research‐resources/bssr‐clinical‐trials.
- 8. Butcher N. J., Monsour A., Mew E. J., et al., “Guidelines for Reporting Outcomes in Trial Reports: The Consort‐Outcomes 2022 Extension,” JAMA 328, no. 22 (2022): 2252–2264, 10.1001/jama.2022.21022. [DOI] [PubMed] [Google Scholar]
- 9. Butcher N. J., Monsour A., Mew E. J., et al., “Guidelines for Reporting Outcomes in Trial Protocols: The Spirit‐Outcomes 2022 Extension,” JAMA 328, no. 23 (2022): 2345–2356, 10.1001/jama.2022.21243. [DOI] [PubMed] [Google Scholar]
- 10. Willis G. B., Cognitive Interviewing (SAGE Publications, Inc., 2005). [Google Scholar]
- 11. Fogel D. B., “Factors Associated With Clinical Trials That Fail and Opportunities for Improving the Likelihood of Success: A Review,” Contemporary Clinical Trials Communications 11 (2018): 156–164, 10.1016/j.conctc.2018.08.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Smith Z., Bilke R., Pretorius S., and Getz K., “Protocol Design Variables Highly Correlated With, and Predictive of, Clinical Trial Performance,” Therapeutic Innovation & Regulatory Science 56, no. 2 (2022): 333–345, 10.1007/s43441-021-00370-0. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data S1.
