The 1962 amendments to the Food, Drug, and Cosmetic Act of 1938 changed the drug authorization process of premarket notification to a mandatory approval system. It gave the US Food and Drug Administration (FDA) the power to refuse the marketing of a drug based not only on safety but, thereafter, also on efficacy. The requirement of substantial evidence of efficacy by adequate and well-controlled investigations conducted by experts qualified by scientific training and experience made it possible for the FDA to introduce standards in research.1 This led to the categorization of drug research into three phases in the Investigational Drug Regulation of 1963, thereby establishing a codified, prospective research design.1 In 1964, Investigational New Drug (IND) forms were created for each of the three phases, embedding them more firmly into medical research. From the onset of this system, discussions have taken place about the appropriateness of phasing experiments in the drug approval process,2 but by the 1970s it was the textbook example for clinical research. Currently, the phasing of research even plays a role in the financial markets—transitions between phases mark the largest movements in pharmaceutical stock value.1 It is thus not surprising that critique of and changes to drug development often take place within the context of the phasing research paradigm.
Phase I studies are designed to determine the maximum tolerated dose (MTD) of a drug and are potentially followed by phase II trials, which investigate whether the drug at the MTD has any promising effects. Phase III trials establish whether the new drug is more efficacious than available treatments, usually in randomized controlled trials. The assumption behind finding the MTD in phase I studies is the existence of parallel but offset dose-toxicity and dose-efficacy relationships for small-molecule drugs. However, for new biopharmaceuticals, this is not necessarily the case. For instance, the pharmacology of replication-competent viral vectors may display nonclassic dose–response curves because of complex interactions modulated by the innate and adaptive immune systems.3
Similarly, the relationship between dosing and toxicity and effectiveness may be less predictable for cell-based interventions because the cells mediate multiple, potentially competing effects. For example, increasing doses of T lymphocytes in a donor hematopoietic stem cell graft can increase the recipient's risk of developing undesirable graft-versus-host disease (GVHD) but simultaneously decrease the risk of leukemic relapse by augmenting the graft-versus-leukemia reaction. Determining the optimal dose of T cells in hematopoietic stem cell transplants thus depends on a major complication and a welcome side effect.4 The successive phase I/II trial may not be the most efficient way to identify a safe (and efficacious) dose for these new biopharmaceuticals. For interventions using stem cells there is additional complexity; the stem cells could proliferate, or differentiate into distinct populations with diverse positive or negative functions, or mutate and become tumorigenic, making it difficult to establish a safe dose. Moreover, for stem cell interventions standardization is complicated by the variability of cell lines and the variability introduced by the different clinicians performing the transplantation.
It is therefore not surprising that some professionals in the stem cell field have reservations about the suitability of the phasing research “paradigm” for first-in-human (FIH) pluripotent stem cell (PSC) interventions.5 Indeed, the exclusive focus on safety in phase I has been a point of contention, both among professionals and in the academic literature.5,6,7,8,9,10 Surprisingly, in the new regulatory system in Japan, the determination of efficacy has shifted from premarket clinical trials to a postmarket mechanism.11 By contrast, in this paper we argue that efficacy testing should be moved forward from phases II and III to FIH studies as well. Under certain conditions, PSC-based interventional FIH studies should have compatible safety and efficacy end points.
The case for testing safety and efficacy
Compatible efficacy and safety end points allow participants to gain a possible direct benefit. FIH studies are ethically the most challenging of all phases because finding a balance between the anticipated risks and anticipated benefits is difficult; indeed, risks cannot be reliably evaluated. Moreover, for new biopharmaceuticals, more uncertainty may exist owing to a lack of experience with interventions that are similar in mode of action. The typical irreversibility of interventions is an added complication. Participants may thus be exposed to high risks and burdens. Direct benefits (therapeutic benefits to research participants received via the intervention tested) are not expected, as the aim is to find the MTD. To justify initiating an FIH study, direct benefits cannot be weighed. However, in certain cases the study can be designed in such a way that there is at least a possibility that participants will gain a therapeutic benefit. The main argument in favor of designing an FIH trial to test efficacy is the potential for a participant to gain a direct benefit and thus balance the exposure to possible severe harms and heavy burdens.6 Although it is true that FIH studies are not designed to offer a prospect of direct benefit,8 and there is little empirical evidence of direct benefits in FIH studies, this very low probability of the chance to benefit is merely one dimension of direct benefits.7 Other dimensions are the nature and the magnitude of the anticipated benefit.12 Particularly for interventions for currently untreatable diseases that have a high morbidity or mortality, it may be desirable to design studies to offer at least a minimal chance of direct benefit. This can be done by using what is expected to be a therapeutic dose instead of starting with subtherapeutic levels, and by enrolling patients within a therapeutic window rather than refractory patients.
According to Hess, intending to provide clinical benefit to subjects in early-phase stem cell–based interventions in the brain is an ethical duty; she argues that the sole aim of generating scientific knowledge can never justify submitting participants (children) to the burdens of an invasive brain procedure.6 Gilbert and colleagues similarly note that modification of phase I end points to include efficacy as a target could promote acceptance of invasive and irreversible risks in optogenetic trials.9 Although adding efficacy end points will be necessary if we would like participants to have a minimum possibility of gaining some therapeutic benefit, it should be clear that the likelihood that participants would benefit, as well as the likelihood that FIH trials will generate data regarding efficacy, is extremely small. Indeed, only about 8–10% of interventions tested in FIH studies lead to market authorization,13,14 and the rate for novel interventions is lower.15
A major problem with designing trials to test safety and efficacy is an increased risk of therapeutic misconception, i.e., a lack of understanding that the main purpose of research is to produce generalizable knowledge.16 This has been a major aspect of discussion of the first FIH embryonic stem cell (hESC) trial by Geron.17,18,19 Subacute complete spinal cord injury (SCI) patients without neurological function below their torso were injected with hESC-derived oligodendrocyte progenitor cells. These patients had an open therapeutic window, meaning that there was a possibility that they would be affected by the cells, in contrast to chronic patients in whom spinal cord damage has caused scarring and the injected cells would probably not have any effect. However, the patients had to decide within two weeks of their trauma whether to enroll in the trial. They were therefore likely to be emotionally unstable, and they had no knowledge of what life is like with a spinal cord injury. Criticism was expressed because the therapeutic misconception is higher for these patients than for chronic complete-SCI patients.19 Here, a trade-off existed between the optimal trial design for obtaining valid informed consent, and the optimal trial design to provide a possible benefit to participants. Careful consideration is necessary when establishing the goal of FIH studies and choosing a subject population.16,20,21,22 In addition, choosing subacute patients also affected the risks, as enrolling in the trial may have prevented spontaneous recovery, which is no longer feasible for chronic SCI patients. A trade-off between the optimal trial design for efficacy testing and the optimal design to reduce risks was thus another concern with the Geron trial. We elaborate on this trade-off later.
Abandoning effective interventions. Some stem cell researchers speculate that promising interventions might be discarded at the phase I level even though they might be effective.23 When participants are harmed in phase I trials, companies are unlikely to continue testing because marketing the product could lead to future litigation. Moreover, negative trial results will reduce stock value, making it more difficult to continue an expensive drug development process. Research may thus be halted before efficacy has even been tested. Indeed, soon after the introduction of the three phases in 1963, pharmaceutical companies anticipated approval of drugs;1 where approval was unlikely, development was halted. Because of the heavy initial emphasis on safety, not only might we fail to develop a potentially effective intervention, but some professionals believe it may lead to a lack of designing subsequent stages of research and possibly a missed opportunity to require strong preclinical evidence of efficacy.5 Moreover, an important point that should not be overlooked is the potential ability to reduce the risks and side effects and improve the outcome once a procedure is in development or has been adopted. This is especially important for life-threatening diseases. For example, the poor outcome following the initial use of hematopoietic stem cell transplantation before 196924 has since improved because of improvements in the understanding of transplant immunology and infectious diseases along with the availability of better immunosuppressive, antiviral, antibacterial, and antifungal agents.25 Similar developments might be expected with PSC or other biologic/cell therapies. Rejecting these innovative procedures solely on the basis of a high-risk outcome of phase I might eliminate an effective treatment that can be refined and made increasingly safe.
The twinning of efficacy and safety. A report published in 1944 by the FDA and the American Medical Association's Council on Pharmacy and Chemistry claimed that efficacy of a drug could not be separated from its safety. Each can be evaluated only in terms of the other.26 The first reason is simply that the dose of medication, or intervention, influences both risks and efficacy. Second, safety is a judgment. Depending on the therapeutic value of a drug (as well as the existing availability of alternative medication, the incidence of the risk, and the age and prognosis of the patient), the same side effects and harm can be seen as either tolerable or unacceptable. Again, while allogeneic hematopoietic stem cell transplantation is a toxic, high-risk procedure with severe side effects that can be fatal, for the large number of procedures performed annually, its safety profile is considered tolerable for a disease such as acute leukemia given its highly fatal prognosis.
Despite the intertwining of efficacy and safety, a trade-off can exist between the optimal trial design for efficacy testing and the optimal design to reduce risks. For example, in a phase I study in patients with amyotrophic lateral sclerosis, neural stem cells were injected within the lumbar spinal cord.27 If the trial had been designed to provide potential direct benefits, cells should have been injected at the level of the cervical spine. However, if harm would occur, it would be more severe with an injection at the cervical level. The decision was made to maximize safety for the participants, and therefore the injections were administered to the lumbar region.28 Although we would argue for the additional aim of efficacy testing in phase I studies, we agree with the rationale here. The decisions should be made on a case-by-case basis.
Another often-discussed problem concerning testing efficacy in phase I trials is the larger sample size needed to establish efficacy, meaning that more participants will be exposed to risks.20 We agree that statistical evidence for efficacy might not be obtained in phase I with a minimal amount of participants. However, phase I trials do not provide any statistical evidence for safety either. Indeed, only great harm and/or high-frequency risks will be identified in phase I trials because of the limited number of participants. Although risks and benefits are measured in early-phase trials, long-term safety and efficacy are not confirmed until the completion of phase III trials, or even after market approval.29 Thus, neither efficacy nor safety can be confirmed with a limited number of participants; it is only possible to observe trends. We therefore do not argue for an increase in the number of participants in FIH studies, only for a focus on compatible safety and efficacy end points.
Testing efficacy in FIH trials
We believe that the arguments above give sufficient reasons for testing some efficacy end points in addition to safety, for new and invasive biopharmaceuticals such as pluripotent stem cells. This decreases the chance of abandoning effective interventions and will provide participants the potential to benefit. Moreover, researchers can aim to reduce the risks and side effects during drug development. Supporting arguments for our thesis can be found in the literature on clinical translation of complex innovative interventions. Gilbert et al have argued that bringing efficacy forward to phase I may increase the predictability of later-phase trials and thereby possibly reduce the late-phase attrition rate. In addition, dual safety and efficacy end points for optogenetic trials may prevent retesting for safety in phase I should traditional phase II trials yield suboptimal results.9 Hey and Kimmelman have proposed what they call a risk-escalation model of early-phase trials, a compromise between maximizing benefits to research participants and maximum avoidance of unintended harm in early-phase trials.10 Their model avoids catastrophic losses while providing (at least) minimal information. However, there are challenges in allowing efficacy to be tested simultaneously with safety in phase I trials. Below, we provide some preconditions and suggestions for efficacy testing in phase I.
Preconditions and suggestions for efficacy as additional primary goals of phase I. First, testing efficacy in FIH studies should never be used to evade the stricter regulation of nontherapeutic trials. Although in certain circumstances we believe that phase I trials should be designed to allow participants to benefit, it should always be designated as nontherapeutic research. This is also important because the therapeutic misconception should be avoided, especially when efficacy will be addressed in FIH studies. Researcher–physicians should clearly convey the message that FIH studies are performed to obtain information for subsequent trials. Providing a benefit to the participant is not the aim of research. To help prevent misunderstanding, it may be preferable to have someone independent of the research team obtain the informed consent.
Second, the importance of the safety of the participants should always remain the main focus of researchers–clinicians. The investigators and independent regulatory bodies should carefully consider the trade-offs between safety and efficacy because for recruiting efforts to succeed society needs to have faith in the medical research community.
Third, for novel interventions, only preclinical data of efficacy are indicative of clinical promise in humans, as there is no precedent yet. Kimmelman and Henderson have proposed a two-step (evidence-based) process for reviewers to use to assess preclinical efficacy. In a first step, all preclinical evidence should be collected and evaluated in the light of potential threats to validity. In a second step, the outcome should be assessed by examining how similar interventions, supported by similar preclinical data, have fared in the translational process.30
Fourth, if efficacy is tested in FIH studies, it should be transparent which aspect of efficacy is being tested: surrogate outcomes, such as successful engraftment of transplanted cells, and/or clinical outcomes, such as better eyesight. Research ethics committees do not have a common language or a common approach for assessing these benefits in human research,31 and consent forms, as well as investigators' discussions with participants, are vague and ambiguous with respect to benefit, making this a critical issue.32
Fifth, to test both safety and efficacy it is important that statisticians and clinical pharmacologists discuss which trial design (dose-finding methodology) should be chosen. Lessons can be learned from oncology, in which a shift has taken place toward integrating phase I and phase II in order to accelerate drug development.33 Similarly, in cancer immunotherapy a change has been proposed in the clinical development process.34 The Cancer Vaccine Clinical Trial Working Group, with representatives from academia, the pharmaceutical and biotechnology industries, and the FDA, defined a new clinical development paradigm for cancer vaccines and related biologics. The PSC field can learn from both the process of interdisciplinary working groups and the discussion of this particular working group. A clinical development program has been recommended that consists of a two-phase drug development process: a proof-of-principle trial and an efficacy trial.34 The proposed proof-of-principle trial examines safety, dose and schedule, and biologic activity.
We believe that for FIH PSC studies it is also very important to demonstrate proof of mechanism. This requires a thorough preclinical understanding of the mechanism, and FIH studies should examine the biological activity of the mechanism in action, itself indicative of efficacy. Additional parameters linking the intervention (e.g., PSC-derived cell injection) with the clinical outcome (e.g., walking better) should be investigated. For example, for PSC injection in a damaged spinal cord, studies should examine the electrophysiological improvement in nerve conduction through the spinal cord lesion, demonstrate repair in appropriate locations via imaging studies (magnetic resonance or positron emission tomography) and/or studies showing engraftment of PSC-derived cells. Statistically, it would be more powerful if all the mechanisms were aligned, and this may reduce the number of patients required to demonstrate some evidence of efficacy in an FIH study. This is even more important when PSC studies are carried out in patients with rare diseases because randomized controlled trials (phase III) are often not feasible owing to a limited number of patients. Furthermore, if mechanistic effects cannot be confirmed by a trial, it may at least provide the starting point for determining why it behaved differently.15 This information can be used for additional preclinical studies and early-phase trials as well as subsequent phase II trials.15
Conclusion
The phasing of research is strongly embedded in the medical research field and is rarely addressed explicitly. However, changes are taking place, as is evident by the introduction of innovative trial designs such as sequential, adaptive, and pragmatic trials, especially in oncology. Although examining risk should remain the default aim of FIH studies, for (pluripotent) stem cell interventions it seems ethically desirable for participants to have at least a chance to benefit. Moreover, promising interventions may be identified in FIH studies and not discarded before their efficacy can be judged. Indeed, the safety of interventions is assessed on the basis of risks and efficacy.
Acknowledgments
We thank the reviewers for their helpful and constructive comments. We acknowledge funding from the Netherlands Organization for Health Research and Development (Veni grant 016.136.093).
References
- Daniel, C (2014). Reputation and Power: Organizational Image and Pharmaceutical Regulation at the FDA. Prince-ton University Press: Princeton, NJ. [Google Scholar]
- US Department of Health, Education, and Welfare (1977). Review Panel on New Drug Regulation. Interim Reports, vol. II, section A. Washington, DC. [Google Scholar]
- Le Bœuf, F, Batenchuk, C, Vähä-Koskela, M, Breton, S, Roy, D, Lemay, C et al. (2013). Model-based rational design of an oncolytic virus with improved therapeutic potential. Nat Commun 4: 1974. [DOI] [PubMed] [Google Scholar]
- Gooley, TA, Martin, PJ, Fisher, LD and Pettinger, M (1994). Simulation as a design tool for phase I/II clinical trials: an example from bone marrow transplantation. Control Clin Trials 15: 450–462. [DOI] [PubMed] [Google Scholar]
- Habets, MGJL, van Delden, JJM and Bredenoord, AL (2016). Studying the lay of the land: views and experiences of professionals in the translational pluripotent stem cell field. Regen. Med 11: 63–71. [DOI] [PubMed] [Google Scholar]
- Hess, P (2012). Intracranial stem cell-based transplantation: reconsidering the ethics of phase 1 clinical trials in light of irreversible interventions in the brain. AJOB Neurosci 3: 3–13. [Google Scholar]
- Miller, FG and Joffe, S (2008). Benefit in phase 1 oncology trials: therapeutic misconception or reasonable treatment option? Clin Trials 5: 617–623. [DOI] [PubMed] [Google Scholar]
- Ross, L (2006). Phase I research and the meaning of direct benefit. J Pediatr 149: S20–S24. [DOI] [PubMed] [Google Scholar]
- Gilbert, F, Harris, AR and Kapsa, RMI (2014). Controlling brain cells with light: ethical considerations for optogenetic clinical trials. AJOB Neurosci 5: 3–11. [Google Scholar]
- Hey, SP and Kimmelman, J (2014). The risk-escalation model: a principled design strategy for early-phase trials. Kennedy Inst Ethics J 24: 121–139. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sipp, D (2015). Conditional approval: Japan lowers the bar for regenerative medicine products. Cell Stem Cell 16: 353–356. [DOI] [PubMed] [Google Scholar]
- King, NM (2000). Defining and describing benefit appropriately in clinical trials. J Law Med Ethics 28: 332–343. [DOI] [PubMed] [Google Scholar]
- Hay, M, Thomas, DW, Craighead, JL, Economides, C and Rosenthal, J (2014). Clinical development success rates for investigational drugs. Nat Biotechnol 32: 40–51. [DOI] [PubMed] [Google Scholar]
- US Food and Drug Administration (2004). Innovation/Stagnation: Challenge and Opportunity on the Critical Path to New Medical Products <http://www.fda.gov/ScienceResearch/SpecialTopics/CriticalPathInitiative/CriticalPathOpportunitiesReports/ucm077262.htm>.
- Kimmelman, J (2010). Gene Transfer and the Ethics of First-in-Human Research : Lost in Translation. Cambridge University Press: Cambridge, UK. [Google Scholar]
- Henderson, GE, Churchill, LR, Davis, AM, Easter, MM, Grady, C, Joffe, S et al. (2007). Clinical trials and medical care: defining the therapeutic misconception. PLoS Med 4: e324. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Solbakk, JH and Zoloth, L (2011). The tragedy of translation: the case of ‘first use' in human embryonic stem cell research. Cell Stem Cell 8: 479–481. [DOI] [PubMed] [Google Scholar]
- Wirth, E, Lebkowski, JS and Lebacqz, K (2011). Response to Frederic Bretzner et al., ‘Target populations for first-in-human embryonic stem cell research in spinal cord injury.' Cell Stem Cell 8: 476–478. [DOI] [PubMed] [Google Scholar]
- Bretzner, F, Gilbert, F, Baylis, F and Brownstone, RM (2011). Target populations for first-in-human embryonic stem cell research in spinal cord injury. Cell Stem Cell 8: 468–475. [DOI] [PubMed] [Google Scholar]
- Dresser, R (2009). First-in-human trial participants: not a vulnerable population, but vulnerable nonetheless. J Law Med Ethics 37: 38–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kimmelman, J (2007). Stable ethics: enrolling non-treatment-refractory volunteers in novel gene transfer trials. Mol Ther 15: 1904–1906. [DOI] [PubMed] [Google Scholar]
- Nada, A and Somberg, J (2007). First-in-man (FIM) clinical trials post-TeGenero: a review of the impact of the TeGenero trial on the design, conduct, and ethics of FIM trials. Am J Ther 14: 594–604. [DOI] [PubMed] [Google Scholar]
- Habets, MG, van Delden, JJ and Bredenoord, AL (2016). Studying the lay of the land: views and experiences of professionals in the translational pluripotent stem cell field. Regen Med 11: 63–71. [DOI] [PubMed] [Google Scholar]
- Bortin, MM (1970). A compendium of reported human bone marrow transplants. Transplantation 9: 571–587. [DOI] [PubMed] [Google Scholar]
- Malard, F, Chevallier, P, Guillaume, T, Delaunay, J, Rialland, F, Harousseau, J-L et al. (2014). Continuous reduced nonrelapse mortality after allogeneic hematopoietic stem cell transplantation: a single-institution's three decade experience. Biol Blood Marrow Transplant 20: 1217–1223. [DOI] [PubMed] [Google Scholar]
- US Congress. Office of Technology (1978). Assessing the Efficacy and Safety of Medical Technologies <http://digital.library.unt.edu/ark:/67531/metadc39383/>.
- Glass, JD, Boulis, NM, Johe, K, Rutkove, SB, Federici, T, Polak, M et al. (2012). Lumbar intraspinal injection of neural stem cells in patients with amyotrophic lateral sclerosis: results of a phase I trial in 12 patients. Stem Cells 30: 1144–1151. [DOI] [PubMed] [Google Scholar]
- Yarborough, M, Tempkin, T, Nolta, J and Joyce, N (2012). The complex ethics of first in human stem cell clinical trials. AJOB Neurosci. 3: 14–16. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Woodcock, J and Woosley, R (2008). The FDA Critical Path Initiative and its influence on new drug development. Annu Rev Med 59: 1–12. [DOI] [PubMed] [Google Scholar]
- Kimmelman, J and Henderson, V (2016). Assessing risk/benefit for trials using preclinical evidence: a proposal. J Med Ethics 42: 50–53. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Churchill, LR, Nelson, DK, Henderson, GE, King, NMP, Davis, AM, Leahey, E et al. (2003). Assessing benefits in clinical research: why diversity in benefit assessment can be risky. IRB Ethics Hum Res 25: 1–8. [PubMed] [Google Scholar]
- Henderson, GE, Davis, AM, King, NMP, Easter, MM, Zimmer, CR, Rothschild, BBet al.. (2004). Uncertain benefit: investigators' views and communications in early phase gene transfer trials. Mol Ther 10: 225–231. [DOI] [PubMed] [Google Scholar]
- Wages, NA and Tait, C (2014). Seamless phase I/II adaptive design for oncology trials of molecularly targeted agents. J Biopharm Stat 25: 903–920. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hoos, A, Parmiani, G, Hege, K, Sznol, M, Loibner, H, Eggermont, A et al. (2007). A clinical development paradigm for cancer vaccines and related biologics. J Immunother 30: 1–15. [DOI] [PubMed] [Google Scholar]