Abstract
The different actors involved in health system decision-making and regulation have to deal with the question which are valid parameters to assess the health value of health technologies.
So called surrogate endpoints represent in the best case preliminary steps in the casual chain leading to the relevant outcome (e. g. mortality, morbidity) and are not usually directly perceptible by patients. Surrogate endpoints are not only used in trials of pharmaceuticals but also in studies of other technologies. Their use in the assessment of the benefit of a health technology is however problematic.
In this report we intend to answer the following research questions: Which criteria need to be fulfilled for a surrogate parameter to be considered a valid endpoint? Which methods have been described in the literature for the assessment of the validity of surrogate endpoints? Which methodological recommendations concerning the use of surrogate endpoints have been made by international HTA agencies? Which place has been given to surrogate endpoints in international and German HTA reports?
For this purpose, we choose three different approaches. Firstly, we conduct a review of the methodological literature dealing with the issue of surrogate endpoints and their validation. Secondly, we analyse current methodological guidelines of HTA agencies members of the International network of agencies for Health Technology Assessment (INAHTA) as well as of agencies concerned with assessments for reimbursement purposes. Finally, we analyse the outcome parameter used in a sample of HTA reports available for the public.
The analysis of methodological guidelines shows a very cautious position of HTA institutions regarding the use of surrogate endpoints in technology assessment. Surrogate endpoints have not been prominently used in HTA reports. None of the analysed reports based its conclusions solely on the results of surrogate endpoints. The analysis of German HTA reports shows a similar pattern.
The validation of a surrogate endpoint requires extensive research, including randomized controlled trials (RCT) assessing clinical relevant endpoints. The validity of a surrogate parameter is rather technology-specific than disease-specific. Thus – even in the case of apparently similar technologies – it is necessary to validate the surrogate for every single technology (i. e. for every single active agent).
The use of surrogate endpoints in the assessment of the benefit of health technologies is still to be seen very critically.
Abstract
Die Frage nach den Ergebnisparametern, die für eine valide Bewertung des Nutzens medizinischer Technologien verwendet werden können, beschäftigt alle an Entscheidungsfindungen und an der Regulierung im Gesundheitssystem beteiligten Akteure.
Während die klinisch relevanten Endpunkte jene sind, die für den Patienten belangvoll sind (z. B. Morbidität, Mortalität), stellen Surrogatendpunkte im besten Fall Vorstufen der eigentlichen klinisch relevanten Endpunkte dar, die für den Patienten in der Regel nicht unmittelbar spürbar sind. Surrogatparameter werden nicht nur in Studien über die Wirksamkeit von Arzneimitteln eingesetzt, sondern auch in Studien zu anderen Technologien. Der Einsatz von Surrogatendpunkten in der Bewertung des Nutzens von Gesundheitstechnologien ist jedoch problematisch.
In diesem Bericht wird folgenden Fragen nachgegangen:
Welche Kriterien muss ein Surrogatparameter erfüllen, um als valider Endpunkt angesehen werden zu können?
Welche Methoden werden zur Validierung von Surrogatendpunkten in der Literatur diskutiert?
Welche methodischen Vorgaben machen internationale Agenturen aus dem Bereich Health Technology Assessment (HTA) oder Arzneimittelnutzenbewertung hinsichtlich des Einsatzes von Surrogatendpunkten?
Welchen Stellenwert haben Surrogatendpunkte in HTA-Berichten internationaler HTA-Agenturen und in Berichten, die in Deutschland produziert werden?
Entsprechend dieser Fragestellungen werden drei verschiedene methodische Herangehensweisen gewählt: Reviews der einschlägigen methodischen Literatur zu Surrogatendpunkten und ihre Validierung, Analyse der aktuellen Methodenpapiere von HTA-Institutionen, Analyse von abgeschlossenen sowie öffentlich zugänglichen internationalen und deutschen HTA-Berichten.
Zusammenfassend zeigen die Empfehlungen der hier berücksichtigten Institutionen dahingehend eine kritische Einstellung zur Verwendung von Surrogatendpunkten in HTA. Es lässt sich zudem feststellen, dass Surrogatendpunkte einen geringen Stellenwert in HTA-Berichten haben. In keinem der untersuchten HTA-Berichte wird die Bewertung von nicht diagnostischen Technologien ausschließlich auf die Ergebnisse von Surrogatendpunkten gestützt. Die Ergebnisse der Analyse deutscher HTA-Berichte sind annährend identisch zu denen internationaler.
Die Validierung eines Surrogatendpunkts erfordert ausführliche Forschung, einschließlich der Durchführung von randomisierten kontrollierten Studien (RCT) mit klinisch relevanten Endpunkten. Die Validität eines Surogatendpunkts ist eher technologie- als krankheitsspezifisch, so dass die Ergebnisse der Validierung eines Surrogatendpunkts nicht auf andere Technologien übertragen werden können (auch nicht bei einem angeblich ähnlichen Wirkmechanismus). Um ein Höchstmaß an Sicherheit zu erreichen, muss die Validität eines Surrogatendpunkts bei jeder Technologie bzw. jedem Wirkstoff einzeln geprüft werden.
Nach wie vor ist der Einsatz von Surrogatendpunkten bei der Nutzenbewertung sehr kritisch zu betrachten.
Executive Summary
1. Health policy background
The issue which are valid and acceptable parameters for the assessment of the health benefit of the application of health technologies is recurrently discussed among the different actors and stakeholders in the health system. In Germany, with the establishment of the Institute for Quality and Efficiency in Health Care (IQWiG) in 2004 the discussions on the methods for health technology assessment in general and on the use of surrogate endpoints in particular have recently gained actuality und publicity.
2. Scientific background
So called surrogate endpoints represent in the best case preliminary steps in the casual chain leading to the relevant outcome (e. g. mortality, morbidity) and are not usually directly perceptible by patients. Characteristics of surrogate endpoints are:
They are measured in lieu of the actually relevant outcome of interest.
They are usually biochemical markers, physiological parameters or subclinical endpoints which for the patient are not directly perceptible. However, they are correlated with relevant clinical endpoints (e. g. high blood pressure is associated with higher risk of stroke, high LDL-cholesterine (LDL = Low density lipoprotein) is a risk factor for a heart attack, the CD4-cell count is associated with AIDS mortality).
Changes in the surrogate are easier to observe than changes in the related relevant endpoint (i. e. the occur earlier and more commonly).
Surrogates are sometimes named intermediate or intermediary outcomes, since they represent an intermediate step in the casual chain leading to the clinical relevant endpoint.
Surrogate parameters are statistically associated with the clinical relevant outcome and have prognostic power.
The association between the surrogate and the relevant endpoint is plausible from a biological and pathophysiological point of view.
Surrogate endpoints are not only used in trials of pharmaceuticals but also in studies of other clinical technologies. Parameters with an intermediary character are also applied in the field of community and public health interventions.
Their use in the assessment of the benefit of a health technology is however problematic. In the past, reliance on surrogate outcomes has led to false conclusions concerning the effects of a technology on the relevant health outcome. In many situations, relying on the strong correlation observed between surrogate and relevant endpoint to find an intervention has had fatal consequences (i. e. positive effects on the surrogate but increased mortality with the intervention in question). The problematic is known since around 30 years. A classical example of the potential for fatal consequences as a result of reliance on surrogate is the case of class I antiarhythmic drugs. Some drugs have been removed from market after the observation of an increased mortality or morbidity with their use, contrary to the expectations raised by the observation of positive effects on a surrogate endpoint. In other occasions, reliance on surrogate has also led to withholding effective therapies. For example, for many years betablockers – due to the bradicardic effect – were considered to be contraindicated in patients with heat failure, since following pathophysiological reasoning a reduction of the heart rate was thought to have deletereous effects in this patients.
3. Research questions
Which criteria need to be fulfilled for a surrogate parameter to be considered a valid endpoint?
Which methods have been described in the literature for the assessment of the validity of surrogate endpoints?
Which methodological recommendations concerning the use of surrogate endpoints have been made by international HTA agencies?
Which place has been given to surrogate endpoints in international and German HTA reports?
4. Methods
According to the above mentioned research questions, we follow different methodological approaches.
In order to answer research questions 1 and 2, related to the concepts and the methods of surrogate validation we conduct a systematic review of methodological papers. Electronic databases are searched with the following terms:
SURROGATE END POINT; SURROGATE END POINTS; SURROGATE ENDPOINT; SURROGATE ENDPOINTS; ENDPOINT, SURROGATE; ENDPOINTS, SURROGATE; END POINT, SURROGATE; END POINTS,SURROGATE; BIOLOGICAL MARKER; BIOLOGICAL MARKERS; VALIDATION; STATISTICS; BIOMETRY; DECISION SUPPORT TECHNIQUES; ENDPOINT DETERMINATION; CAUSALITY
The methodological literature is summarised in a narrative review consisting of two parts: 1. an overview of the criteria to be fulfilled by a surrogate in order to be considered acceptable. 2. an overview of the statistical methods proposed in the literature for the validation of surrogates.
In order to answer research question 3 we analyse the methodological guidelines and recommendations of HTA agencies, member of the International network of agencies for Health Technology Assessment (INAHTA), and of agencies involved in pharmaceutical pricing and reimbursement decisions.
In order to answer research question 4 we analyse a random sample of HTA reports from the HTA database. We extracted the type of outcome parameter used and reported in the HTA reports. We analyse in the same way the full sample of HTA reports procured in Germany and registered in the database of the German Agency for HTA.
5. Results
5.1 Literature review
The literature search yields a total of N=1,109 hits. After checking title and abstract, n=2 duplicates and n=1,007 references lacking references are excluded. A total of n=100 papers is retrieved for more detailed analysis. At the end n=25 methodological papers are summarised in the review on criteria and validation methods.
The criteria that a surrogate parameter need to fulfil in order to be recognized an acceptable and valid endpoint can be summarised as follows:
Biological plausibility: There is evidence from animal models and epidemiological studies of a causal relationship between the surrogate parameter and the clinical relevant endpoint. The surrogate is part of the pathophysiological causal path leading to the health outcome.
Magnitude of the association between surrogate and relevant endpoint: Epidemiological evidence has shown repeatedly and consistently that changes in the surrogate are qualitative and quantitative associated with changes in the relevant health outcome.
Evidence of effect form randomized controlled trials (RCT): There is evidence from RCT showing that the changes induced by an intervention in the surrogate lead to changes in the relevant outcome in the same direction. The effect of the intervention is fully captured by the surrogate. Even in the case of very similar active principles, the mechanism of action may differ. Thus, the transferability of conclusions on the validity of a surrogate from one technology to another needs to be carefully assessed.
In the full report, we summarise the different statistical methods discussed in the literature for the validation of surrogate endpoints. In summary, we conclude that there is no goldstandard for the validation of surrogate endpoints. Since the generalisation of results from single studies is more prone to produce fallacies, approaches summarising results from several studies (i. e. meta-analysis) are preferred.
5.2 Analysis of methodological guidance from HTA agencies
A total of 23 methodological papers from 14 INAHTA members (eleven countries) is identified. In addition, eleven further methodological guidelines from agencies involved in pricing and reimbursement decisions are found. We extract their recommendations concerning the selection of outcome parameters in general and the use of surrogate endpoints in particular.
A total of 13 from 23 analysed INAHTA member methodological papers’ and seven of eleven from “fourth-hurdle agencies“ provide information on how to choose outcome parameters for the assessment. All institutions agree, that patient relevant outcome parameters are strongly preferred in the assessment of the benefit of a health technology. All agencies underline that hard outcome parameters are to be preferred to surrogate endpoints. Nevertheless, the majority of agencies describes that under some circumstances surrogate endpoints may exceptionally be accepted – provided the validity of the surrogate is well established. In order to accept a surrogate, HTA agencies require the presentation of evidence which supports the causal relationship between surrogate and clinical relevant endpoint.
None of the methodological guidance papers from HTA agencies provided a list of well established/generally accepted surrogate endpoints.
5.3 Survey of HTA reports
A total of 140 HTA reports from INAHTA members and of 131 HTA reports from German institutions is analysed.
The reports cover different types of technologies, although the assessment of medical and surgical interventions represent the majority. A prospective description (e. g. in the research questions or in the methods section) of the outcome parameters in which the assessment would be based is present in less than half of the analysed HTA reports. Surrogate endpoints are extracted and reported in 87 (62%) HTA reports from the HTA database. Almost all HTA reports include also a clinical and patient relevant outcome. Only five reports use exclusively surrogate parameters, all of them assessing a diagnostic technology and being the surrogate test characteristics.
Similar results are obtained for the sample of German HTA-reports. Approximately one third of the German HTA reports describe to assess benefits and risks, or effectiveness and safety of the technology, without further describing how these terms were operationalised into outcome parameters. Surrogate endpoints are extracted and reported in 74 (56%) German HTA reports. Almost all German HTA reports also consider a clinical and patient relevant parameter. Only six reports exclusively use surrogate parameters, all of them assessing a diagnostic technology and being the surrogate test characteristics.
6 Discussion
6.1 Methods of the report
According to the different nature of our research questions, we follow several approaches in this report.
The literature review represents a good overview of the field. Besides original works, we also identify three recent systematic reviews which summarise additional methodological papers.
We also provide a representative overview of the methodological guidance regarding surrogate endpoints form HTA agencies worldwide.
In addition, we also provide a representative picture of the actual consideration of surrogate endpoints in HTA reports from international and German HTA agencies. The representative survey allows to understand the value surrogate endpoints have in the field of HTA.
6.2 Results of the report
In order to be considered valid and acceptable a surrogate needs to fulfil several criteria. Thus, favourable results from statistical validation approaches are not a sufficient condition to conclude on the validity of a surrogate endpoint. Information on biological and pathophysiological factors is also required. In addition, the validity of a surrogate is to be seen as technology-specific. Whether a surrogate is able to capture the full effect of a technology depends on the mechanism of action of the technology in question. This irrespective of whether a strong and consistent association between surrogate and relevant health outcome has been well established.
In summary, HTA agencies show a cautious position regarding the reliability of surrogate endpoints in HTA. Clinical relevant endpoints such as mortality and morbidity are preferred, since they allow a sound assessment of effectiveness and safety of the interventions. The considerations and recommendations provided by HTA agencies regarding the use of surrogates are conform to the discussions being held in the theoretical and methodological literature.
The quantitative analysis of the sample of HTA reports allows to conclude that surrogate endpoints have not been prominently used in HTA reports. The results are similar for the international sample and for the German sample, indicating that the handling of surrogate endpoints in Germany is not more stringent than in the international context.
7. Conclusions
The validation of a surrogate endpoint requires extensive research, including RCT assessing clinical relevant endpoints. The validity of a surrogate parameter is rather technology-specific than disease-specific. Thus – even in the case of apparently similar technologies – it is necessary to validate the surrogate for every single technology (i. e. for every single active agent), in order to avoid false conclusions potentially leading to fatal consequences.
The use of surrogate endpoints in the assessment of the benefit of health technologies is still to be seen very critically.
