Abstract
Scientific background
Cardiovascular diseases have an enormous epidemiological and economic importance. For the selection of persons with increased total cardiovascular risk for individual-targeted (e. g. drug-based) prevention interventions different risk prognosis instruments (equations, point scores and table charts) were derived from studies or databases. The transferability of these prognostic instruments on the populations not examined in these data sources as well as their comparability are not clear.
Research questions
The evaluation addresses the questions on the existence of instruments for risk prediction of cardiovascular diseases, their transferability and comparability.
Methods
A systematic literature search was performed in the medical electronic databases in April 2008 beginning from 2004 and was completed with a hand search. Publications on the prognostic instruments for cardiovascular diseases as well as publications addressing external validity and/or comparing prognostic instruments were included in the evaluation.
Results
The systematic lierature search yielded 734 hits. Three systematic reviews, 38 publications with descriptions of prognostic instruments and 29 publications with data on the validity of the prognosis instruments were identified.
Most risk prognosis instruments are based on the Framingham cohort of the USA. Only the PROCAM study is completely based on the German reference population. Almost all prognostic instruments use the variables sex, age, smoking, different parameters of the lipid status and of the blood pressure. Different cardiovascular events are considered to be an end parameter in the prognosis instruments. The time span for predicted events in the studies mostly comprises ten years.
Data on calibration of the prognosis instruments (a quotient of the predicted by the observed risk) are presented in nearly half of the studies on the validation, however in no study from Germany. Only single studies find the levels of calibration between 0.9 and 1.1. Many studies on the transferability of the prognosis instruments show a value of the discrimination (correct differentiation of persons with different risk levels, best value 1.0) between 0.7 and 0.8, few studies between 0.8 and 0.9 and no study over 0.9. The studies addressing the discrimination of the prognostic instruments on the German population almost always find values between 0.7 and 0.8.
The comparison of the validity of different risk prognosis instruments shows a trend for a better calibration and a better discrimination for the prognosis instruments examined on the derivation and/or validation cohorts of one of the compared prognostic instruments. Comparing the prognostic instruments on other cohorts, the newly derived Framingham prognostic instruments show a better discrimination in comparison with previously derived instruments. No studies exists comparing different prognostic instruments on the German population.
Discussion
The geographic variance of the cardiovascular morbidity and mortality supposed to be the most important factor limiting the transferability of the prognostic instruments. An appropriate recalibration is considered to be an approach for the improvement of the transferability.
Conclusions
The identified instruments for the risk prediction of cardiovascular diseases are insufficiently validated on the German population. Their use can lead to false risk estimation for a single person. Therefore, the existing prognostic instruments should be used for the informed decision-making and for the therapy selection in Germany only with critical caution.
Abstract
Wissenschaftlicher Hintergrund
Kardiovaskuläre Erkrankungen gehören zu den Krankheitsbildern mit enormer epidemiologischer und volkswirtschaftlicher Bedeutung. Zur Auswahl von Personen mit erhöhtem kardiovaskulärem Gesamtrisiko für individuumsbezogene (ggf. medikamentöse) Präventionsmaßnahmen werden verschiedene Risikoprognoseinstrumente (Gleichungen, Punktescores bzw. Tabellendiagramme) aus Studien bzw. Datenbanken abgeleitet. Die Übertragbarkeit dieser Prognoseinstrumente auf die in diesen Datenquellen nicht untersuchten Populationen sowie deren Vergleichbarkeit ist unklar.
Fragestellung
Die Bewertung soll Fragen nach dem Vorhandensein von Instrumenten zur Risikoprädiktion für kardiovaskuläre Erkrankungen, ihrer Übertragbarkeit und Vergleichbarkeit beantworten.
Methodik
Es werden eine systematische Literaturrecherche in den medizinischen elektronischen Datenbanken im April 2008 ab 2004 und eine Literaturhandsuche durchgeführt. In die Bewertung werden Publikationen über Prognoseinstrumente für kardiovaskuläre Erkrankungen sowie Publikationen mit Angaben zur externen Validierung bzw. zum Vergleich solcher Prognoseinstrumente untereinander einbezogen.
Ergebnisse
Die systematische Literaturrecherche ergibt 734 Treffer. Es werden insgesamt drei systematische Übersichten, 38 Publikationen mit Beschreibungen prognostischer Instrumente und 29 Veröffentlichungen mit Angaben zur Validität der Prognoseinstrumente identifiziert.
Die meisten Risikoprognoseinstrumente basieren auf der Framingham-Kohorte der USA. Komplett auf die deutsche Bezugspopulation stützt sich ausschließlich die PROCAM-Studie. Fast alle prognostischen Instrumente verwenden als Variablen Geschlecht, Alter, Rauchen, Angaben zum Lipidstatus und zu Blutdruckwerten. Es werden bei den Prognoseinstrumenten verschiedene kardiovaskuläre Ereignisse als Endparameter betrachtet. Die Zeitspanne für prognostizierte Ereignisse beträgt meistens zehn Jahre.
Angaben zur Kalibrierung der Prognoseinstrumente (der Quotient von vorhergesagtem zu beobachtetem Risiko) werden in ca. der Hälfte der Studien zur Validierung präsentiert, dabei in keiner Studie aus Deutschland. Nur in einzelnen Studien liegt die Kalibrierung im Bereich von 0,9 bis 1,1. Viele Studien zur Übertragbarkeit von Prognoseinstrumenten zeigen einen Wert für die Diskrimination (richtige Zuordnung von Personen mit unterschiedlicher Risikohöhe, Bestwert: 1,0) zwischen 0,7 und 0,8, wenige zwischen 0,8 und 0,9 und keine über 0,9. In den Studien mit Angaben zur Diskrimination für die deutsche Population liegen diese Werte fast ausschließlich zwischen 0,7 und 0,8.
Beim Vergleich der Validität von verschiedenen Risikoprognoseinstrumenten zeigt sich ein Trend für eine bessere Kalibrierung und eine bessere Diskrimination bei der Überprüfung auf den Ableitungs- bzw. Validierungskohorten eines der verglichenen Prognoseinstrumente. Bei dem Vergleich von Prognoseinstrumenten durch Anwendung auf andere Populationen liefern die neu abgeleiteten Framingham-Prognoseinstrumente eine etwas bessere Diskrimination im Vergleich zu früher errechneten Instrumenten. Es liegen bislang keine Studien zum Vergleich von verschiedenen Prognoseinstrumenten an der deutschen Population vor.
Diskussion
Als die wichtigste limitierende Komponente bezüglich der Übertragbarkeit von Prognoseinstrumenten wird die geografische Varianz der kardiovaskulären Morbidität und Mortalität angeführt. Eine entsprechende Rekalibrierung wird als Ansatz zur Verbesserung der Übertragbarkeit angesehen.
Schlussfolgerungen
Die identifizierten Instrumente zur Risikoprädiktion von kardiovaskulären Erkrankungen sind an der deutschen Population nicht ausreichend validiert. Ihre Anwendung kann zur Fehleinschätzung des Risikos einzelner Patienten führen. Deswegen sind in Deutschland die vorliegenden Prognoseinstrumente für die informierte Entscheidungsfindung und die Therapieauswahl nur mit kritischer Vorsicht anzuwenden.
Executive Summary
1. Scientific background
Cardiovascular diseases caused 358,684 deaths in Germany 2007 and have an enormous epidemiological importance. Cardiovascular diseases are also of extreme relevance from the health-economic view. The costs of cardiovascular diseases 2006 were nearly 35 billion euros.
It is assumed that cardiovascular morbidity and mortality are modifiable through different prevention interventions. Besides of the population-targeted prevention interventions the individual-targeted (e. g. drug-based) prevention interventions are usually indicated in persons with an increased total risk. For the selection of persons with an increased total cardiovascular risk, so-called risk prognosis instruments are constructed and used.
Risk prognosis instruments in form of equations, point scores and table charts (risk charts) are constructed through a statistical analysis of the data derived from populations. These instruments enable to estimate a risk for a cardiovascular event and/or a survival probability without this event in dependence of the values of the risk factors. Risk prognosis instruments may be also represented graphically, for example as nomograms.
There are a number of different risk prognosis instruments. Unfortunately, these instruments are based on different primary studies or databases which usually do not include the German population. The transferability of these prognostic instruments on the populations not examined in these data sources as well as the comparability of the validity of these prognostic instruments is questioned.
2. Research questions
The evaluation addresses the following questions:
Which instruments for the risk prediction of cardiovascular diseases are available?
What is the evidence for a transferability of the available risk prognosis instruments for cardiovascular diseases on populations not involved in the prognostic study?
To what extent are the available methods for risk prediction of cardiovascular diseases comparable?
3. Methods
Information sources and search strategy
A literature search was performed in the most important medical electronic databases (MEDLINE, EMBASE etc.) in April 2008. The search strategy was restricted to the years beginning from 2004 as well as to the languages German and English. Moreover, an expanded hand search was performed to identify publications on prognostic instruments for cardiovascular diseases as well as publications on the external validity of different prognostic instruments.
Inclusion and exclusion criteria
Publications on prognostic instruments for cardiovascular diseases in persons without previous cardiovascular disease as well as publications addressing external validation and/or the comparison of different prognostic instruments were included in the evaluation. The instruments focusing on specific patient risk groups were not considered. Discrimination and calibration were used as accuracy criteria.
Data analysis and information synthesis
Systematic surveys and primary publications on prognostic instruments as well as publications on the evaluation of the validity and the comparability of different prognostic instruments were considered as an information sources. The information synthesis was performed qualitatively.
4. Results
Results of the literature search
The systematic literature search yielded 734 hits. 116 publications were selected for the review in full text and were examined for the inclusion in the evaluation. Three systematic reviews, eight publications with descriptions of prognostic instruments and 13 publications addressing the validity of the prognostic instruments were identified through the literature search. The hand search in the reference lists of the relevant articles revealed 30 further publications with descriptions of prognostic instruments and 16 further publications addressing the validity of the prognostic instruments.
Risk prognostic instruments
Most risk prognosis instruments are based on the Framingham cohort of the USA, almost all other on European cohorts, mostly on British or Italian. Only the PROCAM study is completely based on the German reference population. Two other instruments, the SCORE Charts for Germany and the WHO/ISH-charts for the European risk region EUR-A, are partially based on this population. Population-based, patient-based and occupational cohorts, in some studies only men or women, were used as a reference population for the derivation of the prognostic instruments.
Almost all prognostic instruments use the variables sex, age, smoking and one or several parameters on lipid status and blood pressure. Many prognostic instruments use the variables diabetes mellitus and/or blood glucose for the risk calculation, several instruments the variables left ventricular hypertrophy on electrocardiogram (ECG), body-mass-index, antihypertensive therapy and some prognostic instruments other variables. The multinational studies stratify their prognostic instruments also regionally. Mostly, only five to six prognostic variables are used in the prognostic instruments.
The most important endpoints are death from coronary heart disease, death from cardiovascular disease, coronary heart disease and coronary event (death, myocardial infarction, in some studies also angina pectoris and/or coronary revascularization) as well as cerebrovascular event (stroke, in some studies also transient ischemic attack), cardiovascular disease and cardiovascular event (coronary event, cerebrovascular event some studies also intermittent claudication and/or heart insufficiency). The time span for predicted events comprises mostly ten years.
Constructing the scores, three different statistical regression models, namely logistic, Weibull or Cox regression models, are used for the data analysis of the reference population. A stepwise regression model is selected in all procedures.
External validity of the risk prediction instruments of cardiovascular diseases
Data on calibration of the prognostic instruments (a quotient of the predicted by the observed risk) are presented in nearly half of the studies. Only a single study shows a level of calibration between 0.9 and 1.1. In all three studies from Germany data on the calibration of the prognostic instruments are missing.
Many studies on the transferability of the prognostic instruments show an AUC value for the discrimination (value for the correct differentiation of persons with different risk levels; AUC = area under the curve; best value 1.0) between 0.7 and 0.8 for different prognostic instruments (sufficient discrimination), few studies an AUC value between 0.8 and 0.9 (good discrimination) and no study an AUC value of more than 0.9 (excellent discrimination).
From studies addressing the discrimination of the prognostic instruments (different Framingham equations) on the German population all but one find AUC values between 0.73 and 0.78 (sufficient discrimination). Studies evaluating the external validity of the new prognostic instruments such as PROCAM (2007) and SCORE-Germany, derived from the German population, are lacking.
Comparison of the validity of different risk prediction instruments of cardiovascular disease
The comparison of the validity of different risk prognostic instruments on the derivation cohort of one of these prognostic instruments (accuracy) showed a trend for a better calibration and a better discrimination for the prognostic instruments calculated on the bases of the derivation cohort.
The comparison of the validity of different risk prognostic instruments on the validation cohort of one of these prognostic instruments (reproducibility) found a trend for a better calibration and a better discrimination for the prognostic instruments calculated from the data of the corresponding derivation cohort.
Comparing the prognostic instruments on other populations (transferability), the newly derived Framingham prognostic instruments showed a slightly better discrimination in comparison with previously calculated instruments. The value of the German prognostic instrument PROCAM 2002 in comparison with Framingham instruments for the European population is not clear. No studies comparing different prognostic instruments on the German population exist.
5. Discussion
Literature search
In spite of an extended search strategy in the most important medical databases, missing of relevant articles addressing the theme of the report due to the problem of the complexity of the literature search for prognostic studies is possible.
Risk prognostic instruments
The representativity of the study participants for the corresponding total population is questionable in many derivation studies of the risk prognostic instrument. The reference populations in the studies are not homogenous concerning the disease stages.
The high number of rarely used variables in the risk prognostic instruments suggests that the relevance of these variables for the risk prognosis is not clearly estimated.
The use of endpoints comprising clinical events is more subjective than the exclusive use of the mortality; however, it has clearly higher clinical and social importance for the individual.
The Cox regression should be preferred for the derivation procedure, because this regression analysis can calculate the risk at different follow-ups and enables a relatively simple adaptation of the model for other populations.
In spite of the reduction of the precision, transforming a risk equation to a point score and to a risk chart, a risk chart permits a better illustration of the actual and the targeted risk of a person compared with a value directly determined from the risk equation.
External validity of the risk prediction instruments of cardiovascular disease
Different components of the transferability, mostly geographic, historic as well as methodological and disease spectrum, were evaluated in the presented studies on the external validation. The geographic transferability appears to be the most important of these components because of the substantial differences in the cardiovascular morbidity and mortality between different countries and regions.
The populations underlying the prognostic instruments in most studies were recruited many years ago; therefore, the prognostic instruments derived from these populations may be not transferable on the currently living populations.
It is not to be expected that the slightly different measurement methods and disease spectrums in different studies relevantly limit the transferability of the prognostic instruments.
An exact threshold value for a good or poor calibration is not clearly determined in the literature yet. In order to restrict the problem of poor calibration, the average values of the risk factors and the average event rates of the reference population used in the prognostic instrument should be replaced in the equations by the corresponding parameters of the predicted population (recalibration).
An exact and plausible threshold value for a good or poor discrimination of the prognostic instruments is also not stated in the literature. The differentiation in excellent, good, sufficient, weak and very weak discrimination is subjective. Moreover, it is recommended to perform the evaluation of the discrimination only after the recalibration of the instruments for the corresponding population.
Comparison of the validity of different risk prediction instruments of cardiovascular disease
The higher validity of the risk prognostic instruments examined on the derivation cohort than on the validation cohort of these prognostic instruments and especially on other populations may be explained due to the considerable geographic variance of the cardiovascular morbidity and mortality.
The lack of studies on the comparison of different prognostic instruments on the German population enables no statements on their comparability.
6. Conclusions
The identified instruments for the risk prediction of cardiovascular disease are not sufficiently validated on the German population; their use can lead to false risk estimation for a single person. Therefore, the existing prognostic instruments should be used for the informed decision-making and for the therapy selection in Germany only with critical caution. Studies on external validation of the prognostic instruments and on the comparison of different prognostic instruments on the German population (if possible after previous recalibration) as well as randomized studies on therapeutic consequences and on clinical benefit of the prognostic instruments are needed.