Skip to main content
Springer logoLink to Springer
. 2025 Jul 28;43(10):1171–1178. doi: 10.1007/s40273-025-01520-0

Challenges in Modelling the Cost Effectiveness of Pharmacotherapies for Obesity

Becky Pennington 1,, Ewen Cummins 2, Albany Chandler 3, James Fotheringham 1,4
PMCID: PMC12450117  PMID: 40717166

Abstract

The cost effectiveness of pharmacotherapies for obesity (such as semaglutide, tirzepatide, liraglutide, and newer agents) is increasingly being appraised by health technology assessment (HTA) bodies. Modelling is required to extrapolate weight change observed over relatively short clinical trial durations to long-term weight loss and associated cardio-metabolic outcomes and costs. Extrapolation is a common issue in HTA, but there is a unique challenge for anti-obesity drugs because of the number of interacting uncertainties. This is a particular concern given the substantial eligible population sizes and associated high financial decision risk of providing lifetime treatment. We describe four key challenges in modelling pharmacotherapies for obesity: (1) modelling long-term body mass index (BMI) trajectories with and without obesity pharmacotherapy, (2) modelling time on treatment, (3) using risk equations to link changes in BMI to clinical outcomes, and (4) modelling clinical outcomes not (solely) related to BMI changes. We discuss each of these challenges and the impact they have had in global HTA appraisals for pharmacotherapies. We speculate how these challenges relating to short-term clinical trials could be overcome to more robustly predict long-term outcomes and the role that observational data may play. As clinical trial and real-world evidence for technologies for obesity evolves, analysts and decision-makers need to determine which evidence sources are most appropriate and how they should be combined.

Key Points

Modelling to extrapolate weight loss in clinical trials for pharmacotherapies into long-term outcomes is uncertain and associated with a high decision risk because of the substantial size of the eligible populations and duration of treatment.
Predicting long-term body mass index trajectories on and off treatment and linking these to clinical outcomes have created challenges for the health technology assessment of obesity pharmacotherapies globally.
As wider use of pharmacotherapy makes trials challenging to perform, the carefully considered combination and analysis of observational data with clinical trial data will play an important role in overcoming these challenges.

Introduction

As more pharmacological therapies become available to treat overweight and obesity, there is a growing need to evaluate their cost effectiveness. Since obesity is a chronic condition [1] and clinical trials of anti-obesity agents are relatively short in duration, decision modelling is required to evaluate the costs and effects of these technologies over a (lifetime) horizon.

The prevalence of people living with obesity, defined by body mass index (BMI) ≥30, is 26% of adults in England [2] and 16% globally [1], which means that the eligible patient population for anti-obesity agents is substantial. The UK National Institute for Health and Care Excellence (NICE) estimated the population for tirzepatide, for which it made a positive recommendation, and which includes a BMI ≥35, at 3.4 million people [3]. The associated decision risk is therefore particularly large in this indication, so it is imperative that uncertainties are understood and minimised.

All weight loss pharmacotherapies reduce food intake and delay gastric emptying, which leads to the sensation of satiety [4]. Modes of action are illustrated elsewhere [4] and include glucagon-like peptide-1 (GLP-1) receptor agonists, such as liraglutide and semaglutide, which increase insulin secretion and suppress glucagon secretion [5, 6]. Tirzepatide adds glucose-dependent insulinotropic polypeptide receptor agonism to GLP-1 agonism to increase insulin sensitivity [7]. Other technologies may induce weight loss and alter glycaemia through glucagon agonism (adding increased energy expenditure) or amylin agonism (improving glucose metabolism by inhibiting glucagon secretion). Newer agents such as survodutide and mazdutide combine GLP-1 agonism with glucagon agonism, and retatrutide combines both these with glucose-dependent insulinotropic polypeptide agonism. Amylin agonists such as pramlintide and cagrilintide work in isolation [4].

Economic evaluations in obesity have historically focussed on public health or behavioural interventions (such as diet and exercise) [8]. However, economic evaluations of pharmacotherapies for obesity are becoming increasingly common: a 2023 systematic review identified 18 studies that evaluated the cost effectiveness of anti-obesity drugs [9]. Given the potential budget impact, there is an imperative for health technology assessment (HTA) bodies globally to appraise these therapies, including NICE [3, 10], the US-based Institute for Clinical and Economic Review (ICER) [11], the Swedish Dental and Pharmaceutical Benefits Agency (Tandvårds- och läkemedelsförmånsverket [TLV]) [12], the Dutch Healthcare Institute (Zorginstituut Nederland [ZIN]) [13], the Australian Pharmaceutical Benefits Advisory Committee [14], the Canadian Agency for Drugs and Technologies in Health [15], and the New Zealand Pharmacology and Therapeutics Advisory Committee [16]. We searched these agencies for published evaluations of liraglutide, semaglutide, and tirzepatide and reviewed the associated documentation to identify the key methodological challenges in analysing their cost effectiveness. We identified five analyses, four of which used cohort state-transition models [10, 12, 13], and one used an individual patient simulation [3]. The modelled states/events are summarised in Table 1.

Table 1.

Summary of included clinical events/states

Included TLV ZIN ICER NICE
Semaglutide Tirzepatide
Pre-diabetes Y Y Y Y Y
T2DM Y Y Y Y Y
ACS Y Y Y
Stroke Y Y Y Y Y
MI Y Y Y
Unstable angina Y Y
Ischaemic attack Y
Heart failure Y
Other CVDa Y
Cancer Y Y
Sleep apnoea Y Y Y
Bariatric surgery Y Y Y
OA/ knee surgery Y Y Y Y
NAFLD Y

ACS, acute coronary syndrome; CVD, cardiovascular disease; ICER, Institute for Clinical and Economic Review; MI, myocardial infarction; NAFLD, non-alcoholic fatty liver disease; NICE, National Institute for Health and Care Excellence; OA, osteoarthritis; TLV, Tandvårds- och läkemedelsförmånsverket; T2DM, type 2 diabetes mellitus; ZIN, Zorginstituut Nederland

aOther CVD included peripheral artery disease, angina, and transient ischemic attack.

In this opinion paper, we describe the challenges associated with economic evaluation of pharmacotherapies, discuss the implications, and propose how these challenges can be addressed in future research.

Key challenges

The primary endpoints of clinical trials for anti-obesity agents relate to weight loss, and secondary endpoints include blood pressure, glycated haemoglobin (HbA1c), and lipid levels [17, 18]. To predict long-term costs and quality-adjusted life-years, these surrogate endpoints (primarily BMI) must first be extrapolated beyond the trial period and then translated into clinical outcomes and mortality. This poses four inter-related challenges.

  1. Modelling long-term BMI trajectories while receiving pharmacotherapies.

  2. Modelling time on treatment for pharmacotherapies, and the impact of cessation.

  3. Using surrogate relationships to translate the effect of pharmacotherapies on interim endpoints to an impact on clinical events.

  4. Modelling effects of pharmacotherapies on clinical events that are not solely related to BMI changes.

The four challenges, associated uncertainties, and assumptions used to try and address these in the economic evaluations are summarised in Table 2 and discussed further in detail in the following sections.

Table 2.

Summary of challenges, associated uncertainties, and assumptions employed in evaluations

Challenge Uncertainties Assumptions used
Modelling long-term BMI trajectories while receiving pharmacotherapies Limited duration of follow-up from trials for intervention and standard care

Constant absolute difference in BMI after trial ends

Natural history increase in BMI for standard care and no change for treatment arm

Waning of treatment effect – people on intervention eventually start to gain weight

Modelling time on treatment for pharmacotherapies, and the impact of cessation Limited follow-up for patients who discontinue for various reasons

Assume time to weight regain of 1–4 years after stopping treatment

Use of STEP-1 semaglutide extension study

Convergence of both arms to natural history or baseline

Using surrogate relationships to translate the effect of pharmacotherapies on interim endpoints to an impact on clinical events Clinical trials not long enough to capture clinical events, relationship between surrogate markers and clinical events on pharmacotherapies unclear

Risk equations from different sources (primarily USA and UK)

Assuming weight loss reverses risk, or residual effect

Annualising multi-year risks

Modelling effects of pharmacotherapies on clinical events that are not solely related to BMI changes Pharmacotherapies for obesity may affect other outcomes

Including effects on diabetes and cardiovascular events

Including other mortality effects

BMI, body mass index

Modelling long-term BMI trajectories on pharmacotherapies for obesity

The limited duration of follow-up data available at the time of appraisal (typically 68–72 weeks) [3, 1014, 19] has required decision-makers to make assumptions about BMI trajectories on standard care (e.g. diet and exercise) and pharmacological treatment beyond the trial period. Options include assuming a constant absolute difference in BMI after data are available [3, 13, 14] or assuming a natural history increase in BMI in the standard care arm, with no change in BMI in the treatment arm. When taking into account the recognised natural history increase in BMI [20] for a comparator arm in a model (most commonly diet and exercise [3, 1012, 14, 19]), assuming sustained weight loss on treatment after a chosen timepoint results in a relative increase in treatment effect over time. But it could be considered pessimistic to assume a waning effect while still on treatment in the absence of a mechanistic explanation for this. The lack of long-term evidence to support submissions assuming a sustained, and in some cases increasing, relative treatment benefit on BMI has led decision-makers in  HTAs for anti-obesity medicines to be conservative around assumptions of long-term effectiveness. For example, expert testimony in the TLV’s evaluation of liraglutide suggested that people will eventually start to regain weight on treatment [12].

Anti-obesity medicines have been licenced for use alongside a reduced-calorie diet and increased physical activity [2124]. However, unknown real-world adherence to these diet and exercise interventions may affect the effectiveness of these medicines.

If patients are to take pharmacotherapies as a long-term treatment, better evidence on their long-term effectiveness is required to reduce the decision uncertainty. To understand the relative effectiveness of the pharmacotherapies compared with standard care, the long-term effectiveness of diet and exercise, which may be used alongside pharmacotherapies, must also be understood.

Modelling time on treatment for pharmacotherapies for obesity and the impact of cessation

Patients may discontinue obesity pharmacotherapies due to non-response (e.g., if <5% of initial body weight is not lost within 6 months [3, 10]), adverse events, or limitations on treatment duration. The NICE appraisal of semaglutide assumed that treatment was limited to 2 years because of the restrictions in specialist weight management services and because of the lack of evidence for longer-term use. However, this view is evolving—there was no stopping rule in the NICE appraisal of tirzepatide [3]. The TLV also questioned the 2-year treatment duration of semaglutide [12]. It is generally accepted that, on average across a cohort of people taking anti-obesity pharmacotherapies, BMI will increase when treatment is stopped [3, 10, 11, 14, 19, 25], supported by data on withdrawal of semaglutide in STEP-4 [26] and the STEP-1 extension [27]. The choice of how to model weight regain has had a large impact on cost-effectiveness results. In TLV’s evaluation of liraglutide, the choice between a 1-year or 4-year gradual return in weight after stopping treatment had a large impact on the cost-effectiveness results [12]. When NICE evaluated semaglutide, the STEP-1 extension trial data were not available, so they relied on the relatively short time on treatment in STEP-4 (20 weeks) and questioned whether the corresponding weight regain rate could be generalised to weight regain after stopping anti-obesity treatments after a longer time on treatment [10]. The STEP-1 extension has since indicated that two-thirds of weight is regained 1 year after stopping semaglutide (after 67 weeks). This longer-term evidence from STEP-1 was used to justify the modelled weight regain rate in the more recent evaluation of tirzepatide by NICE [2] and semaglutide by the ZIN [13], showing how beneficial further data collection in this area has been for HTA decision-makers. The issue of weight regain after stopping an intervention is not unique to pharmacotherapies: weight regain is commonly observed after stopping behavioural weight management programmes [28, 29], and models may assume both arms converge to baseline or natural history trajectories (the ZIN noted that the weight at the end of the catch-up period was assumed equal to the baseline weight instead of natural history trajectories [13]). Although these data can inform long-term trajectories on standard care, uncertainties remain as to how applicable evidence for weight loss (and subsequent regain) from weight management programmes is to weight loss from pharmacotherapies. Furthermore, the interaction between time on treatment and weight regain rate after stopping and the generalisability of evidence between different anti-obesity drugs are remaining uncertainties. The mechanisms of action that underpin the longer-term effects of anti-obesity medicines are not fully understood [30].

Using surrogate relationships to translate the effect of pharmacotherapies on BMI to an impact of clinical events

Many of the HTA appraisals used risk equations to link patient characteristics and surrogate markers (principally BMI, but also HbA1c, blood pressure, and lipids, as well as estimating risk separately for people with normal glycaemia and pre-diabetes at baseline) to clinical events (events generally included death, cardiovascular events—including cardiovascular disease [CVD], myocardial infarction, stroke, and heart failure—and type 2 diabetes mellitus) [3, 10, 11, 13, 14].

The risk equations used in most of the appraisals were derived from US [31, 32] or UK [3335] datasets. The development of the original UK-specific QRisk equations in 2007 was partly in response to concerns that the Framingham risk equations developed at the peak incidence of CVD in America may overestimate the risk in Europe, and that the predominantly white population may underestimate the risk in more diverse populations. They considered that it was important to include risk factors related to social deprivation, BMI, family history, and current treatments [36]. In the QRisk3 update, they noted the complexities arising from considering statin prescriptions [33]. This suggests that there are concerns regarding the transferability of risk equations between different populations when considering geographical settings, deprivation, and currently prescribed medications. The ICER appraisal used Framingham [31] and Edelman et al. [32] US risk equations, whereas the NICE and ZIN appraisals typically used QDiabetes [34], UKPDS [35], and QRisk3 [33], with Framingham for recurrent cardiovascular events (the TLV appraisal of liraglutide used QDiabetes, QRisk3, and Swedish studies for the risk of developing CVD events for people with diabetes [37, 38]). The ICER appraisal noted that they were unable to consider subpopulations with larger potential benefits because of the lack of data [39]. In addition to questioning the validity of using risk equations from different countries (the ZIN appraisal of semaglutide noted that validation was lacking in Dutch models that included BMI [13]), potential issues arise when considering whether GLP-1 agonists are differentially used across levels of deprivation. There may also be concerns with the age of the data used in the risk equations: QRisk3 uses data from 1998 to 2015, and Framingham uses data from a maximum of 12 years of follow-up since 1968–1987.

A study of modelling approaches in obesity found that risk equations gave better overall predictions of clinical events than other methods [40]. Some of the HTA appraisals noted that risk equations were the most appropriate approach [3, 10, 11], whereas others noted that the modelled benefits were highly uncertain [3, 14] or not transparent [13]. The NICE appraisal for semaglutide concluded that there was no alternative to using risk equations but noted several limitations with their use in this context [10]. The TLV also expressed concerns that there was no evidence that weight loss during 1 year of treatment with semaglutide would lead to a long-term reduction in related complications after treatment ended [41].

The UK QRisk3 and QDiabetes provide estimates of the 10-year risk for the first recorded diagnosis of a clinical event, CVD, and diabetes, respectively, and were estimated using Cox proportional hazards models given patients’ baseline risk factors, including BMI [36]. The equations assume that a patient’s risk factors evolve at the same rate as the sample of the general public with the same baseline characteristics [33].

As noted by ICER, these equations may have limitations when attempting to predict the impact of pharmacotherapies for weight loss [39]. Treatment of people with obesity with pharmacotherapies such as GLP-1s can lead to radically different evolutions of risk factors such as BMI compared with in the general public, so applying the same risk factor equations may not be valid [18].

Applying the same risk equations assumes that the effects of having had obesity are entirely reversed, but evidence is emerging that those with substantial weight loss may not have the same risks of events as those who have always been stable at the lower weight. For example, analysis of intentional weight loss in a large UK primary care data set suggested that the risks of heart failure, atrial fibrillation, hip or knee osteoarthritis, and sleep apnoea remained statistically significantly elevated in the weight loss group compared with those who had not previously had obesity [42]. The NICE committee for the tirzepatide appraisal considered that these studies showed it is reasonable to assume some long-term residual impact of having previously had a higher BMI and noted that this was likely to make the GLP-1 agonists less cost effective [3].

A further study found that the benefit of weight loss on obesity-related outcomes may depend on the baseline BMI. For instance, for heart failure and atrial fibrillation, the relative effect of weight loss was greater for those with a higher baseline BMI, whereas for type 2 diabetes mellitus and sleep apnoea the relative effect was lower for those with a higher baseline BMI, and for other complications such as venous thromboembolism and chronic kidney disease (CKD), there was no obvious relationship between baseline BMI and risk reduction [43].

An additional issue arises in annualizing the 10-year risk estimates from QDiabetes and QRisk3 for use in economic models with 1-year cycles. Modelling typically updates the risk factors each annual model cycle, re-estimates the 10-year risk, annualizes this 10-year risk and applies this within the model cycle. Since the risk factors typically worsen with age [3], the annualized risk estimates increase over the 10-year period and, when compounded, cause the modelled 10-year risk to be greater than the original 10-year risk. Overestimation can have a particularly large impact where an interim event has been simulated, for example a patient develops CVD, which then feeds into subsequent QDiabetes risk estimates.

In an ideal world, the duration of clinical trials would be sufficiently long that the effectiveness of anti-obesity technologies on clinical outcomes could be directly observed. However, at present, decision-makers must choose between using short-term trial data specific to the anti-obesity technologies or longer-term risk equations not specific to these technologies (in some cases adapted to consider residual impacts of obesity) and consider how valid these are as longer-term data emerge [44].

Modelling effects of pharmacotherapies on clinical events that are not solely related to BMI changes

The same pharmacotherapies used to manage obesity may be approved for use in other indications, such as diabetes mellitus [5, 7] and cardiovascular events [5, 45]. HTA has so far considered these indications in isolation, with separate appraisals (e.g., NICE appraisal of tirzepatide for type 2 diabetes mellitus [46]). The appraisals in obesity have considered the development of diabetes, and this has introduced further uncertainty in modelling long-term effects on HbA1c in addition to BMI [14]. Furthermore, technologies with combinations of modes of action may have varying impacts in ways not captured by risk equations focussed on weight loss and/or glycaemic control alone. ICER noted that the risk equations may have limitations in predicting the impact of pharmacotherapies for weight loss where there are complex actions on the body [39]. Evidence for GLP-1 therapies in people with obesity (with and without diabetes) indicates they may impact specific non-weight-related endpoints [44, 47].

Building on observational data showing that changes in cardiovascular risk factors only modestly mediate the effects on kidney outcomes, authors of the trial of semaglutide in kidney disease stressed that improvements in kidney outcomes were unrelated to changes in body weight and postulated anti-inflammatory and anti-fibrotic properties [48]. Meanwhile, cardiovascular trial authors postulated that reductions in adipose tissue deposits lead to reduced atherosclerosis, myocardial dysfunction, and inflammation [44]. Obesity, diabetes, kidney disease, and CVD exist in a complex multi-directional relationship with each other, exacerbating uncertainty, especially when using methods such as risk equations, which assume cross-sectional unidirectional relationships.

Independently estimating the impact of changes in each component of cardiovascular-kidney-metabolic syndrome could inappropriately amplify therapeutic effects. ICER felt that their model implicitly addressed the influence of CKD, stating that most CKD results from diabetes or hypertension [39]. The NICE assessment of tirzepatide for obesity considered whether the mortality effects of events such as CVD should be individually modelled but concluded that this unnecessarily complicated the modelling, and the availability of general population standardized mortality ratios related to BMI was judged sufficient to capture the effects of weight loss upon mortality (subject to the caveat regarding residual risk of obesity) [3].

Analysis of long-term data is needed to consider how exposure to GLP-1 agonists and the weight loss or the improved glycaemic control they induce mediate cardiovascular and kidney outcomes. This could then inform the nature of the modelling of these events.

Discussion

In many ways, modelling to predict the clinical outcomes of pharmacotherapies for obesity has been done with the best available data and methods, but it is subject to such uncertainty that the challenges need to be understood and addressed through future research. The incorporation of sensitivity analyses to address parameter and structural uncertainty within health economic models has provided insight into the magnitude of the effect of different data sources and assumptions, demonstrating the value of such models in simulating long-term outcomes. Where this is done transparently and robustly, credible economic models can play a key role in aiding decision-makers in interpreting the impact of reimbursing anti-obesity therapies.

One obvious solution to the paucity of long-term evidence is longer-term follow-up of clinical trials. This would inform the long-term treatment effectiveness in terms of time on treatment and BMI trajectories. This could also collect information on clinical outcomes, but these would still be uncertain because of the limited follow-up duration and relatively small sample sizes (relative to the eligible population). As these technologies enter clinical practice, these trials are going to become more challenging to conduct when considering issues such as recruitment, attrition, and cross-over. Furthermore, the generalisability of evidence may present particular challenges in obesity when considering adherence to lifestyle changes.

A more pragmatic approach, therefore, may be to analyse real-world evidence (RWE). The development of risk equations demonstrates that it is possible (and has been for decades) to link patient characteristics, including BMI, HbA1c, and medications (in the case of statins), to clinical outcomes. A logical next step would be to extend this analysis to examine how the use of obesity pharmacotherapies affects clinical outcomes. Analysis considering the direct effect of pharmacotherapies on clinical outcomes may allow for the effect on GLP-1s outside of the effect on BMI, although this may require careful consideration (and pre-specification) of which outcomes are relevant. This may also allow analysis of repeated clinical events, rather than time to first event. Analysis of RWE may also provide greater evidence of the differential effectiveness of pharmacotherapies in subpopulations. Specifically, it may address concerns regarding underserved populations or those less likely or ineligible to enrol into clinical trials.

We are not alone in suggesting the use of RWE. ICER’s assessment acknowledged the need for studies examining long-term outcomes of weight loss medications in individuals without diabetes and recommended that manufacturers should initiate long-term studies on the benefits and harms of these treatments and use RWE to generate relative effectiveness of the different treatments [39].

Generating and using RWE has its own challenges, particularly relating to integrity, data quality and relevance, risk of bias, and confounding [49]. Efforts must be made to address these as this evidence plays an increasingly important role in modelling and decision-making (both in the application to obesity and more broadly).

Where new evidence is generated, it should be incorporated into re-evaluation of these treatments. ‘Technology management’ [50] can be used to understand whether treatments deliver value or how they can be better targeted to maximise cost effectiveness after reimbursement. NICE plans to review guidance on tirzepatide within 3 years, and notes that this will include RWE, with the use of routinely collected data to avoid unnecessary burden [3]. Guidance on the use of using electronic health records for HTA may be particularly helpful [39] alongside RWE frameworks [51]. We have highlighted the key areas of uncertainty where analysis of RWE will be particularly helpful.

Conclusion

Although modelling the effects of pharmacotherapies on obesity is challenging, the decision risk is too large to ignore. Future research and ongoing re-evaluation, including analysis of real-world data, is essential to ensure decisions can be revisited, uncertainties minimised, and population health maximized.

Declarations

Funding

The authors did not receive support from any organization for the submitted work.

Conflicts of Interest

Becky Pennington is funded by a National Institute for Health and Care Research (NIHR) Fellowship (300160) and is a member of NICE Technology Appraisal Committee A. James Fotheringham is vice-chair of NICE Technology Appraisal Committee A. All authors were involved in the NICE Technology Appraisal of Tirzepatide for managing overweight and obesity (TA1026). Albany Chandler was involved in the NICE Technology Appraisal of Semaglutide for managing overweight and obesity (TA875). The conclusions presented within this paper are those of the authors themselves and do not necessarily represent those of NICE, the NIHR or the University of Sheffield.

Ethical approval

This article does not contain any studies with human participants performed by any of the authors.

Consent to participate

Not applicable.

Consent for publication

All authors provide this consent.

Data availability

All data desrcibed within the manuscript are publically available from cited sources.

Code availability

Not applicable.

Author Contributions

All authors contributed to the conceptualisation of the manuscript, drafted sections of text, reviewed and revised content, and approved the final version.

References

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

All data desrcibed within the manuscript are publically available from cited sources.


Articles from Pharmacoeconomics are provided here courtesy of Springer

RESOURCES