Dear Editor,
We read with great interest the article by Ständer et al. [1], which reports the results of indirect treatment comparisons (ITC) between dupilumab and lebrikizumab in achieving efficacy outcomes at 16 weeks and maintaining those outcomes at 52 weeks for patients with moderate-to-severe atopic dermatitis (AD). The authors conducted the induction phase analysis (week 16 outcomes) using aggregate data from the LIBERTY AD CHRONOS trial (NCT02260986; dupilumab + TCS) [2] and the ADhere trial (NCT04250337; lebrikizumab + TCS) [3]. In the maintenance phase analysis (week 52 outcomes), they used aggregate data from the maintenance phase (week 16–week 52) from the SOLO-CONTINUE trial (NCT02395133; dupilumab monotherapy) [4] and ADvocate 1 and 2 trials (NCT04146363 and NCT04178967; lebrikizumab monotherapy) [5]. Both analyses are based on placebo-adjusted Bucher ITC (hereafter referred to as Bucher’s method). However, their use of Bucher’s method has major limitations that impact the robustness and accuracy of their conclusions.
Limitations of Bucher’s Methodology
Bucher’s method is a commonly used ITC that, when applied correctly, can compare treatments from different trials that share a common comparator arm, such as a placebo or standard treatment. By using this common comparator, Bucher’s method allows for an indirect comparison between treatments by evaluating how each performed relative to the comparator in their respective trials. Bucher’s method relies on two key assumptions: (1) the treatments must have a common comparator, and (2) the trial populations must be similar enough to make the comparisons valid [6, 7]. If either of these assumptions does not hold, Bucher’s method is not appropriate for making comparisons between trials [8]. In such cases, other ITC methods that do not require a common comparator and that adjust for differences in baseline characteristics between trial populations should be used to provide a more accurate and reliable assessment of treatment efficacy. With these assumptions in mind, we next examine the two analyses reported in Ständer et al. and contrast them with more appropriate statistical analyses [1].
Induction Phase Analysis: Achievement of Efficacy Outcomes at Week 16
The authors acknowledge that “the patient populations enrolled in ADhere are presented with slightly lower disease severity according to IGA than patients in LIBERTY AD CHRONOS” and that “mean EASI scores were slightly higher in LIBERTY AD CHRONOS than in ADhere” (Table 1 in Ständer et al.). However, they concluded that “this was deemed acceptable for the analyses.”
To address the potential impact of heterogeneity in trial populations, a more accurate ITC approach is a population-adjusted ITC method, such as a matching-adjusted indirect comparison (MAIC). The MAIC method balances baseline covariates between populations from two studies and estimates the treatment effect as if the treatments were applied to the same comparator population. MAIC methods have been widely used in health technology appraisal submissions to evaluate clinical evidence in the absence of head-to-head trials.
Using a MAIC approach, we matched the trial populations on baseline demographic characteristics (age, gender, and race) and disease severity measures (Eczema Area and Severity Index [EASI] and Investigator’s Global Assessment [IGA] scores) to compare lebrikizumab combination therapy with dupilumab combination therapy. Non-responder imputation for efficacy endpoints (evaluated at week 16) was applied to both studies. On the basis of a placebo-anchored MAIC analysis, lebrikizumab combination therapy was found to have similar week-16 efficacy to dupilumab combination therapy across all tested endpoints, including the proportion of patients achieving an IGA of 0/1 (IGA 0/1), a ≥ 75% improvement from baseline in the EASI (EASI 75); a ≥ 4-point improvement from baseline in the Pruritus Numerical Rating Scale score (PNRS ≥ 4); and a ≥ 4-point improvement from baseline in the Dermatology Life Quality Index score (DLQI ≥ 4). The odds ratios comparing lebrikizumab to dupilumab combination therapy were 1.39 (95% confidence interval [95% CI] 0.42–4.60) for IGA 0/1, 1.14 (95% CI 0.42–3.09) for EASI 75, 0.89 (95% CI 0.29–2.70) for DLQI ≥ 4, and 0.48 (95% CI 0.17–1.37) for PNRS ≥ 4 [9]. These findings were presented at the Maui Derm NP + PA Fall 2024 conference, and a manuscript describing these findings is under review.
If the trial populations are comparable, then the results obtained using MAIC would align with those using Bucher’s method [10]. However, the results of the MAIC analysis described above are not consistent with Ständer et al.’s results using Bucher’s method. This inconsistency highlights the heterogeneity in trial populations and reveals how it affected outcome comparisons, indicating that one of the key assumptions of Bucher’s method was not met.
Maintenance Phase Analysis: Maintenance of Efficacy Outcomes at Week 52
Ständer et al.’s article also used Bucher’s method to analyze the maintenance of efficacy outcomes at week 52 (week 16–week 52) in monotherapy trials [1]. Their analysis used the treatment withdrawal arm as a common placebo comparator. Bucher’s method requires a truly common comparator, but clinical trial outcomes indicate that withdrawing lebrikizumab is demonstrably different from withdrawing dupilumab. As such, the withdrawal arms cannot be considered truly equivalent conditions. This difference renders the withdrawal arms unsuitable as a common placebo arm in a Bucher’s ITC.
The ADvocate 1 and 2 trials [5] have study designs similar to the SOLO 1 and 2 and SOLO-CONTINUE trials [4]. These monotherapy trials include an induction phase of 16 weeks followed by a maintenance phase of 36 weeks. The withdrawal arms of the ADvocate and SOLO trials comprised those who were week-16 per-protocol responders and were re-randomized into placebo (treatment withdrawal) during maintenance. Because patients in the withdrawal arms of these trials received different treatments for 16 weeks, one cannot assume that the withdrawal arms are comparable. Indeed, the unique mechanisms of action, pharmacokinetics, and pharmacodynamics of lebrikizumab and dupilumab suggest that their efficacy after treatment discontinuation may also vary. Key differentiators include that lebrikizumab is a monoclonal antibody that binds with high affinity and a slow dissociation rate to interleukin (IL)-13, whereas dupilumab is a monoclonal antibody that binds to the IL-4 receptor alpha, which results in a broader cytokine blockade of IL-13 and IL-4 pathways. Moreover, the maintenance efficacy outcomes in the withdrawal arms were markedly different for patients receiving lebrikizumab and dupilumab. In responders withdrawing treatment at week 16, maintenance of responses at week 52 in the ADvocate and SOLO trials were 56% vs 30% for EASI 75 and 40% vs 14% for IGA 0/1, based on efficacy analyses using non-responder imputation [4, 5]. These findings from the ADvocate and SOLO withdrawal arms indicate that lebrikizumab may have a longer off-drug treatment effect than dupilumab. Although the durability of a drug’s effect after discontinuation has beneficial clinical implications, the differences in withdrawal arm responses between lebrikizumab and dupilumab introduce bias in between-trial comparisons.
In Ständer et al.’s article, the withdrawal arms are used as a true placebo arm. However, this approach penalizes the drug with a higher durability of off-drug effect even if both drugs have equivalent on-drug effects. Specifically, the difference between the active treatment and withdrawal arms for lebrikizumab will be smaller than for dupilumab, making lebrikizumab appear less effective relative to dupilumab. Therefore, the use of the withdrawal arms as a common comparator makes Bucher’s method inappropriate for this between-trial comparison.
When a common comparator is unavailable, a population-adjusted unanchored comparison of treatment arm data, such as an unanchored MAIC, has been recommended as more appropriate for between-trial comparisons by healthcare payers [10–12]. This method adjusts the characteristics of a target trial population to closely resemble those of a comparator trial population by reweighting individual patient data. Although unanchored comparisons require stronger assumptions than anchored comparisons because they do not include a common comparator, they are commonly used when an anchored comparison is not possible [12]. Without a common comparator, an unanchored comparison requires matching selected covariates (i.e., effect modifiers and prognostic factors) between the target trial population and the comparator trial population. Sensitivity analyses using different sets of covariates can further support the robustness of the results if they are consistent with results of the primary analysis. In an unanchored MAIC analysis, individual patient data from the ADvocate trials were matched with aggregate data from the SOLO-CONTINUE trial. In this analysis, lebrikizumab every 4 weeks provided equal or better long-term maintenance of EASI 75 and IGA 0/1 than dupilumab every week or every 2 weeks in the SOLO-CONTINUE-matched population [13]. Sensitivity analyses showed consistent results [13], which minimizes concern about unmeasured confounders.
Conclusion
Ständer et al.’s article did not adequately consider the two key assumptions of Bucher’s method. In their analysis of week-16 efficacy outcomes, the authors did not adjust for key differences in the trial populations, and in their analysis of week-52 outcomes, they did not use a truly common comparator to compare outcomes. In light of the limitations of Bucher’s method, the current analyses employed MAIC, both anchored and unanchored, to adjust for potential population differences and ensure methodological rigor and transparency. These advanced, population-adjusted methods account for between-study differences and align with current best practices in comparative effectiveness research [7, 14]. By using up-to-date statistical methods, this analysis provides a more accurate and reliable assessment of treatment efficacy, supporting informed clinical decision-making for the care of patients with moderate to severe atopic dermatitis.
Acknowledgements
Medical Writing
Medical writing was provided by Michael Franklin, MS, of PPD, clinical research business of Thermo Fisher Scientific, in accordance with Good Publication Practice guidelines and was funded by Eli Lilly and Company.
Author Contributions
Lucia Seminario-Vidal and Yuxin Ding contributed to the interpretation of the data. Lucia Seminario-Vidal, Yuxin Ding, and Chao Yang prepared the manuscript. Raj Chovatiya, Lucia Seminario-Vidal, Gaia Gallo, Yuxin Ding, Chao Yang, Bülent Akmaz, Laia Solé-Feu, and Kim Rand critically revised and approved the final manuscript.
Funding
This study was funded by Eli Lilly and Company. No funding or sponsorship was received for the publication of this article.
Data Availability
Data sharing is not applicable to this article as no data was generated or analyzed during the current study.
Declarations
Conflict of Interest
Raj Chovatiya served as an advisor, consultant, speaker, and/or investigator for AbbVie, Acelyrin, Alumis, Amgen, AnaptysBio, Apogee Therapeutics, Arcutis Biotherapeutics Inc., Argenx, Astria Therapeutics Inc., Avalere Health, Beiersdorf, Boehringer Ingelheim, Bristol Myers Squibb, Cara Therapeutics, Castle Biosciences, CLn Skin Care, Dermavant, Eli Lilly and Company, EMD Serono, Formation Bio, Galderma, Genentech, GSK, Incyte, Johnson & Johnson, Kenvue, LEO Pharma, L’Oréal, Nektar Therapeutics, Novartis, Opsidio, Pfizer Inc., RAPT, Regeneron, Sanofi, Sitryx, Takeda, TRex Bio, and UCB. Lucia Seminario-Vidal, Gaia Gallo, Yuxin Ding, and Chao Yang are employees of, and own stock in, Eli Lilly and Company, which funded this research. Bülent Akmaz and Laia Solé-Feu are employees of Almirall. Kim Rand is an employee of Maths In Health B.V.
Ethical Approval
This article is based on previously conducted studies and does not contain any new studies with human participants or animals performed by any of the authors.
Footnotes
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Ständer S, Pinter A, Hougeir FG, et al. Dupilumab versus lebrikizumab demonstrates greater likelihood of achieving and maintaining improvements in efficacy outcomes using a placebo-adjusted indirect treatment comparison. Dermatol Ther. 2025;15(9):2537–51. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Blauvelt A, de Bruin-Weller M, Gooderham M, et al. Long-term management of moderate-to-severe atopic dermatitis with dupilumab and concomitant topical corticosteroids (LIBERTY AD CHRONOS): a 1-year, randomised, double-blinded, placebo-controlled, phase 3 trial. Lancet. 2017;389(10086):2287–303. [DOI] [PubMed] [Google Scholar]
- 3.Simpson EL, Gooderham M, Wollenberg A, et al. Efficacy and safety of lebrikizumab in combination with topical corticosteroids in adolescents and adults with moderate-to-severe atopic dermatitis: a randomized clinical trial (ADhere). JAMA Dermatol. 2023;159(2):182–91. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Worm M, Simpson EL, Thaçi D, et al. Efficacy and safety of multiple dupilumab dose regimens after initial successful treatment in patients with atopic dermatitis: a randomized clinical trial. JAMA Dermatol. 2020;156(2):131–43. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Blauvelt A, Thyssen JP, Guttman-Yassky E, et al. Efficacy and safety of lebrikizumab in moderate-to-severe atopic dermatitis: 52-week results of two randomized double-blinded placebo-controlled phase III trials. Br J Dermatol. 2023;188(6):740–8. [DOI] [PubMed] [Google Scholar]
- 6.Chaimani A, Caldwell DM, Li T, Higgins JP, Salanti G. Chapter 11: Undertaking network meta-analyses. In: Higgins JP, Thomas J, Chandler J, Cumpston M, Li T, Page MJ, et al, editors. Cochrane handbook for systematic reviews of interventions. Chichester: Wiley-Blackwell; 22019. p. 285–320.
- 7.Remiro-Azócar A, Heath A, Baio G. Methods for population adjustment with limited access to individual patient data: a review and simulation study. Res Synth Methods. 2021;12(6):750–75. [DOI] [PubMed] [Google Scholar]
- 8.Macabeo B, Quenéchdu A, Aballéa S, François C, Boyer L, Laramée P. Methods for indirect treatment comparison: results from a systematic literature review. J Mark Access Health Policy. 2024;12(2):58–80. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Chovatiya R, Kircik L, Binamer Y, et al. Matching-adjusted indirect comparison of efficacy in patients with moderate-to-severe atopic dermatitis treated with lebrikizumab plus topical corticosteroids versus dupilumab plus topical corticosteroids. maui derm NP+PA fall. Br J Dermatol. 2024;178(5):1064–71. [Google Scholar]
- 10.Phillippo D, Ades T, Dias S, Palmer S, Abrams KR, Welton N. NICE DSU technical support document 18: methods for population-adjusted indirect comparisons in submissions to NICE. UK: NICE Decision Support Unit; 2016. [Google Scholar]
- 11.Health Technology Assessment Coordination Group. Directorate-general for health and food safety. Practical guideline for quantitative evidence synthesis: direct and indirect comparisons. 2024; https://health.ec.europa.eu/document/download/1f6b8a70-5ce0-404e-9066-120dc9a8df75_en?filename=hta_practical-guideline_direct-and-indirect-comparisons_en.pdf. Accessed Sept 19, 2025.
- 12.Phillippo DM, Ades AE, Dias S, Palmer S, Abrams KR, Welton NJ. Methods for population-adjusted indirect comparisons in health technology appraisal. Med Decis Making. 2018;38(2):200–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Rand K, Ramos-Goñi JM, Akmaz B, Solé-Feu L, Armario-Hita JC. Matching-adjusted indirect comparison of the long-term efficacy maintenance and adverse event rates of lebrikizumab versus dupilumab in moderate-to-severe atopic dermatitis. Dermatol Ther (Heidelb). 2024;14(1):169–82. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Phillippo DM, Dias S, Elsada A, Ades AE, Welton NJ. Population adjustment methods for indirect comparisons: a review of National Institute for Health and Care Excellence technology appraisals. Int J Technol Assess Health Care. 2019;35(3):221–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Data sharing is not applicable to this article as no data was generated or analyzed during the current study.
