Skip to main content
Clinical Orthopaedics and Related Research logoLink to Clinical Orthopaedics and Related Research
. 2021 Aug 5;479(11):2350–2361. doi: 10.1097/CORR.0000000000001909

How Large a Study Is Needed to Detect TKA Revision Rate Reductions Attributable to Robotic or Navigated Technologies? A Simulation-based Power Analysis

Matthew D Hickey 1,, Carolyn Anglin 2, Bassam Masri 3, Antony J Hodgson 4
PMCID: PMC8509967  PMID: 34351313

Abstract

Background

Robotic and navigated TKA procedures have been introduced to improve component placement precision in the hope of improving implant survivorship and other clinical outcomes. Although numerous comparative studies have shown enhanced precision and accuracy in placing components, most comparative studies have not shown that such interventions result in improved implant survival. Given what we know about effect sizes from large arthroplasty registries, large cohort studies, and large randomized controlled trials (RCTs), we wondered how large randomized trials would need to be to detect such small differences, and if the number is very high, what that would tell us about the value of these treatments for preventing revision surgery.

Questions/purposes

In this simulation study, we asked: Given that survivorship differences between technology-assisted TKA (TA-TKA, which we defined as either navigated or robot-assisted TKA) and conventional TKA are either small or absent based on large arthroplasty registries, large cohort studies, and large RCTs, how large would randomized trials need to be to detect small differences between TA-TKA and conventional TKA if they exist, and how long would the follow-up period need to be to have a reasonable chance to detect those differences?

Methods

We used estimated effect sizes drawn from previous clinical and registry studies, combined with estimates of the accuracy and precision of various navigation and robotic systems, to model and simulate the likely outcomes of potential comparative clinical study designs. To characterize the ranges of patients enrolled and general follow-up times associated with traditional RCT studies, we conducted a structured search of previously published studies evaluating the effect of robotics and navigation on revision rates compared with that of conventional TKA. The structured search of the University of British Columbia’s library database (which automatically searches medical publication databases such as PubMed, Embase, Medline, and Web of Science) and subsequent searching through included studies' reference lists yielded 103 search results. Only clinical studies assessing implant survival differences between patient cohorts of TA-TKA and conventional TKA were included. Studies analyzing registry data, using cadaver specimens, assessing revision TKA, conference proceedings, and preprint services were excluded. Twenty studies met all our inclusion criteria, but only one study reported a statistically significant difference between the conventional and robotic or navigated groups. Next, we generated a large set of patients with simulated TKA (1.5 million), randomly assigning each simulated patient a set of patient-specific factors (age at the index surgery, gender, and BMI) drawn from data from registries and published information. We divided this set of simulated procedures into four groups, each associated with a coronal alignment precision reported for different types of surgical procedures, and randomly assigned each patient an overall coronal alignment consistent with their group’s precision. TA procedures were modeled based on the alignment precision that an intervention could deliver, regardless of whether the technology used was navigation- or robot-assisted. To evaluate the power associated with using different cohort sizes, we ran a Monte Carlo simulation generating 3000 simulated populations that were drawn (with replacement) from the large set of simulated patients with TKA. We simulated the time to revision for aseptic loosening for each patient, computed the corresponding Kaplan-Meier survival curves, and applied a log-rank test to each study for statistical differences in revision rates at concurrent follow-up timepoints (1-25 years). From each simulation associated with a given cohort size, we determined the percentage of simulated studies that found a statistically significant difference at each follow-up interval. For each alternative precision, we then also calculated the expected reduction in revision rates (effect size) attributable to TA-TKA intervention and the number needed to treat (NNT) using TA-TKA to prevent one revision at 2, 5, 10, and 15 years after index surgery for the entire set of Kaplan-Meier survival analyses.

Results

The results from our simulation found survivorship differences favoring TA-TKA ranging from 1.4% to 2.0% at 15 years of follow-up. Comparative studies would need to enroll between 2500 and 4000 patients in each arm of the study, depending on the precision of the navigated or robotic procedure, to have an 80% chance of showing this reduction in revision rates at 15 years of follow-up. For the highest precision simulated intervention, the NNT using TA-TKA to prevent one revision was 1000 at 2 years, 334 at 5 years, 100 at 10 years, and 50 at 15 years post–index surgery.

Conclusion

Based on these simulations, it appears that TA-TKA interventions could potentially result in a relative reduction in revision rates as large as 27% (from 7.5% down to about 5.5% at 15 years for the intervention with the most precise coronal alignment); however, since this 2% absolute reduction in revision rates is relatively small in comparison with the baseline success rate of TKA and would not be realized until 15 years after the index surgery, traditional RCT studies would require excessively large numbers of patients to be enrolled and excessively long follow-up times to demonstrate whether such a reduction actually exists.

Clinical Relevance

Given that the NNTs to avoid revisions at various time points are predicted to be high, it would require correspondingly low system costs to justify broad adoption of TA-TKA based on avoided revision costs alone, though we speculate that technology assistance could perhaps prove to be cost effective in the care of patients who are at an elevated risk of revision.

Introduction

TKA is one of the most successful and commonly used approaches to treating endstage symptomatic knee osteoarthritis, with more than 800,000 procedures performed annually in the United States and Canada [8, 17]. Furthermore, TKA generally has very good to excellent long-term survivorship of approximately 93% (64% to 96%) and 96% (79% to 99%) at 15 years of follow-up in registry and clinical studies, respectively [15, 22]. Nonetheless, revision procedures still account for roughly 4% to 7% of all TKAs [2, 8]. Studies have argued that some revision procedures are related to the mechanical aspects of the procedure, which are controlled by the surgeon. For example, leaving the femoral component in excessive varus or valgus alignment leads to elevated revision rates [26]. To reduce such outliers, robotic and navigated TKA procedures (collectively referred to herein as technology-assisted TKA [TA-TKA]) have been introduced to improve the precision of bone cuts and implant alignment [4, 9, 11, 12, 14, 18-25, 28-31, 33, 35, 36]. Although these innovations in surgical navigation have improved the accuracy of implant placement, most comparative outcome studies have not demonstrated clear reductions in revision rates [9, 11, 12, 14, 18-25, 28-31, 33, 35, 36]. One retrospective trial [4] reported a significant difference but had a high loss to follow-up rate (56%) and no controls for preoperative patient demographics, with statistically significant differences in age at the index surgery and BMI between the conventional group and the navigated group, likely resulting in a higher risk of revision for the conventional group simply based on these patient-specific factors (younger age at surgery and increased BMI [5, 7]). With the increasing usage of TA-TKA (up from 1.2% in 2005 to 7.0% in 2014 according to a study of The Nationwide Inpatient Sample database [3]), it is critical to understand whether the enhanced precision afforded by robot-assisted or navigated TKA intervention is associated with improved implant survivorship and can therefore justify any increase in costs (as much as USD 6000 per procedure [3]) or operating time (up to 25 minutes per procedure [32]).

It is possible, even highly likely, that most previous comparative studies have been underpowered, perhaps severely so. If proposed technology-assisted interventions produce an effect size causing a 1% to 2% reduction in a baseline revision rate of 7%, then the number of avoided revisions will be relatively low when viewed from the perspective of the entire patient population and, therefore, they would be difficult to detect against the large baseline of well-functioning implants by using a conventional RCT approach, even if technology assistance results in an important reduction in overall revision rates. If the effect size attributable to technology assistance is low (as measured relative to the entire patient population), then it would likely require impractically large RCT studies to conclusively demonstrate such reductions. This would also suggest that, given the additional costs involved in using such systems, it would likely be difficult to demonstrate that any reduction in revision rates because of technology assistance would or could result in lower overall surgical costs. Unfortunately, it is difficult to use standard power analysis methods to evaluate the statistical power of studies comparing revision rates in conventional TKA and those in TA-TKA, primarily because the effect size of malalignment on revision surgery is unknown and estimates of the effect size are further complicated by wide variations in patient-specific factors in patients undergoing TKA. Because of these limitations, we evaluated the effect size through large-scale simulations based on registry data, creating large sets of simulated patients who are assigned different patient-specific characteristics and surgeon-controlled factors estimating their risk of revision. Similar simulation-based methodologies have been used previously to estimate appropriate sample sizes in the absence of clinical research data [1, 27].

Because of these limitations, we used a Monte Carlo simulation approach [1, 27] to estimate the relationship between the effect size potentially resulting from the use of technology assistance and the trade-off between study size and power associated with the effect size. The simulation drew on registry data to estimate expected distributions and effects of patient demographic parameters on the risk of aseptic revision. We combined these with estimates of the effect of precision on implant alignment facilitated by conventional procedures and on implant longevity by technology-assisted procedures, as derived from previous studies.

In this simulation study, we asked: Given that survivorship differences between technology-assisted TKA (TA-TKA, which we defined as either navigated or robot-assisted TKA) and conventional TKA are either small or absent based on large arthroplasty registries, large cohort studies, and large RCTs, how large would randomized trials need to be to detect small differences between TA-TKA and conventional TKA if they exist, and how long would the follow-up period need to be to have a reasonable chance to detect those differences?

Materials and Methods

Overview

To address our proposed research question, we designed simulations of hypothetical comparative clinical studies, each characterized by different numbers of patients and the precision and accuracy of various TA surgical techniques (Fig. 1). We modeled the various TA procedures based on the alignment precision that an intervention could deliver, regardless of whether the technology used was navigation or robot-assisted. We assumed three alternative precisions (SD, σ) in achieving neutral coronal alignment that could be facilitated by TA-TKA, including σ = 2°, σ = 1.5°, and σ = 1°. We used demographic information drawn from previous studies to characterize the distributions of factors describing patients undergoing TKA (principally age, gender, and BMI) and technical information to characterize the accuracy and precision of conventional and TA surgical procedures. In addition, we analyzed survival data from a recent study [26] to identify estimated effect sizes associated with deviations in one key implant alignment parameter (coronal plane alignment), which was selected because it is the most commonly reported criterion to define a well-aligned knee, and used this information to estimate revision due to aseptic loosening in various sets of simulated procedures. For each simulated study, we applied a log-rank test to test for statistical difference of survivorship at discrete yearly follow-up points up to 25 years postoperatively. We then used a Monte Carlo technique in which we repeated each simulated study 1000 times to calculate the expected reductions in revision rates attributable to TA-TKA intervention and then empirically estimated the power of each proposed study design in detecting those reductions (Supplementary Digital Content 1; http://links.lww.com/CORR/A605).

Fig. 1.

Fig. 1

This flowchart shows an overview of our comparative study design. First, patient demographic data and surgical procedure precision data were acquired from previous studies. Next, implant survival time was modeled using a patient-specific life factor, estimated using studies with a high number of participants. Alternative study designs were simulated by varying the number of patients enrolled, follow-up time, and precision of the intervention used in a simulated study. The statistical power associated with each study design was then calculated by simulating repeated iterations and determining the probability of achieving statistical significance.

Search Methodology

To identify published studies that compared revision rates for conventional and navigated or robotic TKA procedures, the first author (MDH) conducted a structured search of the University of British Columbia’s library database (which automatically searches medical publication databases such as PubMed, Embase, Medline, and Web of Science) using the following keywords: “total knee replacement/arthroplasty,” “revision,” “survival,” “robot,” “robotic,” “navigated,” and “navigation” (Fig. 2). The keywords were used to search the titles, abstracts, and full text of publications. We then screened the abstracts of all results and excluded any items from further review if the main text was not written in English or was not available, was a duplicate, if a navigated or robotic approach to TKA was not mentioned, used cadaveric specimens, was a review paper, or focused on revision TKA. Conference proceedings and preprint services were also excluded. Because registries usually do not list implant alignment and because this analysis was completed to assess the statistical power of clinical studies, studies analyzing registry data only were not included. Included studies had to have received informed consent from participants and approval from an ethics committee or institutional review board.

Fig. 2.

Fig. 2

This flowchart shows our study selection process. We included 20 studies. The intervention tool was navigation-assisted in 16 procedures and robot-assisted in four; TA = technology-assisted.

After screening abstracts, we analyzed the full text of each article. We excluded studies that did not compare implant survivorship for at least two groups: a control group treated with a conventional TKA approach and a second group treated with either navigation or robotic technology. For each article, we noted the structure of the clinical study (such as, randomized control or retrospective), the number of participants and/or knees in the study, mean follow-up time for survival analysis, intervention method (robotic or navigated), patient demographics, and precision in obtaining the coronal alignment targeted in the intervention (the SD for overall coronal alignment, hip-knee-ankle angle, or femorotibial angle). We searched Google Scholar to identify articles that cited any of the included studies to find further publications that met the search criteria. The reference lists of all included articles were screened to locate any further studies matching the search criteria (Fig. 2).

Search Summary

The structured search yielded 103 results using the University of British Columbia Library databases and subsequent searching through included studies reference lists. Twenty-nine publications were selected for full-text review after initial abstract screening (Fig. 2). All included works had been peer-reviewed and had followed the institutional ethics review process. Nine were excluded because the study did not include appropriate survival or revision data, or the full-text article was not available online. Twenty papers met all the inclusion criteria as of August 10, 2020 (Table 1).

Table 1.

Summary of relevant publications resulting from structured search

Reference Study type Technology assistance Follow-up in years Number of knees Conventional/TA Precision (±°) Conventional/TA BMI in kg/m2 Conventional/TA Age in years at index Conventional/TA Gender, % women (n) Conventional/TA
σ ≤ 1.25°
Molfetta [28] Retrospective Navigated 5.4 30/30 0.8/0.6 NR 67 (62-80)/
68 (65-81)
86.6% (26)/
74.0% (22)
Song [31] RCT Navigated 9 38/37 4.0/1.0 NR 66.1 ± 8.1/65.4 ± 5.9 75.6% (31)/74.4% (29)
Todesca [33] RCT Navigated 6.4 117/121 1.2b/1.0b NR NR NR
1.25° < σ < 1.75°
Hernández-Vaquero [18] RCT Navigated 8.3 50/50 3.9/1.7 NR 68.8 ± 8.5/70.4 ± 6.9 68.0% (34)/74.0% (37)
Cip [12] RCT Navigated 5 91/92 1.7/1.4 28.5 ± 4.7/30.2 ± 5.4 76.1 ± 7.0/74.9 ± 8.6 73.6% (67)/69.6% (64)
Cip [11] RCT Navigated 12 32/27 1.7/1.4 29.4 ± 3.5/29.4 ± 4.1 79.3 ± 7.1/79.3 ± 7.5 75.0% (24)/77.7% (21)
Yang [35] Retrospective Robotic 10 42/71 3.7/1.5 NR 67.8 ± 6.5/66.3 ± 7.5 88.1% (37)/95.8% (68)
σ ≥ 1.75°
Kimc [22] RCT Navigated 10.8 520/520 2.0b/1.9b NR 68 (49-88) 86.9% (452)
Zhu [36] RCT Navigated 9 37/30 3.6/2.7 27.7 ± 4.5/27.6 ± 4.7 65.3 ± 7.4/67.9 ± 8.1 83.8% (31)/93.3% (28)
Ouanezar [30] Case Control Navigated 10.5 36/59 3.4/2.5 31 ± 6/30 ± 5 69 ± 9/71 ± 8 74.5% (38)/65.5% (57)
Kimb [23] RCT Navigated 12 162/162 2.0b/1.9b 27 ± 3.3 68.1 ± 7.5 94.4% (153)
Ollivier [29] RCT Navigated 11 36/35 4.0/2.0 26 ± 5/27 ± 4 66 [40-80]/64 [46-78] 52.5% (21)d/52.5% (21)
Kimc [24] RCT Navigated 15 282/282 2.1b/1.9b 28 ± 8 59 ± 7 79.1% (223)
Cho [9] Retrospective Robotic 11 230/160 4.0/2.2 NR 67.6 (56-81)/68.2 (57-80) 83.2% (163)/91.0% (141)
Jeon [21] Retrospective Robotic 10.8 79/84 3.1b/2.4b 27.7 (16.5-39.7)/26.7 (18.9-37.2) 70.1 (56-83)/69.2 (47-84) 81.5% (44)/77.0% (60)
D’Amato [14] RCT Navigated 10.3 45/48 2.8/2.4 NR 68.6 ± 9.0d/70.6 ± 11.0d 50.0% (30)/50.0% (30)
Hsuc [19] RCT Navigated 8.1 56/56 2.5/1.8 28.8 ± 4.1 68.7 ± 5.8 78.6% (44)
Kim [25] RCT Robotic 13 724/724 2.7b/2.0b 29 ± 8/28 ± 9 61 ± 8/60 ± 7 80.4% (566)/78.6% (530)
Jenny [20] Retrospective Navigated 13 513/513 NR 29.4 ± 3.6/28.6 ± 4.3 72.1 ± 7.3/71.8 ± 6.9 65.5% (336)/66.1% (339)
Baumbacha [4] Retrospective Navigated 10 46/50 4.2/3.0 32.3 ± 5.2/30.8 ± 4.7 68.6 ± 7.0/73.9 ± 8.9 75.2% (85)/81.7% (85)

Data presented as mean ± SD, % (n patients who were women in the respective study arm), mean (range), median [range], or NR (not reported). The table is broken down into three sections based on the level of precision (σ) reported in obtaining neutral coronal alignment with the most precise interventions in the first section (σ < 1.25°) and the least precise in the last (σ > 1.75°). Bolded table entries indicate a statistically significant difference (p < 0.05) in study group demographics.

a

Indicates that the study found a statistical difference in revision rates.

b

Procedure SD not reported; SD calculated using outlier percentages assuming errors are normally distributed.

c

Study conducted with patients who underwent TA-TKA for one knee and conventional TKA for the other.

d

Two different implants were used for both TA-TKA and conventional. Patient demographic SDs for implant groups were combined; TA-TKA = technology-assisted total knee arthroplasty.

The number of patients enrolled in these studies ranged from 27 to 724 in the TA-TKA study arms and 32 to 724 in the conventional TKA study arms, respectively. The average BMI, age at index surgery, and distribution of gender ranged from 26.0 kg/m2 to 32.3 kg/m2, 59 to 79 years, and 50% to 94% women, respectively, in the conventional study arms and from 26.7 kg/m2 to 30.8 kg/m2, 59 to 79 years, and 50% to 96% women, respectively, in the TA-TKA study arms. To facilitate comparison with our simulation results over various TA-TKA intervention precisions, we grouped studies into one of three categories based on the precision of the TA-TKA procedure in achieving neutral overall coronal alignment: σ ≤ 1.25°, 1.25° < σ < 1.75°, or σ ≥ 1.75°.

Distributions of Patient Characteristics

We began building a simulated population of patients with TKA by identifying the distributions of key patient demographic information and typical coronal alignment of implants using data reported in the American Joint Replacement Registry [2] and in recently published studies [5, 7]. We chose to use patient demographic data from larger databases as opposed to studies uncovered in the structured search as we wanted our simulated TKA patient population to have distributions in their patient-specific characteristics that match the typical TKA patient population in the United States. The key patient-specific factors that have been associated with risk of revision include BMI [7], age at the index surgery, and gender [5] (Table 2). The mean values of patient demographics used in our simulation were well within the ranges calculated across studies uncovered during the structured search.

Table 2.

Simulated TKA patient population demographics

Property Distribution
BMI in kg/m2 27 ± 3 [7]
Age in years at index 67 ± 9 [1]
Gender (% women) 59% [1]

Data presented as mean ± SD or %.

Estimation of Patient-specific Life Factor

We modeled each simulated patient’s probability of implant survival as a Gaussian function because this is a monotonically decreasing function that closely matches the typical survival curve shapes and can be easily parameterized using a single parameter, the characteristic survival time, τ. As the characteristic survival time for a particular patient decreases, so does the probability that the patient will still have a functioning TKA implant at a given time after the index surgery (Fig. 3). In our simulations, we modeled the enhanced revision risk for a particular patient with risk-altering characteristics by assigning them a life factor, k, derived from an analysis of previous studies linking the risk of revision to patient-specific and surgeon-controlled factors. A patient characterized as having a high revision risk (such as a younger man with a high BMI) will be assigned a smaller life factor than a patient with a lower revision risk (for example, an older woman with a lower BMI).

Fig. 3.

Fig. 3

This graph shows the probability of TKA implant survival in the form of a Gaussian survival function, parameterized by the characteristic survival time, τ, whereby τ is the number of years until the probability of survival is 37% (= 1/e). As the characteristic survival time decreases, the probability that a patient’s TKA implant is still functioning at any given time after surgery also decreases. A color image accompanies the online version of this article.

To estimate the relative risks and life factor associated with the patient-specific factors of age at index surgery and gender, k(age, gender), we used data acquired in a large registry study on the risk of revision from the Clinical Practice Research Datalink between 1991 and 2011 [5, 34] (Supplementary Fig. 1; http://links.lww.com/CORR/A606). To estimate relative risks of revision and life factor associated with BMI, k(BMI), we used survival data presented in a recent systematic review on the effect of BMI on implant survival in TKA [7] (Supplementary Fig. 2; http://links.lww.com/CORR/A607). The overall patient-specific life factor kPS was then presumed for our simulations to be the multiplicative product of the individual life factors associated with the different patient-specific characteristics. For example, an older, lighter woman will be assigned a k value closer to 1 compared with a younger, heavier man (Supplementary Digital Content 2; http://links.lww.com/CORR/A608).

Surgeon-controlled Life Factor

For this study, we modeled a single surgeon-controlled parameter: the overall coronal alignment. We chose coronal alignment because this factor is the primary alignment parameter targeted and reported in most studies of TA-TKA systems. To estimate the life factor value, kSC, associated with various degrees of varus and valgus alignment, we modeled survival functions from reported distributions [26] (Supplementary Fig. 3; http://links.lww.com/CORR/A609). This study was chosen to model the risk related to coronal alignment because of its large number of enrolled patients (n = 982) and because the survival data were separated by alignment category, which allowed us to interpolate between and extrapolate beyond category averages to calculate life factors for all possible values of coronal alignment.

Modeling Time to Revision

In our simulations, we assigned demographic and surgeon-controlled factors to each simulated patient and calculated a patient-specific expected survival curve by applying the relevant life factors. For an individual simulated patient, the overall life factor was defined as the product of the surgeon-controlled and patient-specific life factors described. We used these life factors to compute a simulated patient-specific value for τ by multiplying that patient’s assigned life factors by an overall maximum attainable characteristic survival time, τmax, which we chose to characterize the expected survival curve for a patient without risk factors (Supplementary Fig. 4; http://links.lww.com/CORR/A610). We selected a value for τmax by using a Bayesian optimization algorithm to adjust the mean survival probability of the entire simulated patient population to 93% at 15 years, as reported in a recent review of 47 studies from the Finnish and Australian joint replacement registries [15]. Through this process, τmax was calculated to be 113 years. We decided to use survival rates from registries because implant survival rates can be inflated due to a bias in reporting favorable outcomes [15]. Furthermore, since revision rates are likely to be higher in heterogeneous practices than in high-volume centers, we chose to model the heterogeneous scenario, resulting in the smallest possible study size to uncover an effect. As a check, we also evaluated the mean survival probability found in the studies uncovered during the structured search, and we calculated the mean extrapolated survival rate across all included studies and determined that it was comparable at 93.5% at 15 years.

Simulation Design

To estimate the power of various study designs in finding a statistically significant difference in revision rates between a conventional TKA and various technology-supported alternatives, we used a Monte Carlo approach in which, for a particular proposed study design (characterized by the number of patients [ranging from 500-4000] enrolled in each of the control and intervention arms), we generated a large number of simulated patient populations, M = 3000, and used the methods described above to randomly assign patient-specific factors to the simulated patients, calculate associated life factors, and randomly select the time at which a revision surgery would be needed for each simulated patient’s implant.

For each simulated study design, we used the method described above to generate four sets of simulated patients: one set with coronal alignment parameters governed by the precision of a conventional manual surgical approach (the mean precision reported for conventional procedures reported in the included studies from the structured search was σ = 3.02°) and three additional sets with coronal alignment precision parameters governed by the precision of various robotic or navigated approaches, as reported (Supplementary Fig. 5; http://links.lww.com/CORR/A611) [6]. The three alternative precisions were σ =1.0°, σ = 1.5°, and σ = 2.0°. These values were selected to cover the range of precisions reported for various robotic and navigation-based surgical systems [29, 31, 35].

Ethical Approval

This study did not use animal or human subjects, and therefore, ethical approval was not sought.

Statistical Analysis

Given the number of patients enrolled per arm in a study, we simulated the progression of the survival analysis year-by-year to model what would be found at annual patient follow-up visits. We then generated Kaplan-Meier survival curves for the conventional groups and three intervention groups representing each of the alternative surgical precisions; in all cases, we assumed no patients were lost to follow-up or died. For each alternative precision, we then calculated the expected reduction in revision rates (effect size) attributable to TA-TKA intervention and the number needed to treat (NNT) using TA-TKA to prevent one revision at 2, 5, 10, and 15 years after index surgery for the entire set of Kaplan-Meier survival analyses. At each year, we applied a log-rank test [13] to test for statistically significant difference in implant survival between the control group and each intervention group, assuming the null hypothesis that there was no difference in revision rates. At each timepoint in the M = 3000 simulated studies, we evaluated the empirical probability that the log-rank test would reject the null hypothesis (p < 0.05), indicating a statistically significant difference between populations in terms of implant survival.

To enable clearer comparisons with previous studies, we smoothened and replotted the simulation results in statistical power plots that show the estimated statistical power of each study design (specified by the number of patients enrolled per arm) at various years of follow-up, and overlaid these plots with dots representing the corresponding size and average years of follow-up for the various comparative studies uncovered from the structured review.

Results

Our simulations found survivorship differences favoring TA-TKA of 1.4% (SD, σ = 2.0°), 1.8% (σ = 1.5°), and 2.0% (σ = 1.0°) at 15 years of follow-up relative to a baseline revision rate of 7.5% for the conventional procedure (Fig. 4); the resulting sample sizes needed to detect these differences with a power of 80% (a commonly used value for choosing a sample size [10]) generally exceeded 5000 (2500 per arm) for the highest precsion intervention (σ = 1.0°), and were as high as 8000 (4000 per arm) for the least precise intervention (σ = 2.0°) at a follow-up of 15 years (Fig. 5). For the highest precison intervention at 2 and 5 years follow-up, the number of enrolled patients required to achieve a power 80% to detect any reduction in revision rates would have far exceeded the number of patients in our simulation study, decreasing to 8000 (4000 per arm), 3500 (1750 per arm), and 2700 (1350 per arm) at 10, 20, and 25 years follow-up, respectively. For the highest simulated intervention precision (σ = 1.0°) representing the best-case-scenario, the NNT using TA-TKA to prevent one revision was 1000 at 2 years, 334 at 5 years, 100 at 10 years, and 50 at 15 years post–index surgery.

Fig. 4.

Fig. 4

The results from the Monte Carlo analysis show the distribution of Kaplan-Meier survival (mean ± SD) curves for the conventional procedure and two of the TA-TKA groups. The σ = 1.5° group is not shown to improve image clarity. A color image accompanies the online version of this article.

Fig. 5.

Fig. 5

A-C These figures show the probability that a survival analysis will reject the null hypothesis (p < 0.05) given the mean follow-up time and number of enrolled patients in each arm of the study for three alternative effect sizes (that is, difference in the revision rate) for three different TA-TKA system precisions in coronal alignment (A) σ =1.0°, (B) σ = 1.5°, or (C) σ = 2.0°. The precision of the conventional group was assumed to be σ = 3.02°. The dotted line in each plot indicates the 80% power line. The overlaid red and blue dots correspond to previous studies, each characterized by a study size and average follow-up period; these are plotted on the subplot most closely corresponding to the precision of the technique reported in the study. The star markers in each plot show the number of patients required per arm to reach an expected statistical power of 80% for each effect size, given a follow-up time of 15 years. For clarity, some studies in (C) are noted by citation number rather than by name; RCT = randomized controlled trial. A color image accompanies the online version of this article.

Discussion

The effectiveness of TA-TKA in reducing revision rates is one of the most controversial ongoing debates in orthopaedic surgery. Although robotic and navigated innovations have demonstrated improved implant placement accuracy, many studies have not shown that this has resulted in detectable improvements in revision rates [9, 11, 12, 14, 18-25, 28-31, 33, 35, 36]. Given that traditional power analysis techniques are difficult to apply when the effect size is unknown, we used simulation techniques informed by data from previous studies to estimate potential effect sizes in patient populations with various demographic characteristics and to study the tradeoff between study size and power. The results from these simulations suggest that the effect size of TA-TKA interventions on revision rates is simply too small to be detected using traditional RCT methodologies with reasonable numbers of enrolled patients and follow-up times. We found that large clinical studies would be needed to prove that TA-TKA reduces revision rates, because the NNTs were over 100 until after ten years after the index surgery. That being so, routine use of TA-TKA is likely not justifiable based on avoided revision costs alone unless the associated per-procedure incremental costs prove to be substantially lower than have been reported for use of technology-assisted systems (averaging over USD 6000 in a relatively recent analysis of the US Nationwide Inpatient Sample database [3]).

Limitations

This study has several important limitations. First, as this is a simulation-based analysis, we inferred the effects of patient-specific and surgeon-controlled factors, as well as their interactions, on the implant revision rates but did not directly measure such effects and have not directly verified the presumed multiplicative effect of the various risk factors. However, our risk models are based on risk data acquired from large clinical studies, systematic reviews, and registry studies, and capture what we believe are reasonable estimates of the effect size of TA-TKA on revision rates. Although these estimates may be refined in future studies, we believe that our overall findings are plausible as the implant survivorship results generated in our simulations are within the expected bounds reported in the cited comparative studies.

Second, we assumed that the principal effect of robot-assisted and navigation interventions would improve the coronal alignment precision achieved and did not consider potential benefits such as improved ligament balance. Although it is possible that such additional benefits may increase the effect size of technology assistance and decrease the study size required to demonstrate benefit, we are not aware of studies that claim such an effect or that would allow us to model it. Third, our simulation operated under the assumption that no simulated patients were lost to follow-up or died during the analysis. This likely had the effect of making our assessment of the number of patients required to adequately power an RCT study optimistic, and actual studies would require more patients and longer follow-up times than reported here and would further support our conclusions. Fourth, we chose to use survival rates from registries to determine the mean survival rate of our simulated TKA patient population. Studies from high-volume centers, such as those identified in the structured search, may report lower revision rates, which would require enrolling even more patients than we have calculated, so we feel that this assumption is conservative with respect to the conclusions we have drawn.

Fifth, our modeling of patient-specific factors affecting risk of revision was limited to age at index surgery, gender, and BMI, but we did not consider other possible factors, which may increase risk of revision such as smoking or the primary indication for TKA as these factors are rarely reported in clinical studies assessing the effect of TA-TKA on revision rates. However, we feel that our model does capture the range of usually reported patient-specific risk factors and believe the exclusion of other factors would have negligible impact on our conclusions. Finally, our simulation was based on a binary classification of gender. We found no evidence that would allow us to estimate the relative risks associated with patients who are transgender or nonbinary who undergo TKA, but since such patients would constitute a small fraction of most patient populations, we do not anticipate that our conclusions would be strongly influenced by this assumption.

Effect Size and Value of Technologically Assisted TKA

Kim et al. [25] reported what appears to be the largest comparative study to date, enrolling nearly 1000 patients in each arm of a comparison study between conventional and robotically assisted TKAs. All patients were younger than 65 years at the time of the index surgery. The authors reported a variability of 2° in the robot-assisted procedure versus 3° in the conventional procedure, and no difference in terms of implant survival at 15 years; survival was 98% for both. They concluded that, “considering the additional time and expense associated with robot-assisted TKA, we cannot recommend its widespread use.” The survival rate reported in that study is higher than the rate we used in our simulations, likely because of the relatively low BMI (average 28.5 kg/m2), the predominance of women (approximately 80%), and the robotic procedure having a procedure variance at the high end of the range we simulated. Collectively, these factors tend to minimize the potential differences between robot-assisted and conventional procedures, so it is perhaps not surprising that the authors found no such differences given their practice situation.

The value proposition on TA-TKA can be further explored by assessing the economic impact of using TA-TKA as a tool to reduce revision rates in TKA [16]. Based on the simulated NNT to prevent one revision and factoring in an additional USD 6000 per TA-TKA case [3], the associated costs of preventing one revision are relatively expensive, reaching USD 6 million at 2 years (NNT of 1000), just over USD 2 million at 5 years (NNT of 334), USD 600,000 at 10 years (NNT of 100), and USD 300,000 at 15 years (NNT of 50) post–index surgery. In our simulations, TA-TKA intervention did result in up to a 27% reduction in the incidence of revision at 15 years (from approximately 7.5% for conventional TKA to 5.5% for the most precise TA-TKA intervention investigated). Even though this reflects a large fraction of revisions, given that the overall success rate of the procedure is so high, and that even this 2% difference in revision rate is not apparent until 15 years after the index surgery, traditional RCT studies would require excessively large numbers of patients to be enrolled and excessively long follow-up times even to demonstrate the existence of such a reduction, let alone to put reasonable confidence bounds on the estimate of reduction of revision rate. Therefore, we strongly recommend that investigators cease trying to demonstrate reductions in revision risk using an RCT approach applying TA-TKA to a broad patient population, as the estimated effect size is too small to be detected with reasonable numbers of patients in a reasonable period of follow-up.

Conclusion

Based on our simulations, traditional RCT studies would require excessively large numbers of patients to be enrolled and excessively long follow-up times even to demonstrate whether a reduction in revision rate actually exists. Given that such large patient numbers and long follow-up times are needed to demonstrate a difference in outcome, the average per-patient benefit is likely too small to warrant universal clinical application if the per-procedure differential costs of using technology assistance remain as high as reported in some studies [3]. We note, though, that technology assistance may provide other benefits not considered here that may affect decisions about use. We also speculate that TA-TKA may potentially prove cost-effective in reducing revision risk for a more select group of patients who are at intrinsically higher risk of revision due to patient-specific factors such as younger age at index surgery or higher BMI. We therefore recommend that future comparative studies focus on the effect of TA-TKA intervention on patients who are at a relatively higher risk of revision compared with the general population.

Supplementary Material

SUPPLEMENTARY MATERIAL
abjs-479-2350-s001.docx (35.3KB, docx)
abjs-479-2350-s002.docx (25.2KB, docx)
abjs-479-2350-s003.docx (102.1KB, docx)

Acknowledgment

We thank the Centre for Hip Health and Mobility for providing facilities and technical support.

Footnotes

The institution of one or more of the authors (MDH, AJH) has received funding from the Natural Sciences and Engineering Research Council of Canada.

One of the authors (AJH) certifies that he holds shares in Traumis Surgical Systems and holds several patents (US9554812, US8548559, US10010381, and US9037295) that are broadly relevant to the work.

One of the authors (CA) certifies that she is President of and holds stock/stock options in Ammolite BioModels and holds several patents relevant to the work.

One of the authors (BM) certifies receipt of personal payments or benefits, during the study period, in an amount of less than USD 10,000 from Stryker, as well as research support from Zimmer, Smith & Nephew, DePuy, and Stryker.

All ICMJE Conflict of Interest Forms for authors and Clinical Orthopaedics and Related Research® editors and board members are on file with the publication and can be viewed on request.

Ethical approval was not sought for the present study.

This work was performed at the University of British Columbia, Vancouver, BC, Canada.

Contributor Information

Carolyn Anglin, Email: carolyn@ammolitebiomodels.com.

Bassam Masri, Email: bas.masri@ubc.ca.

Antony J. Hodgson, Email: ahodgson@mech.ubc.ca.

References

  • 1.Allgoewer A, Mayer B. Sample size estimation for pilot animal experiments by using a Markov Chain Monte Carlo approach. Altern Lab Anim. 2017;45:83-90. [DOI] [PubMed] [Google Scholar]
  • 2.American Academy of Orthopaedic Surgeons. American Joint Replacement Registry (AJRR) 2019 Annual Report. Available at: http://connect.ajrr.net/2019-ajrr-annual-report. Accessed September 10, 2020.
  • 3.Antonios JK, Korber S, Sivasundaram L, et al. Trends in computer navigation and robotic assistance for total knee arthroplasty in the United States: an analysis of patient and hospital factors. Arthroplast Today. 2019;5:88-95. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Baumbach JA, Willburger R, Haaker R, Dittrich M, Kohler S. 10-year survival of navigated versus conventional TKAs: a retrospective study. Orthopedics. 2016;39:S72-S76. [DOI] [PubMed] [Google Scholar]
  • 5.Bayliss LE, Culliford D, Monk AP, et al. The effect of patient age at intervention on risk of implant revision after total replacement of the hip or knee: a population-based cohort study. Lancet. 2017;389:1424-1430. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Bechtold B. Violin Plots for Matlab. 2020. Available at: https://github.com/bastibe/Violinplot-Matlab. Accessed October 4, 2020. [Google Scholar]
  • 7.Boyce L, Prasad A, Barrett M, et al. The outcomes of total knee arthroplasty in morbidly obese patients: a systematic review of the literature. Arch Orthop Trauma Surg. 2019;139:553-560. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Canadian Institute for Health Information. Hip and knee replacements in Canada: CJRR quick stats, 2018-2019. Available at: https://www.cihi.ca/sites/default/files/document/cjrr-hip-knee-qs-2018-en.xlsx. Accessed August 28, 2020.
  • 9.Cho KJ, Seon JK, Jang WY, Park CG, Song EK. Robotic versus conventional primary total knee arthroplasty: clinical and radiological long-term results with a minimum follow-up of ten years. Int Orthop. 2019;43:1345-1354. [DOI] [PubMed] [Google Scholar]
  • 10.Cohen J. Statistical Power Analysis for the Behavioral Sciences. 2nd ed. Erlbaum Associates; 1988. [Google Scholar]
  • 11.Cip J, Obwegeser F, Benesch T, Bach C, Ruckenstuhl P, Martin A. Twelve-year follow-up of navigated computer-assisted versus conventional total knee arthroplasty: a prospective randomized comparative trial. J Arthroplasty. 2018;33:1404-1411. [DOI] [PubMed] [Google Scholar]
  • 12.Cip J, Widemschek M, Luegmair M, Sheinkop MB, Benesch T, Martin A. Conventional versus computer-assisted technique for total knee arthroplasty: a minimum of 5-year follow-up of 200 patients in a prospective randomized comparative trial. J Arthroplasty. 2014;29:1795-1802. [DOI] [PubMed] [Google Scholar]
  • 13.Creed J, Gerke T, MatSurv Berglund A: Survival analysis and visualization in MATLAB. J Open Source Softw. Available at: https://github.com/aebergl/MatSurv. Accessed August 2, 2020. [Google Scholar]
  • 14.D’Amato M, Ensini A, Leardini A, Barbadoro P, Illuminati A, Belvedere C. Conventional versus computer-assisted surgery in total knee arthroplasty: comparison at ten years follow-up. Int Orthop. 2019;43:1355-1363. [DOI] [PubMed] [Google Scholar]
  • 15.Evans JT, Walker RW, Evans JP, Blom AW, Sayers A, Whitehouse MR. How long does a knee replacement last? A systematic review and meta-analysis of case series and national registry reports with more than 15 years of follow-up. Lancet. 2019;393:655-663. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Gøthesen Ø, Slover J, Havelin L, Askildsen JE, Malchau H, Furnes O. An economic model to evaluate cost-effectiveness of computer assisted knee replacement surgery in Norway. BMC Musculoskelet Disord. 2013;14:202. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Healthcare Cost and Utilization Project. HCUP Fast Stats - Most Common Operations During Inpatient Stays. Available at: https://hcup-us.ahrq.gov/faststats/NationalProceduresServlet?year1=2018&characteristic1=0&included1=1&year2=2008&characteristic2=54&included2=1&expansionInfoState=hide&dataTablesState=hide&definitionsState=hide&exportState=hide. Accessed September 5, 2020.
  • 18.Hernández-Vaquero D, Suarez-Vazquez A, Iglesias-Fernandez S. Can computer assistance improve the clinical and functional scores in total knee arthroplasty? Clin Orthop Relat Res. 2011;469:3436-3442. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Hsu RWW, Hsu WH, Shen WJ, Hsu W Bin, Chang SH. Comparison of computer-assisted navigation and conventional instrumentation for bilateral total knee arthroplasty: the outcomes at mid-term follow-up. Medicine. 2019;98. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Jenny J-Y, Saragaglia D, Bercovy M, et al. Navigation improves the survival rate of mobile-bearing total knee arthroplasty by severe preoperative coronal deformity: a propensity matched case–control comparative study. J Knee Surg. 2020. [DOI] [PubMed] [Google Scholar]
  • 21.Jeon SW, Kim K Il, Song SJ. Robot-assisted total knee arthroplasty does not improve long-term clinical and radiologic outcomes. J Arthroplasty. 2019;34:1656-1661. [DOI] [PubMed] [Google Scholar]
  • 22.Kim Y-H, Park J-W, Kim J-S. Computer-navigated versus conventional total knee arthroplasty. J Bone Joint Surg Am. 2012;94:2017-2024. [DOI] [PubMed] [Google Scholar]
  • 23.Kim YH, Park JW, Kim JS. The clinical outcome of computer-navigated compared with conventional knee arthroplasty in the same patients: a prospective, randomized, double-blind, long-term study. J Bone Joint Surg Am. 2017;99:989-996. [DOI] [PubMed] [Google Scholar]
  • 24.Kim YH, Park JW, Kim JS. 2017. Chitranjan S. Ranawat Award: Does computer navigation in knee arthroplasty improve functional outcomes in young patients? A randomized study. Clin Orthop Relat Res. 2018;476:6-15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Kim YH, Yoon SH, Park JW. Does robotic-assisted TKA result in better outcome scores or long-term survivorship than conventional TKA? A randomized, controlled trial. Clin Orthop Relat Res. 2020;478:266-275. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Lee B, Cho H, Bin S, Kim J, Jo B. Femoral component varus malposition is associated with tibial aseptic loosening after TKA. Clin Orthop Relat Res. 2018;687:400-407. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Moayyeri A, Sadatsafavi M, Leslie WD. Sample size requirements for bone density precision assessments and effect on patient categorization: a Monte Carlo simulation study. Bone. 2007;41:679-684. [DOI] [PubMed] [Google Scholar]
  • 28.Molfetta L, Caldo D. Computer navigation versus conventional implantation for varus knee total arthroplasty: a case-control study at 5 years follow-up. Knee. 2008;15:75-79. [DOI] [PubMed] [Google Scholar]
  • 29.Ollivier M, Parratte S, Lino L, Flecher X, Pesenti S, Argenson JN. No benefit of computer-assisted TKA: 10-year results of a prospective randomized study. Clin Orthop Relat Res. 2018;476:126-134. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Ouanezar H, Franck F, Jacquel A, Pibarot V, Wegrzyn J. Does computer-assisted surgery influence survivorship of cementless total knee arthroplasty in patients with primary osteoarthritis? A 10-year follow-up study. Knee Surg Sports Traumatol Arthrosc. 2016;24:3448-3456. [DOI] [PubMed] [Google Scholar]
  • 31.Song EK, Agrawal PR, Kim SK, Seo HY, Seon JK. A randomized controlled clinical and radiological trial about outcomes of navigation-assisted TKA compared to conventional TKA: long-term follow-up. Knee Surg Sports Traumatol Arthrosc. 2016;24:3381-3386. [DOI] [PubMed] [Google Scholar]
  • 32.Song EK, Seon JK, Yim JH, Netravali NA, Bargar WL. Robotic-assisted TKA reduces postoperative alignment outliers and improves gap balance compared to conventional TKA knee. Clin Orthop Relat Res. 2013;471:118-126. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Todesca A, Garro L, Penna M, Bejui-Hugues J. Conventional versus computer-navigated TKA: a prospective randomized study. Knee Surg Sports Traumatol Arthrosc. 2017;25:1778-1783. [DOI] [PubMed] [Google Scholar]
  • 34.Social Security Administration. Actuarial life table. Available at: https://www.ssa.gov/oact/STATS/table4c6_2016.html. Accessed August 4, 2020.
  • 35.Yang HY, Seon JK, Shin YJ. Robotic total knee arthroplasty with a cruciate-retaining implant: a 10-year follow-up study. Clin Orthop Surg. 2017;9:169-176. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Zhu M, Ang CL, Yeo SJ, Lo NN, Chia SL, Chong HC. Minimally invasive computer-assisted total knee arthroplasty compared with conventional total knee arthroplasty: a prospective 9-year follow-up. J Arthroplasty. 2016;31:1000-1004. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

SUPPLEMENTARY MATERIAL
abjs-479-2350-s001.docx (35.3KB, docx)
abjs-479-2350-s002.docx (25.2KB, docx)
abjs-479-2350-s003.docx (102.1KB, docx)

Articles from Clinical Orthopaedics and Related Research are provided here courtesy of The Association of Bone and Joint Surgeons

RESOURCES