Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2023 Jun 1.
Published in final edited form as: Bone. 2022 Feb 28;159:116376. doi: 10.1016/j.bone.2022.116376

Reverse Engineering the FRAX Algorithm: Clinical Insights and Systematic Analysis of Fracture Risk

Jules D Allbritton-King a, Julia K Elrod b, Philip S Rosenberg b, Timothy Bhattacharyya a
PMCID: PMC9035136  NIHMSID: NIHMS1795079  PMID: 35240349

Abstract

The Fracture Risk Assessment Tool (FRAX) is a computational tool developed to predict the 10-year probability of hip fracture and major osteoporotic fracture based on inputs of patient characteristics, bone mineral density (BMD), and a set of seven clinical risk factors. While the FRAX tool is widely available and clinically validated, its underlying algorithm is not public. The relative contribution and necessity of each input parameter to the final FRAX score is unknown. We systematically collected hip fracture risk scores from the online FRAX calculator for osteopenic Caucasian women across 473,088 unique inputs. This dataset was used to dissect the FRAX algorithm and construct a reverse-engineered fracture risk model to assess the relative contribution of each input variable.

Within the reverse-engineered model, age and T-Score were the strongest contributors to hip fracture risk, while BMI had marginal contribution. Of the clinical risk factors, parent history of fracture and ongoing glucocorticoid treatment had the largest additive effect on risk score. A generalized linear model largely recapitulated the FRAX tool with an R2 of 0.91. Observed effect sizes were then compared to a true patient population by creating a logistic regression model of the Study of Osteoporotic Fractures (SOF) cohort, which closely paralleled the effect sizes seen in the reverse-engineered fracture risk model.

Analysis identified several clinically relevant observations of interest to FRAX users. The role of major osteoporotic fracture risk prediction in contributing to an indication of treatment need is very narrow, as the hip fracture risk prediction accounted for 98% of treatment indications for the SOF cohort. Removing any risk factor from the model substantially decreased its accuracy and confirmed that more parsimonious models are not ideal for fracture prediction. For women 65 years and older with a previous fracture, 98% of FRAX combinations exceeded the treatment threshold, regardless of T-score or other factors. For women age 70+ with a parent history of fracture, 99% of FRAX combinations exceed the treatment threshold.

Based on these analyses, we re-affirm the efficacy of the FRAX as the best tool for fracture risk assessment and provide deep insight into the interplay between risk factors.

Keywords: FRAX, Hip Fracture Risk, Osteoporosis, Generalized Linear Model, Bone Mineral Density

1. Introduction:

When a patient presents for discussion of osteoporosis management, clinicians must assess the individual’s risk of future fracture to make a recommendation for pharmacologic treatment. Fortunately, clinicians worldwide have free access to the online Fracture Risk Assessment Tool (FRAX), a clinically validated computational tool that integrates patient biometrics and risk factors to predict fracture risk. The National Osteoporosis Foundation (NOF) treatment guidelines recommend a clinical diagnosis of osteoporosis for patients with a 10-year hip fracture (HF) risk of ≥3% or a 10-year major osteoporotic fracture (MOF) risk of ≥20%, as calculated by the FRAX algorithm.

When using the FRAX software, clinicians input a patient’s sex, age, BMI, T-score, and seven yes/no risk factor datapoints in order to compute 10-year HF and MOF risk. Over 70 separate FRAX algorithms have been developed and validated using region and ethnicity-specific cohort data to account for population variations. Numerous adjustments have also been proposed to account for relevant risk factors excluded from the standard FRAX calculator, such as fall history, type 2 diabetes, and chronic kidney disease.[17]

Despite being the most widely-used fracture risk prediction system in clinical assessment guidelines worldwide, the mathematical framework of the FRAX algorithm is not publicly available.[810] As such, several pertinent questions arise when using the FRAX in practice. Are all datapoints, particularly the T-score, needed? Are all risk factors equally weighted, or do they have variable effects based on other patient characteristics? How will a patient’s FRAX score, and thus fracture risk, change over time? Because the underlying algorithm is unknown, these are challenging to answer. From a clinical standpoint, FRAX is well-accepted but can be cumbersome. Clinicians are always searching for simplified models to determine fracture risk.

Several groups have developed their own simplified models for fracture risk prediction in specific subpopulations. These simplified models are then claimed to perform favorably when compared to FRAX predictions of a given cohort. As the creators of FRAX themselves have noted, these comparisons are misleading, as existing internally derived models have been constructed to fit the data of the relevant index cohort, and their predictive value for future patients is not validated.[11] Other more recent models have shown promising results for fracture risk prediction using classification and regression tree analysis (CART) and machine learning principles; however, FRAX remains the most widely used tool by clinicians.[1216]

Our primary aim was to better describe the inner workings of the FRAX algorithm by systematically collecting 473,088 FRAX scores across a clinically relevant range of inputs to create a descriptive regression model with additive effects that recapitulated the FRAX tool. We then assessed the effects of each input variable on the final risk predictions, validated our fracture risk model against the Study of Osteoporotic Fractures (SOF) patient cohort, and identified several clinically relevant heuristics for determination of patient pharmacologic intervention.

2. Materials and Methods:

2.1. Parameter Ranges for FRAX Calculator Data Collection

The online FRAX software provides separate predictive tools validated against population-based cohorts from different regions. For the purposes of this study, FRAX data was gathered from the US (Caucasian) calculator for women only. The FRAX calculator required input of four continuous parameters (Age, Weight, Height, and Femoral Neck BMD) and seven binary risk factors (Previous Fracture, Parent Fractured Hip, Current Smoking, Glucocorticoids, Rheumatoid Arthritis, Secondary Osteoporosis, and Alcohol 3+ units/day) in order to generate an output prediction. The range of ages studied was defined as 50 to 90 years in increments of 2 years, as NOF guidelines for anti-osteoporotic medications apply to ages 50 and older. Height was held constant at 161.8 cm, the average for American women, while weight was varied to achieve BMI values from 15 to 40 in increments of 2.5. T-score was selected as the measure of femoral neck BMD and ranged from −2.5 to −1.0 in increments of 0.1, the range of T-scores for which NOF criteria recommend FRAX usage. Each of the seven binary risk factors required a yes/no input. T-scores below −2.5 were not studied because these patients meet criteria for treatment without further analysis.

2.2. Automated Data Collection from Online FRAX Calculator

A custom Python script was developed to automate data collection of all parameter combinations within the predetermined, discrete input ranges from the online FRAX calculator. The Selenium browser control framework (https://pypi.org/project/selenium/) using the Chrome browser was employed to systematically access the US (Caucasian) FRAX webpage, input a unique ID and prespecified parameter values, and record the output 10-year HF and MOF risk percentage. The software was iterated over the full range of input parameters for a total of 473,088 unique FRAX predictions, as well as 6,726 sets of patient parameters from the SOF cohort.

2.3. Three-Dimensional Risk Space for Individual Binary Risk Factors

To visualize the relative contribution of each binary risk factor to FRAX-predicted 10-year HF risk, the full 3,696-point range of age, BMI, and T-score combinations were plotted in three dimensions for subsets of collected data including only one risk factor. All combinations that yielded ≥3% 10-year HF risk were plotted in red, and risk space was defined as the percentage of all 3,696 points that produced a ≥3% 10-year HF risk output.

2.4. Reverse-Engineered Fracture Risk Model

In order to better understand the FRAX algorithm, a generalized linear model predicting the log of FRAX 10-year HF risk was constructed. Age was blocked into two-year groups and converted to a categorical variable to account for its nonlinear association with HF risk, while BMI and T-score were included as continuous numeric variables. The six binary predictors were included in the model as factors. For each observation, a weight was calculated using the square of predictions from a simple linear model regressing HF Risk against only Age, BMI, and T-score. These weights account for a positive association between the mean and the variance of the 10-year FRAX score in our dataset. We noted that the model more accurately described the well-established negative correlation between BMI and hip fracture risk when datapoints with exceedingly low BMIs of <18.5 were excluded. This may be due to the fact that a large portion of underweight women above age 50 are already osteoporotic with a T-score below −2.5. Thus, the T-scores between −2.5 and −1 in our generated dataset likely do not apply to many underweight patients. We chose to exclude datapoints with BMIs below <18.5 to avoid obscuring the true relationship between BMI and HF Risk. All subsequent modeling approaches excluded this group as well. The accuracy of the reverse-engineered fracture model was assessed by comparing its prediction of treatment recommendation for each patient from the SOF cohort to the actual FRAX score for each patient.

2.5. Logistic Regression Model of Study of Osteoporotic Fractures (SOF) Cohort

Data from the SOF cohort was used to cross-validate the associations observed in the data generated with the online FRAX calculator. SOF was an observational study following a cohort of 10,366 women age 65 or older from 1986 to 2008. Study participants were recruited from the Baltimore, Pittsburgh, Minneapolis, and Portland metropolitan areas and attended nine data collection visits. Age, BMI, Previous Fracture, Parent Fracture, Current Smoking, Steroid Use, and Alcohol 3+ units per day were collected from the first visit. BMD, which was used to calculate T-score, was collected at the second visit (year 2) and Rheumatoid Arthritis status was collected at visit 4 (year 6). Hip Fracture is a binary variable indicating whether the patient experienced a hip fracture during the first ten years of the study.

A few subtle differences between these variables and the FRAX variables should be noted. First, in the SOF dataset, Previous Fracture refers to any previous bone fracture after the age of 50. The comparable FRAX variable refers to a previous fragility fracture experienced at any age. Similarly, in the SOF dataset, Parent Fracture refers to any parent fracture after the age of 50, while the comparable FRAX variable, Parent Fractured Hip, refers only to previous parental hip fractures. For the SOF data, Steroids refers to any past or present use of steroid pills while the FRAX variable Glucocorticoids refers to any past or present long-term use of oral glucocorticoids.

In order to understand the association between these SOF variables and hip fracture risk, a logistic regression model predicting the log-odds of hip fracture from the three quantitative predictors and six binary predictors was constructed. Age was blocked into two-year groups and converted to a categorical variable, while BMI and T-score were treated as continuous numeric variables. As with the reverse-engineered fracture model, 51 patients with BMIs <18.5 were excluded from the dataset prior to training the logistic regression model.

2.6. Reduced FRAX Model for Individual Risk Factor Utility

We aimed to assess the amount of variation in hip fracture risk explained by each binary predictor. Thus, we constructed a simple linear model regressing the log of HF Risk against Age and BMI as factors. Then, T-score as a factor and each binary predictor was added to the reduced model individually and the partial R-squared statistic for this addition was calculated. This provided a quantitative measurement of the residual variance explained by T-score and each binary predictor.

2.7. Performance of FRAX calculator, reverse-engineered fracture risk model, and reduced model in predicting 10-year hip fracture for SOF patients.

To illustrate changes in predictive power for more parsimonious models of FRAX, we compared the prediction of treatment recommendation from FRAX, the reverse-engineered fracture risk model, and a reduced model including only Age, BMI, T-Score, and Parent Fractured Hip versus true 10-year incidence of hip fracture in the SOF cohort. Accuracy was calculated for each model.

2.8. Comparative Utility of FRAX-Predicted HF Risk and MOF Risk

In order to assess the comparative utility of FRAX scores for HF and MOF when making a treatment recommendation, we plotted HF risk vs MOF risk for the full collected FRAX dataset. Divergent datapoints, defined as FRAX scores where MOF risk was ≥20% but HF risk was less than 3%, were displayed in red. A separate plot of HF risk vs MOF risk was plotted for the SOF patient cohort. The percentages of divergent datapoints within the full FRAX dataset and SOF patient FRAX scores were also calculated.

3. Results:

3.1. Three-Dimensional Risk Space for Individual Binary Risk Factors

Three-dimensional graphing of individual 3,696-point datasets including only one risk factor allowed visualization of the effect of each risk factor as a percentage of the total set of FRAX scores (Figure 1A, Supplemental File 1). With no additional risk factors included, 28% of combinations yielded a ≥3% 10-year HF risk. Points above the threshold clustered at higher age and lower T-score. Inclusion of the ‘Glucocorticoids’ risk factor had the greatest increase in total risk space, with 54% of all points above the ≥3% 10-year hip fracture risk threshold. Inclusion of the ‘Secondary Osteoporosis’ risk factor had no effect on FRAX scores, reflecting its exclusion from the FRAX algorithm when a BMD measurement is inputted.[17] Our subsequent analysis and modeling approaches thus excluded analysis of the ‘Secondary Osteoporosis’ risk because it did not affect FRAX score. Notably, >90% of combinations above age 70 with the ‘Parent Hip Fracture’ risk factor included were at ≥3% HF risk. We noted that addition of ‘Glucocorticoids’, ‘Current Smoking’, ‘Previous Fracture’, ‘Alcohol, 3+ units/day’, and ‘Rheumatoid Arthritis’ risk factors led to a more uniform expansion of the risk space relative to the no risk factor condition.

Figure 1. Three-dimensional representation of risk space for ≥3% 10-year hip fracture risk.

Figure 1.

The FRAX calculation tool was used to calculate 10-year hip fracture risk for 3,696 evenly-spaced combinations of age, BMI, and T-score. Red points indicate a combination that yielded a 10-year hip fracture risk of ≥3% thus meeting criteria for medication treatment. The percentage value for each graph indicates the percentage of all 3,696 combinations per dataset that were ≥3%.

3.2. Reverse-Engineered Fracture Risk Model

The reverse-engineered fracture risk model had an accuracy of 83.4% when predicting treatment recommendations for SOF patients, as compared to the recommendation based on their full FRAX score (Table 1A). The model was then used to visualize the additive effect of each input variable (Age, BMI, T-Score, and all risk factors) on the log of the predicted 10-year hip fracture risk (Figure 2A). Age was observed to have an increasing effect on log(HF Risk) peaking at age 84. At ages 85 and up, the competing risk of death is revealed by a decrease in the risk of hip fracture even with advancing age. (Supplemental Figure 1). BMI had a negative correlation with log(HF Risk) in the range of 20 to 40, with the highest additive contribution to log(HF Risk) observed at the lowest BMIs. T-score demonstrated a linear effect on log(HF Risk), with the greatest additive contribution at the lowest T-score of −2.5. All risk factors had an increased additive effect on log(HF Risk), with ‘Parent Fractured Hip’ and ‘Glucocorticoids’ having the greatest effect. The additive effects of each clinical risk factor in the generalized linear model largely recapitulated the respective expansion of risk space for each risk factor observed in Figure 1.

Table 1: Performance of reverse-engineered fracture risk model predictions vs. actual FRAX scores for SOF patients.

(A) The predicted 10-year risk of hip fracture for 6,675 SOF patients was calculated using the actual FRAX calculator and the reverse-engineered fracture risk model. The confusion matrix shows that the reverse-engineered fracture risk model has an accuracy of 83.4% compared to the full FRAX model.

Reverse-Engineered Fracture Risk
Model
<3% HF Risk ≥3% HF Risk
<3% HF Risk 1518 348
FRAX Score ≥3% HF Risk 763 4046

Figure 2: Additive effects of clinical risk factors on 10-year hip fracture risk for women.

Figure 2:

(A) A generalized linear model was fitted for 236,544 combinations of the 9 risk predictors and their corresponding FRAX 10-year hip fracture risk scores. The additive effects on the log of the hip fracture risk for rising age, BMI, and T-Score are shown in the top panel. The additive effects of the binary factors are seen in the lower panel. (B) A logistic regression model was fitted for data from the first ten years of the Study of Osteoporotic Fractures (SOF). Additive effects on the log-odds of hip fracture are shown.

3.3. Logistic Regression Model of Study of Osteoporotic Fractures (SOF) Cohort

The logistic regression model fitted for SOF data was used to visualize the additive effect of each FRAX-included input variable on hip fracture data from a real patient cohort (Figure 2B). The additive effects of age, BMI, and T-score followed similar trends as the effects data from the reverse-engineered fracture risk model constructed from the FRAX dataset. The SOF-defined risk factor of ‘Previous Fracture’ had the greatest average additive contribution to the log-odds of hip fracture, and ‘Alcohol 3+ units/day’ showed a borderline significant association. ‘Any Parent Fracture’ also demonstrated a positive association with hip fracture, while the other predictors did not show statistically significant correlations with hip fracture risk.

3.4. Reduced FRAX Model for Individual Risk Factor Utility

To measure the relative contributions of each risk factor to hip fracture risk, we created a reduced model by regressing the log of hip fracture risk against age and BMI (Table 2). We then sequentially added risk factors to the reduced model and computed the partial R-squared value to measure each risk factor’s contribution to the model. The reduced model with no risk factors or T-score included had an R-squared value of 0.533 for the full FRAX dataset. Adding ‘Parent Fractured Hip’ to the reduced model yielded the greatest increase among the binary risk factors for an R-squared value of 0.616, with a partial R-squared of 0.179. Inclusion of T-score alone raised the R-squared value to 0.659 with a partial R-squared of 0.269. Including all risk factors while omitting T-score yielded an R-squared value of 0.791 with a partial R-squared of 0.552. Including every risk factor and T-score to recapitulate the full FRAX algorithm yielded an R-squared value of 0.917 and a partial R-squared of 0.821. This approach allowed an explicit ranking of the relative effect of each clinical risk factor and paralleled the findings of the risk space visualizations and reverse-engineered fracture risk model.

Table 2. Comparative Utility of FRAX Risk Factors.

A reduced model regressing log(HF Risk) against Age and BMI was developed to further assess the additive predictive capacity of T-Score and individual FRAX risk factors. Each variable was individually added to the reduced model and the partial R-squared value was calculated.

Risk Factors Included R2 Partial R2
T-Score, Previous Fracture, Parent Fractured Hip, Glucocorticoids, Current Smoking, Rheumatoid Arthritis, Alcohol 3+ units/day (Full model) 0.917 0.821
Previous Fracture, Parent Fractured Hip, Glucocorticoids, Current Smoking, Rheumatoid Arthritis, Alcohol 3+ units/day 0.791 0.552
T-score, Parent Fractured Hip 0.742 0.448
T-Score, Glucocorticoids 0.711 0.382
T-Score, Current Smoking 0.697 0.352
T-Score, Previous Fracture 0.694 0.345
T-Score, Alcohol 3+ units/day 0.687 0.330
T-Score, Rheumatoid Arthritis 0.678 0.311
T-Score 0.659 0.269
Parent Fractured Hip 0.616 0.179
Glucocorticoids 0.585 0.113
Current Smoking 0.571 0.082
Previous Fracture 0.568 0.075
Alcohol 3+ units/day 0.561 0.060
Rheumatoid Arthritis 0.552 0.042
Reduced Model (Age + BMI) 0.533 NA

3.5. Performance of FRAX calculator, reverse-engineered fracture risk model, and reduced model in predicting 10-year hip fracture for SOF patients.

To better visualize the loss of predictive power in more parsimonious models of FRAX, the accuracy in predicting actual incidence of fracture among the SOF cohort was calculated for FRAX, the reverse-engineered fracture risk model, and a reduced model (Figure 3). The true FRAX prediction and the reverse-engineered fracture risk model performed comparably, with accuracies of 40% and 44%, respectively. The accuracy of a reduced model incorporating only Age, BMI, T-Score, and Parent Hip Fracture decreased substantially to 23%.

Figure 3. Performance of FRAX calculator, reverse-engineered fracture risk model, and reduced model in predicting 10-year hip fracture for SOF patients.

Figure 3.

The predicted 10-year risk of hip fracture for 6,675 SOF patients was calculated using (A) the FRAX calculator, (B) the reverse-engineered fracture risk model, and (C) a reduced model incorporating only Age, BMI, T-Score, and Parent Fractured Hip. The accuracy of each model in predicting a sustained hip fracture is displayed to the right.

3.6. Comparative Utility of FRAX-Predicted HF Risk and MOF Risk

To assess the utility of MOF Risk vs. HF Risk for making treatment recommendations, we identified divergent cases in which MOF was ≥20% but HF risk was <3%. For the collected FRAX dataset, 6.4% of scores were divergent (Figure 4A), as compared to 1.8% divergent SOF patient FRAX scores (Figure 4B). Most divergent cases in both datasets were between ages 65 and 70 with multiple clinical risk factors.

Figure 4. Low Divergence of Major Osteoporotic Fracture Risk in FRAX and SOF Datasets.

Figure 4.

(A) Visual representation of FRAX-calculated 10-year HF risk vs. 10-year MOF risk for all datapoints collected. Points in red represent the population of cases for which HF was <3% and MOF was ≥20%. (B) Visual representation of FRAX-calculated 10-year HF risk vs. 10-year MOF risk for all SOF patients. Points in red represent the population of cases for which HF was <3% and MOF was ≥20%.

3.7. Derivation of Simple Heuristics from Modeling Approaches

We utilized our generated FRAX dataset to identify several simplified cutoffs which may be clinically helpful (Table 3). First, we observed that >99% of FRAX combinations with age over 70 and ‘Parent Fractured Hip’ met or exceeded the ≥3% 10-year HF risk treatment threshold. Second, for combinations over age 65 with ‘Previous Fracture’, >98% of combinations exceeded the treatment threshold. Third, for combinations over age 60 with ‘Rheumatoid Arthritis’, >90% of combinations exceeded the treatment threshold. Finally, HF risk is a more reliable indicator than MOF risk for a treatment recommendation, as only 6.4% of FRAX datapoints yielded a recommendation for treatment based on MOF risk and not HF risk. Similarly, only 1.8% of SOF patient FRAX scores exceeded the MOF risk treatment threshold but not the HF risk threshold.

Table 3. Clinically Relevant Heuristics Derived from FRAX Analysis.

Three patterns were identified based on visualization and descriptive analysis of collected FRAX data.

Patient Subpopulation Clinical Heuristic
Age >70, Parent Fractured Hip >99% of combinations exceed FRAX treatment threshold
Age >65, Previous Fracture >98% of combinations exceed FRAX treatment threshold
Age >60, Rheumatoid Arthritis >90% of combinations exceed FRAX treatment threshold

4. Discussion:

Systematic collection of FRAX scores across a clinically relevant range of inputs allowed us to reconstruct a model of the FRAX algorithm, which was then used to assess the relative contribution of individual risk factors to the final HF risk prediction. We concluded that removing any of the risk factors, especially the T-score, substantially degraded the hip fracture prediction accuracy. Further, our study provides a descriptive analysis of the FRAX algorithm’s integration of each input variable and derives several clinically useful takeaways from these findings.

There is a great deal of interest in simplifying the FRAX algorithm by removing risk factors to create a more expedient system for clinical use. However, our analysis reaffirms the utility and robustness of the FRAX algorithm. Removal of any one clinical risk factor from the reverse-engineered fracture model substantially reduced its accuracy. While other tools exist, the FRAX tool is freely available, widely validated, and provides the most value for future fracture risk assessment. Collection of 11 datapoints may be cumbersome; however, these are simple clinical questions that have no cost other than time and data entry.

An accurate method for risk stratification without BMD measurement is also highly sought after, as measuring BMD requires additional cost, time, and radiation. Medicare reimbursement of dual-energy X-ray absorptiometry (DXA) has fallen about 70% since 2006, further limiting access.[18] Numerous groups have claimed that removal of BMD measures do not pose a significant detriment based on comparisons of FRAX calculations with and without BMD for small patient cohorts. However, our data make it clear that without inclusion of T-score the accuracy of the reverse-engineered fracture model’s hip fracture prediction falls substantially, similar to previously published studies assessing FRAX scoring with and without input of BMD measures.[19,20]

There has been extensive analysis of clinical risk factors that affect hip fracture probability independently of BMD, seven of which are incorporated into the online FRAX calculator. The external validity of the FRAX model has been verified using many region and ethnicity-based cohorts, but the actual effect of each risk factor on the final FRAX prediction is unknown.[2128] Our three-dimensional visualization of how FRAX-predicted HF risk changed according to age, BMI, and T-score, as well as each individual clinical risk factor, provides insight into the underlying weighting of the FRAX algorithm. Each risk factor produced relatively uniform increases in risk space from the ‘No Risk Factor’ data. ‘Parent Hip Fracture’ is an exception in that it sharply increased the proportion of datapoints above age 70 that were ≥3% 10-year HF risk. This interaction reflects the variable effect size of Parent Hip Fracture with age, as stated on the official FRAX website. An understanding of the interplay between HF risk and age for patients with a given risk factor is of use to clinicians for discussion of treatment options with patients who are likely to exceed the HF risk threshold in the near future.

Isolating the additive effects of each input variable on the log of FRAX-calculated HF risk revealed the differential contribution towards predicted risk across each variable’s range. The additive effect of age increased until 80–85 years. After 85 years old, we observed the ten-year risk of hip fracture drop, due to the competing risk of death.. The additive effect of BMI decreased linearly as BMI increased. Independently, the additive effect of T-score on log of hip fracture risk decreased linearly with T-score. These findings closely paralleled prior studies on the relationship between age, BMI, BMD, and 10-year hip fracture risk in women.[29,30] Inclusion of the ‘Parent Fractured Hip’ risk factor had the single most pronounced increase in hip fracture risk paralleling the expansion of risk space observed across the full range of age, BMI, and T-score.

A logistic regression model constructed from the SOF dataset confirmed the positive association between age and HF risk observed in the FRAX-based model, as seen by the increasing effect size until age 80–85. The negative association between T-score and hip fracture risk also paralleled the findings of the FRAX-based model, albeit with a steeper slope due to the grouping of patients with T-scores below −2.5 or above −1.0 with the respective extremes of our prespecified T-score range, −2.5 and −1.0. The effect of BMI on the log(odds) of HF decreased linearly as BMI increased, similar to the reverse-engineered fracture risk model. It should be noted that the official FRAX tool indicates the effect of BMI is substantially reduced when a BMD measurement is included. This interaction accounts for the differences in BMI effect magnitude between the reverse-engineered fracture risk model and the SOF model.

Each clinical risk factor included in FRAX has a differential effect on the final risk prediction, as demonstrated our calculation of each factor’s partial R-squared value when incorporated into a reduced model. Thus, removal of risk factors to create a more parsimonious model substantially degrades the model’s accuracy. Notably, a reduced model displays a substantially decreased proportion of true negatives (No Fracture/<3% HF Risk) and a concurrent increase in false positives (No Fracture/≥3% HF Risk).

Current guidelines recommend pharmacologic treatment if a patient has either >3% 10 year risk of hip fracture or >20% risk of a major osteoporotic fracture. We found that the calculation major osteoporotic fracture risk has little effect in the real world. Very few datapoints were divergent--above the MOF threshold but below the hip fracture risk threshold (Figure 4). Only 1.8% of SOF patients were in this divergent group. Thus, while both risk calculations can be done, the probability of hip fracture dominates the treatment decision.

In addition to our validation of the FRAX algorithm and the necessity of its included inputs, our descriptive analysis and modeling approaches yielded several heuristics valuable to clinicians. The shortcuts identified will help clinicians to assess some patients without BMD measures immediately available, and to evaluate the suitability of follow-up DXA scans. These heuristics are of value in curbing the overuse of DXA in lower-risk patients.[31,32]

A major limitation of our study is that we only addressed the osteopenic population (T score −1.0 to −2.5), but prior to evaluation it is impossible to know if the patient falls into this category. However, it is precisely this osteopenic group that requires the most comprehensive risk assessment, as only 1.4% of all fractures among the SOF cohort occurred in patients with T-scores greater than −1.0. As we have established, a measurement of bone mineral density is the single most important variable apart from age for producing an accurate prediction of fracture risk. Further, our FRAX dataset was generated from the algorithm for American women of Caucasian descent. Because FRAX algorithms are modified based on region and ethnicity-specific cohorts, our findings may not align perfectly for other patient groups.

Though clinicians have sought to simplify the FRAX calculator since its implementation, we found that all included variables are necessarily included to provide an accurate prediction of absolute fracture risk. Our analysis provides a quantitative description of the effect of each FRAX input on the output risk predictions and identifies several clinically applicable heuristics for rapid evaluation of some patients. Further analysis of the FRAX algorithms for other regions is warranted to assess the validity of these heuristics among other cohorts and potentially identify other useful clinical shortcuts.

Supplementary Material

1
2
Download video file (94.4MB, mp4)

Highlights.

  • The FRAX is an online tool used to decide who needs osteoporosis treatment. It is a black box because the algorithm is unknown.

  • By testing 473,088 possible risk factor combinations, we reverse-engineered the model.

  • The FRAX is the most accurate model for hip fracture prediction. It performs better than other models on real world patients.

  • For women 65 years and older with a previous fracture, treatment for osteoporosis is recommended regardless of T-Score. 98% of FRAX combinations exceed the treatment threshold.

Funding Sources:

This work was supported by the Intramural Research Program of the National Institute of Arthritis and Musculoskeletal and Skin Diseases (Z01 AR041184). The Study of Osteoporotic Fractures (SOF) is supported by National Institutes of Health funding. The National Institute on Aging (NIA) provides support under the following grant numbers: R01 AG005407, R01 AR35582, R01 AR35583, R01 AR35584, R01 AG005394, R01 AG027574, and R01 AG027576.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

DISCLOSURE PAGE

All the authors state that they have no conflicts of interest.

CRediT authorship contribution statement

Jules D. Allbritton-King: Methodology, Software, Analysis, Writing.

Julia K. Elrod: Statistical Analysis, Methodology, Writing.

Philip S. Rosenberg: Methodology, Statistical Analysis, Writing – review & editing.

Timothy Bhattacharyya: Conceptualization, Writing – review & editing, Supervision, Project administration, Funding acquisition.

References

  • [1].Crandall CJ, Larson J, Cauley JA, Schousboe JT, LaCroix AZ, Robbins JA, et al. Do Additional Clinical Risk Factors Improve the Performance of Fracture Risk Assessment Tool (FRAX) Among Postmenopausal Women? Findings From the Women’s Health Initiative Observational Study and Clinical Trials. JBMR Plus. 2019;3(12):1–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [2].Kayan K, Johansson H, Oden A, Vasireddy S, Pande K, Orgee J, et al. Can fall risk be incorporated into fracture risk assessment algorithms: A pilot study of responsiveness to clodronate. Osteoporos Int. 2009;20(12):2055–61. [DOI] [PubMed] [Google Scholar]
  • [3].Whitlock RH, Leslie WD, Shaw J, Rigatto C, Thorlacius L, Komenda P, et al. The Fracture Risk Assessment Tool (FRAX®) predicts fracture risk in patients with chronic kidney disease. Kidney Int [Internet]. 2019;95(2):447–54. Available from: 10.1016/j.kint.2018.09.022 [DOI] [PubMed] [Google Scholar]
  • [4].Giangregorio LM, Leslie WD, Lix LM, Johansson H, Oden A, McCloskey E, et al. FRAX underestimates fracture risk in patients with diabetes. J Bone Miner Res. 2012;27(2):301–8. [DOI] [PubMed] [Google Scholar]
  • [5].Bisson EJ, Finlayson ML, Ekuma O, Marrie RA, Leslie WD. Accuracy of FRAX® in People With Multiple Sclerosis. J Bone Miner Res. 2019;34(6):1095–100. [DOI] [PubMed] [Google Scholar]
  • [6].Leslie WD, Schousboe JT, Morin SN, Martineau P, Lix LM, Johansson H, et al. Loss in DXA-estimated total body lean mass but not fat mass predicts incident major osteoporotic fracture and hip fracture independently from FRAX: a registry-based cohort study. Arch Osteoporos. 2020;15(1):1–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [7].Leslie WD, Schousboe JT, Morin SN, Martineau P, Lix LM, Johansson H, et al. Measured height loss predicts incident clinical fractures independently from FRAX: a registry-based cohort study. Osteoporos Int. 2020;31(6):1079–87. [DOI] [PubMed] [Google Scholar]
  • [8].Kanis JA, Harvey N, Cooper C, Johansson H, Odén A, McCloskey E, et al. A systematic review of intervention thresholds based on FRAX: A report prepared for the National Osteoporosis Guideline Group and the International Osteoporosis Foundation. Arch Osteoporos. 2016;11(1):1–101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [9].Adami S The FRAX ® : critical appraisal. Int J Clin Rheumtol [Internet]. 2009. Dec;4(6):645–50. Available from: http://www.futuremedicine.com/doi/abs/10.2217/ijr.09.55 [Google Scholar]
  • [10].Xu G, Yamamoto N, Hayashi K, Takeuchi A, Miwa S, Igarashi K, et al. The accuracy of different FRAX tools in predicting fracture risk in Japan: A comparison study. J Orthop Surg. 2020;28(2):1–6. [DOI] [PubMed] [Google Scholar]
  • [11].Kanis JA, Oden A, Johansson H, McCloskey E. Pitfalls in the external validation of FRAX. Osteoporos Int. 2012;23(2):423–31. [DOI] [PubMed] [Google Scholar]
  • [12].Su Y, Kwok TCY, Cummings SR, Yip BHK, Cawthon PM. Can Classification and Regression Tree Analysis Help Identify Clinically Meaningful Risk Groups for Hip Fracture Prediction in Older American Men (The MrOS Cohort Study)? JBMR Plus. 2019;3(10):1–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [13].Kong SH, Ahn D, (Raymond) Kim B, Srinivasan K, Ram S, Kim H, et al. A Novel Fracture Prediction Model Using Machine Learning in a Community‐Based Cohort. JBMR Plus. 2020;4(3):1–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [14].Ferizi U, Besser H, Hysi P, Jacobs J, Rajapakse CS, Chen C, et al. Artificial Intelligence Applied to Osteoporosis: A Performance Comparison of Machine Learning Algorithms in Predicting Fragility Fractures From MRI Data. J Magn Reson Imaging [Internet]. 2019. Apr 25;49(4):1029–38. Available from: https://onlinelibrary.wiley.com/doi/abs/10.1002/jmri.26280 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [15].Kruse C, Eiken P, Vestergaard P. Machine Learning Principles Can Improve Hip Fracture Prediction. Calcif Tissue Int. 2017;100(4):348–60. [DOI] [PubMed] [Google Scholar]
  • [16].Badgeley MA, Zech JR, Oakden-Rayner L, Glicksberg BS, Liu M, Gale W, et al. Deep learning predicts hip fracture using confounding patient and healthcare variables. arXiv [Internet]. 2018; (October 2018). Available from: 10.1038/s41746-019-0105-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [17].Leslie WD, Morin SN, Lix LM, Niraula S, McCloskey EV., Johansson H, et al. Performance of FRAX in Women with Breast Cancer Initiating Aromatase Inhibitor Therapy: A Registry-Based Cohort Study. J Bone Miner Res. 2019;34(8):1428–35. [DOI] [PubMed] [Google Scholar]
  • [18].Lewiecki EM, Ortendahl JD, Vanderpuye‐Orgle J, Grauer A, Arellano J, Lemay J, et al. Healthcare Policy Changes in Osteoporosis Can Improve Outcomes and Reduce Costs in the United States. JBMR Plus. 2019;3(9):1–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [19].Johansson H, Azizieh F, al Ali N, Alessa T, Harvey NC, McCloskey E, et al. FRAX- vs. T-score-based intervention thresholds for osteoporosis. Osteoporos Int. 2017;28(11):3099–105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [20].Choi S, Kwon S-R, Jung J-Y, Kim H-A, Kim S-S, Kim S, et al. Prevalence and Fracture Risk of Osteoporosis in Patients with Rheumatoid Arthritis: A Multicenter Comparative Study of the FRAX and WHO Criteria. J Clin Med. 2018;7(12):507. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [21].Lesnyak O, Zakroyeva A, Lobanchenko O, Johansson H, Liu E, Lorentzon M, et al. A surrogate FRAX model for the Kyrgyz Republic. Arch Osteoporos. 2020;15(1). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [22].Povoroznyuk VV., Grygorieva NV., Kanis JA, McCloskey EV, Johansson H, Harvey NC, et al. Epidemiology of hip fracture and the development of FRAX in Ukraine. Arch Osteoporos. 2017;12(1). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [23].Sornay-Rendu E, Munoz F, Delmas PD, Chapurlat RD. The FRAX tool in French women: How well does it describe the real incidence of fracture in the OFELY cohort. J Bone Miner Res. 2010;25(10):2101–7. [DOI] [PubMed] [Google Scholar]
  • [24].Naureen G, Johansson H, Iqbal R, Jafri L, Khan AH, Umer M, et al. A surrogate FRAX model for Pakistan. Arch Osteoporos. 2021;16(1). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [25].Clark P, Denova-Gutiérrez E, Zerbini C, Sanchez A, Messina O, Jaller JJ, et al. FRAX-based intervention and assessment thresholds in seven Latin American countries. Osteoporos Int. 2018;29(3):707–15. [DOI] [PubMed] [Google Scholar]
  • [26].Lesnyak O, Ismailov S, Shakirova M, Alikhanova N, Zakroyeva A, Abboskhujaeva L, et al. Epidemiology of hip fracture and the development of a FRAX model for Uzbekistan. Arch Osteoporos. 2020;15(1). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [27].Leslie WD, Lix LM, Johansson H, Oden A, McCloskey E, Kanis JA. Independent clinical validation of a Canadian FRAX tool: Fracture prediction and model calibration. J Bone Miner Res. 2010;25(11):2350–8. [DOI] [PubMed] [Google Scholar]
  • [28].Kirilova E, Johansson H, Kirilov N, Vladeva S, Petranova T, Kolarov Z, et al. Epidemiology of hip fractures in Bulgaria: development of a country-specific FRAX model. Arch Osteoporos. 2020;15(1). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [29].Kanis JA, Johnell O, Oden A, Dawson A, De Laet C, Jonsson B. Ten year probabilities of osteoporotic fractures according to BMD and diagnostic thresholds. Osteoporos Int. 2001;12(12):989–95. [DOI] [PubMed] [Google Scholar]
  • [30].Joakimsen RM, Fønnebø V, Magnus JH, Tollan A, Søgaard AJ. The Tromsø Study : Body Height, Body Mass Index and Fractures. 1998;436–42. [DOI] [PubMed] [Google Scholar]
  • [31].Morden NE, Schpero WL, Zaha R, Sequist TD, Colla CH. Overuse of short-interval bone densitometry: assessing rates of low-value care. Osteoporos Int [Internet]. 2014. Sep 9;25(9):2307–11. Available from: http://link.springer.com/10.1007/s00198-014-2725-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [32].Lewiecki EM, Lane NE. Common mistakes in the clinical use of bone mineral density testing. Nat Clin Pract Rheumatol [Internet]. 2008. Dec 21;4(12):667–74. Available from: http://link.springer.com/10.1007/s00198-014-2725-2 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1
2
Download video file (94.4MB, mp4)

RESOURCES