Abstract
Purpose:
Osteoradionecrosis (ORN) of the mandible represents a severe, debilitating complication of radiation therapy (RT) for head and neck cancer (HNC). At present, no normal tissue complication probability (NTCP) models for risk of ORN exist. The aim of this study was to develop a multivariable clinical/dose-based NTCP model for the prediction of ORN any grade (ORNI-IV) and grade IV (ORNIV) after RT (±chemotherapy) in patients with HNC.
Methods and Materials:
Included patients with HNC were treated with (chemo-)RT between 2005 and 2015. Mandible bone radiation dose-volume parameters and clinical variables (ie, age, sex, tumor site, pre-RT dental extractions, chemotherapy history, postoperative RT, and smoking status) were considered as potential predictors. The patient cohort was randomly divided into a training (70%) and independent test (30%) cohort. Bootstrapped forward variable selection was performed in the training cohort to select the predictors for the NTCP models. Final NTCP model(s) were validated on the holdback test subset.
Results:
Of 1259 included patients with HNC, 13.7% (n = 173 patients) developed any grade ORN (ORNI-IV primary endpoint) and 5% (n = 65) ORNIV (secondary endpoint). All dose and volume parameters of the mandible bone were significantly associated with the development of ORN in univariable models. Multivariable analyses identified D30% and pre-RT dental extraction as independent predictors for both ORNI-IV and ORNIV best-performing NTCP models with an area under the curve (AUC) of 0.78 (AUCvalidation = 0.75 [0.69–0.82]) and 0.81 (AUCvalidation = 0.82 [0.74–0.89]), respectively.
Conclusions:
This study presented NTCP models based on mandible bone D30% and pre-RT dental extraction that predict ORNI-IV and ORNIV (ie, needing invasive surgical intervention) after HNC RT. Our results suggest that less than 30% of the mandible should receive a dose of 35 Gy or more for an ORNI-IV risk lower than 5%. These NTCP models can improve ORN prevention and management by identifying patients at risk of ORN.
Introduction
Osteoradionecrosis (ORN) of the mandible is a severe, late toxicity after chemo-radiation for head and neck cancer (HNC) with a reported incidence between 1% to 16%.1–4 Although ORN is less prevalent relative to other radiation-attributable HNC toxicities, ORN is often extremely debilitating, requires intense resource requirements for management, and contributes to a substantial negative effect on the quality of life.5 With the rising incidence of human papillomavirus (HPV) associated subtypes of HNC,6 survival rates have improved, as HPV-associated tumors are more sensitive to radiation therapy (RT) than HPV-negative tumors and exhibit improved tumor control.7–9 Moreover, because patients with HPV-positive tumors are typically younger and healthier,10 the longer life-years expectancy postradiation and expected chronic compromise to bone healing for these patients result in a higher cumulative lifetime risk for ORN development,11 highlighting the importance of dedicated strategies aimed to prevent ORN in modern practice.
ORN is characterized by nonhealing bone and mucosal insult after radiation treatment, and the condition may present with variable severity.2,12 Some cases of ORN may clinically heal spontaneously over time (grade I), while other presentations of ORN may require minor debridement of the injured tissue (grade II), hyperbaric therapy (grade III), or major invasive mandible surgery (grade IV).13 Due to the characteristic presence of devitalized bone and reduced blood supply, successful treatment for ORN may be challenging and unpredictable, thus the optimal management for the condition is prevention. Normal tissue complication probability (NTCP) prediction of ORN based on dose-volume parameters can guide RT mandibular dose constraints in an attempt to prevent the development of ORN in patients with HNC.14,15 NTCP may also be used to guide alternative selections for treatment modalities with less distal beam-path toxicity, such as proton therapy.16 Furthermore, NTCP prediction may be used to identify patients at medium-high risk of ORN to prescribe dedicated follow-up imaging for early detection of ORN and intervention before advanced stages.17
At present, no NTCP model has been developed for ORN, yet previous case-control studies have identified a significant relationship between mandibular dose and the development of ORN.1,12,13,18,19 Many identified the mandible bone volume receiving 50 Gy (V50Gy) as the most important volume parameter (VxGy); the related dose parameters (Dx) were not investigated in these studies.1,12,13,18,19 Moreover, pre-RT dental extractions have also been identified as a risk factor for ORN development.13,20 Some studies observed a significant association between smoking status and ORN12,13,20; however, others did not observe this correlation.1,3 Consequently, in the absence of a formal NTCP model with clinical variables, monolithic nonpatient-specific dose constraints are used in general practice. Without a usable NTCP model, the confounding effect of clinical variables on dose-toxicity may be obscured.
To this end, the aim of this study was to develop a multivariable NTCP model for the prediction of development of any grade of ORN after RT in patients with HNC. The model building considers both dose-volume parameters and clinical risk factors to provide an optimized pretreatment ORN risk assessment. Secondary study analysis aimed at development of an NTCP model for the prediction of advanced (grade IV) ORN.
Methods and Materials
Patients
Subsequent to institutional review board approval (RCR030800), retrospective data of patient information for cases with proven squamous cell carcinoma HNC were included if patients received RT alone, in combination with surgery, or with chemotherapy with curative intent between 2005 and 2015 at a single institution, MD Anderson Cancer Center. These patients were part of a larger “big data RT HNC” collection effort that is currently being constructed. Patients with previously documented head and neck irradiation, history of salivary gland cancer, and patients with a survival or follow-up time of less than a year were excluded from the study. Generally, the prescribed dose to primary tumor range was 68 to 72 Gy for definitive treatment (typically, 2.12 Gy in 33 fractions 5 times per week), 60 to 66 Gy for postoperative indications (typically, 2 Gy in 30–33 fractions 5 times per week), and 57 Gy to the elective lymph node levels (1.72 Gy in 33 fractions). Generally, a radiation source of 6 MV, a traditional beam, and a nominal dose rate of 600 Monitor-Units/min were used. In the study period, for primary tumor and upper neck nodal disease, the vast majority received a split-field technique matching a lower anterior neck field and larynx midline block. Alternatively, “whole-field” intensity modulated radiation therapy (IMRT) was deployed when tumors were located more inferiorly to avoid underdosing.
Data extraction and processing
Planned dose distribution and corresponding planning computed tomography (CT) were extracted from various planning systems (Pinnacle, Philips Radiation Oncology Systems; Eclipse, Varian Medical Systems; Raystation, RaySearch Laboratories) to standardized DICOM-RT format. The mandibular bone was subsequently auto-segmented with a previously validated multiatlas-based auto-segmentation using commercial software ADMIRE (research version 1.1; Elekta AB, Stockholm, Sweden).21 Dose-volume histogram (DVH) parameters were extracted with bulk extraction using an in-house developed software script in MATLAB (version R2014a).
NTCP endpoints
The primary NTCP endpoint of this study was binary ORN (ORNI-IV) development any time point after treatment in patients with a minimum of 12 months of post-RT follow-up. The secondary NTCP endpoint was the development of ORNIV (ORNIV) at any time point after treatment. The ORN grades are defined as follows13: grade I, minimal bone exposure requiring conservative management; grade II, bone exposure requiring and receiving minor debridement; grade III, hyperbaric oxygen needed; grade IV, major invasive surgery required. ORN cases and grades were identified through querying radiology HNC RT CT scan reports from the radiology information systems, together with a thorough manual inspection of the electronic health record for ORN diagnosis.
Candidate predictors
Candidate DVH parameters of the mandibular bone were mean, minimum, and maximum dose; D2%; from D5% – in increments of 5% – to D95%; D97%; D98%; D99%; from V5Gy – in increments of 5 Gy – to V70Gy. The following clinical variables were considered: age; sex (female vs male); tumor subsite (oral cavity vs oropharynx vs hypopharynx/unknown-primary/larynx/nasopharynx: as discrete ordinal 1, 2, 3); smoking status (current vs former/never); smoking pack-years (continuous); postoperative RT (PORT) (definitive vs PORT); dental extraction (no/edentulous vs dental extractions); and chemotherapy (no vs chemotherapy). Only dental extractions within 6 weeks before treatment were considered; preradiation dental extractions are typically performed 4 to 6 weeks before RT at our institution.
Statistical modeling
The complete retrospective collected data were randomly divided into a training set and an independent test set with a 70:30 ratio. Univariable logistic regression analysis was performed on the training set to investigate statistically significant DVH and clinical variables (P < .05). Multivariable NTCP model development was performed with all candidate variables with step-wise forward selection with ranking based on Akaike information criterion (AIC) score while testing per variable selection “step” for significance of P < .01 with likelihood-ratio test for nested model comparison. The internal validity of the variable selection was estimated by repeating the variable selection 5000 times with a bootstrap procedure (ie, with replacement), as suggested by the transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD) statement.22 Internal model robustness of variable selection was confirmed if variables were serially selected in the bootstrapped samples. These analyses were performed for the primary (ORNI-IV) and secondary endpoint (ORNIV) separately on the training cohort. Final models were independently validated (ie, not changing variables and coefficients) using the embargoed test subset. Model performance used area under the receiver operating characteristic curve (area under the curve [AUC]), Nagelkerke’s R2, and the discrimination slope as evaluative criteria. In addition, nested model improvement was determined with AIC score difference (Δ), which was considered “significant” when ΔAIC > 2, and “strong” discriminatory/informativeness assertion can be made when ΔAIC > 5. The R-packages Regression Modeling Strategies (version 4.3–1)23 were implemented for these purposes.
Results
Patients
Of the total 1789 patients with HNC, 1259 patients were included in this study after screening (inclusion diagram in Supplementary Materials). They were randomly split in a training set of 882 patients (70%) and a validation set of 377 patients (30%). Median follow-up time for all patients was 57 months (range, 12–174). Patient characteristics and demographics are detailed in Table 1. Briefly, the vast majority were patients with oropharyngeal cancer (OPC) (66%), followed by patients with oral cavity cancer (15%) and patients with laryngeal cancer (13%). The majority were male (83%) treated with IMRT (71%). Demographics were not significantly different between the training and validation set with the exception of sex (P = .02). From the total cohort, 13.7% (n = 173 patients) developed any grade ORN (primary endpoint) and 5% ORNIV (secondary endpoint). Median time to development of ORN was 17 months (range, 2–142) post-RT. The distribution of ORN grades was as follows: grade I (12.7%), grade II (20.8%), grade III (28.9%), and grade IV (n = 37.6%).
Table 1.
n (%) | Total 1259 (100) | Training set 882 (70) | Validation set 377 (30) | P value | |||
---|---|---|---|---|---|---|---|
| |||||||
ORN grade (%) | .681 | ||||||
Any | 173 | (14) | 124 | (14) | 49 | (13) | |
G1 | 22 | (2) | 14 | (2) | 8 | (2) | .44 |
G2 | 36 | (3) | 30 | (3) | 6 | (2) | |
G3 | 50 | (4) | 36 | (4) | 14 | (4) | |
G4 | 65 | (5) | 44 | (5) | 21 | (6) | |
Mean mandible dose (SD) | 37.74 | (12.51) | 37.45 | (12.86) | 38.41 | (11.67) | .217 |
Sex (%) | .021 | ||||||
Female | 215 | (17) | 136 | (15) | 79 | (21) | |
Male | 1044 | (83) | 746 | (85) | 298 | (79) | |
Age (SD) | 60.72 | (10.07) | 60.82 | (9.89) | 60.68 | (10.15) | .824 |
Tumor site (%) | .375 | ||||||
Oral cavity | 190 | (15) | 54 | (6) | 136 | (36) | |
Oropharynx | 826 | (66) | 249 | (28) | 577 | (153) | |
Larynx | 159 | (13) | 42 | (5) | 117 | (31) | |
Hypopharynx | 24 | (2) | 7 | (1) | 17 | (5) | |
Nasopharynx | 22 | (2) | 10 | (1) | 12 | (3) | |
Unknown primary | 38 | (3) | 15 | (2) | 23 | (6) | |
T stage (%) | .748 | ||||||
Tx | 10 | (1) | 3 | (0) | 7 | (2) | |
T0 | 38 | (3) | 15 | (2) | 23 | (6) | |
T1 | 268 | (21) | 76 | (9) | 192 | (51) | |
T2 | 416 | (33) | 121 | (14) | 295 | (78) | |
T3 | 272 | (22) | 87 | (10) | 185 | (49) | |
T4 | 255 | (20) | 75 | (9) | 180 | (48) | |
N stage (%) | .496 | ||||||
N0 | 248 | (20) | 69 | (8) | 179 | (47) | |
N1 | 146 | (12) | 51 | (6) | 95 | (25) | |
N2 | 834 | (66) | 247 | (28) | 587 | (156) | |
N3 | 31 | (2) | 10 | (1) | 21 | (6) | |
p16 HPV positive (%) | .921 | ||||||
Positive | 397 | (32) | 281 | (32) | 116 | (31) | |
Negative | 71 | (6) | 50 | (6) | 21 | (6) | |
Unknown | 791 | (63) | 551 | (62) | 240 | (64) | |
Technique (%) | .053 | ||||||
3D-CRT | 123 | (10) | 88 | (10) | 35 | (9) | |
IMRT | 891 | (71) | 608 | (69) | 283 | (75) | |
VMAT | 224 | (18) | 173 | (20) | 51 | (14) | |
IMPT | 21 | (2) | 13 | (1) | 8 | (2) | |
Chemotherapy (%) | .965 | ||||||
No chemotherapy | 233 | (19) | 164 | (19) | 69 | (18) | |
Concurrent | 624 | (50) | 436 | (49) | 188 | (50) | |
Induction + concurrent | 285 | (23) | 197 | (22) | 88 | (23) | |
Induction | 97 | (8) | 70 | (8) | 27 | (7) | |
missing | 20 | (2) | 15 | (2) | 5 | (1) | |
Surgery (%) | 1.000 | ||||||
Definitive | 1043 | (83) | 731 | (83) | 312 | (83) | |
Postoperative | 216 | (17) | 151 | (17) | 65 | (17) | |
Dental status (%) | .212 | ||||||
No extraction | 707 | (56) | 506 | (57) | 201 | (53) | |
Edentulous | 210 | (17) | 137 | (16) | 73 | (19) | |
Dental extraction | 342 | (27) | 239 | (27) | 103 | (27) | |
Smoking status (%) | .49 | ||||||
Current | 180 | (14) | 121 | (14) | 59 | (16) | |
Former | 607 | (48) | 434 | (49) | 173 | (46) | |
Never | 472 | (37) | 327 | (37) | 145 | (38) | |
Pack years (SD) | 20.39 | (28.62) | 21.84 | (28.89) | 19.77 | (28.50) | .238 |
Abbreviations: 3D-CRT = three-dimensional conformal radiotherapy; HPV = human papilloma virus; IMPT = intensity modulated proton therapy; IMRT = intensity modulated radiation therapy; ORN = osteoradionecrosis; SD = standard deviation; VMAT = volumetric-modulated arc therapy.
Significant difference levels comparing train and test set are given. Chi-squared test was used for categorical variables; t test for continuous variables.
Univariable analyses
All DVH parameters were significantly associated with the development of ORN (any grade) in univariable analyses. The parameters ranging between D2% and D98% and V15Gy and V70Gy were highly significant (P < .0001) (Supplementary Materials). The D30% and V50Gy showed the best classification performance with AUCs of 0.76 (95% confidence interval, 0.72–0.80). Notably, DVH parameters ranging from D15% to D55%, V40Gy to V60Gy, and mean mandible dose performed similarly (AUC, 0.74– 0.76). Figure 1 depicts the dose (Dx%) and volume (VxGy) distinction of patients who do and do not develop ORN and the DVH parameter significance level. For example, Figure 1 shows that patients who did not develop ORN received an average D30% of 46 ± 16 Gy, whereas this was 57 ± 9 Gy for those who did develop ORN. Additionally, dental extraction (odds ratio [OR], 1.67 [1.35–2.06]; P < .0001), PORT (OR, 1.68 [1.07–2.65]; P = .0253), and chemotherapy (OR, 1.85 [1.06–3.21]; P = .0293) were significantly associated with the development of ORN (Table 2). Specifically for tumor site, compared with oral cavity (ie, as reference), ORs were 0.58 (0.37–0.92) (P = .021) for OPC and 0.08 (0.03–0.23) (P < .0001) for others. Both smoking status and pack-years did not show a significant relationship with ORN development.
Table 2.
Any grade ORN | Grade IV ORN | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
|
|
|||||||||
Variable | β | OR (95% CI) | AIC | AUC | P value | β | OR (95% CI) | AIC | AUC | P value |
| ||||||||||
D30%* | 0.09 | 1.10 (1.07–1.12) | 627 | 0.76 | <.0001 | 0.11 | 1.12 (1.08–1.16) | 306 | 0.80 | <.0001 |
V50Gy* | 0.04 | 1.04 (1.03–1.05) | 643 | 0.76 | <.0001 | 0.05 | 1.05 (1.03–1.06) | 307 | 0.82 | <.0001 |
Tumor site | −0.92 | 0.40 (0.28–0.56) | 691 | 0.63 | <.0001 | −1.34 | 0.26 (0.15–0.45) | 329 | 0.68 | <.0001 |
Dental extraction | 0.51 | 1.67 (1.35–2.06) | 698 | 0.62 | <.0001 | 0.51 | 1.67 (1.19–2.33) | 345 | 0.62 | 0.003 |
PORT | 0.52 | 1.68 (1.07–2.65) | 716 | 0.54 | .025 | 1.30 | 3.67 (1.96–6.88) | 339 | 0.63 | <.0001 |
Chemotherapy | 0.61 | 1.85 (1.06–3.21) | 715 | 0.54 | .029 | 0.14 | 1.15 (0.53–2.53) | 353 | 0.51 | .721 |
Smoking status | 0.08 | 1.09 (0.62–1.91) | 720 | 0.50 | .776 | 1.25 | 3.48 (0.83–14.55) | 349 | 0.55 | .088 |
Gender | 0.60 | 1.83 (0.98–3.41) | 716 | 0.53 | .059 | −0.04 | 0.96 (0.42–2.2) | 354 | 0.50 | .926 |
Age | −0.01 | 0.99 (0.97–1.01) | 720 | 0.52 | .449 | −0.01 | 0.99 (0.96–1.02) | 353 | 0.53 | .369 |
Pack years | 0.00 | 1.00 (0.99–1.01) | 720 | 0.50 | .699 | −0.01 | 0.99 (0.98–1.01) | 352 | 0.52 | .251 |
Abbreviations: AIC = Aikaike Information Criterion; AUC = area under the receiver operator characteristic curve; CI = confidence interval; OR = odds ratio; ORN = osteoradionecrosis; PORT = postoperative radiation therapy.
Best performing mandible bone dose volume histogram (DVH) variables are shown; note all candidate DVH parameters were significant (refer to Supplementary Materials).
: model coefficient.
Multivariable NTCP model development and validation
AIC-ranked forward selection in the training set step-wise identified D30% first (P < .0001) followed by pre-RT dental extraction (likelihood-ratio test; P = .005) with a “significant” ΔAIC of 5.96. Bootstrapped forward variable selection in the training cohort also showed that D30% was the most frequently selected first variable (50% of the bootstrapped samples; note, D25% in 23%), and the clinical variable dental extraction was the second variable (47%; Supplementary Materials). The positive regression coefficients reveal that higher D30% (OR, 1.10 [1.07–1.12]) and dental extraction (OR, 1.67 [1.35–2.06]) are associated with higher risk of developing ORN (Table 3). The model performance was good with an AUC of 0.78 (0.74–0.82) and R2 of 0.20 (Table 4). Validation of the performance of the NTCP model with D30 and dental extraction tested on the independent test set (n = 377) was also good (AUCvalidation = 0.75 [0.69–0.82]; R2 = 0.17). The calibration plot (Supplementary Materials) showed that the predicted NTCP values were an underestimation compared with the actual observed ORNI-IV rate in the validation cohort.
Table 3.
Grade I-IV ORN | Grade IV ORN | ||||||||
---|---|---|---|---|---|---|---|---|---|
|
|
||||||||
Variables | β | OR | P value | β | OR | P value | β | OR | P value |
| |||||||||
Intercept | −6.85 | −9.16 | −12.27 | ||||||
D30 | 0.09 | 1.1 (1.07–1.12) | <.0001 | 0.11 | 1.12 (1.07–1.16) | <.0001 | 0.12 | 1.13 (1.08–1.17) | <.0001 |
Dental extractions | 0.66 | 1.93 (1.28–2.92) | .002 | 0.62 | 1.85 (0.98–3.49) | .057 | |||
Smoking status | 1.51 | 4.54 (1.05–19.68) | .043 |
Abbreviations: NTCP = normal tissue complication probability; OR = odds ratio; ORN = osteoradionecrosis.
: model coefficient.
Table 4.
Any grade ORN | Grade IV ORN | ||
---|---|---|---|
|
|
||
D30 | D30 | D30 | |
Dental extractions | Dental extractions | Smoking status | |
| |||
Training (n = 882) | |||
| |||
AIC | 619.2 | 304.9 | 302.3 |
AUCtraining | 0.78 (0.74–0.82) | 0.81 (0.76–0.86) | 0.81 (0.76–0.86) |
Nagelkerke R2training | 0.20 | 0.17 | 0.18 |
Discrimination slope | 0.12 | 0.06 | 0.06 |
HL test X2 (P value) | 8.44 (.39) | 10.82 (.21) | 11.7 (.17) |
| |||
Validation (n = 377) | |||
| |||
AUCvalidation | 0.75 (0.69–0.82) | 0.82 (0.74–0.89) | 0.75 (0.64–0.86) |
Nagelkerke R2validation | 0.17 | 0.20 | 0.14 |
Abbreviations: AIC = Aikaike Information Criterion; AUC = area under the receiver operator characteristic curve; HL = Hosmer-Lemeshow; NTCP = normal tissue complication probability; ORN = osteoradionecrosis.
n: number of patients.
For the secondary NTCP endpoint ORNIV (ie, needing major surgical intervention), forward selection selected the dose variable D30%. Disregarding dose variables (V70/65Gy) that flipped to negative coefficient in multivariable analyses (ie, suggesting over/incorrect-fitting), smoking status was the next most-associated variable but did not meet our pre-specified significance level (likelihood-ratio test; P = .013), nor did dental extraction (P = .06). Bootstrapped variable selection selected V55Gy (32%) over D25% (20%), D30% (18%), and D40% (14%) (Supplementary Materials), together with the clinical variables smoking status (29%) and/or dental extraction (21%) in multiple “runs.” In training, ORNIV model performance was nearly identical for NTCP models with D30% or V55Gy alone or combined with smoking status or dental extraction (AUC range, 0.80–0.82; R2 range, 0.16–0.18). However, external validation showed that the model with D30% and dental extraction (AUCvalidation = 0.82 [0.74–0.89]; R2validation = 0.21) performed significantly better than the model with V55Gy and smoking status (AUCvalidation = 0.71 [0.60–0.83]; Z-test P = .02) (refer to Supplementary Materials for alternative models). Performance improved with a model D30% and dental extraction compared with the model with D30% alone; even though this improvement is limited, for consistency with ORNI-IV, we selected the same 2 variables in the final ORNIV NTCP model (note: coefficients deviate). Moreover, the calibration cohort (Supplementary Materials) of the model also showed an underestimation of the NTCP values compared with the ORNIV observed rates, yet this was less pronounced as for ORNI-IV.
Final NTCP models that were developed in the training cohort (model coefficients in Table 3) and validated in the unseen/embargoed test cohort are plotted in Figure 2. Binned actual observed ORN proportions, represented by points with error bars, correspond with the NTCP models. The horizontal gray lines in Figure 2 indicate the 5% ORN threshold risks.
Subcohort analyses
The final NTCP model (Table 3) performed similarly for patients with OPC only (n = 826; AUCOPC-cohort = 0.76 [0.71–0.80]), for larynx/hypo/nasopharynx/unknown-primary patients with cancer (ie, others; n = 243; AUCOther-cohort = 0.76 [0.53–0.98]), and combined cohorts (ie, OPC + other patients; n = 1069; AUCOPC + other-cohort = 0.79 [0.75–0.83]). In contrast, performance in the patients with oral cavity cancer was poor (n = 190; AUCOral cavity = 0.59 [0.50–0.68]). A similar trend was seen for ORNIV (AUCOPC-cohort = 0.80 [0.74–0.86], AUCOPC + other-cohort = 0.84 [0.79–0.89], AUCOral cavity = 0.57 [0.46–0.68]), except that in the “others cohort” no ORNIV was present. Refer to Supplementary Materials for subanalyses test results per tumor site and for definitive and PORT patients.
Discussion
Although ORN rates are relatively low (~5%−15%), the consequences for patients experiencing ORN are highly disabling with a substantial effect on the health care utilization and quality of life.5 Once ORN develops, treatment is complicated by the lack of regenerative bone and tissue cells needed for healing and repair. Advanced stage ORN requires extensive surgery associated with significant perioperative morbidity.24 Given the potential severity of ORN and limitations in treatment once ORN has developed, improved pretreatment risk assessment tools aimed at identifying high-risk patients and guiding strategies for prevention and early intervention of ORN represent an important unmet need.
Due to the low relative prevalence of ORN among HNC survivors, a large data set is needed to design a robust NTCP model, which is particularly challenging in this case as large-scale radiation dose plans with matching late toxicity scores for each patient are rarely readily available. Previous studies assessing radiation dose to the mandible and development of ORN have, at best, 200 to 600 patients.1,12,13,18,19 In response to the unmet need for prediction models for the development of ORN and ORN severity validated across data from a sufficient patient cohort, this study developed NTCP models for the prediction of ORN of any grade and grade IV in a large cohort of 1259 patients with HNC treated with definitive or postoperative (chemo-) RT.
The association between ORN development and mandible radiation dose was clearly observed with the univariable significance of all DVH parameters (Fig. 1). The final NTCP models were based on D30% and pretreatment dental extraction. This NTCP model had good performance in both the training and validation cohort for ORNI-IV (AUCtrain/validation = 0.78/0.75) and ORNIV (AUCtrain/validation = 0.81/0.82). These models are clinically useful tools to determine appropriate dose constraints for the mandibular bone when feasible (ie, when tumor coverage is not compromised).16 Additionally, the models identify patients at high risk for ORN development who may benefit from more intensive clinical surveillance programs with dedicated imaging follow-up25 and/or earlier intervention, whether conservative or surgical, to prevent ORN progression.
Our results demonstrate that mandible dose constraints can be distilled from these NTCP models to optimize patients’ IMRT plans. For example, our models suggest that mandibular D30% of patients without pretreatment dental extraction should be kept below 42 Gy to achieve <5% risk of ORN development, while a D30%<35 Gy is required for patients with dental extractions to achieve the same level of risk (Fig. 2). Alternatively, for a more conservative risk threshold of 1%, D30% should be <25 Gy (without dental extractions) and <17 Gy (with dental extractions). With respect to ORNIV only, maintaining D30%<56 Gy without pre-RT dental extractions or D30%<50 Gy with pre-RT dental extractions may be sufficient to achieve <5% risk of ORNIV development.
Our findings of significant association between ORN and several DVH parameters as well as with predental status match the results of several recent publications.1,12,13,18,19,26 For instance, a recent publication from a Danish group showed that several DVH parameters in the intermediate- and high-dose range including Dmean were associated with ORN in a cohort of patients with HNC with 56 ORN cases and 112 controls.26 Another study from the Princess Margaret Cancer Centre reported that V50 and V60 were significantly higher in 71 patients with ORN compared with 142 patients with no ORN.12 In addition, another group previously reported that maximum radiation dose to the mandible as a single dose constraint was a poor correlate of ORN in patients with OPC, and mandibular volumes receiving 44 Gy (V44Gy) and 58 Gy (V58Gy) were comparatively more discriminatory of patients with ORN versus non-ORN patients.1 However, these studies were case-control studies, based on a limited number of patients and did not design a multivariable NTCP model.
Our multivariable NTCP models showed that a combination of mandibular dosimetric parameters (D30%) with the pre-RT dental extraction status achieved the best performing model for ORN risk prediction. A study by the Memorial Sloan Kettering Cancer Center group showed that, in addition to mandibular radiation doses, the presence of mild-severe periodontal bone loss was associated with increased ORN risk. However, in this study, pre-RT radiographs were only available for 18 patients with ORN who were matched with 36 controls.3 In concurrence with our study results, several recent studies have demonstrated that pre-RT dental extractions are a significant risk factor for ORN development.12,20,27–29
Whether pre-RT dental extraction is a direct incipient insult preceding ORN development or merely a surrogate for poor dentition remains unclear. Although all patients receive pretherapy dental oncology assessment, we do not routinely deploy asymptomatic dental surveillance posttherapy, referring these cases to their community dentists. Consequently, our data set lacked significant prospective postradiation dental assessment variables, surveillance of radiation caries, and posttherapy dental extractions that may have been completed outside our facility. Consequently, there remains a significant need to undertake prospective assessment of orodental health with developed instruments (eg, formal sialometry, radiation caries monitoring with DMFS160 (grading system for post-radiation caries),30 and patient-reported outcomes) to determine whether the observed association of ORN with pretherapy dental extractions can be related to 1 or more mechanisms. In particular, we plan to expand the current research to investigate the relationship between the location of the pretherapy dental extraction and posttherapy ORN with dental reports and pre-/posttherapy CT and magnetic resonance images.
In contrast to previous studies,12,13,20 smoking status was not found to be significantly associated with all grades of ORNI-IV in the current study, but smoking status was frequently identified on variable selection with higher ORN grade (ie, grade IV). Notably, our validation showed reduced performance of models with smoking status included compared with that in the training cohort (in contrast to the model with dental extraction). Other groups have shown similar ambiguity as to the role of tobacco in development of ORN, with other publications also showing no association between smoking status and ORN.1,3 These contradictory findings may be due to intercohort variables inherent in different studies’ populations. A second possible explanation is that smoking continuation during and after treatment may be of more influence for the development of ORN compared with patients who elect to stop smoking before treatment as in our data set, which had limited active smokers. More research is needed to investigate the discordance between our findings and other group reports.12,13,20
Subanalyses showed that the NTCP model performance was poor when tested in the patients with oral cavity cancer only (AUCOral-cavity = 0.59), especially relative to the performance in the patients with nonoral cavity cancer (AUCOPC + Other-cohort = 0.79). For the patients with oral cavity cancer, both the mandible dose (Dmean = 46.5 ± 6.5 Gy) and ORN prevalence (23%) were higher compared with the rest of the cohort (36.2 ± 12.7 Gy; prevalence, 12%). Although a relatively small sample size of patients with oral cavity cancer (n = 190) could explain the limited significance of the dose variables (ie, only D30%, D35%, and D40%), the poor performance of the NTCP models suggests that there is an effect in these patients not captured in the present data set. One consideration is that patients with oral cavity cancer typically received PORT (89%), whereas patients with other tumor sites were generally treated with primary RT (94%). Across tumor locations, NTCP model performance was better in patients treated with definitive RT (AUC = 0.78) than those in the PORT group (AUC = 0.65) (Supplementary Materials). Although PORT was significant in univariable analysis, it did not perform well in the multivariable analyses. Further research with specific focus on the role of pre-RT surgical intervention and/or other oral cavity-specific factors is needed to better explain ORN development in patients with oral cavity cancer. Additionally, although patients with oral cavity cancer generally receive radiation to greater volumes of the mandible, the gradients across the mandible were more homogenous. The current NTCP approach treats DVH dose-volume “bins” as discrete independent constructs, which may obfuscate discriminatory signal in organs with more homogenous cohort dose distributions and suggests further investigation with alternative normal tissue injury approaches are warranted. Previous studies have proposed and approach to investigate a spatial dose-toxicity association by warping the dose distribution with deformable regis- tration techniques of the patients to a reference CT scan.31,32 This may allow for voxel-based identification of ORN significantly associated mandible areas, which are projected on the reference patient.
Although the validation NTCP model performance measures were good, the calibration plots in the validation cohort suggested that the predicted NTCP values were underestimated, that is, the model coefficients should have been larger. Additional external validation is needed to improve the estimation of the model coefficient according to the closed-testing procedure.33
Though this study is based on an extensive retrospective cohort of patients with HNC treated between 2005 and 2015 at MD Anderson Cancer Center, limitations include that this sample represents a fraction of all patients in the study time frame, an estimated 25%. Nevertheless, we are convinced that the included patient cohort is likely a fair representation of our institutional HNC population. In addition, other variables that are not included in this study may be related to ORN development and potentially improve the NTCP models. For instance, posttreatment alcohol use was associated with development of ORN in the study by Owosho et al3; nevertheless, alcohol use history had no relation with ORN in other studies.1,20 Alcohol use may also act as a surrogate variable for general oral health, and the same can be reasoned for social-economic status, insurance status, ethnicity, and smoking status. More extensive research is needed to identify the role of general oral health in the development of ORN. Moreover, we considered ORN development as a binary variable, leading (as in most NTCP studies) to potential limitations with regard to right-censored event prediction. For simplicity, we used conventional NTCP model approaches, but efforts for dynamic time-incorporating risk models (eg, partially observed Markov decision processes) are ongoing.
Despite these limitations, to our knowledge, this study represents the largest extant ORN survey of dose-response data and the first published ORN NTCP model. To ensure findable, accessible, interoperable and reusable (FAIR) data34 and allow external validation, an anonymized version of the data set, including DVH and clinical variables with ORN grades, has been deposited at doi:https://doi.org/10.6084/m9.figshare.13568207. Our hope is that this can afford others the opportunity to validate our approach, generate institutional-specific models, and engender further cross-platform research for ORN toxicity modeling and multi-institutional dose constraints.
Conclusions
The developed NTCP models performed well in predicting ORNI-IV (primary NTCP endpoint) and ORNIV (secondary NTCP endpoint) in both the patient with HNC training and independent test cohorts. NTCP models were based on mandible bone D30% and pretreatment dental extraction. Our results show a distinct association between planned mandible bone radiation dose and ORN development and suggest that less than 30% of the mandible should receive a dose of 35 Gy or more for an ORNI-IV risk lower than 5%. These NTCP models may be used to improve prevention of ORN as well as guide ORN surveillance/management strategies by identifying and stratifying patients at risk of ORN.
Supplementary Material
Acknowledgments
Sources of support: L.vD. received/receives funding and salary support from the Dutch organization NWO ZonMw during the period of study execution via the Rubicon Individual career development grant. A.S.R.M. and A.A.A. were funded for this work by The University of Texas MD Anderson Cancer Center-Oropharynx Cancer Program generously supported by Mr. and Mrs. Charles W. Stiefel. K.A.H, A.S.R.M., S.Y.L., and C.D.F. received/receives funding and salary support related to this project during the period of study execution from the NIH National Cancer Institute (NCI) Early Phase Clinical Trials in Imaging and Image Guided Interventions Program (R01CA218148). S.Y.L., A.S.R.M., and C.D.F. received/receives funding and salary support related to this project during the period of study execution from the National Institutes of Health (NIH) National Institute for Dental and Craniofacial Research (NIDCR) Establishing Outcome Measures Award (R01DE025248/R56DE025248). C.D.F. received funding unrelated to this project during the period of study execution from NIH/NCI Cancer Center (P30CA016672, P50 CA097007, and R01CA2148250), from NIH/NIBIB (R25EB025787–01), from NIH/NSF (NSF1557679), NSF-CMMI grant (NSF1933369), and the Sabin Family Foundation.
Footnotes
Disclosures: none.
Data sharing statement: An anonymized version of the data set, including DVH and clinical variables with ORN grades is available at doi:10.6084/m9.figshare.13568207.
Supplementary material associated with this article can be found, in the online version, at doi:10.1016/j.ijrobp.2021.04.042.
References
- 1.Mohamed ASR, Hobbs BP, Hutcheson KA, et al. Dose-volume correlates of mandibular osteoradionecrosis in oropharynx cancer patients receiving intensity-modulated radiotherapy: Results from a case-matched comparison. Radiother Oncol 2017;124:232–239. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Chronopoulos A, Zarra T, Ehrenfeld M, Otto S. Osteoradionecrosis of the jaws: Definition, epidemiology, staging and clinical and radiological findings. A concise review. Int Dent J 2018;68:22–30. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Owosho AA, Tsai CJ, Lee RS, et al. The prevalence and risk factors associated with osteoradionecrosis of the jaw in oral and oropharyngeal cancer patients treated with intensity-modulated radiation therapy (IMRT): The Memorial Sloan Kettering Cancer Center experience. Oral Oncol 2017;64:44–51. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Beadle BM, Liao K-P, Chambers MS, et al. Evaluating the impact of patient, tumor, and treatment characteristics on the development of jaw complications in patients treated for oral cancers: A SEER-Medicare analysis. Head Neck 2013;35:1599–1605. [DOI] [PubMed] [Google Scholar]
- 5.Wong ATT, Lai SY, Gunn GB, et al. Symptom burden and dysphagia associated with osteoradionecrosis in long-term oropharynx cancer survivors: A cohort analysis. Oral Oncol 2017;66:75–80. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Siegel RL, Miller KD, Jemal A. Cancer statistics, 2020. CA Cancer J Clin 2020;70:7–30. [DOI] [PubMed] [Google Scholar]
- 7.Licitra L, Perrone F, Bossi P, et al. High-risk human papillomavirus affects prognosis in patients with surgically treated oropharyngeal squamous cell carcinoma. J Clin Oncol 2006;24:5630–5636. [DOI] [PubMed] [Google Scholar]
- 8.Sedaghat AR, Zhang Z, Begum S, et al. Prognostic significance of human papillomavirus in oropharyngeal squamous cell carcinomas. Laryngoscope 2009;119:1542–1549. [DOI] [PubMed] [Google Scholar]
- 9.Ang KK, Harris J, Wheeler R, et al. Human papillomavirus and survival of patients with oropharyngeal cancer. N Engl J Med 2010;363:24–35. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Young D, Xiao CC, Murphy B, Moore M, Fakhry C, Day TA. Increase in head and neck cancer in younger patients due to human papillomavirus (HPV). Oral Oncol 2015;51:727–730. [DOI] [PubMed] [Google Scholar]
- 11.Patel D, Haria S, Patel V. Oropharyngeal cancer and osteoradionecrosis in a novel radiation era: A single institution analysis. Oral Surg 2020;14:113–121. [Google Scholar]
- 12.Caparrotti F, Huang SH, Lu L, et al. Osteoradionecrosis of the mandible in patients with oropharyngeal carcinoma treated with intensity-modulated radiotherapy. Cancer 2017;123:3691–3700. [DOI] [PubMed] [Google Scholar]
- 13.Tsai CJ, Hofstede TM, Sturgis EM, et al. Osteoradionecrosis and radiation dose to the mandible in patients with oropharyngeal cancer. Int J Radiat Oncol Biol Phys 2013;85:415–420. [DOI] [PubMed] [Google Scholar]
- 14.Kierkels RGJ, Korevaar EW, Steenbakkers RJHM, et al. Direct use of multivariable normal tissue complication probability models in treatment plan optimisation for individualised head and neck cancer radiotherapy produces clinically acceptable treatment plans. Radiother Oncol 2014;112:430–436. [DOI] [PubMed] [Google Scholar]
- 15.Witte MG, Van Der Geer J, Schneider C, Lebesque JV, Alber M, Van Herk M. IMRT optimization including random and systematic geometric errors based on the expectation of TCP and NTCP. Med Phys 2007;34:3544–3555. [DOI] [PubMed] [Google Scholar]
- 16.Langendijk JA, Lambin P, De Ruysscher D, Widder J, Bos M, Verheij M. Selection of patients for radiotherapy with protons aiming at reduction of side effects: The model-based approach. Radiother Oncol 2013;107:267–273. [DOI] [PubMed] [Google Scholar]
- 17.Sandulache VC, Hobbs BP, Mohamed ASR, et al. Dynamic contrast-enhanced MRI detects acute radiotherapy-induced alterations in mandibular microvasculature: Prospective assessment of imaging biomarkers of normal tissue injury. Sci Rep 2016;6:29864. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Zhang W, Zhang X, Yang P, et al. Intensity-modulated proton therapy and osteoradionecrosis in oropharyngeal cancer. Radiother Oncol 2017;123:401–405. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Brodin NP, Tomé WA. Revisiting the dose constraints for head and neck OARs in the current era of IMRT. Oral Oncol 2018;86:8–18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Moon DH, Moon SH, Wang K, et al. Incidence of, and risk factors for, mandibular osteoradionecrosis in patients with oral cavity and oropharynx cancers. Oral Oncol 2017;72:98–103. [DOI] [PubMed] [Google Scholar]
- 21.Mohamed ASR, Ruangskul M-N, Awan MJ, et al. Quality assurance assessment of diagnostic and radiation therapy−simulation CT image registration for head and neck radiation therapy: Anatomic region of interest−based comparison of rigid and deformable algorithms. Radiology 2015;274:752–763. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Moons KGM, Altman DG, Reitsma JB, et al. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRI-POD): Explanation and elaboration. Ann Intern Med 2015;162:W1. [DOI] [PubMed] [Google Scholar]
- 23.R Development Core Team. R: A language and environment for statistical computing; Vienna, Austria: The R Foundation for Statistical Computing; 2011. Available at: http://www.R-project.org/. Accessed 28 May 2021. [Google Scholar]
- 24.Lambade PN, Lambade D, Goel M. Osteoradionecrosis of the mandible: A review. Oral Maxillofac Surg 2013;17:243–249. [DOI] [PubMed] [Google Scholar]
- 25.Head Joint and Neck Radiotherapy-MRI Development Cooperative. Dynamic contrast-enhanced MRI detects acute radiotherapy-induced alterations in mandibular microvasculature: Prospective assessment of imaging biomarkers of normal tissue injury. Sci Rep 2016;6:29864. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Aarup-Kristensen S, Hansen CR, Forner L, Brink C, Eriksen JG, Johansen J. Osteoradionecrosis of the mandible after radiotherapy for head and neck cancer: Risk factors and dose-volume correlations. Acta Oncol (Madr) 2019;58:1373–1377. [DOI] [PubMed] [Google Scholar]
- 27.Beech NM, Porceddu S, Batstone MD. Radiotherapy-associated dental extractions and osteoradionecrosis. Head Neck 2017;39:128–132. [DOI] [PubMed] [Google Scholar]
- 28.Sathasivam HP, Davies GR, Boyd NM. Predictive factors for osteoradionecrosis of the jaws: A retrospective study. Head Neck 2018;40:46–54. [DOI] [PubMed] [Google Scholar]
- 29.Kojima Y, Yanamoto S, Umeda M, et al. Relationship between dental status and development of osteoradionecrosis of the jaw: A multicenter retrospective study. Oral Surg Oral Med Oral Pathol Oral Radiol 2017;124:139–145. [DOI] [PubMed] [Google Scholar]
- 30.Watson E, Eason B, Kreher M, Glogauer M. The DMFS160: A new index for measuring post-radiation caries. Oral Oncol 2020;108 104823. [DOI] [PubMed] [Google Scholar]
- 31.Monti S, Palma G, D’Avino V, et al. Voxel-based analysis unveils regional dose differences associated with radiation-induced morbidity in head and neck cancer patients. Sci Rep 2017;7:7220. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Beasley W, Thor M, McWilliam A, et al. Image-based data mining to probe dosimetric correlates of radiation-induced trismus. Int J Radiat Oncol 2018;102:1330–1338. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Vergouwe Y, Nieboer D, Oostenbrink R, et al. A closed testing procedure to select an appropriate method for updating prediction models. Stat Med 2017;36:4529–4539. [DOI] [PubMed] [Google Scholar]
- 34.Goble C, Cohen-Boulakia S, Soiland-Reyes S, Garijo D, Gil Y, Crusoe MR. FAIR computational workflows. Data Intell 2019;23:2. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.