Skip to main content
PLOS ONE logoLink to PLOS ONE
. 2020 May 22;15(5):e0232639. doi: 10.1371/journal.pone.0232639

Computed tomography-derived radiomic signature of head and neck squamous cell carcinoma (peri)tumoral tissue for the prediction of locoregional recurrence and distant metastasis after concurrent chemo-radiotherapy

Simon Keek 1,#, Sebastian Sanduleanu 1,*,#, Frederik Wesseling 2, Reinout de Roest 3, Michiel van den Brekel 4,5, Martijn van der Heijden 4,6, Conchita Vens 6,7, Calareso Giuseppina 8, Lisa Licitra 9,10, Kathrin Scheckenbach 11, Marije Vergeer 12, C René Leemans 3, Ruud H Brakenhoff 3, Irene Nauta 3, Stefano Cavalieri 8, Henry C Woodruff 1,13, Tito Poli 14, Ralph Leijenaar 2, Frank Hoebers 2,#, Philippe Lambin 1,13,#
Editor: Jason Chia-Hsun Hsieh15
PMCID: PMC7244120  PMID: 32442178

Abstract

Introduction

In this study, we investigate the role of radiomics for prediction of overall survival (OS), locoregional recurrence (LRR) and distant metastases (DM) in stage III and IV HNSCC patients treated by chemoradiotherapy. We hypothesize that radiomic analysis of (peri-)tumoral tissue may detect invasion of surrounding tissues indicating a higher chance of locoregional recurrence and distant metastasis.

Methods

Two comprehensive data sources were used: the Dutch Cancer Society Database (Alp 7072, DESIGN) and “Big Data To Decide” (BD2Decide). The gross tumor volumes (GTV) were delineated on contrast-enhanced CT. Radiomic features were extracted using the RadiomiX Discovery Toolbox (OncoRadiomics, Liege, Belgium). Clinical patient features such as age, gender, performance status etc. were collected. Two machine learning methods were chosen for their ability to handle censored data: Cox proportional hazards regression and random survival forest (RSF). Multivariable clinical and radiomic Cox/ RSF models were generated based on significance in univariable cox regression/ RSF analyses on the held out data in the training dataset. Features were selected according to a decreasing hazard ratio for Cox and relative importance for RSF.

Results

A total of 444 patients with radiotherapy planning CT-scans were included in this study: 301 head and neck squamous cell carcinoma (HNSCC) patients in the training cohort (DESIGN) and 143 patients in the validation cohort (BD2DECIDE). We found that the highest performing model was a clinical model that was able to predict distant metastasis in oropharyngeal cancer cases with an external validation C-index of 0.74 and 0.65 with the RSF and Cox models respectively. Peritumoral radiomics based prediction models performed poorly in the external validation, with C-index values ranging from 0.32 to 0.61 utilizing both feature selection and model generation methods.

Conclusion

Our results suggest that radiomic features from the peritumoral regions are not useful for the prediction of time to OS, LR and DM.

Introduction

Head and neck squamous cell carcinoma (HNSCC) is the sixth most common malignant disease worldwide [1]. In the Netherlands, approximately 39,000 men and women were diagnosed with HNSCC between 2000 and 2015 [2]. Roughly two thirds of patients have advanced stage of disease at diagnosis with debilitating symptoms.

Major progress has been made in the treatment of advanced HNSCC throughout the last decade [6]. The “traditional” treatment of these advanced tumors consists of surgical excision followed by complementary (adjuvant) radiotherapy or chemoradiotherapy (CRT). CRT either applied upfront or postoperatively significantly improves survival in HNSCC patients with overall 5-year survival rates up to 61%, 41%, and 69% for oral, pharyngeal and laryngeal cancers, respectively [36]. The introduction of organ-preserving therapies (induction chemotherapy, upfront concomitant CRT, or molecular targeted drugs such as cetuximab) has notably changed treatment protocols of advanced stage HNSCC patients, especially in patients where surgical resection is considered too invasive and where severe problems with speech and swallowing are expected after surgery. Concomitant CRT consists of systemic administration of cisplatin in combination with locoregional radiotherapy and is the mainstay of organ-preserving treatment for advanced HNSCC.

It has been shown that 40% of patients treated upfront with CRT develop a locoregional recurrence or distant metastasis within 2 years after treatment and consequently have an unfavorable prognosis [7]. Several studies have found that advanced and human papillomavirus (HPV)-16-negative tumors respond poorly to CRT in contrast to HPV positive tumors, in particular in oropharyngeal HNSCC [4, 8]. TNM classifications are expected to support patient prognosis by clinicians but unfortunately, they are not helpful to accurately predict which HNSCC patients treated with CRT will develop locoregional recurrences and hence might have benefited from alternative treatment options. Several other potentially prognostic factors have been proposed, such as chemotherapy dose, radiotherapy dose, co-morbidity, World Health Organization (WHO) Performance Status (PS), and HPV-status. Through the use of machine learning algorithms, complex survival models can be created that take these clinical factors into account, while accounting for e.g. interaction between the predictors and right censored data [9].

Currently used biomarkers comprise tumor size, local tumor extent and a few molecular markers (e.g. p16 staining or HPV-PCR). Radiologic imaging, which is routinely performed prior to initiation of CRT, provides an additional source of information that can be exploited through the use of advanced image analysis methods such as radiomics. Radiomics turns radiographic images into a high-throughput data-mining format. The format of the extracted data is a set of features, including first-order intensity histogram statistics, shape- and size statistics, and (filtered) texture features. Complex models that combine radiomics with clinical parameters may be better in detecting HNSCC patients that have a higher likelihood to relapse early after CRT [10].

A growing body of research shows that the tumor microenvironment is a key player in head and neck cancer development and progression [11,12] and hence the immediate surroundings of the tumor may be a source for the extraction of imaging biomarkers. One of the hypotheses is that information about underlying malignancy-associated changes (MAC’s) in the tumor microenvironment can be detected by these imaging biomarkers. These MAC changes are subtle changes in the nuclear morphology and chromatin structure of seemingly normal cells located within the stroma distally to neoplastic lesions that have been shown to dictate its ability to grow and spread, evade the body’s immune defenses, and resist therapeutic intervention [13].

In this study, we aim to investigate the role of radiomics for prediction of overall survival (OS), locoregional recurrence (LRR) and distant metastasis (DM) in stage III and IV HNSCC patients, both in a HPV-negative oropharyngeal cohort (high risk) as well as in the general HNSCC population. We hypothesize that radiomic analysis of peritumoral tissue detects changes associated with malignancy and therefore the likelihood of locoregional recurrence and distant metastasis following CRT.

Methods

Patient characteristics

Two sources of clinical and imaging data were available to us for this study: the Dutch Cancer Society Database (Alp 7072, acronym DESIGN) and “Big Data To Decide” (BD2Decide, NCT02832102). DESIGN is a Dutch multi-center clinical study to create predictive models for stage III and IV HPV-negative HNSCC patients treated by CRT. BD2Decide is a European multi-center clinical study to improve clinical decision making in stage III and IV HNSCC patients irrespective of treatment. In the present study, we included patients from both consortiums with pathologically-confirmed HNSCC, who received contrast-enhanced pre-treatment CT and have been treated upfront with CRT.

The DESIGN data consists of contrast enhanced CT images (and associated clinical data) acquired from 4 different centers: Amsterdam UMC location VUmc, Netherlands Cancer Institute (NKI), Maastricht Radiation Oncology Clinic (MAASTRO), and the University Medical Center Utrecht (UMCU). The BD2Decide data consists of contrast-enhanced CT images retrospectively acquired from 4 different centers: Fondazione IRCCS Istituto dei Tumori Milano (INT), Maastricht Radiation Oncology Clinic (MAASTRO), Amsterdam UMC, location VUmc (VUMC), and the Heinrich-Heine-university in Düsseldorf. There were no overlapping patients between DESIGN and BD2DECIDE.

Both DESIGN and BD2Decide data included clinical, pathological, radiologic imaging, and molecular markers for each case. After comparing datasets, a selection was made to include patients based on the overlap of available clinical data between the two cohorts. These consist of age, sex, performance status, ACE-27 baseline comorbidity, number of pack years, alcohol consumption, hemoglobin at baseline, chemotherapy regimen, HPV status (defined as p16-status) for oropharyngeal cancer, induction chemotherapy (yes/ no), chemotherapy completion (yes/no), and RT dose to the high-risk clinical target volume (HR-CTV).

CT acquisition parameters and segmentation

Patients were selected according to the following inclusion criteria: (i) concomitant CRT of unresected HNSCC, (ii) hypopharyngeal, laryngeal or (HPV-negative on p16 staining) oropharyngeal, (iii) no prior treatment with chemotherapy or with radiotherapy in the head and neck area, (iv) availability of contrast-enhanced baseline planning CT imaging with a slice thickness ≤ 5mm and artifacts in less than 50% of the GTV slices, and (v) availability of patient outcome data for OS, LRR, and DM. A large selection of different scanners were used to acquire the images (S1 Appendix).

GTVs were delineated in each center by an assigned radiation oncologist or radiologist. All contours were revised by a radiation oncologist with over 18 years experience, using MIM software version 6.9.0 (MIM, Cleveland, United States).

Tumor border regions of interest (ROI) extending 3mm and 5mm from the 3D GTV border were generated in MIM (outer ring expansion, see Fig 1). Afterwards, air and bone were filtered from the delineation by setting minimum and maximum thresholds, and manually adjusting the final ROI’s border (peritumoral) regions.

Fig 1. Contrast-enhanced CT image from an oropharyngeal cancer patient.

Fig 1

Primary gross tumor volume (GTV1) border in green, blue: 3mm peritumoral border, yellow: 5mm peritumoral border.

Ethical approval

This study was performed following the guidelines of the Code of Conduct for Human Tissue and Medical Research (https://www.federa.org/codes-conduct) and the EU General Data Protection Regulation.

Medical Ethics Committee approval was provided by the individual centers (full list provided in S2 Appendix).

Written informed consent was given and was placed under the responsibility of the Principal Investigators of the relevant Clinical Participating Centers mentioned above and remain under the custodianship of the specific Participating Centers.

For reproducibility purposes, our code can be found on: https://github.com/PeritumoralRadiomics/Peritumoral-radiomics-HN.git.

Clinical outcome

The clinical endpoints evaluated in this study were overall survival (OS), locoregional recurrence (LRR) and distant metastasis (DM). The missForest (non-parametric missing value imputation using Random Forest) function within the R environment (https://www.R-project.org/) was used to impute missing data. Time to OS was defined as the time between CRT start date and date of death, or censored at the last follow-up date.

Time to LRR was defined as the time between CRT start date and the first scan date of radiologically evident local or regional recurrence (event), or censored at the last follow-up date or date of death.

Time to DM was defined as the time between CRT start date and the first scan date of radiologically evident distant metastasis, or censored at the last follow-up date or date of death.

Image pre-processing, radiomic feature extraction and feature harmonization

International Biomarker Standardization Initiative (IBSI)-compliant radiomic features as well as other non-IBSI covered features were extracted with our in-house RadiomiX research software (supported by Oncoradiomics, Liège, Belgium) implemented in Matlab 2017a (Mathworks, Natick, Mass). Hounsfield Unit (HU) intensities beyond -1024 and +3071 HU were clipped (assigned the value -1024 and +3071 respectively). An image intensity discretization applying a fixed bin width of 25HU was used for feature extraction in CT. Voxel size resampling was performed before feature extraction using cubic interpolation. Images were resampled to isotropic voxels of size 3 x 3 x 3 mm3 using cubic interpolation (upsampling to highest slice thickness).

Radiomic features were extracted consisting of five main groups: 1) fractal features 2) first order statistics, 3) shape and size, 4) texture descriptors including gray level co-occurrence (GLCM), gray level run-length (GLRLM) and gray level size-zone texture matrices (GLSZM), 5) features from groups 1, 3 and 4 after wavelet decomposition of the original image. There were no missing feature values. Definitions and detailed feature descriptions are described elsewhere [14].

Radiomic feature values are potentially sensitive to inter-scanner model, acquisition protocol and reconstruction settings variations. The ComBat statistical feature harmonization technique was employed in our analysis. This technique was initially developed by Johnson et al. [15] for gene expression microarray data (even for small sample sizes) and was recently applied in multicenter PET, MRI and CT radiomic studies [16,17]. Feature values were adjusted for the batch effect according to treatment center, without adjustment for other covariates. Finally, features were normalized in the training dataset by the mean and standard deviation, which were subsequently used to normalize the validation dataset.

Univariable analysis and generation of multivariable models

The prognostic value of the individual radiomic and clinical features was evaluated using concordance index (CI) with the survival package (Therneau T (2015). A Package for Survival Analysis in R. version 2.38, URL: https://CRAN.R-project.org/package=survival) and randomForestSRC package (Ishwaran H (2017) Fast Unified Random Forests for Survival, Regression and Classification (RF-SRC) version 2.9.1, URL: https://cran.r-project.org/web/packages/randomForestSR).

Noether’s method was applied to assess the statistical significance of the computed CI from random chance (CI = 0.5) with the survcomp package (Benjamin Haibe-Kains (2017). Performance Assessment and Comparison for Survival Analysis in R. version 1.36.0, URL: https://www.pmgenomics.ca/bhklab/). To account for multiple testing, a false-discovery-rate (FDR) procedure by Benjamin and Hochberg was applied to adjust the p-values in univariate Cox-regression.

Two machine learning methods were employed that are able to use censored survival data as inputs: Cox proportional hazards based and random survival forrest (RSF).

Multivariable radiomic Cox models were generated using the significant features selected through univariate cox modelling on the training dataset. In a 100-repeat 2-fold cross-validation on the training data, significant features were selected based on univariate significance (p<0.05) adjusted for multiple testing.

“These features were then ranked according to adjusted hazard ratios, where hazard ratios lower than 1 were inversed, and were gradually added to a multivariate cox model until the first peak in the cross-validation testing C-index or after the first peak until the C-index drops by more than 0.02, depending if there is an oscillation or noise pattern leading to multiple peaks. The number of occurrences of each feature in all repetitions was determined, and a selection rate > 50% was used as threshold for the final set of features, ensuring that the selected features were chosen in the majority of the models.

Multivariable clinical models included features selected through Cox-regression based on univariate significance (p<0.05) adjusted for multiple testing. The selected clinical features were then used to train multivariable Cox or RSF models.

Multivariable clinical RSF models were generated based on selecting all features with a relative feature importance >0 in the Random Survival Forest.

RSF strictly adheres to the prescription laid out by Breiman (2003) and requires taking into account the outcome (splitting criterion used in growing a tree must explicitly involve survival time and censoring information) in growing a random forest model. Further, the predicted value for a terminal node in a tree, the resulting ensemble predicted value from the forest, and the measure of prediction accuracy must all properly incorporate survival information.

Multivariable radiomic RSF models were generated based on the optimal number of features corresponding to the first peak in C-index value in the out-of-bag cases OR after the first peak until the C-index frops by more than 0.02, depending if there is an oscillation or noise pattern leading to multiple peaks. Hereby features with decreasing relative importance in the Random Survival Forest were consecutively added.

Results

Clinical characteristics

Contrast enhanced CT images from a total of 444 patients were included in this study: The training cohort (DESIGN) consisted of 301 head and neck squamous cell carcinoma (HNSCC) patients and the validation cohort (BD2DECIDE) of 143 patients. At time of diagnosis, the median age in the training cohort (DESIGN) was 61 years (range: 36 to 80 years), while the median age in the external validation cohort (BD2DECIDE) was 60.5 years (range: 41 to 78 years).

In the training dataset the median OS time was 1118 days, the median time to LRR or last follow-up was 1042 days and the median time to DM or last follow-up was 1060 days. In the external validation dataset the median time to death or last follow-up was 1268 days, the median time to LRR or last follow-up was 1217 days and the median time to DM or last follow-up was 1189 days.

The full list of patient characteristics and time to progression is presented in Table 1.

Table 1. DESIGN/ BD2DECIDE patient characteristics.

DESIGN training cohort (n = 301) BD2DECIDE validation cohort (n = 143) P-value
Median (range) Median (range)
GTVprim Volume (cm3) 21.28 (0.65–176.10) 19.82 (0.54–157,28) 0.82
Age (years) 61 (36–80) 60 (41–78) 0.52
Number of pts (%) Number of pts (%)
WHO PS <0.001
 0 0 (0) 120 (83.9)
 1 79 (26.2) 20 (14.0)
 2 139 (46.2) 3 (2.1)
 3 10 (3.3) 0 (0)
 Missing 73 (24.3) 0 (0)
Clinical TNM (T), 7th Edition 0.08
 cTX 0 (0) 0 (0)
 cT1 14 (4.7) 3 (2.1)
 cT2 63 (20.9) 25 (17.5)
 cT3 106 (35.2) 68 (47.6)
 cT4 118 (39.2) 47 (32.9)
Clinical Nodal stage (N), 7th Edition 0.01
 cNX 1 (0.3) 0 (0)
 cN0 41 (13.6) 37 (25.9)
 cN1 41 (13.6) 19 (13.3)
 cN2 a-b-c 209 (69.5) 79 (55.2)
 cN3 9 (3.0) 8 (5.6)
HPV status (P16 stain) <0.001
Negative 207 (68.8) 64 (44.8)
 Positive/ Unknown 94 (31.2) 79 (55.2)
Treatment
Chemotherapy regimen <0.001
 • Platin 292 (97.0) 81 (56.6)
 • Platin + others 9 (3.0) 23 (16.1)
 • Cetuximab 0 (0) 39 (27.3)
Cumulative radiotherapy dose high-risk CTV 70 (60–84) Gy 70 (20–76) Gy
Tumor site
 Oropharynx 145 (48.2) 49 (34.3) 0.02
 Larynx 57 (18.9) 39 (27.3)
 Hypopharynx 99 (32.9) 55 (38.5)

Clinical characteristics

Clinical models (Tables 2 and 3) to predict OS, LR and DM ranged from a C-index of 0.61–0.85 in training with both methods and a C-index of 0.49–0.75 in external validation. Details on the clinical variable selected in the final Cox/ RSF models are presented in Table 4.

Table 2. Multivariable Cox Regression method, C-index and number of radiomic and (non)-treatment related prognostic clinical factors in validation dataset (BD2DECIDE).

C-index Prognostic (No. feat) C-index GTVprim (No. feat) C-index TB 3mm (No. feat) C-index TB 5mm (No. feat) C-index GTVprim, + TB 3mm + TB 5mm (No. feat)
Train Val Train Val Train Val Train Val Train Val
Oropharynx
 Clinical-OS 0.61 (1) 0.49 (1)
 Clinical-LR 0.61 (1) 0.55 (1)
 Clinical-DM 0.67 (1) 0.65 (1)
 Radiomics-OS 0.65 (3) 0.57 (3) 0.69 (3) 0.52 (3) 0.79 (1) 0.60 (1) 0.70 (2) 0.56 (2)
 Radiomics-LR 0.57 (1) 0.52 (1) 0.70 (2) 0.56 (2) 0.76 (6) 0.51(6) 0.72 (4) 0.48 (4)
 Radiomics-DM - - 0.69 (2) 0.61 (2) 0.73 (3) 0.44 (3) 0.72 (2) 0.60 (2)
All subsites
 Clinical-OS 0.64 (4) 0.56 (4)
 Clinical-LR - -
 Clinical-DM 0.67 (1) 0.49 (1)
 Radiomics-OS 0.61 (1) 0.60 (1) 0.63 (4) 0.61 (4) 0.61 (2) 0.62 (2) 0.61 (3) 0.59 (3)
 Radiomics-LR 0.66 (3) 0.51 (3) 0.67 (3) 0.51 (3) 0.58 (1) 0.47 (1) 0.61 (1) 0.47 (1)
 Radiomics-DM 0.63 (2) 0.54 (2) 0.54 (2) 0.47 (4) 0.61 (2) 0.56 (2) 0.64 (3) 0.55(2)

Abbreviations GTVprim—Primary Gross Tumor Volume, OS- Overall Survival, LR- Locoregional Recurrence, DM- Distant Metastasis.

Table 3. Random survival forest method, C-index and number of radiomic or (non)-treatment related prognostic clinical factors.

C-index Prognostic (No. feat) C-index Treatment (No. feat) C-index GTVprim (No. feat) C-index TB 3mm (No. feat) C-index TB 5mm (No. feat) C-index GTVprim, + TB 3mm + TB 5mm (No. feat)
Train Val Train Val Train Val Train Val Train Val Train Val
Oropharynx
 Clinical-OS 0.74 (5) 0.74 (5) 0.52 (1) 0.53 (1)
 Clinical-LR 0.81 (5) 0.81 (5) 0.51 (1) 0.51 (1)
 Clinical-DM 0.85 (4) 0.85 (4) 0.51 (1) 0.52 (1)
 Radiomics-OS 0.73 (3) 0.58 (3) 0.77 (6) 0.49 (6) 0.79 (5) 0.60 (5) 0.78 (6) 0.61 (6)
 Radiomics-LR 0.77 (2) 0.49 (2) 0.83 (3) 0.43 (3) 0.83 (2) 0.59 (7) 0.71 (3) 0.57 (3)
 Radiomics-DM 0.82 (2) 0.49 (2) 0.91 (8) 0.55 (8) 0.81 (3) 0.50 (3) 0.86 (4) 0.32 (4)
All subsites
 Clinical-OS 0.77 (7) 0.77 (7) 0.56 (1) 0.51 (1)
 Clinical-LR 0.79 (3) 0.79 (3) 0.56 (2) 0.49 (2)
 Clinical-DM 0.84 (4) 0.84 (4) - -
 Radiomics-OS 0.79 (7) 0.58 (4) 0.89 (4) 0.58 (4) 0.77 (5) 0.60 (5) 0.78 (6) 0.59 (6)
 Radiomics-LR 0.81 (3) 0.52 (2) 0.80 (2) 0.52 (2) 0.86 (7) 0.59 (7) 0.83 (2) 0.53 (2)
 Radiomics-DM 0.86 (3) 0.49 (3) 0.86 (4) 0.49 (4) 0.96 (3) 0.50 (3) 0.86 (3) 0.43 (3)

Abbreviations GTVprim—Primary Gross Tumor Volume, OS- Overall Survival, LR- Locoregional Recurrence, DM- Distant Metastasis.

Table 4. Multivariable clinical Cox/ RSF models.

Outcome Clinical Cox, all subsites Prognostic Clinical Cox, Oropharynx Prognostic Clinical RSF, all subsites Prognostic Clinical RSF, Oropharynx Prognostic Clinical RSF, all subsites Treatment Clinical RSF, Oropharynx Treatment
OS N-stage N-stage N-stage N-stage Chemotherapy regimen Chemotherapy regimen
Tumor site Tumor site Age Chemotherapy completion
Gender Hb baseline Pack Years
Alcohol consumption Age Alcohol consumption
Pack-years Gender
LR N-stage Gender Hb baseline Gender Chemotherapy regimen Chemotherapy regimen
Tumor site Alcohol consumption Chemotherapy completion
Gender Age
Pack years
N-stage
DM N-stage N-stage N-stage N-stage Chemotherapy regimen
T-stage T-stage
Hb baseline Age
Pack-years Pack years

The highest performing model in external validation was a clinical model (Oropharynx-DM). With this clinical model a significant survival split was found both in training (Fig 2a) but not in validation (Fig 2b) based on the median prediction probabilities in training according to the Cox model.

Fig 2.

Fig 2

a. Training Kaplan-Meier (distant metastasis free) survival split for oropharyngeal patients (best performing clinical model in validation with Cox regression, oropharynx-DM) based on above (blue line) and below (yellow) median prediction probabilities. b. Validation Kaplan-Meier (distant metastasis free) survival split for oropharyngeal patients (best performing clinical model in validation with Cox regression, oropharynx-DM) based on above (blue line) and below (yellow) median prediction probabilities. Non-significant split in survival according to median in training, though in all of the above median cases the time to event is not observed (censoring).

Radiomics characteristics

A total of 1298 radiomic features were extracted from all contrast-enhanced CT-images. Results of training (DESIGN) and validation (BD2DECIDE) c-index metrics are provided in Tables 2 and 3. Both in oropharyngeal cases alone as well as in all tumor subsites combined peritumoral radiomics performed poorly in external validation, with C-index ranging from 0.32 to 0.61 with both feature selection and model generation methods. (Figs 3 and 4).

Fig 3. Error rate stabilizes with increasing number of trees.

Fig 3

Features with an importance > 0 on an RFSRC model trained with all clinical variables in were eventually combined in the multivariable clinical (prognostic/ treatment-related) RFSRC model and externally validated on the BD2DECIDE dataset.

Fig 4. Variable dependence of predicted distant metastasis at 1, 2 and 5 years on the 4 clinical variables of interest (highest performing clinical model in validation, oropharyngeal-DM) according to Random Survival Forest.

Fig 4

Individual cases are marked with blue triangles for censored cases and red circles for distant metastasis events. Loess smooth curve indicates the distant metastasis trend with increasing values of the individual clinical feature.

Volumetric information was calculated for GTVprim and Spearman correlation coefficients between individual selected features and volume were calculated. With the Cox method these C-indexes were all <0.60 (all P>0.05 correlation with model features). With the RSF method these varied between 0.28–0.45 (all P>0.05 correlation with model features).

Radiomics quality assurance and TRIPOD statement

For quality assurance a radiomics quality score (RQS) was calculated [14] for this study. The RQS score for this specific study was 44% (most points allocated for external validation and use of feature reduction analysis).

Scores were likewise calculated for the 22-item adherence data extraction checklist of the TRIPOD (Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis), which was in the range of 0.75–0.86 (See S3 Appendix).

Discussion

In this first peritumoral H&N radiomics study we found that the highest performing model in external validation was a clinical model which was able to predict distant metastasis in oropharyngeal cancer cases with an external validation c-index of 0.65 and 0.75 with the RSF and Cox models respectively. Both in oropharyngeal cases alone as well as in all tumor subsites peritumoral radiomics performed poorly in external validation, with C-index ranging from 0.32 to 0.61 with both feature selection and model generation methods.

The reasoning for choosing a 5mm tumor border is based on radiotherapy margins which are defined outside the visible/palpable or imaging-detectable (macroscopic) tumor GTV, the clinical target volume (CTV), whereby potential microscopic tumor spread is taken into account. Based on experience from pathological examination of surgical resections, the Danish Head and Neck Cancer (DAHANCA) group concluded that for primary tumors (GTV-T), the risk of subclinical microscopic spread was around 50% of which more than 99% was within 5 mm and 95% within 4 mm of the rim of GTV-T [18].

Previous studies on peritumoral radiomics in other tumor models have not been able to produce promising results in internal cross-validation either. We have not yet seen a peritumoral H&N radiomics study with an external validation dataset.

Dou et al. [19] for instance found a testing C-index of 0.55 with a lung radiomic tumor border model in the prediction of distant metastasis, while Shan et al. [20] found that in predicting early recurrence in hepatocellular carcinoma (HCC), by comparing AUC values between training and validation cohorts, the prediction accuracy in the validation cohort was good for the peritumoral radiomics model (0.80 vs. 0.79, P = 0.47) but poor for the tumoral radiomics model (0.82 vs. 0.62, P < 0.01).

Despite the poor performance in external validation with both GTVprim, 3mm, and 5 mm tumor border radiomics, we have found a clinical model for the prediction of distant metastasis in oropharyngeal cancer patients performed the best in external validation.

We find an overlapping clinical parameter, namely node-stage, between these two clinical models. Indeed high node stage is hypothesized to be one of the major risk factors for the development of distant metastasis [21,22]. We also see some discrepancies between the two clinical models. For instance, T-stage, age, and packyears (the number of packs of cigarettes per day multiplied by the years spent smoking) are also selected as one of the predictors of distant metastasis in the RSF model.

Strengths of the current study include the use of an external validation dataset, the extensive clinical data and the rigorous feature selection methods that take into account time-to-event outcomes.

One of the limitations is the retrospective nature of the study, leading to several clinical variables (e.g. weight loss) to not be comparable between training and validation. Another limitation is the heterogeneity between the training and validation dataset, both in terms of WHO PS, N-stage, chemotherapy regimen (mostly platin alone regimens in DESIGN versus platin + other regimens in BD2DECIDE) as well as tumor site (DESIGN more oropharynx, less laryngeal cases compared to BD2DECIDE). We hypothesize that this has negatively impacted the model performance.

Another limitation is the omission of valuable semantic imaging features, qualitative imaging features that are defined by experienced radiologists (e.g. extracapsular growth, necrosis) as well as the omission of radiomics description of the GTV2 (positive lymph nodes).

Most radiomic features are designed to be extracted from a fully enclosed 3D volume, as is often the case with the primary tumor. In contrast, the peritumoral regions are rings with limited volume, especially the 3mm regions. Therefore, features such as those extracted from filtered images require a certain volume of the region of interest and therefore have limited application in small volumes or disjointed regions. These technical issues may have contributed to the relatively poor performance of peritumoral radiomics.

We believe that in the future, to improve clinical use of this kind of signatures, larger and more homogenous and prospectively collected data should be sought, taking into account imaging features derived from GTV2/ lymph node regions and gene expression profiles in order to construct more reliable prognostic biomarkers. An intrinsic problem might be that recurrences cannot be predicted well with bulk tumor characteristics. In a recent genetics study [23] it was shown that half of the local relapses of CRT treated HNSCCs, did not share genetic changes with the index tumors, suggesting that minor treatment resistant subclones determine outcome in many cases. Taking this into regard we believe that future radiomics studies should derive information not only from the planning CT’s, but also during the multiple follow-up moments after treatment.

Conclusion

In this study, we have investigated whether clinical data as well as computer-extracted radiomic features from peritumoral as well as inter-tumoral derived imaging features on CT can predict OS, LRR and DM. Our results show that radiomic features from the primary peritumoral regions, as well as from the primary inter-tumoral regions, do not predict OS, LRR and DM.

More homogenous cohorts, both in patient and imaging characteristics, and the combination of clinical, radiomics, and genomics models may increase the generalizability and predictive power of prognostic models.

Supporting information

S1 Appendix. Datasets, imaging parameters and missing data.

(DOCX)

S2 Appendix. Full names of ALL the ethics committees/institutional review boards that approved study.

(DOCX)

S3 Appendix. TRIPOD adherence data extraction checklist.

(DOCX)

Acknowledgments

Authors acknowledge financial support from ERC advanced grant (ERC-ADG-2015, n° 694812—Hypoximmuno), ERC-2018-PoC (n° 81320 –CL-IO). This research is also supported by the Dutch technology Foundation STW (grant n° P14-19 Radiomics STRaTegy), which is the applied science division of NWO, and the Technology Programme of the Ministry of Economic Affairs. Authors also acknowledge financial support from SME Phase 2 (RAIL—n°673780), EUROSTARS (DART, DECIDE), the European Program H2020-2015-17 (BD2Decide—PHC30-689715, PREDICT—ITN—n° 766276), TRANSCAN Joint Transnational Call 2016 (JTC2016 “CLEARLY”- n° UM 2017–8295), Interreg V-A Euregio Meuse-Rhine (“Euradiomics”) and the Dutch Cancer Society.

Authors further acknowledge financial support by the Dutch Cancer Society (KWF Kankerbestrijding), Project number 12085/2018-2 and A6C 7072/2014-2.

Data Availability

Data cannot be shared publicly because we are not allowed to share the imaging data as third party (2 year moratorium on BD2DECIDE data). Radiomics/ clinical data and script are available from Github https://github.com/SebastianSanduleanu/Peritumoral-HN-Radiomics.git for researchers who meet the criteria for access to confidential data.

Funding Statement

Funded: DESIGN: Alpe d’Huzes/ KWF Program Grant A6C 7072. BD2DECIDE: European Union Horizon 2020 research/innovation program (689715). Dr. R Leijenaar received support in the form of a salary from OncoRadiomics. The specific roles of these authors are articulated in the ‘author contributions’ section.

References

  • 1.Clayburgh Daniel R, Grandis Jennifer R et al. “Molecular Biology” Oral, Head and Neck Oncology and Reconstructive Surgery 2018: 79–89 [Google Scholar]
  • 2.Dutch Cancer Registry (NKR) from Integral Cancer Center Netherlands (IKNL): https://www.iknl.nl/cijfers/de-nederlandse-kankerregistratie.
  • 3.Ang K. K., Harris J., Wheeler R., et al. Human papillomavirus and survival of patients with oropharyngeal cancer. New England Journal of Medicine. 2010;363(1):24–35. 10.1056/NEJMoa0912217 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Fakhry C., Westra W. H., Li S., et al. Improved survival of patients with human papillomavirus-positive head and neck squamous cell carcinoma in a prospective clinical trial. JNCI Journal of the National Cancer Institute. 2008;100(4):261–269. 10.1093/jnci/djn011 [DOI] [PubMed] [Google Scholar]
  • 5.Nahavandipour Arvin, Jakobsen Kathrine Kronberg, Gronhoj Christian et al. “Incidence and survival of laryngeal cancer in Denmark: a nation-wide study from 1980 to 2014.” Acta Oncologica 2019; 58 (7) [DOI] [PubMed] [Google Scholar]
  • 6.Gregoire V., Lefebvre J.-L., Licitra L. et al. "Squamous cell carcinoma of the head and neck: EHNS-ESMO-ESTRO Clinical Practice Guidelines for diagnosis, treatment and follow-up." Ann Oncol . 2010; 21 Suppl 5: v184–186. [DOI] [PubMed] [Google Scholar]
  • 7.Beaumont J., Acosta O., Devillers A. et al. ”Voxel-based identification of local recurrence sub-regions from pre-treatment PET/CT for locally advanced head and neck cancers.” EJNMMI Res 2019; 9: 90 10.1186/s13550-019-0556-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Trosman S. J., Koyfman S.A. Ward M.C. et al. "Effect of human papillomavirus on patterns of distant metastatic failure in oropharyngeal squamous cell carcinoma treated with chemoradiotherapy." JAMA Otolaryngol Head Neck Surg . 2015: 141(5): 457–462. 10.1001/jamaoto.2015.136 [DOI] [PubMed] [Google Scholar]
  • 9.Leger S., Zwanenburg A., Pilz K. et al. "A comparative study of machine learning methods for time-to-event survival data for radiomics risk modelling." Sci Rep. 2017:7(1): 13206 10.1038/s41598-017-13448-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Giraud P., Giraud P., Gasnier A. et al. "Radiomics and Machine Learning for Radiotherapy in Head and Neck Cancers." Front Oncol 2019; 9: 174 10.3389/fonc.2019.00174 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Alsahafi E., Begg K., Amelio I. et al. "Clinical update on head and neck cancer: molecular biology and ongoing challenges." Cell Death Dis 2019; 10(8): 540 10.1038/s41419-019-1769-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Peltanova B., Raudenska M., Masarik M. et al. "Effect of tumor microenvironment on pathogenesis of the head and neck squamous cell carcinoma: a systematic review." Mol Cancer 2019; 18(1): 63 10.1186/s12943-019-0983-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Gallagher M., Hogan J., Maire F. et al. “Intelligent Data Engineering and Automated Learning” LNCS 2005: 382–383 [Google Scholar]
  • 14.Lambin P., Leijenaar R.T.H., Deist T.M. et al. "Radiomics: the bridge between medical imaging and personalized medicine." Nat Rev Clin Oncol 2017; 14(12): 749–762. 10.1038/nrclinonc.2017.141 [DOI] [PubMed] [Google Scholar]
  • 15.Johnson W. E., Li C., Rabinovic A. et al. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 2007; 8(1): 118–127. 10.1093/biostatistics/kxj037 [DOI] [PubMed] [Google Scholar]
  • 16.Orlhac F., Frouin F., Nioche C. et al. Validation of a Method to Compensate Multicenter Effects Affecting CT Radiomics. Radiology. 2019; 291(1): 52–58. [DOI] [PubMed] [Google Scholar]
  • 17.Lucia F., Visvikis D., Vallieres M. et al. "External validation of a combined PET and MRI radiomics model for prediction of recurrence in cervical cancer patients treated with chemoradiotherapy." Eur J Nucl Med Mol Imaging. 2019; 46(4): 864–877. 10.1007/s00259-018-4231-9 [DOI] [PubMed] [Google Scholar]
  • 18.Campbell S., Poon I., Markel D. et al. "Evaluation of microscopic disease in oral tongue cancer using whole-mount histopathologic techniques: implications for the management of head-and-neck cancers." Int J Radiat Oncol Biol Phys 2012; 82(2): 574–581. 10.1016/j.ijrobp.2010.09.038 [DOI] [PubMed] [Google Scholar]
  • 19.Dou T. H., Coroller T.P., van Griethuysen J.J.M. et al. "Peritumoral radiomics features predict distant metastasis in locally advanced NSCLC." PLoS ONE 2018; 13(11): e0206108 10.1371/journal.pone.0206108 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Shan Q. Y., Hu H., Feng S. et al. "CT-based peritumoral radiomics signatures to predict early recurrence in hepatocellular carcinoma after curative tumor resection or ablation." Cancer Imaging 2019; 19(1): 11 10.1186/s40644-019-0197-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Garavello W., Ciardo A., Spreafico R. et al. "Risk factors for distant metastases in head and neck squamous cell carcinoma." Arch Otolaryngol Head Neck Surg 2006; 132(7): 762–766. 10.1001/archotol.132.7.762 [DOI] [PubMed] [Google Scholar]
  • 22.Kim D. H., Kim W.T., Lee J.H. et al. "Analysis of the prognostic factors for distant metastasis after induction chemotherapy followed by concurrent chemoradiotherapy for head and neck cancer." Cancer Res Treat 2015; 47(1): 46–54. 10.4143/crt.2013.212 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.de Roest R. H., Mes S., Brink A. et al. "Molecular Characterization of Locally Relapsed Head and Neck Cancer after Concomitant Chemoradiotherapy." Clin Cancer Res 2019; OF1–OF9 [DOI] [PubMed] [Google Scholar]

Decision Letter 0

Jason Chia-Hsun Hsieh

6 Jan 2020

PONE-D-19-31722

Computed Tomography-Derived Radiomic Signature of Head and Neck Squamous Cell Carcinoma (Peri)tumoral Tissue for the Prediction of Locoregional Recurrence and Distant Metastasis After Concurrent Chemo-radiotherapy

PLOS ONE

Dear Dr. Sanduleanu, 

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

We would appreciate receiving your revised manuscript by Feb 20 2020 11:59PM. When you are ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter.

To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). This letter should be uploaded as separate file and labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. This file should be uploaded as separate file and labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. This file should be uploaded as separate file and labeled 'Manuscript'.

Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

We look forward to receiving your revised manuscript.

Kind regards,

Jason Chia-Hsun Hsieh, M.D. Ph.D

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

http://www.journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and http://www.journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Thank you for your ethics statement : "According to the decisions of the Institutional Review Board and individual patient

informed consent this study was performed following the guidelines of the Code of

Conduct for Human Tissue and Medical Research (https://www.federa.org/codesconduct)

and the EU General Data Protection Regulation."

a) Please amend your current ethics statement to include the full name of the ethics committee/institutional review board(s) that approved your specific study.

b) Please amend your current ethics statement to confirm that your named institutional review board or ethics committee specifically approved this study.

c) Once you have amended this/these statement(s) in the Methods section of the manuscript, please add the same text to the “Ethics Statement” field of the submission form (via “Edit Submission”).

For additional information about PLOS ONE ethical requirements for human subjects research, please refer to http://journals.plos.org/plosone/s/submission-guidelines#loc-human-subjects-research.

3. Please provide additional details regarding participant consent. In the ethics statement in the Methods and online submission information, please ensure that you have specified (1) whether consent was suitably informed and (2) what type you obtained (for instance, written or verbal). If your study included minors under age 18, state whether you obtained consent from parents or guardians. If the need for consent was waived by the ethics committee, please include this information.

4. We noticed you have some minor occurrence(s) of overlapping text with the following previous publication(s), which needs to be addressed:

https://doi.org/10.1371/journal.pone.0206108

In your revision ensure you cite all your sources (including your own works), and quote or rephrase any duplicated text outside the Methods section. Further consideration is dependent on these concerns being addressed.

5. Please provide an updated Competing Interests Statement in your cover letter that includes the following:

"Dr. Philippe Lambin reports, within and outside the submitted work, grants/sponsored research agreements from Varian medical, Oncoradiomics, ptTheragnostic, Health Innovation Ventures and DualTpharma. He received an advisor/presenter fee and/or reimbursement of travel costs/external grant writing fee and/or in kind manpower contribution from Oncoradiomics, BHV, Merck and Convert pharmaceuticals. Dr Lambin has shares in the company Oncoradiomics SA and Convert pharmaceuticals SA and is co-inventor of two issued patents with royalties on radiomics (PCT/NL2014/050248, PCT/NL2014/050728) licensed to Oncoradiomics and one issue patent on mtDNA (PCT/EP2014/059089) licensed to ptTheragnostic/DNAmito, three non-patentable invention (softwares) licensed to  ptTheragnostic/DNAmito, Oncoradiomics and Health Innovation Ventures. Dr. Woodruff has (minority) shares in the company Oncoradiomics. CR Leemans and RH Brakenhoff have received financial support from GenMab BV, InteRNA Technologies BV and Bristol Myers-Squibb, all outside the presented work. Ralph Leijenaar has OncoRadiomics shares."

6. We note that you have reported significance probabilities of 0 in places. Since p=0 is not strictly possible, please correct this to a more appropriate limit, eg 'p<0.0001'."

7. In your Data Availability statement, you have not specified where the minimal data set underlying the results described in your manuscript can be found. PLOS defines a study's minimal data set as the underlying data used to reach the conclusions drawn in the manuscript and any additional data required to replicate the reported study findings in their entirety. All PLOS journals require that the minimal data set be made fully available. For more information about our data policy, please see http://journals.plos.org/plosone/s/data-availability.

Upon re-submitting your revised manuscript, please upload your study’s minimal underlying data set as either Supporting Information files or to a stable, public repository and include the relevant URLs, DOIs, or accession numbers within your revised cover letter. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories. Any potentially identifying patient information must be fully anonymized.

Important: If there are ethical or legal restrictions to sharing your data publicly, please explain these restrictions in detail. Please see our guidelines for more information on what we consider unacceptable restrictions to publicly sharing data: http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions. Note that it is not acceptable for the authors to be the sole named individuals responsible for ensuring data access.

We will update your Data Availability statement to reflect the information you provide in your cover letter.

8. Please include captions for your Supporting Information files at the end of your manuscript, and update any in-text citations to match accordingly. Please see our Supporting Information guidelines for more information: http://journals.plos.org/plosone/s/supporting-information.

Additional Editor Comments:

The "clinical model" prominently featured in the results must be described in methods. Due to the negative finding, conclusions/discussion should also look into whether the finding is due to implementation or theoretical limitations.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: No

Reviewer #2: No

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: Major Comments:

1) In general, the paper is well written and the study is executed carefully.

2) Clinical models are a big part of results but are never defined. Please make sure to define how they are derived in methods.

3) A negative finding in an engineering paper can always mean that the method’s implementation was not good enough (e.g. different radiomics features or different preprocessing would have made the method useful). To help secure the paper’s niche in the field, please review the conclusions and limitations to focus specifically on the type of analysis and features of the data (peritumoral/inter tumoral masks) and describe why the prediction may be failing, what improvements could be made, and what is a theoretical limitation. For example, the relatively thin peritumoral layer likely poses an boundary-effect issue for wavelet decomposition, while the radiomics kernels may have a lot of missing values.

Minor Comments:

Abstract: Meaning of “clinical” model is not clear at this point in the text.

In the section “Univariable analysis and generation of multivariable models” please affirm very explicitly that the feature selection was done based on analysis of training results, not holdout data.

WHO PS is not defined

“More homogenous cohorts of patients and the combination of clinical, radiomics and

genomics models may help to generate predictive models in the future, and include

genetic/ radiomics analyses of index tumors and relapses.” – this part of conclusions maybe should be avoided. If radiomics features were found to have poor performance, their inclusion may not improve the prediction of the overall model, so this statement becomes quite speculative.

“In this study, we have investigated whether clinical data as well as computer-extracted

radiomic features from peritumoral as well as inter-tumoral derived imaging features on

CT can predict OS, LRR and DM. Our results show that radiomic features from the

primary peritumoral regions do not predict OS, LRR and DM.” – Please also state about inter-tumoral to have parallel logical flow.

Reviewer #2: This paper describes a negative finding indicating that peritumoral Radiomics from CT does not help to predict overall survival (OS), locoregional recurrence (LRR), or distant metastases (DM), which is opposite to the hypothesis. The paper is easy to follow and understand. I have the following comments that might help the authors improving the manuscript.

Based on Table 1, it seems that the validation set and training set have lots of differences (WHO, etc.), did such differences cause difficulties in the cross-validation? It seems that the authors have used a stringent validation strategy, i.e., training based on one dataset and validation using another. While it is good and desirable, case is not such ideal. Based on Table 1, it seems that the validation set and training set have lots of differences (WHO, etc.), did such differences cause difficulties in the cross-validation for the failure of replication? The 2-fold validation may also be changed to 5 fold or 10 fold if the performance is poor.

As for the multivariate Cox model, the sorted univariate-based features were gradually (forward) fed in until the first peak. This part is unclear to me. What if there is no peak or there are many trivial peaks (noisy)? How to determine a valid peak? It is helpful to understand if the authors could provide more details. Also, an occurrence frequency threshold of 50% is kind of arbitrary. Any consideration when determine such a threshold?

While multivariate Cox model is easier to understand, the multivariate version of RSF deserves a more detailed description. How does it handle the centered data for this time-to-event type of analysis? RF is easier to understand and widely used, better state the difference between RSF and RF.

Minor issues:

In the last paragraph of Introduction, “HPV-ve” should be “HPV-negative”?

“Multivariable radiomic Cox models were generated on the significant features in

univariate cox” seems redundant.

There are some typos.

“Multivariable clinical RSF models were generated based on selecting all features with a

relative feature importance >0 in the Random Survival Forest.

Multivariable radiomic RSF models were generated based on the number of features

corresponding to the first peak in C-index value in the out-of-bag cases after

consequently adding features based on decreasing relative importance in the Random

Survival Forest. The rule used to obtain the optimal amount of features is at the first

peak OR (depending on the C-index graph if there is an oscillation pattern that you

describe) after the peak until the C-index drops more than 0.02.” -- This part may contain multiple repeated descriptions and unclear descriptions. Please elaborate.

The peritumor region of interests were defined including regions 3mm and 5mm away from the tumor entity borderline. Why using two different distances? Will the results become better if increasing such a distance (e.g., to 1 cm)?

I did not see the definition of “clinical model”. Was that a model based purely on the clinical (rather than radiomics features)? Did the authors combine the clinical features and radiomics features together to predict survival? Tables 2-3 seem that the two types of methods were done separately.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files to be viewed.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2020 May 22;15(5):e0232639. doi: 10.1371/journal.pone.0232639.r002

Author response to Decision Letter 0


3 Mar 2020

Detailed response to reviewers manuscript:

‘Computed Tomography-Derived Radiomic Signature of Head and Neck Squamous Cell Carcinoma (Peri)tumoral Tissue for the Prediction of Locoregional Recurrence and Distant Metastasis After Concurrent Chemo-radiotherapy’

We would first like to thank both the editor and the reviewers for their comments to our manuscript. We have responded to each comment separately and made amendments to the manuscript accordingly. Please see our responses below. Note, each line number refers to the revised manuscript. Italic text refers to added or modified section in the manuscript.

Reviewer 1:

Major comments

1) In general, the paper is well written and study is executed carefully.

We appreciate the kind words.

2) Clinical models are a big part in results but are never defined. Please make sure to define how they are derived in the methods.

We would like to thank the reviewer for addressing this point. We have updated the methods section to include the methodology of the clinical models, similar to how the radiomics models were described. (page 8, lines 34 to 36).

“Multivariable clinical models included features selected through Cox-regression based on univariate significance (p<0.05) adjusted for multiple testing. The selected clinical features were then used to train multivariable Cox or RSF models.“

3) A negative finding in an engineering paper can always mean that the method’s implementation was not good enough (e.g. different radiomics features or different preprocessing would have made the method useful). To help secure the paper’s niche in the field, please review the conclusions and limitations to focus specifically on the type of analysis and features of the data (peritumoral/inter tumoral masks) and describe why the prediction may be failing, what improvements could be made, and what is a theoretical limitation. For example, the relatively thin peritumoral layer likely poses a boundary-effect issue for wavelet decomposition, while the radiomics kernels may have a lot of missing values.

We would like to thank the reviewer for addressing this point. We recognize we have not fully described why the peritumoral/tumoral method we have chosen fails, and a section has been added that describe these limitations, what hypothesize could improve the current method, and also why we think the current state of radiomics may also limit the possible effectiveness of the method (page 16, lines 24-29):

“Most radiomic features are designed to be extracted from a fully enclosed 3D volume, as is often the case with the primary tumor. In contrast, the peritumoral regions are rings with limited volume, especially the 3mm regions. Therefore, features such as those extracted from filtered images require a certain volume of the region of interest and therefore have limited application in small volumes or disjointed regions. These technical issues may have contributed to the relatively poor performance of peritumoral radiomics.”

Minor comments:

Abstract: Meaning of “clinical” model is not clear at this point of the text.

We would like to thank the reviewers for pointing out this unclarity. We have added the missing segment describing the clinical features to the abstract (page 3, methods section, line 16)

“Clinical patient features such as age, gender, performance status, etc. were collected.”

“More homogenous cohorts of patients and the combination of clinical, radiomics and genomics models may help to generate predictive models in the future, and include genetic/ radiomics analyses of index tumors and relapses.” – this part of conclusions maybe should be avoided. If radiomics features were found to have poor performance, inclusion may not improve the prediction of the overall model, so this statement becomes quite speculative.

We would like to thank the reviewer for this comment. We have removed the relevant section from the conclusion: “and include genetic/ radiomics analyses of index tumors and relapses.”, as this is indeed speculative. The previous section we have decided to keep, as the inhomogeneity between the 2 cohorts was found to be a deciding factor of the poor performance of the different models. The modified text now reads (lines 11-13 page 17):

“More homogenous cohorts, both in patient and imaging characteristics, and the combination of clinical, radiomics, and genomics models may increase the generalizability and predictive power of prognostic models.”

“In this study, we have investigated whether clinical data as well as computer-extracted

radiomic features from peritumoral as well as inter-tumoral derived imaging features on CT can predict OS, LRR and DM. Our results show that radiomic features from the primary peritumoral regions do not predict OS, LRR and DM.” – Please also state about inter-tumoral to have parallel logical flow.

We would like to thank the reviewer for this comment. We have adjusted the section to also state that radiomics features form the primary inter-tumoral region do not predict OS, LRR and DM. Page 17,lines 8-10, now reads:

“Our results show that radiomic features from the primary peritumoral regions, as well as from the primary inter-tumoral regions, do not predict OS, LRR, and DM.”

Reviewer #2: This paper describes a negative finding indicating that peritumoral Radiomics from CT does not help to predict overall survival (OS), locoregional recurrence (LRR), or distant metastases (DM), which is opposite to the hypothesis. The paper is easy to follow and understand. I have the following comments that might help the authors improving the manuscript.

We would like to thank the reviewer for addressing these valid points that can help improve the quality of the manuscript.

Based on Table 1, it seems that the validation set and training set have lots of differences (WHO, etc.), did such differences cause difficulties in the cross-validation? It seems that the authors have used a stringent validation strategy, i.e., training based on one dataset and validation using another. While it is good and desirable, case is not such ideal. Based on Table 1, it seems that the validation set and training set have lots of differences (WHO, etc.), did such differences cause difficulties in the cross-validation for the failure of replication? The 2-fold validation may also be changed to 5 fold or 10 fold if the performance is poor.

There are indeed lots of differences between the training and validation set with regard to not only WHO PS, but also e.g. chemotherapy regimen (training mostly cisplatin alone, validation cisplatin + others and cetuximab), and clinical node stage. One of the methods that might lower the bias towards estimating the generalization errors of the clinical models would indeed be to increase the number of folds. In this case we did not increase the number of folds because we have 1) a relatively large training dataset 2) a separate validation dataset.

As for the multivariate Cox model, the sorted univariate-based features were gradually (forward) fed in until the first peak. This part is unclear to me. What if there is no peak or there are many trivial peaks (noisy)? How to determine a valid peak? It is helpful to understand if the authors could provide more details. Also, an occurrence frequency threshold of 50% is kind of arbitrary. Any consideration when determine such a threshold?

We would like to thank the reviewer for this valid concern. We have determined a valid peak according to visual inspection and if a noisy pattern existed, we determined the optimal number of features according to the point on the C-index graph after the first peak where the C-index drops no more than 0.02. The occurrence frequency threshold of 50% is indeed somewhat arbitrary, but it was simply chosen to select the features that were chosen in the majority of the models. We have added this to the methods section (page 8, line 27-33)

“These features were then ranked according to adjusted hazard ratios, where hazard ratios lower than 1 were inversed, and were gradually added to a multivariate cox model until the first peak in the cross-validation testing C-index or after the first peak until the C-index drops by more than 0.02, depending if there is an oscillation or noise pattern leading to multiple peaks. The number of occurrences of each feature in all repetitions was determined, and a selection rate > 50% was used as threshold for the final set of features, ensuring that the selected features were chosen in the majority of the models.”

While multivariate Cox model is easier to understand, the multivariate version of RSF deserves a more detailed description. How does it handle the centered data for this time-to-event type of analysis? RF is easier to understand and widely used, better state the difference between RSF and RF.

We have added more information about RSF and how this method handles a time-to-event type of analysis on page 9 line 1-6 in the methods section:

“RSF strictly adheres to the prescription laid out by Breiman (2003) and requires taking into account the outcome (splitting criterion used in growing a tree must explicitly involve survival time and censoring information) in growing a random forest model. Further, the predicted value for a terminal node in a tree, the resulting ensemble predicted value from the forest, and the measure of prediction accuracy must all properly incorporate survival information.”

Minor issues:

In the last paragraph of Introduction, “HPV-ve” should be “HPV-negative”?

Changes have been made.

“Multivariable radiomic Cox models were generated on the significant features in

univariate cox” seems redundant.

We have updated the sentence to read: ” Multivariable radiomic Cox models were generated using the significant features selected through univariate cox modelling on the training dataset.”

There are some typos.

We attempted to rectify all of them.

“Multivariable clinical RSF models were generated based on selecting all features with a

relative feature importance >0 in the Random Survival Forest.

Multivariable radiomic RSF models were generated based on the number of features

corresponding to the first peak in C-index value in the out-of-bag cases after

consequently adding features based on decreasing relative importance in the Random

Survival Forest. The rule used to obtain the optimal amount of features is at the first

peak OR (depending on the C-index graph if there is an oscillation pattern that you

describe) after the peak until the C-index drops more than 0.02.” -- This part may contain multiple repeated descriptions and unclear descriptions. Please elaborate.

Indeed multiple repeated/ unclear descriptions are given here. We have modified this sentence into:

“Multivariable radiomic RSF models were generated based on the optimal number of features corresponding to the first peak in C-index value in the out-of-bag cases or after the first peak until the C-index drops by more than 0.02, depending if there is an oscillation or noise pattern leading to multiple peaks. Hereby features with decreasing relative importance in the Random Survival Forest were consecutively added.”

The peritumor region of interests were defined including regions 3mm and 5mm away from the tumor entity borderline. Why using two different distances? Will the results become better if increasing such a distance (e.g., to 1 cm)?

Ideally, we would have liked to do a sensitivity analysis in which we gradually increase the peritumoral regions of interest. Nevertheless, too small regions would have resulted in less radiomics features being calculated (some filters cannot be applied to small regions of interest) and too large peritumoral sizes would have resulted in many regions falling outside of the anatomical boundaries of the H&N region (e.g. in bone, cartilage and air).

I did not see the definition of “clinical model”. Was that a model based purely on the clinical (rather than radiomics features)? Did the authors combine the clinical features and radiomics features together to predict survival? Tables 2-3 seem that the two types of methods were done separately.

The clinical model was indeed based solely on clinical variables. It seems that both clinical as well as radiomics features are performing poorly in predicting any outcome. Combining these two methods would not rely in any major improvement and would most likely confuse the readers. We have therefore chosen to omit this additional analysis.

To clarify the clinical model building we have updated the text to read (page 8 lines 34-38):

“Multivariable clinical models included features selected through Cox-regression based on univariate significance (p<0.05) adjusted for multiple testing. The selected clinical features were then used to train multivariable Cox or RSF models.

Multivariable clinical RSF models were generated based on selecting all features with a relative feature importance >0 in the Random Survival Forest.”

Attachment

Submitted filename: 14022020 Detailed response to reviewers Tumor Ring SK_HW.docx

Decision Letter 1

Jason Chia-Hsun Hsieh

20 Apr 2020

Computed Tomography-Derived Radiomic Signature of Head and Neck Squamous Cell Carcinoma (Peri)tumoral Tissue for the Prediction of Locoregional Recurrence and Distant Metastasis After Concurrent Chemo-radiotherapy

PONE-D-19-31722R1

Dear Dr. Sanduleanu,

We are pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it complies with all outstanding technical requirements.

Within one week, you will receive an e-mail containing information on the amendments required prior to publication. When all required modifications have been addressed, you will receive a formal acceptance letter and your manuscript will proceed to our production department and be scheduled for publication.

Shortly after the formal acceptance letter is sent, an invoice for payment will follow. To ensure an efficient production and billing process, please log into Editorial Manager at https://www.editorialmanager.com/pone/, click the "Update My Information" link at the top of the page, and update your user information. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to enable them to help maximize its impact. If they will be preparing press materials for this manuscript, you must inform our press team as soon as possible and no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

With kind regards,

Jason Chia-Hsun Hsieh, M.D. Ph.D

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

All the questions from the reviewers were answered adequately.

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #2: No

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #2: (No Response)

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #2: No

Acceptance letter

Jason Chia-Hsun Hsieh

7 May 2020

PONE-D-19-31722R1

Computed Tomography-Derived Radiomic Signature of Head and Neck Squamous Cell Carcinoma (Peri)tumoral Tissue for the Prediction of Locoregional Recurrence and Distant Metastasis After Concurrent Chemo-radiotherapy

Dear Dr. Sanduleanu:

I am pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please notify them about your upcoming paper at this point, to enable them to help maximize its impact. If they will be preparing press materials for this manuscript, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

For any other questions or concerns, please email plosone@plos.org.

Thank you for submitting your work to PLOS ONE.

With kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Jason Chia-Hsun Hsieh

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Appendix. Datasets, imaging parameters and missing data.

    (DOCX)

    S2 Appendix. Full names of ALL the ethics committees/institutional review boards that approved study.

    (DOCX)

    S3 Appendix. TRIPOD adherence data extraction checklist.

    (DOCX)

    Attachment

    Submitted filename: 14022020 Detailed response to reviewers Tumor Ring SK_HW.docx

    Data Availability Statement

    Data cannot be shared publicly because we are not allowed to share the imaging data as third party (2 year moratorium on BD2DECIDE data). Radiomics/ clinical data and script are available from Github https://github.com/SebastianSanduleanu/Peritumoral-HN-Radiomics.git for researchers who meet the criteria for access to confidential data.


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES