Skip to main content
The British Journal of Radiology logoLink to The British Journal of Radiology
. 2023 Sep 19;96(1150):20230172. doi: 10.1259/bjr.20230172

The quality and clinical translation of radiomics studies based on MRI for predicting Ki-67 levels in patients with breast cancer

Min Wang 1, Ting Mei 1, Youling Gong 1,
PMCID: PMC10546437  PMID: 37724784

Abstract

Objective:

To evaluate the methodological quality of radiomics literature predicting Ki-67 levels based on MRI in patients with breast cancer (BC) and to propose suggestions for clinical translation.

Methods:

In this review, we searched PubMed, Embase, and Web of Science for studies published on radiomics in patients with BC. We evaluated the methodological quality of the studies using the Radiomics Quality Score (RQS). The Cochrane Collaboration’s software (RevMan 5.4), Meta-DiSc (v. 1.4) and IBM SPSS (v. 26.0) were used for all statistical analyses.

Results:

Eighteen studies met our inclusion criteria, and the average RQS was 10.17 (standard deviation [SD]: 3.54). None of these studies incorporated any of the following items: a phantom study on all scanners, cut-off analyses, prospective study, cost-effectiveness analysis, or open science and data. In the meta-analysis, it showed apparent diffusion coefficient (ADC) played a better role to predict Ki-67 level than dynamic contrast-enhanced (DCE) MRI in the radiomics, with the pooled area under the curve (AUC) of 0.969.

Conclusion:

Ki-67 index is a common tumor biomarker with high clinical value. Radiomics is an ever-growing quantitative data-mining method helping predict tumor biomarkers from medical images. However, the quality of the reviewed studies evaluated by the RQS was not so satisfactory and there are ample opportunities for improvement. Open science and data, external validation, phantom study, publicly open radiomics database and standardization in the radiomics practice are what researchers should pay more attention to in the future.

Advances in knowledge:

The RQS tool considered the radiomics used to predict the Ki-67 level was of poor quality. ADC performed better than DCE in radiomic prediction. We propose some measures to facilitate the clinical translation of radiomics.

Introduction

Surpassing lung cancer, breast cancer (BC) in females has become the most common cancer, according to global cancer statistics in 2020. 1 Ki-67, the cell proliferation index, plays an important role in the molecular subtype classification and subsequent therapy selection in BC. Luminal A and B are two main types of BC and are basically differentiated by the level of Ki-67 and hormone receptor expression. Luminal B lesions tend to have a higher Ki-67 level, while Luminal A lesions are with a lower level. 2 The cut-off value for the Ki-67 level is contentiously debated. As proposed by the St Gallen Consensus Conference, the cut-off value has changed three times: most recently in 2015, 3 it was changed to 20–29% from 20% in 2013 4 and before that, it was 14% in 2011. 5 BC with a high Ki-67 index is more likely to be heterogeneous and aggressive 6,7 and have a higher risk of recurrence, 8 and its therapy differs from the one with a lower Ki-67 index. A higher Ki-67 index was a significant indicator of better efficacy of chemotherapy for BC, 9 and a predictor for pathological complete response (pCR) before neoadjuvant chemotherapy. 10 In contrast, patients classified as Luminal A are more sensitive to endocrine therapy. Therefore, it is crucial to measure the Ki-67 index before selecting a specific therapy for patients with BC.

However, the Ki-67 index must be determined by biopsy and histopathological staining which requires both a long wait time and high cost. More importantly, the index is measured in a small sample taken from the tumor which may not represent the status of the entire tumor, as the proliferative activity may vary in different regions of the same tumor. 11 Radiology is considered a promising solution because it has the advantage of reflecting the whole tumor. 12 As an adjunct to mammography, MRI can yield a higher detection rate in females at high risk for BC. 13 Different sequence in MRI contains its own special imaging mechanism and predictive power, which is beneficial for radiomics data mining. Radiomics rarely depends on the experience level of the radiologist and it can objectively extract more imaging information to reveal tumor biomarkers, help direct a particular treatment, track response to treatment, discriminate the occurrence of relapse and predict prognosis. 14–17 In addition, radiomics is expected to show the best sites for biopsy, thus reducing the error rate in pathology. 14

While the field of radiomics is growing rapidly and numerous studies are being continually published, there are obstacles to clinical translation because the current quality of the medical literature is low and some study results have been difficult to reproduce. 18,19 Therefore, the radiomics quality score (RQS) was proposed as a tool for evaluating the radiomics literature, and was aimed to make radiomics more reliable in data selection, medical imaging, feature extraction, exploratory analysis, and modeling set. 20

To the best of our knowledge, no systematic review has been conducted to assess the quality of radiomics literature based on MRI for predicting the Ki-67 level in patients with BC. Ki-67 is a promising index for clinical applications and radiomic modeling because it is expressed in almost all tumors and is strongly correlated with tumor size, 21–24 which means that radiomic models may be applicable to many other tumors, not only BC. Therefore, in our review article, we aimed to use RQS to systematically evaluate the methodological quality of the literature, and point out implications and limitations as well as offer suggestions for future research.

Methods and material

We conducted and reported this systematic review based on the prespecified criteria outlined by the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines (Supplementary Table 1.). The study protocol was registered with the International Prospective Register of Systematic Reviews (PROSPERO, registry number CRD42022333994). No formal ethical approval was required for this study because this is a systematic review article.

Supplementary Table 1.

Search strategy

Two investigators independently searched PubMed, Embase, and Web of Science. Only studies published in English were included. The coverage dates for this review began from each database’s inception and ended on May 8, 2022. Keywords “breast cancer”, “radiomics”, “Ki-67”, and “MRI” were used in the database searches. After removing duplicates, the titles and abstracts of studies were screened for eligibility by two investigators. Any disagreements were adjudicated by a third investigator to reach a consensus. All studies deemed eligible by title and abstract screening underwent a full-text review by two independent investigators using the same criteria. Studies were eligible for inclusion if: (1) patients who were diagnosed with BC had undergone a biopsy for Ki-67 level and MRI before treatment and (2) radiomics was used to estimate the Ki-67 level in patients with BC. Studies comprising animal subjects, articles such as letters to the editor, reviews, and studies lacking data were excluded. We also manually screened the reference lists of all included studies and key journals in the field to include all potentially eligible studies.

Data extraction

We extracted items for the characteristics of the included studies such as first author, year of publication, study design, MRI magnet strength, MRI manufacturers, MRI sequence, region of interest (ROI) segmentation methods, feature process software, sample size, sensitivity, specificity and area under the curve (AUC) of models. The summary tables were thoroughly and independently reviewed by two investigators. Study authors were contacted for additional information or missing data if necessary.

Methodological quality and risk of bias assessment

Two investigators assessed the methodological quality of included studies by using the RQS, including 16 items with a total score of 36, and more scoring details can be seen in Table 1. And, we used the Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) to evaluate the bias of diagnostic studies from four key aspects: the selection of cases, the implementation or interpretation of the test, the implementation and interpretation of the gold-standard, and the flow and timing of cases. The risk of bias was assigned to one of three categories: “low risk”, “high risk” or “unclear risk”, and if the answer to all informational questions was “yes”, then the risk of bias was assessed as low; if one answer to all questions was “no”, then there was a possibility of bias. The Cochrane Collaboration’s software (RevMan 5.4) was used to complete this evaluation process and generate corresponding figures. Two investigators performed the assessments independently, and conflicts regarding scoring were resolved through discussion with an expert.

Table 1.

The RQS a

Criteria Points
1 Image protocol quality—well-documented image protocols (e.g. contrast, slice thickness, energy, etc.) and/or usage of public image protocols allow reproducibility/replicability +1 (if protocols are well-documented)
+1 (if public protocol is used)
2 Multiple segmentations—possible actions are: segmentation by different physicians/algorithms/software, perturbing segmentations by (random) noise, segmentation at different breathing cycles. Analyse feature robustness to segmentation variabilities +1
3 Phantom study on all scanners—detect interscanner differences and vendor-dependent features. Analyze feature robustness to these sources of variability +1
4 Imaging at multiple time points—collect images of individuals at additional time points. Analyze feature robustness to temporal variabilities (e.g. organ movement, organ expansion/ shrinkage) +1
5 Feature reduction or adjustment for multiple testing—decreases the risk of overfitting. Overfitting is inevitable if the number of features exceeds the number of samples. Consider feature robustness when selecting features −3 (if neither measure is implemented)
+3 (if either measure is implemented)
6 Multivariable analysis with non-radiomics features (e.g. EGFR mutation)—is expected to provide a more holistic model. Permits correlating/inferencing between radiomics and non-radiomics features +1
7 Detect and discuss biological correlates—demonstration of phenotypic differences (possibly associated with underlying gene–protein expression patterns) deepens understanding of radiomics and biology +1
8 Cut-off analyses—determine risk groups by either the median, a previously published cut-off or report a continuous risk variable. Reduces the risk of reporting overly optimistic results
Discrimination
+1
9 Discrimination statistics—report discrimination statistics (e.g. C-statistic, ROC curve, AUC) and their statistical significance (e.g. p-values, confidence intervals). One can also apply resampling method (e.g. bootstrapping, cross-validation) +1 (if a discrimination statistic and its statistical significance are reported)
+1 (if a resampling method technique is also applied)
10 Calibration statistics—report calibration statistics (for example, Calibration-in-the-large/slope, calibration plots) and their statistical significance (e.g. p-values, confidence intervals). One can also apply resampling method (e.g. bootstrapping, cross-validation) +1 (if a calibration statistic and its statistical significance are reported)
+1 (if a resampling method technique is also applied)
11 Prospective study registered in a trial database—provides the highest level of evidence supporting the clinical validity and usefulness of the radiomics biomarker +7 (for prospective validation of a radiomics signature in an appropriate trial)
12 Validation—the validation is performed without retraining and without adaptation of the cut-off value, provides crucial information with regard to credible clinical performance −5 (if validation is missing)
+2 (if validation is based on a data set from the same institute)
+3 (if validation is based on a data set from another institute)
+4 (if validation is based on two data sets from two distinct institutes)
+4 (if the study validates a previously published signature)
+5 (if validation is based on three or more data sets from distinct institutes)
*Data sets should be of comparable size and should have at least 10 events per model feature
13 Comparison to ‘gold-standard’—assess the extent to which the model agrees with/is superior to the current ‘gold-standard’ method (e.g. TNM-staging for survival prediction). This comparison shows the added value of radiomics +2
14 Potential clinical utility—report on the current and potential application of the model in a clinical setting (e.g. decision curve analysis). +2
15 Cost-effectiveness analysis—report on the cost-effectiveness of the clinical application (e.g. QALYs generated) +1
16 Open science and data—make code and data publicly available. Open science facilitates knowledge transfer and reproducibility of the study +1 (if scans are open source)
+1 (if region of interest segmentations are open source)
+1 (if code is open source)
+1 (if radiomics features are calculated on a set of representative ROIs and the calculated features and representative ROIs are open source
Total points (36 = 100%)

AUC, area under the curve; EGFR, epidermal growth factor receptor;ROI, region of interest; RQS, radiomics quality score.

a

The RQS was described in earlier work. 20

Statistical analyses

Mean and variance was used for statistical description of RQS. We collected the sample size, sensitivity and specificity of each model based on ADC and DCE, and studies without sufficient data were excluded from data synthesis. Specific numbers of true-positive (TP), false-positive (FP), true-negative (TN) and false-negative (FN) were used to derive the pooled AUC value. The Cochrane Collaboration’s software (RevMan 5.4), Meta-DiSc (v. 1.4) and IBM SPSS (v. 26.0) were used for all statistical analyses.

Results

Search results

After removing duplicates, 56 unique articles were identified from electronic databases and then they were screened by reading abstracts. Finally, 18 studies fulfilled our inclusion criteria (Figure 1).

Figure 1.

Figure 1.

The flow diagram of the study selection process.

Study characteristics

The characteristics and main results of the studies included are summarized in Table 2. These studies were published between 2017 and 2022, and their sample sizes varied from 27 to 922. In total, 4019 participants were included.

Table 2.

Characteristics of the 18 studies included

Author Year Study design Number of patients AUC AUC if no grouping
Training set Validation set
Fan 2019 Retrospective 144 NA S50+ADC and Multitask: 0.821 ± 0.074
Fan 2020 Retrospective 322 NA original: 0.755 bicubic: 0.770 SRGAN: 0.801 EDSR: 0.818
Kayadibi 2021 Retrospective 154 14%: ADC: 0.785
DCEI: 0.696 BOTH: 0.755 20%:
ADC: 0.744 DCE: 0.629 BOTH: 0.761
14%: ADC: 0.849
DCEI: 0.695
BOTH: 0.635 20%: ADC: 0.617
DCE: 0.741 BOTH: 0.618
NA
Liang 2018 Retrospective 318 T 2WI: 0.762 T 2WI: 0.74 NA
Liu 2020 Retrospective 328 T 2WI: 0.727
T1+C: 0.873 DWI: 0.674 multiparametric MRI: 0.888
T 2WI: 0.706
T1+C: 0.829 DWI: 0.643 multiparametric MRI: 0.875
NA
Ma 2018 Retrospective 377 NA Naive Bayes: 0.773
SVM: 0.665
kNN: 0.489
Zhang 2019 Retrospective 128 0.75 ± 0.08 0.72 NA
Castaldo 2022 Retrospective 32 NA LDA: 0.81
Logit Boost: 0.70 RF: 0.76
He 2020 Retrospective 88 NA The first post-contrast frame: the AUC values among various regions of tumor and peritumoral stroma were all higher than 0.6
Holli-Helenius 2017 Retrospective 27 NA Pre-contrast images: Sum entropy: 0.827 ± 0.065
Sum variance: 0.832 ± 0.076
Sum entropy+Sum variance: 0.876 ± 0.077
Jiang 2021 Retrospective 209 DCE: 0.825 DWI: 0.832 Both: 0.863 DCE: 0.816 DWI: 0.795
Both: 0.838
NA
Li 2021 Retrospective 351 Intra RS: 0.845 Peri RS: 0.835 Com RS: 0.875 Intra RS: 0.714 Peri RS: 0.692 Com RS: 0.749 NA
Monti 2018 Retrospective 49 NA 0.811 ± 0.005
Ni 2020 Retrospective 112 NA Histogram: 0.926 (0.870–0.981) Texture: 0.718 (0.615–0.820) GLCM: 0.975 (0.949–1.000) RLM: 0.946 (0.905–0.988)
Saha 2018 Retrospective 922 NA 0.624 NA
Zhou 2020 Retrospective 126 Ktrans: 0.67 Kep: 0.69
Ve: 0.72
Vp: 0.74
NA NA
Umutlu 2021 Retrospective 124 NA All MRI and PET: 0.997
Zhang 2021 Retrospective 208 NA The feature of Dependence Non-uniformity Normalized obtained the highest AUC of 0.8923 for Ki-67; The 3D tumor subregion related to FFK yielded the highest AUCs of 0.8080 for Ki-67

ADC, apparent diffusion coefficient; AUC, area under the curve; Com RS, combined radiomics signatures; DCE-MRI, dynamic contrast-enhanced magnetic resonance imaging; DWI, diffusion-weighted imaging; EDSR, enhanced deep super resolution; FFK, fast-flow kinetics; GLCM, gray level co-occurrence matrix; Intra RS, intratumoral radiomics signatures; Kep, flux rate constant; Ki-67, cell proliferation index; Ktrans, volume transfer constant; LDA, linear discriminant analysis; PET, positron emission tomography; Peri RS, peritumoral radiomics signatures; Pk-DCE MRI, pharmacokinetic dynamic contrast-enhanced magnetic resonance imaging; RF, random forest decision trees; RLM, run length matrix; ROI, region of interest; S50, the last time period of the DCE-MRI series; SRGAN, super-resolution generative adversarial network; SVM, support vector machine; T1+C, contrast-enhanced T1-weighted imaging; T 2WI, T 2 weighted imaging; Ve, extravascular extracellular volume fraction; Vp, plasma volume fraction; kNN, k-nearest neighbor.

Risk of bias assessment

QUADAS-2 was used in our risk of bias assessment. The results showed that most studies had a moderate risk of bias (Figure 2). The unclear risk was mainly in the index test, including the bias and applicability (Figure 3).

Figure 2.

Figure 2.

The risk of bias summary for each study.

Figure 3.

Figure 3.

The risk of bias graph for the 18 included studies.

RQS distribution of the 18 studies

The RQS has 16 items. Its maximum possible total score is 36 (100% on a percentage scale). The mean value of RQS for all 18 included studies was 10.17 (28.25%, standard deviation [SD]: 3.54). As shown in Figure 4, no study scored more than half of the maximum possible for the RQS. The RQS range was 0–13, with two studies receiving the highest score. Three studies had RQS of less than 10, and all of them lacked feature reduction and then three points were deducted from each RQS. The study by He et al received a zero in our evaluation; it had the lowest RQS.

Figure 4.

Figure 4.

Calculation of the RQS for the 18 included studies. Maximum possible RQS is 36. Shading in the figure is as follows: grey indicates where points were deducted, light green indicates zero points, dark green indicates not fully scored items and orange indicates fully scored. RQS, radiomics quality score.

In our review of the 18 studies, no study received any score for each of these five items of the RQS: a phantom study on all scanners, cut-off analyses, prospective design, cost-effectiveness analysis, and open science and data. The sum of possible scores for these four items is 14 points (14 of the maximum possible RQS of 36, 38.89%). In contrast, all studies received a score on three items: detect and discuss biological correlates, discrimination statistics, and comparison to the gold standard. 4 of 18 (22.22%) studies lost three points on feature reduction. 1 of 18 (5.56%) studies lost five points on validation. The other 17 studies all used internal data set validation and 8 of them adopted cross-validation other than independent validation. The total possible points for validation are five when three or more external data sets are compared. But none of the studies performed external validation. Of note, there was only one study (Jiang et al) that received points for the potential clinical utility item, for its usage of decision curve analysis (DCA) and nomogram to advance the development of clinical applications. Moreover, it was also the only study using calibration curves of the nomogram to show the model-predicted and actual probabilities, so this study also received points for the calibration statistics item.

Predictive power of different MRI sequence

In the 18 studies, apparent diffusion coefficient (ADC), diffusion-weighted imaging (DWI), T 2 weighted imaging (T 2WI), dynamic contrast-enhanced MRI (DCE-MRI), and pharmacokinetic DCE (Pk-DCE) were sequences used for ROI segmentation, feature selection, and then for quantitative analysis. Our study compared the predictive power of ADC with DCE in radiomics prediction for the Ki-67 level (cut-off value: 14%). Six studies were included in the meta-analysis and eight data sets were pooled. As shown in Table 3, ADC showed a higher predictive power with a pooled AUC of 0.969 compared to 0.944 for DCE. However, the analysis was highly heterogeneous with the Spearman’s rank correlation rho estimates of 0.8, therefore, SROC (summary ROC) curve was used. Figure 5 was the SROC curve and forest plot.

Table 3.

ADC vs DCE in radiomics predictive power for the Ki-67 level

MRI sequence Sensitivity Specificity AUC of sROC
ADC 0.801 0.844 0.969
DCE 0.802 0.819 0.944

ADC, apparent diffusion coefficient; AUC, area under the curve; DCE, dynamic contrast-enhanced; sROC, summary ROC.

Figure 5.

Figure 5.

The SROC curve and forest plot. ADC, apparent diffusion coefficient; DCE, dynamic contrast-enhanced; SROC, summary ROC.

Discussion

Using the RQS as an evaluation tool, this review provides a systematic analysis of the methodological quality of 18 studies that used radiomics based on MRI to predict the Ki-67 level in a total of more than 4000 BC patients. While the results of this review support the efficacy of the booming field of radiomics in predicting the Ki-67 level, the studies we included were limited by the poor methodological quality.

MRI has the advantage of multisequence imaging, and plays an important role in breast imaging diagnosis. Some findings have shown that ADC performed well in predicting BC biomarkers, especially the Ki-67 index. 25,26 In radiomics, researchers held disagreements about which sequence is of higher AUC. 27–31 Our review compared the summary AUC of models based on DCE and ADC. While DCE was used most commonly, ADC showed a better AUC. Of note, Kayadibi et al compared these two Ki-67 cut-off values (14% vs  20%) and observed a significant result showing that the radiomics model based on 14% produced a higher AUC (0.849 vs 0.614 in the testing set). Our evaluation of the 18 included studies showed that most studies (12/18) chose 14% as the cut-off value. However, as mentioned above, the cut-off value is contentiously debated and has been changed multiple times. A long-term follow-up study showed that 20% was optimal for predicting the prognosis in luminal BC; this cut-off value can better distinguish the risk of recurrence and death than 14%. It is still an open question about which cut-off is more suitable for radiomics in BC. 32

Improvement for radiomics quality is urgently needed

RQS, a tool promoted to quantify the quality of radiomics studies, has its unique benefits such as being a pioneering standard in the radiomics field, wide adaptability, and ease of management. Using 16 items, RQS can comprehensively evaluate whether a radiomics study is a good quality study or whether it has a deficiency. Lambin et al 20 put more emphasis on these four items: a prospective study, feature selection, validation, and open science and data, suggesting that these four items might be more important for reflecting whether a study is of good quality or not. The prospective study design is recommended for the highest level of evidence, as well as for helping construct robust radiomics models and guaranteeing clinical validity. Disappointedly, none of the studies we evaluated were designed prospectively.

Four studies lost points on feature reduction. An included study by Saha et al 33 serves as an example that extracted 529 DCE-MRI features to help compare the different AUCs obtained by previous radiomics studies which used small, limited, and various features. However, these investigators had to confront the overfitting consequence of compiling such a comprehensive set of features, and this could jeopardize the performance of the model. 20 Finally, they got the AUCs of the models all below 0.7 which was an inferior result compared to other studies on this similar topic.

In our review, none of the 18 studies received any points for four items (phantom study on all scanners, cut-off analyses, cost-effectiveness analysis, and open science and data), indicating the insufficient attention and operation in the real-world practice of radiomics. A phantom study is designed to elevate the reproducibility and robustness of radiomic features because the imaging protocols for medical imaging scanners are not completely identical. However, previous reviews 34–37 and our review found that no study published in the radiomics literature had carried out this test. A recent phantom study used the interscanner coefficient of variation (COV) and interscanner intraclass correlation coefficient (ICC) to assess the features extracted from two phantoms on eight scanners. 38 Ray et al showed that first-order statistics were the most robust (having ICCs > 0.8) followed by kurtosis and skewness. Shape was the least robust metric, indicating that intervendor and interscanner sources of variability were variances to be addressed. And, we note that the RQS does not include an item about sample size, which is crucial for the validity of radiomics results. 39 A sufficient number of patients facilitates the validation and, as we all know results obtained from a small number of patients are not so compelling, especially in radiomics which is based on mass data. Three of our included studies 40–42 had sample sizes of less than 50 patients, which undoubtedly limited the interpretation of the radiomics results.

The information presented in Supplementary Table 3 shows great variability in the software used by different investigators for image segmentation and feature process (extraction and selection). It is also worthy of note to mention the intra- and inter-research heterogeneity in data sets derived from scan protocols, naming conventions of features, and modeling steps. 14,43,44 In addition, it has been reported that different teams obtained different feature values even when using the same feature. 43 The combination of these factors made it difficult when we tried to compare these studies and leave no opportunities for subsequent research studies to reproduce the original results. Taken together, this situation presents another problem underlying the current status of radiomics—standardization. Proposed by Zwanenburg et al. in 2020, 43 using the Image Biomarker Standardization Initiative (IBSI) reference values can result in excellent reproducibility for features. However, in our review, we found that only three studies 31,45,46 claimed to follow the IBSI reference when defining and calculating the features. Regarding feature extraction, semi- and fully automatic approaches are recommended because manual feature extraction can take a lot of time and add inter- and intra-observer variance. 47 To improve the future of clinical translation, researchers should pay more attention to the use of uniform standards for software and algorithms when processing features.

No data in our reviewed studies were made publicly available. This situation was discouraging because openly sharing in-house data can greatly improve reproductivity and then the clinical translation can be achieved earlier. 48 It was also reported that open science in radiomics can facilitate external validation, which is another key point in RQS. 49 We believe that a large, publicly open and standardized radiomics database can provide tremendous power for clinical translation, where researchers can download data with little discrepancy in image acquisition and protocols. Furthermore, such a database can also be built by many institutions to meet the requirements of external validation more easily. 49,50 The Cancer Imaging Archive (TCIA) (http://cancerimagingarchive.net/), an ever-growing archive of medical images consisting of a large quantity of clinical information, sets a good example for data sharing. 51 A study based on MRI has used the TCIA data set to predict molecular classifications of BC subtypes, 52 taking advantage of the multicenter data set validation. Moreover, future data sets are supposed to be based on the phantom study to lessen the impacts of the variance of intervendor and interscanner sources.

We still recommend that each radiomics study pre-score their study before submitting their work for publication and explain any incomplete items in the discussion. These measures will not only help improve their work but will also help editors and reviewers to objectively estimate the quality of the study.

There are several potential limitations of this review article. Primarily, we were limited by the small number of published studies in the literature. In addition, because of the large heterogeneity among them, we performed a meta-analysis of poor quality. Lastly, RQS itself has limitations. It is not a well-rounded tool, and it should be developed to keep pace with the rapid speed with which radiomics is improving.

Conclusion

Ki-67 is a common index to reflect the cell proliferation rate, whose radiomics model might be utilized in many other tumors. Many researchers have worked to predict it by radiomics, and our review indicated a promising result for the Ki-67 level prediction in patients with BC. However, according to the RQS, many aspects of these studies need improvement: the incorporation of a prospective study design, external validation, open science and data, and phantom analysis. The exploration of the most suitable MRI sequence for radiomics is warranted. Standardization of feature processing and establishment of a standard and open radiomics database might foster clinical translation.

Footnotes

Acknowledgements: The authors are grateful to MD Jingyu Zhong for helping solve some problems of the data collection.

Contributor Information

Min Wang, Email: wangmin77058@sina.cn.

Ting Mei, Email: meitingys@163.com.

Youling Gong, Email: drgongyouling@hotmail.com.

REFERENCES

  • 1. Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al. Global cancer Statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 2021; 71: 209–49. doi: 10.3322/caac.21660 [DOI] [PubMed] [Google Scholar]
  • 2. Curigliano G, Burstein HJ, Winer EP, Gnant M, Dubsky P, Loibl S, et al. De-escalating and escalating treatments for early-stage breast cancer: the St. Gallen International expert consensus conference on the primary therapy of early breast cancer 2017. Ann Oncol 2017; 28: 1700–1712. doi: 10.1093/annonc/mdx308 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Coates AS, Winer EP, Goldhirsch A, Gelber RD, Gnant M, Piccart-Gebhart M, et al. Tailoring therapies--improving the management of early breast cancer: st Gallen International expert consensus on the primary therapy of early breast cancer 2015. Ann Oncol 2015; 26: 1533–46. doi: 10.1093/annonc/mdv221 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Goldhirsch A, Winer EP, Coates AS, Gelber RD, Piccart-Gebhart M, Thürlimann B, et al. Personalizing the treatment of women with early breast cancer: highlights of the St Gallen International expert consensus on the primary therapy of early breast cancer 2013. Ann Oncol 2013; 24: 2206–23. doi: 10.1093/annonc/mdt303 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Goldhirsch A, Wood WC, Coates AS, Gelber RD, Thürlimann B, Senn H-J, et al. Strategies for subtypes--dealing with the diversity of breast cancer: highlights of the St. Gallen International expert consensus on the primary therapy of early breast cancer 2011. Ann Oncol 2011; 22: 1736–47. doi: 10.1093/annonc/mdr304 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Yerushalmi R, Woods R, Ravdin PM, Hayes MM, Gelmon KA. Ki67 in breast cancer: Prognostic and predictive potential. Lancet Oncol 2010; 11: 174–83. doi: 10.1016/S1470-2045(09)70262-1 [DOI] [PubMed] [Google Scholar]
  • 7. Urruticoechea A, Smith IE, Dowsett M. Proliferation marker Ki-67 in early breast cancer. J Clin Oncol 2005; 23: 7212–20. doi: 10.1200/JCO.2005.07.501 [DOI] [PubMed] [Google Scholar]
  • 8. Inwald EC, Klinkhammer-Schalke M, Hofstädter F, Zeman F, Koller M, Gerstenhauer M, et al. Ki-67 is a Prognostic parameter in breast cancer patients: results of a large population-based cohort of a cancer Registry. Breast Cancer Res Treat 2013; 139: 539–52. doi: 10.1007/s10549-013-2560-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Coates AS, Winer EP, Goldhirsch A, Gelber RD, Gnant M, Piccart-Gebhart M, et al. Tailoring therapies—improving the management of early breast cancer: st Gallen International expert consensus on the primary therapy of early breast cancer 2015. Ann Oncol 2015; 26: 1533–46. doi: 10.1093/annonc/mdv221 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Chen X, He C, Han D, Zhou M, Wang Q, Tian J, et al. The predictive value of Ki-67 before Neoadjuvant chemotherapy for breast cancer: a systematic review and meta-analysis. Future Oncol 2017; 13: 843–57. doi: 10.2217/fon-2016-0420 [DOI] [PubMed] [Google Scholar]
  • 11. Boros M, Moncea D, Moldovan C, Podoleanu C, Georgescu R, Stolnicu S. Intratumoral heterogeneity for Ki-67 index in invasive breast carcinoma: A study on 131 consecutive cases. Appl Immunohistochem Mol Morphol 2017; 25: 338–40. doi: 10.1097/PAI.0000000000000315 [DOI] [PubMed] [Google Scholar]
  • 12. Ab Mumin N, Ramli Hamid MT, Wong JHD, Rahmat K, Ng KH. Magnetic resonance imaging phenotypes of breast cancer molecular subtypes: A systematic review. Acad Radiol 2022; 29: S89–106. doi: 10.1016/j.acra.2021.07.017 [DOI] [PubMed] [Google Scholar]
  • 13. Berg WA, Zhang Z, Lehrer D, Jong RA, Pisano ED, Barr RG, et al. Detection of breast cancer with addition of annual screening ultrasound or a single screening MRI to Mammography in women with elevated breast cancer risk. JAMA 2012; 307: 1394–1404. doi: 10.1001/jama.2012.388 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Gillies RJ, Kinahan PE, Hricak H. Radiomics: images are more than pictures, they are data. Radiology 2016; 278: 563–77. doi: 10.1148/radiol.2015151169 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Davey MG, Davey MS, Boland MR, Ryan ÉJ, Lowery AJ, Kerin MJ. Radiomic differentiation of breast cancer molecular subtypes using pre-operative breast imaging - A systematic review and meta-analysis. Eur J Radiol 2021; 144: 109996. doi: 10.1016/j.ejrad.2021.109996 [DOI] [PubMed] [Google Scholar]
  • 16. Conti A, Duggento A, Indovina I, Guerrisi M, Toschi N. Radiomics in breast cancer classification and prediction. Semin Cancer Biol 2021; 72: 238–50. doi: 10.1016/j.semcancer.2020.04.002 [DOI] [PubMed] [Google Scholar]
  • 17. Fan M, Wu G, Cheng H, Zhang J, Shao G, Li L. Radiomic analysis of DCE-MRI for prediction of response to Neoadjuvant chemotherapy in breast cancer patients. Eur J Radiol 2017; 94: 140–47. doi: 10.1016/j.ejrad.2017.06.019 [DOI] [PubMed] [Google Scholar]
  • 18. Vallières M, Zwanenburg A, Badic B, Cheze Le Rest C, Visvikis D, Hatt M. Responsible Radiomics research for faster clinical translation. J Nucl Med 2018; 59: 189–93. doi: 10.2967/jnumed.117.200501 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Sollini M, Antunovic L, Chiti A, Kirienko M. Towards clinical application of image mining: a systematic review on artificial intelligence and Radiomics. Eur J Nucl Med Mol Imaging 2019; 46: 2656–72. doi: 10.1007/s00259-019-04372-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Lambin P, Leijenaar RTH, Deist TM, Peerlings J, de Jong EEC, van Timmeren J, et al. Radiomics: the bridge between medical imaging and personalized medicine. Nat Rev Clin Oncol 2017; 14: 749–62. doi: 10.1038/nrclinonc.2017.141 [DOI] [PubMed] [Google Scholar]
  • 21. Wei D-M, Chen W-J, Meng R-M, Zhao N, Zhang X-Y, Liao D-Y, et al. Augmented expression of Ki-67 is correlated with Clinicopathological characteristics and prognosis for lung cancer patients: an up-dated systematic review and meta-analysis with 108 studies and 14,732 patients. Respir Res 2018; 19(): 150. doi: 10.1186/s12931-018-0843-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Chen J-H, Bahri S, Mehta RS, Kuzucan A, Yu HJ, Carpenter PM, et al. Breast cancer: evaluation of response to Neoadjuvant chemotherapy with 3.0-T MR imaging. Radiology 2011; 261: 735–43. doi: 10.1148/radiol.11110814 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Tian Y, Ma Z, Chen Z, Li M, Wu Z, Hong M, et al. Clinicopathological and Prognostic value of Ki-67 expression in bladder cancer: A systematic review and meta-analysis. PLoS One 2016; 11(): e0158891. doi: 10.1371/journal.pone.0158891 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. De Rosa F, Pignatiello S, Sibillo MC, Guadagno E. Proliferation in Pleomorphic adenoma: lights and shadow on this parameter, in a Neoplasm showing unpredictable behavior-an Immunohistochemical study and review of the literature. Pathol Res Pract 2022; 232: 153748. doi: 10.1016/j.prp.2021.153748 [DOI] [PubMed] [Google Scholar]
  • 25. Mori N, Ota H, Mugikura S, Takasawa C, Ishida T, Watanabe G, et al. Luminal-type breast cancer: correlation of apparent diffusion coefficients with the Ki-67 labeling index. Radiology 2015; 274: 66–73. doi: 10.1148/radiol.14140283 [DOI] [PubMed] [Google Scholar]
  • 26. Catalano OA, Horn GL, Signore A, Iannace C, Lepore M, Vangel M, et al. PET/MR in invasive Ductal breast cancer: correlation between imaging markers and histological phenotype. Br J Cancer 2017; 116: 893–902. doi: 10.1038/bjc.2017.26 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Fan M, Yuan W, Zhao W, Xu M, Wang S, Gao X, et al. Joint prediction of breast cancer histological grade and Ki-67 expression level based on DCE-MRI and DWI Radiomics. IEEE J Biomed Health Inform 2020; 24: 1632–42. doi: 10.1109/JBHI.2019.2956351 [DOI] [PubMed] [Google Scholar]
  • 28. Jiang T, Song J, Wang X, Niu S, Zhao N, Dong Y, et al. Intratumoral and peritumoral analysis of Mammography, Tomosynthesis, and Multiparametric MRI for predicting Ki-67 level in breast cancer: a Radiomics-based study. Mol Imaging Biol 2022; 24: 550–59. doi: 10.1007/s11307-021-01695-w [DOI] [PubMed] [Google Scholar]
  • 29. Kayadibi Y, Kocak B, Ucar N, Akan YN, Akbas P, Bektas S. Radioproteomics in breast cancer: prediction of Ki-67 expression with MRI-based Radiomic models. Acad Radiol 2022; 29: S116–25. doi: 10.1016/j.acra.2021.02.001 [DOI] [PubMed] [Google Scholar]
  • 30. Liang C, Cheng Z, Huang Y, He L, Chen X, Ma Z, et al. An MRI-based Radiomics Classifier for preoperative prediction of Ki-67 status in breast cancer. Acad Radiol 2018; 25: 1111–17. doi: 10.1016/j.acra.2018.01.006 [DOI] [PubMed] [Google Scholar]
  • 31. Liu W, Cheng Y, Liu Z, Liu C, Cattell R, Xie X, et al. Preoperative prediction of Ki-67 status in breast cancer with Multiparametric MRI using transfer learning. Acad Radiol 2021; 28: e44–53. doi: 10.1016/j.acra.2020.02.006 [DOI] [PubMed] [Google Scholar]
  • 32. Bustreo S, Osella-Abate S, Cassoni P, Donadio M, Airoldi M, Pedani F, et al. Optimal Ki67 cut-off for Luminal breast cancer Prognostic evaluation: a large case series study with a long-term follow-up. Breast Cancer Res Treat 2016; 157: 363–71. doi: 10.1007/s10549-016-3817-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Saha A, Harowicz MR, Grimm LJ, Kim CE, Ghate SV, Walsh R, et al. A machine learning approach to Radiogenomics of breast cancer: a study of 922 subjects and 529 DCE-MRI features. Br J Cancer 2018; 119: 508–16. doi: 10.1038/s41416-018-0185-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Zhong J, Hu Y, Si L, Jia G, Xing Y, Zhang H, et al. A systematic review of Radiomics in Osteosarcoma: utilizing Radiomics quality score as a tool promoting clinical translation. Eur Radiol 2021; 31: 1526–35. doi: 10.1007/s00330-020-07221-w [DOI] [PubMed] [Google Scholar]
  • 35. Stanzione A, Gambardella M, Cuocolo R, Ponsiglione A, Romeo V, Imbriaco M. Prostate MRI Radiomics: A systematic review and Radiomic quality score assessment. European Journal of Radiology 2020; 129: 109095. doi: 10.1016/j.ejrad.2020.109095 [DOI] [PubMed] [Google Scholar]
  • 36. Sanduleanu S, Woodruff HC, de Jong EEC, van Timmeren JE, Jochems A, Dubois L, et al. Tracking tumor biology with Radiomics: a systematic review utilizing a Radiomics quality score. Radiother Oncol 2018; 127: 349–60. doi: 10.1016/j.radonc.2018.03.033 [DOI] [PubMed] [Google Scholar]
  • 37. Chetan MR, Gleeson FV. Radiomics in predicting treatment response in non-small-cell lung cancer: current status, challenges and future perspectives. Eur Radiol 2021; 31: 1049–58. doi: 10.1007/s00330-020-07141-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Rai R, Holloway LC, Brink C, Field M, Christiansen RL, Sun Y, et al. Multicenter evaluation of MRI-based radiomic features: A phantom study. Med Phys 2020; 47: 3054–63. doi: 10.1002/mp.14173 [DOI] [PubMed] [Google Scholar]
  • 39. Zhang Y, Zhu Y, Zhang K, Liu Y, Cui J, Tao J, et al. Invasive Ductal breast cancer: preoperative predict Ki-67 index based on Radiomics of ADC maps. Radiol Med 2020; 125: 109–16. doi: 10.1007/s11547-019-01100-1 [DOI] [PubMed] [Google Scholar]
  • 40. Monti S, Aiello M, Incoronato M, Grimaldi AM, Moscarino M, Mirabelli P, et al. DCE-MRI pharmacokinetic-based phenotyping of invasive ductal carcinoma: A radiomic study for prediction of histological outcomes. Contrast Media Mol Imaging 2018; 2018: 5076269. doi: 10.1155/2018/5076269 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41. Holli-Helenius K, Salminen A, Rinta-Kiikka I, Koskivuo I, Brück N, Boström P, et al. MRI texture analysis in differentiating luminal A and luminal B breast cancer molecular subtypes - a feasibility study. BMC Med Imaging 2017; 17(): 69. doi: 10.1186/s12880-017-0239-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Castaldo R, Garbino N, Cavaliere C, Incoronato M, Basso L, Cuocolo R, et al. A complex radiomic signature in luminal breast cancer from a weighted statistical framework: a pilot study. Diagnostics (Basel) 2022; 12(): 499. doi: 10.3390/diagnostics12020499 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43. Zwanenburg A, Vallières M, Abdalah MA, Aerts HJWL, Andrearczyk V, Apte A, et al. The image biomarker standardization initiative: standardized quantitative radiomics for high-throughput image-based phenotyping. Radiology 2020; 295: 328–38. doi: 10.1148/radiol.2020191145 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44. Pinker K, Chin J, Melsaether AN, Morris EA, Moy L. Precision medicine and radiogenomics in breast cancer: new approaches toward diagnosis and treatment. Radiology 2018; 287: 732–47. doi: 10.1148/radiol.2018172171 [DOI] [PubMed] [Google Scholar]
  • 45. Umutlu L, Kirchner J, Bruckmann NM, Morawitz J, Antoch G, Ingenwerth M, et al. Multiparametric integrated (18)F-FDG PET/MRI-based radiomics for breast cancer phenotyping and tumor decoding. Cancers 2021; 13: 2928. doi: 10.3390/cancers13122928 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46. Zhou X, Gao F, Duan S, Zhang L, Liu Y, Zhou J, et al. Radiomic features of Pk-DCE MRI parameters based on the extensive tofts model in application of breast cancer. Phys Eng Sci Med 2020; 43: 517–24. doi: 10.1007/s13246-020-00852-9 [DOI] [PubMed] [Google Scholar]
  • 47. Kuo MD, Jamshidi N. Behind the numbers: decoding molecular phenotypes with radiogenomics--guiding principles and technical considerations. Radiology 2014; 270: 320–25. doi: 10.1148/radiol.13132195 [DOI] [PubMed] [Google Scholar]
  • 48. Foy JJ, Robinson KR, Li H, Giger ML, Al-Hallaq H, Armato SG. 3rd, variation in algorithm implementation across radiomics software. J Med Imaging (Bellingham) 2018; 5(): 044505. doi: 10.1117/1.JMI.5.4.044505 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49. Park JE, Kim D, Kim HS, Park SY, Kim JY, Cho SJ, et al. Quality of science and reporting of Radiomics in oncologic studies: room for improvement according to Radiomics quality score and TRIPOD statement. Eur Radiol 2020; 30: 523–36. doi: 10.1007/s00330-019-06360-z [DOI] [PubMed] [Google Scholar]
  • 50. McShane LM, Altman DG, Sauerbrei W, Taube SE, Gion M, Clark GM, et al. Reporting recommendations for tumor marker Prognostic studies (REMARK). J Natl Cancer Inst 2005; 97: 1180–84. doi: 10.1093/jnci/dji237 [DOI] [PubMed] [Google Scholar]
  • 51. Clark K, Vendt B, Smith K, Freymann J, Kirby J, Koppel P, et al. The cancer imaging Archive (TCIA): maintaining and operating a public information repository. J Digit Imaging 2013; 26: 1045–57. doi: 10.1007/s10278-013-9622-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52. Li H, Zhu Y, Burnside ES, Huang E, Drukker K, Hoadley KA, et al. Quantitative mri radiomics in the prediction of molecular classifications of breast cancer subtypes in the tcga/tcia data set. Npj Breast Cancer 2016; 2: 2. doi: 10.1038/npjbcancer.2016.12 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Table 1.

Articles from The British Journal of Radiology are provided here courtesy of Oxford University Press

RESOURCES