Table 3.
Process | Considerations | Recommendations |
---|---|---|
Study design | Study registration | Pre-register studies in databases such as the Open Science Framework (OSF) |
Cohort selection | Focus on specific molecular subtypes or subclasses of cancers may enable more accurate radiogenomic modelsMeta-analysis of multiple cohorts can be used to achieve more generalizable models | |
Study design | Prospective study design to enable longitudinal feature assessment may be ideal for generating models to predict immunotherapy response and identify biomarkers of resistanceFor retrospective study design, statistical and modeling approaches should be decided a priori | |
Evaluating molecular data | Tumor and TME gene expression data procurement and processing | RNA-seq for assessing gene expression, refer to Conesa et al. 2016 for a review of good data practices [121] RNA-seq may be eventually supplanted by single-cell RNA-seq, which can improve the ability to distinguish tumor versus immune cell gene expression |
Pathway and immune infiltration analysis | Software like Gene Set Enrichment Analysis (GSEA), Ingenuity Pathway Analysis, DAVID, Metascape are standard for pathway enrichment analysis Approaches including single sample GSEA (ssGSEA), CIBERSORT, and Immunoscore useful for more specific quantification of types of tumor immune cell infiltration |
|
Cell markers by IHC | Specific staining of cell surface markers remains the gold standard for quantifying immune cell infiltration To increase staining throughput, consider using tissue microarrays and multiplexed IHC |
|
Quantifying TILs by H&E | H&E allows for good quantitation of TILs, but is often subject to clinician-reader bias Best clinical practices are outlined in Salgado et al. 2015 [122] |
|
Image acquisition, processing, and extraction | Image acquisition parameters | Use standardized acquisition parameters |
Image pre-processing | Normalize voxel intensities of images, particularly MRI, to more accurately and reproducibly extract radiomic features |
|
Feature definition and extraction | Use feature standardization platforms, such as MITK Phenotyping and the Image Biomarker Standardization Initiative |
|
Tumor segmentation | Use multiple independent observers if segmenting manually or consider semi-automatic/automatic approaches to maximize reproducibility |
|
Deep learning | Utilize algorithm visualization methodology, such as saliency maps, to increase interpretability/explainability/transparency |
|
Modeling and data analysis | Feature selection | Reduce feature dimensionality such as through regression modeling (e.g. LASSO Cox, Elastic Net) or using intra-class feature similarity measures (e.g. intra-class correlation coeffcient) to prevent overfitting and improve feature reliability |
Model design | Best performing models for predicting prognosis and immunotherapy response are likely achieved by combining radiogenomics models with other covariates into composite models Correct for multiple hypothesis testing where appropriate |
|
Machine learning | Use hold-out data sets for evaluation of models and to prevent any data leakage from training to evaluation sets Validate on data that are independent from the training set and ideally from multi-institutional sources |
|
Data transparency and reporting | Public data and code repositories | Share code in open-source repositories like GitHub Share imaging data in public repositories like the Imaging Data Commons (IDC) |
Radiomics quality score (RQS) | Report RQS score (out of 36) developed by Sanduleanu et al. 2018 [1] | |
Study reporting checklists | Use of TRIPOD 22-item checklist for model development and validation [120] |
Legend: DAVID: database for annotation, visualization, and integrated discovery, IHC: immunohistochemistry, TIL: tumor infiltrating lymphocyte, H&E: hematoxylin and eosin, MITK: medical imaging interaction toolkit, LASSO: least absolute shrinkage and selection operator, TRIPOD: Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis.