The addition of radiomic features to the clinical routine assessment categories using different machine learning algorithms has highly variable effect on the discriminative accuracy to predict significant versus insignificant PCa. Analysis of prediction performance for clinically significant prostate cancer (PCa) using 15 variant feature subsets with 3 different machine learning algorithms. The subsets were based on PI-RADS (PI) and the top four quantitative imaging features surface to volume ratio (SVR), joint entropy (JE), least axis (LA), or maximum 3D diameter (max3D). The prediction models were built using support vector machine (SVM, a, d), neural network (NN, b, e), or random forest (RF, c, f) algorithms. a–c The Box-Whisker plots with 5–95% percentile for each machine learning algorithm obtained by 100-fold cross-validation experiments as depicted in detail in the “Materials and methods” section using the respective area under the receiver operator characteristics (ROC) curve to predict significant PCa. Asterisks relate to the analysis of PI against the respective subset as indicated (a–c). Significant differences to PI are depicted using two-tailed, unpaired Student’s t test (a–c). The respective images of the 100-fold cross-validated (colors) ROC curve analyses with the mean ROC curve (blue) are shown for each prediction model for PI and its combination with SVR, LA, or max3D (e, f). The adjacent gray area depicts ± one standard deviation (e, f). Shown are the results of the validation cohort with 30% holdback proportion, drawn at random. Patients with PI-RADS = 3 (n = 8) were excluded due to training/validation redundancy, to avoid overfitting and bias as the respective lesions were always insignificant PCa in the studied cohort (Fig. 5b). SVM was adapted for rbf-Kernel, C = 1 with probability = true. The NN consisted of 1 layer and 3 hidden nodes, maximum iteration of 100, logistic activator, and the lbfgs solver. For RF analysis, 20 estimators with random_state = 0 were specified