. 2023 May 10;381:e073800. doi: 10.1136/bmj-2022-073800

Table 4.

Description of machine learning model architectures and hyperparameters tuning performed

Model, basic architecture, and hyperparameters tuned	Range explored during tuning	Final selected value after tuning
XGBoost
Tree based booster with GPU_hist; gradient based subsampling; RMSE evaluation metric	1-6	6
Maximum tree depth:
Learning rate (eta)	0.0001-0.1	0.073
Subsampling proportion	0.1-0.5	0.1
No of boosting rounds	1-500	251
Alpha (regularisation)	0-20	18
Gamma (regularisation)	0-20	0
Lambda (regularisation)	0-20	3
Column sampling by tree	0.1-0.8	0.501
Column sampling by level	0.1-0.8	0.518
Neural network
Feed forward ANN with fully connected layers; 26 input nodes (No of predictors); ReLU activation functions in hidden layers; Adam optimiser; single output node with linear activation; RMSE loss function; batch size 1024:
No of hidden layers	1-5	2
No of nodes in each hidden layer	26-50	30
No of epochs	1-50	32
Initial learning rate	0.001-0.1	0.032

ANN=artificial neural network; ReLU=rectified linear unit; RMSE=root mean squared error.

The continuous outcome variables for both models were the jack-knife pseudovalues for the cumulative incidence function for breast cancer related mortality at 10 years. The final neural network model had a total of 1771 parameters (all trainable).