Table 1. Summary of features of existing R packages for risk prediction.
✓ denotes the presence of the feature and × denotes absence of the feature.
Packages | Model building | Model validation | |||||
---|---|---|---|---|---|---|---|
Calibration to population incidence | Detailed family history | Special option for SNP markers | Imputation of missing risk-factors | Full cohort | Two-phase study | Imputation of missing risk-factors | |
riskRegressiona | × | × | × | × | ✓ | × | × |
predictABELa | × | × | × | × | ✓ | × | × |
BCRAb | ✓c | × | × | × | × | × | × |
BayesMendelb | × | ✓ | × | ✓ | × | × | ×f |
rmap | × | × | × | × | ✓ | ✓e | × |
iCARE b | ✓c | × | ✓d | ✓ | ✓ | ✓e | ✓f |
a These packages include some functions for model building (see Section), but those approaches do not demonstrate the key features shown in the above table.
b Capability to use information from multiple data sources, e.g., BCRA and iCARE can use relative risk parameters from cohort or case-control studies and disease incidence and mortality rates from population registries.
c BCRA estimates baseline hazard and calibrates the model to the underlying population incidence rates using distribution of risk-factors from cases in a specific study that may not be representative of the general population. This step is implemented in iCARE using a reference dataset that provides information on the distribution of risk-factors in the general population.
d iCARE includes the special option in which independent SNP markers can be included using published estimates of odds ratios and allele frequencies.
e Inverse probability weighted estimators of model validation statistics are implemented, accounting for bias due to non-random sampling using sampling weights.
f BayesMendel incorporates imputation methods for certain risk-factors (e.g., age), but they do not implement any method of validating risk prediction models. iCARE implements an inbuilt imputation approach to deal with missing risk-factors using a reference risk-factor dataset representative of the underlying population. The standardized model validation methods implemented in iCARE can take advantage of this inbuilt feature to impute missing risk-factors in the validation study.