Skip to main content
. 2023 Jul 17;2(7):e0000108. doi: 10.1371/journal.pdig.0000108

Table 1. Imperfectly interoperable (IIO) data sets.

From the 3,192-patient CDSS-derived data set, we create two training sets with three levels of imperfect feature overlap (60, 80 and 90%) compared with perfect interoperability (100%). In our experiments, the owner of a small ‘target’ data set (fewer patients) wants to benefit from a larger ‘source’ data set without having access to this data. The ‘source’ may lack several features that are available in the ‘target’, yielding several levels of ‘imperfect interoperability’. We construct validation sets with and without these missing features, as well as a held-out test set. The F1 scores we report in this paper are averages over five randomized folds of this data-splitting procedure.

Split Partition Patients Interoperable Imperfectly interoperable
Train Source (A) 2 068 Inline graphic Inline graphic
Target (B) 516 Inline graphic Inline graphic
Validation Source 288 Inline graphic Inline graphic
Target 288 Inline graphic Inline graphic
Test Source 288 Inline graphic Inline graphic