Skip to main content
. 2023 May 8;29(5):1113–1122. doi: 10.1038/s41591-023-02332-5

Extended Data Fig. 1. Preprocessing and filtering of the DK disease trajectory datasets.

Extended Data Fig. 1

Filtering of the Danish (DK-DNPR) patient registries prior to training. In the Danish dataset, patient status codes were used to remove discontinuous disease histories such as patients living in Greenland, patients with alterations in their patient ID or patients who lack a stable residence in Denmark. We also removed referral and temporary diagnosis codes which are not the final diagnosis codes and can be misleading to use for training. Patients with short trajectories (<5 diagnosis codes) were removed. The final set of patients were split into Training (80 %), Validation (10%) and Testing set (10%).