Skip to main content
. 2022 Nov 21;109(12):2163–2177. doi: 10.1016/j.ajhg.2022.10.013

Figure 1.

Figure 1

Data set preparation

Steps taken to prepare the three data sets in this study, extracted from ClinVar (A and C) and gnomAD (B). Numbers on the right side represent the numbers of variants remaining after each step and numbers in parentheses represent the numbers of genes remaining after each step. The data set resulting from (A) is referred to as the ClinVar 2019 set, from (B) the gnomAD set, and from (C) the ClinVar 2020 set. The asterisk refers to numbers after removing variants from the MPC training sets. This was done in a post hoc manner after all filtering and downsampling steps were carried out for the ClinVar 2019 and gnomAD sets.