Skip to main content
. Author manuscript; available in PMC: 2022 Oct 1.
Published in final edited form as: Ophthalmol Retina. 2021 Feb 6;5(10):1027–1035. doi: 10.1016/j.oret.2020.12.013

Table 1. Training and Test Sets.

All included images from the North American, Nepali, and combined datasets were split such that 90% of the images were distributed into the training set (A) and 10% of the images were distributed into the test set (B). Each dataset was stratified to maintain similar ratios of images with and without stage in each training and test set, as well as separated on a patient-level to ensure all training and test sets contained no overlapping patients. Each training set (A) was further split into 5 cross validation splits, retaining the underlying distribution of stage. Models were created using the North American dataset alone, the Nepali dataset alone, or both datasets combined. A) Training Set, divided into 5 splits for 5-fold Cross Validation

No. in Training Split No. In Validation Split

Total North American Dataset Nepali Dataset Total North American Dataset Nepal Dataset

Splits Images Patients No Stage Stage No Stage Stage Images Patients No Stage Stage No Stage Stage
1 7899 895 3018 1102 3599 180 2138 231 735 380 995 28
2 7941 882 2901 1161 3694 185 2096 244 852 321 900 23
3 7977 899 2998 1191 3651 137 2060 227 755 291 943 71
4 8149 909 3056 1263 3661 169 1888 217 697 219 933 39
5 8033 893 2974 1145 3762 152 2004 233 779 337 832 56