. 2025 Jun 9;5(6):100847. doi: 10.1016/j.xops.2025.100847

Table 1.

Key Considerations for Data Used to Train an AI Model

Category	Key Questions
Dataset size	• How large is the training dataset? Consider both the number of images and the number of unique patients or eyes. • Was augmentation used to artificially increase dataset size? If so, what type?
Dataset scope	• Does the dataset include a range of disease severities (e.g., mild, moderate, severe)? • Was demographic information (age, gender, and race) reported? • Does the dataset reflect diverse and representative demographics? • Could potential biases in the dataset affect model performance?
Labeling methods	• Were labels assigned by a single expert or through consensus labeling? • For unsupervised learning, how was the model evaluated for clinical relevance?
Pretraining	• Was the model pretrained on a general dataset (e.g., ImageNet)?

AI = artificial intelligence.