Skip to main content
. 2025 Jun 9;5(6):100847. doi: 10.1016/j.xops.2025.100847

Table 1.

Key Considerations for Data Used to Train an AI Model

Category Key Questions
Dataset size
  • How large is the training dataset? Consider both the number of images and the number of unique patients or eyes.

  • Was augmentation used to artificially increase dataset size? If so, what type?

Dataset scope
  • Does the dataset include a range of disease severities (e.g., mild, moderate, severe)?

  • Was demographic information (age, gender, and race) reported?

  • Does the dataset reflect diverse and representative demographics?

  • Could potential biases in the dataset affect model performance?

Labeling methods
  • Were labels assigned by a single expert or through consensus labeling?

  • For unsupervised learning, how was the model evaluated for clinical relevance?

Pretraining
  • Was the model pretrained on a general dataset (e.g., ImageNet)?

AI = artificial intelligence.