. 2016 Mar 29;11(3):e0152117. doi: 10.1371/journal.pone.0152117

Table 6. Training Data refers to the first part of the dataset used for model creation.

Validation Data refers to the second part of the dataset used for validation purposes. Simulated data is a 200 restaurant random sampling and analysis repeated over 10,000 iterations using the complete dataset from the pilot study. “Prevalence” refers to the prevalence of restaurants with low health code rating in the specific dataset.

	AUC	Sensitivity	Specificity	PPV	Prevalence
Training data	0.79	0.70	0.58	0.50	0.25
Validation data	0.70	0.65	0.56	0.50	0.33
Simulated data (10,000 iterations)	0.78	0.72	0.44	0.61	0.29
Sample of all San Francisco restaurants with no cuisine exclusion	0.98	0.91	0.74	0.29	0.10
Sample of all New York City restaurants with no cuisine exclusion	0.77	0.74	0.54	0.25	0.12