. 2022 Feb 21;2021:611–620.

Table 4:

Model performance using Frequency Tokenization Encoding trained using 80/20 split on dataset of size 450,000 (360,000 training, 90,000 testing). Top 10% and bottom 90% are measures of target LOINC frequency in testing dataset.

	Accuracy (%)	F1 Score (Weighted)	Precision (Weighted)	Top 10% LOINC Weighted Precision n=74866	Bottom 90% LOINC Weighted Precision n=15134
Logistic Regression	87.3	0.859	0.864	0.896	0.6854
Random Forest	94.5	0.943	0.943	0.957	0.874
KNN	88.95	0.877	0.878	0.911	0.714