Skip to main content
. 2021 May 26;5(5):e22461. doi: 10.2196/22461

Table 2.

Descriptive statistics of operative procedure text, preoperative diagnosis, and preferred terms.

Data set Number of unique CPTa codes Number of unique procedure texts Number of tokens in procedure textb Number of tokens in preoperative diagnosisb Number of tokens in preferred termsb



Mean (SD) Range Mean (SD) Range Mean (SD) Range
Training 252 13,847 5.12 (3.57) 1-60 4.12 (2.5) 0-15 13.23 (6.44) 3-46
Validation 231 6012 5.15 (3.64) 1-51 4.11 (2.5) 0-13 13.23 (4.45) 3-41
Holdout 224 6731 4.98 (3.52) 1-60 4.01 (2.52) 0-13 13.24 (6.18) 3-41

aCPT: current procedural terminology.

bThe unit of descriptive statistics is token (word).