Skip to main content
. 2015 Nov 12;15:91. doi: 10.1186/s12911-015-0215-x

Table 4.

Corpus statistics

Name Description Filter # %
all TTE reports 70441 100.0
only relevant sites f site 68915 97.8
T d dominant layouts fsite, fchar≥800, f¯li 63489 90.1
T u mostly unstructured fsite, fchar≥100, fchar<800, f¯li 2712 3.9
T c uncommon layout fsite, fli 1041 1.5
mostly defective fsite, fchar<100 1673 2.4

fsite: filter that excludes three sites of the hospital. fchar≥n: require at least n non white space characters. fli: at least 5 list elements