Skip to main content
. 2021 Oct 7;9(10):e30588. doi: 10.2196/30588

Table 5.

Changes in the area under the receiver operator characteristic curve through successive permutation of features.

Feature LRa C5b
Sentences with complex semantic categories 0.011 0
Content words 0.013 0
Low-stroke characters 0.012 0
Adverbs of negation 0.01 0
Average strokes per character 0.016 0
Two-character words 0 0.001
Conjunctions 0.015 0.001
Ratios of noun phrases 0.009 0.001
Normalized frequency of noun phrases per 10,000 words 0.024 0.001
Three-character words 0 0.002
Difficult words 0.01 0.002
Middle-stroke characters 0.001 0.004
Pronouns 0.001 0.008
Average sentences per paragraph 0.048 0.011
Type-token ratios 0.005 0.011
High-stroke characters 0.01 0.011
Personal pronouns 0.01 0.018
Positive conjunctions 0.009 0.02
Negative conjunctions 0.007 0.021
Simple sentences 0.026 0.029
Average words per sentences 0.007 0.032
Density of content words 0.01 0.073

aLR: logistic regression.

bC5: C5.0 decision tree.