Fig. 6.
Illustrations of the categorical regression analyses predicting taboo word status from semantic variables and written corpus frequency. a Distribution of valence (x-axis), arousal (y-axis), and written corpus frequency (point size) for taboo words and fillers (color-coded) in the combined dataset of all samples, alongside their classification accuracy (point type). Regression lines predicting arousal from valence are fitted with local polynomial regression (loess) fitting. b Accuracy rates (darker colors) and F1 scores (lighter colors) for the LOOCV analysis, predicting taboo word status in the left-out sample with a GLMM trained on all other samples (including as predictors valence, arousal, concreteness, AoA ratings, and written corpus frequency)