. 2020 Nov 17;28(3):541–548. doi: 10.1093/jamia/ocaa263

Table 2.

Tested methods, functions, and parameters

Method	Tested functions and parameters
Text vectorization	Function: TfidfVectorizer() ngram_range: [(1,1), (1,2), (1,3)] max_df: [0.70, 0.80, 0.90, 0.95, 1.0] min_df: [2, 10, 50] binary: [False, True] use_idf: [False, True] norm: ['l1', ‘l2’, None]
Logistic regression	Function: LogisticRegression() penalty: ‘none’ class_weight: ‘balanced’ max_iter: 1e4 solver: ‘saga’
Support vector machine	Function: SVC() kernel: ‘linear’ class_weight: ‘balanced’ max_iter: 1e4
Random forest	Function: RandomForestClassifier() class_weight: ‘balanced’
Adaptive boosting	Function: AdaBoostClassifier()
Neural networks	Function: Sequential() Layers: Dense(units = number of variables, activation = ‘relu’) Dropout(dropout = 0.2) Dense(units = 1, activation = ‘sigmoid’) optimizer: ‘adam’ loss: ‘binary_crossentropy’ metrics: ‘binary_accuracy’ epochs: 1000 callbacks: EarlyStopping(monitor = ‘val_loss’, min_delta = 0.01) output threshold: [0.01, 0.02, …, 0.98, 0.99]