Overview of the methods used for data extraction, training and testing. Data from the NIS database was extracted using a tailored scripting in Python to select patients with TAVR. Then the cohort was split into a training set (70% of the data, n=7,615) and test set (30% of the data, n=3,268). A feature ranking method was applied to the training set to determine the top 5,10,15,20,30 variables. Because of data imbalance the training sets were randomly oversampled and then the ML algorithms (i.e. LR, ANN, NB, RF) were trained to develop the models. The different set of variables (including All variables = 43) were used independently to train each of the ML algorithms. The developed models were validated using the test sets and computing the precision metric results focused on the AUC.
NIS, National Inpatient Sample; TAVR, Transcatheter Aortic Valve Replacement; ML, Machine Learning; ANN, Artificial Neural Network; LR, Logistic Regression; NB, Naïve Bayes; RF, Random Forest; AUC, area under the curve..