Assessing treatment switch among patients with multiple sclerosis: A machine learning approach

Jieni Li; Yinan Huang; George J Hutton; Rajender R Aparasu

doi:10.1016/j.rcsop.2023.100307

. 2023 Jul 10;11:100307. doi: 10.1016/j.rcsop.2023.100307

Assessing treatment switch among patients with multiple sclerosis: A machine learning approach

Jieni Li ^a, Yinan Huang ^b, George J Hutton ^c, Rajender R Aparasu ^a,^⁎

PMCID: PMC10405092 PMID: 37554927

Abstract

Background

Patients with multiple sclerosis (MS) frequently switch their Disease-Modifying Agents (DMA) for effectiveness and safety concerns. This study aimed to develop and compare the random forest (RF) machine learning (ML) model with the logistic regression (LR) model for predicting DMA switching among MS patients.

Methods

This retrospective longitudinal study used the TriNetX data from a federated electronic medical records (EMR) network. Between September 2010 and May 2017, adults (aged ≥18) MS patients with ≥1 DMA prescription were identified, and the earliest DMA date was assigned as the index date. Patients prescribed any DMAs different from their index DMAs were considered as treatment switch. . The RF and LR models were built with 72 baseline characteristics and trained with 70% of the randomly split data after up-sampling. Area Under the Curves (AUC), accuracy, recall, G-measure, and F-1 score were used to evaluate the model performance.

Results

In this study, 7258 MS patients with ≥1 DMA were identified. Within two years, 16% of MS patients switched to a different DMA. The RF model obtained significantly better discrimination than the LR model (AUC = 0.65 vs. 0.63, p < 0.0001); however, the RF model had a similar predictive performance to the LR model with respect to F- and G-measures (RF: 72% and 73% vs. LR: 72% and 73%, respectively). The most influential features identified from the RF model were age, type of index medication, and year of index.

Conclusions

Compared to the LR model, RF performed better in predicting DMA switch in MS patients based on AUC measures; however, judged by F- and G-measures, the RF model performed similarly to LR. Further research is needed to understand the role of ML techniques in predicting treatment outcomes for the decision-making process to achieve optimal treatment goals.

Keywords: Multiple sclerosis, Treatment switching, Real-world evidence, Machine learning

1. Introduction

Multiple sclerosis (MS) is a central nervous system inflammatory disease resulting in demyelination and axonal degradation.¹ MS is mainly diagnosed in the reproductive age of women and affects about 0.9 million individuals in the United States (US).2, 3, 4 Diagnosis of MS is associated with functional impairments, lower health-related quality of life (HRQoL), and a significant economic burden.5, 6, 7 Meanwhile, research has found that over 65% of all healthcare-related costs among MS patients were attributed to prescription drugs.⁶ There is no specific autoimmune target for MS management; therefore, none of the medications for MS are curative.⁸ Accordingly, the introduction of disease-modifying agents (DMA) has changed the management landscape by reducing relapse rates, improving patients' HRQoL, and delaying the development of disability.9, 10, 11

In recent years, treatment options are quickly evolving in MS patients with the introduction of several DMAs.¹² Specifically, the US Food and Drug Administration (FDA) had authorized 21 DMAs to treat MS.¹⁰ With many treatment options for MS, deciding on the initial DMA therapy plan can be challenging.¹³ More importantly, each medication failure might result in cumulative neurologic damage, emphasizing the importance of an appropriate therapy plan for newly diagnosed individuals.¹⁴ If the initial DMA is ineffective for MS patients, other DMAs are often prescribed. The findings from the 2019 Milliman MS report suggested that 25.8% of patients switched to another DMA within 12 months after their initial DMA.¹⁵ Several considerations, including tolerability, safety, and effectiveness, influence the decision to change the current DMA for MS patients.¹¹ Moreover, previous studies based on the traditional regression models identified that age, sex, year of index, relapse, and MS severity were associated with treatment switching among MS patients.16, 17, 18

Machine learning (ML) algorithms are gaining traction as viable alternatives to traditional regression approaches for prediction and categorization. ML models are increasingly being used in healthcare to generate evidence for improving the quality of care.19, 20, 21, 22 Moreover, ML-derived individual treatment options were proven to generate better outcomes in various diseases.23, 24, 25 However, applications of ML models in MS were limited to predicting the onset, subtypes, and progression of MS with clinical data.26, 27, 28, 29 To the best of our knowledge, ML algorithms have not been applied to examine the treatment-related considerations in MS. Identifying the subpopulation at high risk for treatment switching helps to guide clinical decisions, which is critical for improving health outcomes in MS patients. ML prediction model has the potential to obtain improved accuracy for treatment switching in MS. Hence, this study aimed to develop and validate the ML model to predict DMA switching and further compare the ML model with the traditional regression model for treatment switching among patients with MS.

2. Methods

2.1. Study design and data sources

This retrospective cohort study used the TriNetx, a federated electronic medical record (EMR), from 2009 to 2019 to evaluate treatment switching in MS. TriNetX data consisted of electronic health records for over 84 million patients from 55 healthcare organizations (HCO).³⁰ In the TriNetX data, both inpatient and outpatient records from the HCOs were included. Overall, the de-identified information in patients' demographic characteristics, diagnoses, procedures, lab tests, vital signs, and prescriptions was combined to assess the utilization of healthcare services. This study was deemed exempt under Category 4 from the Institutional Review Board at the University of Houston.

2.2. Study population

Patients with at least one injectable (including glatiramer acetate and interferon beta) or oral DMA prescription (including fingolimod, dimethyl fumarate, and teriflunomide) were identified during the study period (September 2010–May 2017). Considering the data availability, the DMAs approved before 2017 were considered in this study. Patients' DMA prescription records were identified using RxNorm (produced by the National Library of Medicine) or Healthcare Common Procedure Coding System (HCPCS) codes.³¹ The date of the earliest DMA prescription was identified as the index date. Next, patients were selected if they had at least one MS diagnosis within 12 months before or after the index date (identified by the International Classification of Diseases Ninth/Tenth Revision Clinical Modification [ICD-9/10-CM] codes: 340 or G35).³² Patients with any DMA different from their index drug within 24 months from the index date were considered as treatment switching, and the rest of the patients were flagged as non-switching.

This study did not include patients <18 years old at the index date. Also, patients who were prescribed any DMAs during 12 months washout period were excluded to eliminate the prevalent bias. Moreover, to ensure that patients were continuously involved within the system during the study period, patients with no outpatient visit and no prescription visit within either 12 months pre-index date or 24 months post-index date were excluded. Due to the higher rate of relapse reduction and lower disability progression, infusion DMAs were considered as the high-efficacy treatment. ³³^,³⁴ Accordingly, patients with any infusion DMA prescriptions (Alemtuzumab, Mitoxantrone, Natalizumab, Ocrelizumab) within 12 months before the index date to the earlier date of the DMA switching or 24 months after the index date were excluded.

2.3. Conceptual framework and measurement of variables

A total of 72 variables were selected for training the ML modelbased on a review of existing literature that identified factors assocated with treatment switching in MS.¹³^,¹⁸^,³¹^,³⁵ These factors were further conceptualized based on the Andersen Behavioral Model (ABM) of Health Services Use, which provides a strong conceptual foundation for predicting treatment switching.³⁶ According to the ABM, healthcare utilization depends on predisposing, enabling, and need factors. Predisposing factors that explain patients' predisposition to use healthcare services include demographic and socioeconomic factors. Enabling factors affect an individual's ability to access healthcare services. Finally, need factors reflect perceived and actual health status.

The following factors were included in this study for model training: (1) Predisposing factors: age, sex, race, and ethnicity; (2) Enabling factors: time-period (year of the index date); and (3) Need factors: Elixhauser Comorbidities, MS-related symptoms (speech symptoms, brainstem symptoms, general symptoms, cerebellar symptoms, difficulty in walking, pyramidal symptoms, sensory symptoms, bladder/bowel symptoms or sexual dysfunction, visual symptoms, and cerebral/cognitive symptoms), MS symptomatic medication use (fatigue medications, spasticity drugs, antidepressants, analgesics, bladder dysfunction medications, anticonvulsant medications, and cognition medications), relapse drugs use, and healthcare utilization (emergency room visits, inpatient visits, outpatient visits, and use of magnetic resonance imaging [MRI] procedures).³⁷ To ensure the differences between the approval date of DMAs would not impact the results, the year of the index data was included in the analysis. All of the above factors were measured 12 months before the index date to capture patients' clinically relevant characteristics.

2.4. Machine learning methods

This study used a 70% random sample for the training purpose, in which the model was trained to predict treatment switching. The remaining 30% of the data was used for testing purposes to evaluate the model performance. The non-switching class vastly outnumbered the switching class in the training dataset, with the ratio of switching to non-switching being about 1 to 6, resulting in a class imbalance problem. The imbalanced data might lead to biased prediction toward the non-switching class for machine learning classifiers.³⁸^,³⁹ Thus, the up-sampling method that duplicated random records from the minority class (switching cohort) to generate balanced cohorts was applied to create more balanced data, facilitating the classifiers to deal with the switching and non-switching classes.⁴⁰ Another advantage of the up-sampling method is that it can outperform the down-sampling method, which reduces the sample size and increases the risk of overfitting.⁴¹ The up-sampling method was implemented using the R package “caret”.³⁵ All of the analyses were conducted in R version 3.6.0 (R core team, Vienna, Austria).

2.4.1. Random forest (RF) model

The RF model is one of the most popular predictive approaches due to its higher prediction accuracy than other classification and regression models.⁴² The tree-based ML method applied the decision tree framework to segregate values of predictors in a series of binary splits. The RF model is a supervised tree-based ensemble learning method that creates many decision trees inferred from bootstrap samples using bagging approaches.⁴³ Further, using the random feature selection, the ensemble's predictions are aggregated (majority vote) to make the final prediction.⁴³ Thus, the RF model is more resistant to the over-fitting issue than the classification and regression tree model.⁴⁴^,⁴⁵ Compared to traditional regression models, the critical advantage of the RF algorithm that it is a very flexible algorithm that can evaluate more predictor variables that are not limited by model assumptions, such as multicollinearity.⁴⁶ This study implemented the RF model with the R package “randomForest”.⁴⁷

There were three critical hyperparameters in the RF model that this study focused on: (1) the number of trees (ntree), which indicated the number of trees that the algorithm creates before taking the most votes or averaging the predictions; (2) the number of variables randomly sampled at each split (mtry), which could influence the model's error rates, stability, and accuracy of single trees; (3) the maximal number of leaf nodes for each tree (maxdepth), which impacts the balance and complexity of trees.⁴⁸ The optimal values for the above parameters were tuned in the 10-fold cross-validation and then used to decide the final model. Since a large number of trees are built, the RF model will not require a substantial tuning process; in this study, a model consisting of 200 distinct trees (ntree) was generated, sufficiently large enough for out-of-bag error (OOB) to settle down.⁴⁹ The default of $\sqrt{n}$ (n means the total number of predictors at each node) is used for the selected features at each node (mtry = 16).⁴³ Moreover, the maximal number of leaf nodes for each tree was tuned in a grid ranging between 10 and 200, and the optimal number of terminal node trees was obtained at 50. The most influential predictors were identified based on the mean decreased accuracy and mean decreased Gini of the RF model.⁵⁰

2.4.2. Logistic regression (LR) model

The LR model is a traditional regression method for classification, primarily for dichotomous outcomes. The assumptions of the LR model (e.g., linearity assumption, multicollinearity, and homoscedasticity) are challenging to be satisfied because the underlying data-generating model is unknown. In the LR model, the 10-fold cross-validation was applied to get the best value based on the misclassification error to optimize the values for the punishment parameter (λ) while keeping the minimized model deviance. The important predictors for switching were identified based on the statistical significance. In this study, the LR model was implemented with the “glmnet” package.⁵¹

2.5. Statistical analysis

Using the testing data, this study assessed the LR and the RF model with the area under the receiver operating characteristic curve (AUROC).⁵² The Receiver Operating Characteristics (ROC) curve plot shows the sensitivity (true positive rate) as a function of the specificity (false negative) rate for a variety of thresholds. The area under the curve (AUC) for the LR and the RF machine learning classifiers were compared using the 2-sided DeLong test.⁵³ In addition, considering the imbalanced data, the F-measure and G-measure were introduced for evaluating these models from the testing data.⁵⁴ The F-measure has been widely employed in most ML applications, especially for a binary outcome.⁵⁵ The F-measure was calculated by the harmonic mean of precision and recall that could integrate precision and recall into a single measurement that accounts for both perspectives.⁵⁶ G-measure ( $\sqrt{Precision \times Recall}$ ) aimed to evaluate the model performance in predicting the majority and minority groups.⁵⁷^,⁵⁸ Also, the G-measure is a measurement that aims to evaluate the model performance by preventing the overfitting of the negative class and underfitting of the positive class in the model.⁵⁹ Accordingly, both measures could provide additional justification for unbalanced data. The F- and G- measures for the RF classifier and the LR model were compared based on absolute difference.⁵⁷^,⁵⁸ Other performance measurements (accuracy, specificity, precision, and recall) in evaluating the RF and the LR model were also reported based on a confusion matrix using the caret package.⁶⁰

Furthermore, this study compared switching and non-switching cohorts for their baseline characteristics. Additionally, to ensure the power of validation, each baseline characteristic was compared between the training and testing data. All analyses were conducted using server SAS statistical software version 9.4 (SAS Institute Inc., Cary, NC, USA) at a 0.05 level of significance.

3. Results

3.1. Cohort characteristics

This study identified 7258 MS patients who were prescribed ≥1 DMA between September 2010 and May 2017 after applying the inclusion/exclusion criteria. (See Fig. 1) Overall, 1161 (16%) of MS patients switched to a different DMA within two years. From this study sample, 5079 (69.98%) MS patients were selected for the training data, and 2179 (30.02%) MS patients were set into the testing data. Additionally, after applying the up-sampling method, 8566 MS patients (4283 patients in each cohort) were considered in building up the RF and ML model.

3.2. Study population characteristics

Patients who switched to different DMAs were younger than those who did not (mean age: 42.83 vs. 47.09, p < 0.0001, Table 1). A higher percentage of patients from the non-switching cohort had oral DMA as their index medication (29.63% vs. 37.56%, p < 0.0001). The most prevalent comorbidities among patients who switched were connective tissue diseases (27.56%), mood disorders (20.50%), and depression (20.07%). Patients who switched had a higher proportion of brainstem symptoms (12.06% vs. 9.89%, p = 0.0255), walking difficulties (11.20% vs. 7.77%, p = 0.0001), sensory symptoms (20.24% vs. 16.43%, p = 0.0016), and visual symptoms (12.58% vs. 10.04%, p = 0.0096), as well as being more likely to be prescribed at least one MS symptomatic medication (64.34% vs. 59.46%, p = 0.0018), and corticosteroids (23.86% vs. 18.09%, p < 0.0001). Patients from the switching cohort had more outpatient visits (10.07 vs. 8.98, p = 0.0051), and more of them had at least one MRI procedure during the baseline period (30.84% vs. 25.68%, p = 0.0003). No significant differences were observed in patients' characteristics between the testing and training datasets (Supplemental Table 1).

Table 1.

Patients baseline characteristics.

	Patients with treatment switching		Patients without treatment switching		p-value
	1161 (16.00%)		6097 (84.00%)
	N	%	N	%
Predisposing Factors
Age Category
Mean, SD	42.83	11.90	47.09	12.12	<0.0001
Younger adults (18–44 years)	646	55.64%	2602	42.68%	<0.0001
Middle-aged adults (45–64 years)	471	40.57%	3001	49.22%	<0.0001
Old adults (≥65 years)	44	3.79%	494	8.10%	<0.0001
Sex
Female	883	76.06%	4634	76.00%	0.9705
Male	278	23.94%	1463	24.00%	0.9705
Race
African American	123	10.59%	573	9.40%	0.2045
Whites	955	82.26%	5061	83.01%	0.5333
Others	83	7.15%	463	7.59%	0.5984
Ethnicity
Hispanic or Latino	50	4.31%	219	3.59%	0.2374
Not Hispanic or Latino	1031	88.80%	5455	89.47%	0.499
Unknown	80	6.89%	423	6.94%	0.9537
Enabling Factor
Time-period (of index date)
2010–11	148	12.75%	1108	18.17%	<0.0001
2012–13	391	33.68%	1576	25.85%	<0.0001
2014–15	461	39.71%	2364	38.77%	0.5497
2016–17	161	13.87%	1049	17.21%	0.0052
Index Drug
Injectable DMA	817	70.37%	3807	62.44%	<0.0001
Interferon beta 1a	253	21.79%	1154	18.93%	0.0237
Interferon beta 1b	55	4.74%	235	3.85%	0.1592
Peginterferon beta	18	1.55%	41	0.67%	0.0023
Glatiramer acetate	491	42.29%	2377	38.99%	0.0348
Oral DMA	344	29.63%	2290	37.56%	<0.0001
Fingolimod	100	8.61%	796	13.06%	<0.0001
Dimethyl fumarate	202	17.40%	1170	19.19%	0.1531
Teriflunomide	42	3.62%	324	5.31%	0.0155
Need Factors
Elixhauser Comorbidities
Congestive heart failure	7	0.60%	43	0.71%	0.6992
Cardiac arrhythmia	29	2.50%	218	3.58%	0.0634
Valvular disease	12	1.03%	74	1.21%	0.6032
Pulmonary circulation disorders	3	0.26%	36	0.59%	0.1561
Peripheral vascular disorders	10	0.86%	69	1.13%	0.4158
Hypertension uncomplicated	121	10.42%	781	12.81%	0.0238
Hypertension complicated	8	0.69%	40	0.66%	0.8988
Paralysis	37	3.19%	228	3.74%	0.3575
Other neurological disorders	158	13.61%	705	11.56%	0.0484
Chronic pulmonary disease	59	5.08%	306	5.02%	0.9283
Diabetes uncomplicated	50	4.31%	265	4.35%	0.9514
Diabetes complicated	20	1.72%	84	1.38%	0.3647
Hypothyroidism	68	5.86%	364	5.97%	0.8813
Renal failure	7	0.60%	50	0.82%	0.4423
Peptic ulcer disease, excluding bleeding	7	0.60%	62	1.02%	0.1828
AIDS/HIV	4	0.34%	12	0.20%	0.3253
Lymphoma	0	0.00%	3	0.05%	0.4497
Metastatic cancer	2	0.17%	10	0.16%	0.9494
Liver disease	0	0.00%	20	0.33%	0.0507
Solid tumor without metastasis	14	1.21%	98	1.61%	0.309
Rheumatoid arthritis/collagen related disorders	26	2.24%	127	2.08%	0.7338
Coagulopathy	4	0.34%	62	1.02%	0.027
Obesity	49	4.22%	275	4.51%	0.6611
Weight loss	5	0.43%	71	1.16%	0.0244
Fluid and electrolyte disorders	41	3.53%	234	3.84%	0.6161
Blood loss anemia	0	0.00%	14	0.23%	0.1022
Deficiency anemia	14	1.21%	98	1.61%	0.309
Alcohol abuse	8	0.69%	53	0.87%	0.5376
Drug abuse	7	0.60%	86	1.41%	0.0249
Psychoses	9	0.78%	76	1.25%	0.1713
Depression	233	20.07%	1056	17.32%	0.0247
Elixhauser Comorbidities Score
Mean, SD	0.22	4.01	0.47	4.67	0.0577
≤0	926	79.76%	4902	80.40%	0.6146
1–4	47	4.05%	277	4.54%	0.4541
≥5	188	16.19%	918	15.06%	0.3234
Other Comorbidities
Cancer	30	2.58%	202	3.31%	0.1955
Metabolic Disorder
Thyroid disorders	92	7.92%	439	7.20%	0.3853
Nutritional deficiencies	140	12.06%	840	13.78%	0.1163
Lipid disorders	92	7.92%	568	9.32%	0.1306
Mental Illness
Anxiety	116	9.99%	582	9.55%	0.6368
Mood disorders	238	20.50%	1084	17.78%	0.0277
Other Neurological disorders
Paralysis	35	3.01%	227	3.72%	0.2356
Epilepsy, convulsions	37	3.19%	195	3.20%	0.9839
Headache; including migraine	197	16.97%	805	13.20%	0.0007
Eye disorders	133	11.46%	528	8.66%	0.0024
Ear and sense organ disorders	36	3.10%	183	3.00%	0.8561
Circulatory/vascular disorders
Heart diseases	27	2.33%	220	3.61%	0.0271
Cerebrovascular disease	18	1.55%	117	1.92%	0.3942
Respiratory disorders
Chronic obstructive pulmonary disease and bronchiectasis	3	0.26%	53	0.87%	0.0292
Genitourinary disorders
Diseases of the urinary system	166	14.30%	896	14.70%	0.7253
Diseases of skin and subcutaneous tissue	24	2.07%	125	2.05%	0.9701
Musculoskeletal disorders
Non-traumatic joint disorders	121	10.42%	588	9.64%	0.4132
Spondylosis; intervertebral disc disorders; other back problems	194	16.71%	1076	17.65%	0.4406
Other connective tissue diseases (including fibromyalgia)	320	27.56%	1575	25.83%	0.2187
Ill-defined conditions
Nausea, vomiting/abdominal	49	4.22%	234	3.84%	0.5371
MS-Related Symptoms
Bladder/bowel symptoms or sexual dysfunction	182	15.68%	989	16.22%	0.6436
Brainstem symptoms	140	12.06%	603	9.89%	0.0255
Cerebellar symptoms	6	0.52%	59	0.97%	0.135
Cerebral/Cognitive symptoms	133	11.46%	600	9.84%	0.0942
Difficulty walking	130	11.20%	474	7.77%	0.0001
General symptoms	232	19.98%	1252	20.53%	0.6691
Pyramidal symptoms	95	8.18%	522	8.56%	0.6713
Sensory symptoms	235	20.24%	1002	16.43%	0.0016
Speech symptoms	16	1.38%	112	1.84%	0.2763
Visual symptoms	146	12.58%	612	10.04%	0.0096
None of the above	506	43.58%	2883	47.29%	0.0205
MS Severity Score
Mean, SD	0.93	1.55	0.85	1.51	0.09
MS Symptomatic Medication Use
Analgesics	382	32.90%	1811	29.70%	0.0296
Antidepressants	342	29.46%	1580	25.91%	0.0122
Bladder dysfunction medications	92	7.92%	554	9.09%	0.2024
Cognition medications	10	0.86%	41	0.67%	0.4801
Fatigue medications	160	13.78%	771	12.65%	0.2889
Other anticonvulsant medications	135	11.63%	570	9.35%	0.0162
Spasticity drugs	388	33.42%	1922	31.52%	0.2037
None of the above	414	35.66%	2472	40.54%	0.0018
Relapse Drugs
Any corticosteroids	277	23.86%	1103	18.09%	<0.0001
All-Cause Healthcare Utilization
Any emergency room visits	159	13.70%	762	12.50%	0.2614
# of emergency room visits	0.30	1.10	0.24	0.86	0.0786
Any inpatient visits	101	8.70%	429	7.04%	0.0459
# of inpatient visits	0.16	0.82	0.12	0.63	0.1053
Length of stays	0.61	3.06	0.52	3.51	0.4005
Any outpatient visits	1161	100.00%	6097	100.00%
# of outpatient visits	10.07	12.01	8.98	12.67	0.0051
MS-related Healthcare Utilization
Any emergency room visits	95	8.18%	470	7.71%	0.5807
# of emergency room visits	0.14	0.68	0.12	0.51	0.2711
Any inpatient visits	79	6.80%	358	5.87%	0.2207
# of inpatient visits	0.11	0.65	0.08	0.42	0.2103
Length of stays	0.47	2.69	0.41	2.97	0.4662
Any outpatient visits	1040	89.58%	5363	87.96%	0.1173
# of outpatient visits	4.45	5.20	4.01	5.08	0.0077
Any MRI procedures	358	30.84%	1566	25.68%	0.0003
# of MRI procedures	0.44	0.76	0.34	0.68	<0.0001
Vital Sign
BMI
Underweight, <18	379	32.64%	1993	32.69%	0.9767
Healthy, 18–25	11	0.95%	62	1.02%	0.828
Overweight, 26–29	267	23.00%	1490	24.44%	0.2935
Obese, ≥30	217	18.69%	1076	17.65%	0.3947
Not recorded	287	24.72%	1476	24.21%	0.7096
Blood Pressure Diastolic
Normal	539	46.43%	3159	51.81%	0.0008
Abnormal	361	31.09%	1692	27.75%	0.0205
Not recorded	261	22.48%	1246	20.44%	0.1155
Blood Pressure Systolic
Normal	539	46.43%	3159	51.81%	0.0008
Abnormal	227	19.55%	1043	17.11%	0.0444
Not recorded	395	34.02%	1895	31.08%	0.0481
Lab Test
Hemoglobin [Mass/volume]
Normal	576	49.61%	2833	46.47%	0.0489
Abnormal	107	9.22%	599	9.82%	0.5215
Not recorded	478	41.17%	2665	43.71%	0.1096
Leukocytes [#/volume]
Normal	546	47.03%	2583	42.37%	0.0033
Abnormal	114	9.82%	495	8.12%	0.0555
Not recorded	501	43.15%	3019	49.52%	<0.0001
Hematocrit [Volume Fraction]
Normal	546	47.03%	2490	40.84%	<0.0001
Abnormal	113	9.73%	549	9.00%	0.4294
Not recorded	502	43.24%	3058	50.16%	<0.0001
Platelets [#/volume]
Normal	633	54.52%	2882	47.27%	<0.0001
Abnormal	29	2.50%	143	2.35%	0.7543
Not recorded	499	42.98%	3072	50.39%	<0.0001
Creatinine [Mass/volume]
Normal	511	44.01%	2606	42.74%	0.4225
Abnormal	61	5.25%	372	6.10%	0.2639
Not recorded	589	50.73%	3119	51.16%	0.791
Monocytes [#/volume]
Normal	592	50.99%	2874	47.14%	0.016
Abnormal	15	1.29%	42	0.69%	0.0329
Not recorded	554	47.72%	3181	52.17%	0.0054

Open in a new tab

SD: Standard deviation.

DMA: Disease Modifying Agent.

AIDS/ HIV: Acquired Immunodeficiency Syndrome/ Human Immunodeficiency Virus.

3.3. Important predictors of switching

From the RF model, the top five most influential predictors identified in this study were: age, type of index medication, year of index, number of outpatient visits, and Body mass index (BMI) (See Fig. 2). Meanwhile, age, year of index, Elixhauser Comorbidities Score, type of index medication, BMI, comorbidities conditions (cerebral/cognitive symptoms, eye disorders, nutritional deficiencies, hypertension, other neurological disorders, diabetes, thyroid disorders, heart diseases, and spondylosis), lab test (hemoglobin level, platelet count, and monocyte count), and healthcare utilization (inpatient visits, MS-related inpatient, and MS-related outpatient) were statistically significant factors associated with the treatment switching in the LR model.

Fig. 2 — Top 10 Most Influential Predictors from the Random Forest Model.

3.4. Comparison of random forest model and logistic regression

The RF model improved the AUC (0.65 vs. 0.63, p < 0.0001) using the testing data compared to the LR model. However, the RF model had a similar G-Measure score as the LR model (0.73 vs. 0.73); the RF model had an almost equal F-Measure score as the LR model (0.72 vs. 0.72). Other performance measurements for both models varied: for RF: the accuracy was 0.61, specificity was 0.63, precision was 0.89, and recall was 60%; for LR, the accuracy was 0.63, specificity was 0.57, precision was 0.87, the recall was 61%. The AUCs observed in training data were comparable to the AUCs observed in testing data in both models. Details of all performance measures can be found in Table 2.

Table 2.

Model Performance: Random Forest vs. Logistic Regression.

	Test AUC	95% CI	Accuracy	Specificity	Precision	Recall (Sensitivity)	F-1	G-measure	p-value	Train AUC
Logistic Regression	0.6318	0.6007–0.6229	63%	57%	87%	61%	72%	73%	Reference	0.6748
Random Forest	0.6513	0.6204–0.6823	61%	63%	89%	60%	72%	73%	<0.0001	0.7900

Open in a new tab

AUC: Area under the ROC (Receiver Operating Characteristics) curve.

CI: Confidence interval.

4. Discussion

ML methods have been broadly applied to predict healthcare utilization (e.g., readmissions and inpatient mortality). 23, 24, 25^,⁵³ In MS, ML models have been applied to predict the onset of MS, MS subtypes, and the progression of MS.26, 27, 28, 29 However, studies utilizing the ML approach to evaluate treatment considerations are very limited. To the best of our knowledge, this study is the first attempt to leverage ML methods, namely RF, to predict treatment switching in MS patients. This study developed and compared the RF model with the LR model to predict treatment switching among MS patients using national electronic medical record data. Accurately identifying MS patients at risk of switching to inform targeted intervention may improve MS patients' health outcomes. Literature has supported that ML-derived individual treatment selections were proven to generate better disease outcomes.23, 24, 25 A previous study found that patients who took the RF model suggested epilepsy treatment had considerably higher success rates and lower healthcare utilization than those who took different treatments.²³ Thus, personalized and evidence-based medicine might be achieved using ML approaches for treatment selection and optimizing patient outcomes.

This study adds to existing research regarding the utility and value of ML for treatment considerations. The AUC observed in this study for the RF model was 0.65, an improvement over the AUC of 0.63 observed in the LR model. A previous study found involving the RF model found an AUC of 0.70 for epilepsy treatment, but the treatment change was different from our study, which included any regimen changes or a complete withdrawal.²³ However, the AUC indicated an acceptable model performance distinguishing between switchers and non-switching patients. Additionally, in the case of predicting treatment switching among MS patients, it is preferable to have a greater sensitivity than specificity since identifying patients at risk for treatment switching could assist providers in their treatment decision-making process. As a result, in the interest of healthcare providers who aimed to determine the true positives correctly, a higher sensitivity is preferred , especially when evaluating the model performance.

There is an ongoing debate about the comparative performance of traditional regression approaches compared to ML models.⁶¹ Thus, this study applied several performance metrics to evaluate the relative performance of the RF model vs. the LR model in predicting treatment switching for MS patients. When data is imbalanced, the ROC curve, which plots sensitivity against specificity, was marked as a more robust measurement than accuracy.⁶²^,⁶³ However, misinterpreting specificity might also deceive the AUC when evaluating ML models in imbalanced data.⁶⁴ Considering the limitations of AUC, F- and G-measures were identified as more valuable tools because of the ability to detect the true positives correctly.⁶³ In this study, the RF reported fair scores for F- and G- measures. Although findings showed that the RF offered a notable improvement over LR based on AUC, RF performed similarly to LR judged by F- and G-measures.

Predictive factors are valuable in identifying targets to optimize treatment selection and associated health outcomes in patients with MS. Results from the LR model can help identify the factors and the strength of association with treatment switching among MS patients. Additionally, one of the advantages of applying the RF algorithm is that the model could generate this desired ranking by its relative importance.⁶⁵ Among all the factors, the RF model identified the most critical factor in this study was age. This finding is consistent with previous studies that recognize the importance of age on the treatment patterns in MS patients.¹³^,16, 17, 18 A systematic review found that age could impact MS progression, immunosenescence, and DMA selection and engagement.⁶⁶ Also, the progression of the disease is observed to be slower among older MS patients than among young and middle-aged MS patients, which might decrease the need for treatment switching among older patients. Furthermore, adverse drug events, such as severe infections, are more commonly observed among the older population.⁶⁷ Additionally, the risk-benefit consideration of selecting DMA could be influenced by age-related comorbidities, leading to potential treatment switching among older MS patients.⁶⁶

Moreover, the type of index medication and the year of the index date were observed as the other two top important factors in predicting treatment switching, which could be explained by the introduction of the first oral DMA in 2010.¹⁸ This might also be due to the prescriber's experience and the availability of oral DMAs on the market. Many MS patients switched from injectable to oral DMAs after the launch of oral DMAs.⁴⁹^,⁶⁸ Also, evidence from clinical trials and real-world data showed that oral DMAs were associated with reduced relapse rates, delayed MS progression, and higher treatment adherence rates than injectable DMAs.69, 70, 71, 72 In addition to demographic and clinical factors, the RF model revealed the association between BMI and treatment switching among MS patients. Several studies have evaluated the influence of nutrition and diet on the pathophysiology of MS.73, 74, 75, 76 For instance, obese MS patients were reported with higher disease activity than normal or underweight patients.⁷⁶^,⁷⁷ Thus, consideration of BMI could add value to making treatment decisions for MS patients based on the literature and findings from this study. Consistent with the results from the RF model, the LR model also found that the abovementioned top important factors were statistically significantly associated with treatment switching among MS patients.

4.1. Strength and limitations

To our knowledge, this is the first study that applied ML methods to examine DMA treatment switching among MS patients. Using a predictive analytics framework to evaluate treatment switching can help manage MS care better and more efficiently. Factors considered in this study involved both demographic and clinical characteristics that are generally richer in the EMR than in claims data. This study also considered healthcare utilization and lab test when predicting treatment switching. In addition, the study's findings can provide real-world insights into patients' treatment patterns and the factors that could influence treatment switching in MS.

This study, however, also had some limitations that need to be considered. First, medical records from the TriNetX data were mainly gathered from academic hospitals, which may restrict the generalizability of our findings. Secondly, though the rate of treatment switching was low compared to prior research using claims data, the study finding was consistent with previous analyses using electronic medical records.¹⁸ Thirdly, this study developed and compared models in predicting any DMA switching among patients with MS; future studies should compare the treatment switching with different types of DMAs to improve MS care. Fourth, this study evaluated the model performance with an internal validation dataset; more work is needed applying those models with an external dataset to implement the model when predicting treatment switching. Furthermore, several sociodemographic, clinical, and behavioral characteristics, such as insurance status, prescribers, and patients' preferences, were not available in the TriNetX data, which might result in residual confounding and limit the further investigation of our findings. The TriNetX data lacked information on MS activity and duration of MS. Nevertheless; our analysis included the MS severity score, which had previously been validated as a proxy for the disease severity and the disability status.⁷⁸^,⁷⁹

5. Conclusion

The RF model developed in this study provided a better prediction for treatment switching in MS compared to the LR model based on the AUC. The important contributing factors the RF model identified for treatment switching were age, type of index DMA, and time-period of the initial DMA. These factors may help identify targets to optimize treatment selection and associated health outcomes in patients with MS. More research is needed to understand the role and impact of ML algorithms in therapy selection for improving MS care.

Funding

We have no founding support regarding this study.

Disclosures

We confirm that this work is original and has not been published elsewhere, nor is it currently under consideration for publication elsewhere. A part of the study findings was submitted to the 2022 International Society for Pharmacoeconomics and Outcomes Research (ISPOR) National Conference.

Declaration of Competing Interest

Dr. Aparasu has received research funding from Astellas Inc., Incyte Corp., Gilead, and Novartis Inc. for projects unrelated to the current work. Dr. Hutton reports grants from Biogen, Novartis, MedImmune, Hoffman-LaRoche, E.M.D. Serono, Sanofi, and personal fees from Novartis, Sanofi, Celgene outside the submitted work. The other authors declare no conflicts of interest for this article.

Footnotes

^{Appendix A}

Supplementary data to this article can be found online at https://doi.org/10.1016/j.rcsop.2023.100307.

Appendix A. Supplementary data

Supplementary Material for Patient Baseline Characteristics by Training and Testing Sample

mmc1.docx^{(44.1KB, docx)}

References

1.Kutzelnigg A., Lassmann H. Pathology of multiple sclerosis and related inflammatory demyelinating diseases. Handb Clin Neurol. 2014;122:15–58. doi: 10.1016/B978-0-444-52001-2.00002-9. [DOI] [PubMed] [Google Scholar]
2.Wallin M.T., Culpepper W.J., Campbell J.D., et al. The prevalence of MS in the United States. Neurology. 2019;92(10) doi: 10.1212/WNL.0000000000007035. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.McGinley M.P., Goldschmidt C.H., Rae-Grant A.D. Diagnosis and treatment of multiple sclerosis. JAMA. 2021;325(8) doi: 10.1001/jama.2020.26858. [DOI] [PubMed] [Google Scholar]
4.Hunter S.F. Overview and diagnosis of multiple sclerosis. Am J Manag Care. 2016;22(6 Suppl):s141–s150. [PubMed] [Google Scholar]
5.Campbell J.D., Ghushchyan V., McQueen R.B., et al. Burden of multiple sclerosis on direct, indirect costs and quality of life: national US estimates. Mult Scler Relat Disord. 2014;3(2):227–236. doi: 10.1016/j.msard.2013.09.004. [DOI] [PubMed] [Google Scholar]
6.Earla J.R., Thornton J.D., Hutton G.J., Aparasu R.R. Marginal health care expenditure burden among US civilian noninstitutionalized individuals with multiple sclerosis: 2010-2015. J Manag Care Spec Pharm. 2020;26(6):741–749. doi: 10.18553/jmcp.2020.26.6.741. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Li J., Zakeri M., Hutton G.J., Aparasu R.R. Health-related quality of life of patients with multiple sclerosis: analysis of ten years of national data. Mult Scler Relat Disord. 2022;66 doi: 10.1016/j.msard.2022.104019. [DOI] [PubMed] [Google Scholar]
8.Rae-Grant A., Day G.S., Marrie R.A., et al. Practice guideline recommendations summary: disease-modifying therapies for adults with multiple sclerosis. Neurology. 2018;90(17):777–788. doi: 10.1212/WNL.0000000000005347. [DOI] [PubMed] [Google Scholar]
9.Comi G., Radaelli M., Soelberg Sørensen P. Evolving concepts in the treatment of relapsing multiple sclerosis. Lancet. 2017;389(10076) doi: 10.1016/S0140-6736(16)32388-1. [DOI] [PubMed] [Google Scholar]
10.Elsisi Z., Hincapie A.L., Guo J.J. Expenditure, utilization, and cost of specialty drugs for multiple sclerosis in the US Medicaid population, 2008-2018. Am Health Drug Benefits. 2020;13(2):74–84. [PMC free article] [PubMed] [Google Scholar]
11.Gajofatto A., Benedetti M.D. Treatment strategies for multiple sclerosis: when to start, when to change, when to stop? World J Clin Cases. 2015;3(7):545. doi: 10.12998/wjcc.v3.i7.545. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Chitnis T., Giovannoni G., Trojano M. Complexity of MS management in the current treatment era. Neurology. 2018;90(17) doi: 10.1212/WNL.0000000000005399. [DOI] [PubMed] [Google Scholar]
13.Saccà F., Lanzillo R., Signori A., et al. Determinants of therapy switch in multiple sclerosis treatment-naïve patients: a real-life study. Mult Scler J. 2019;25(9) doi: 10.1177/1352458518790390. [DOI] [PubMed] [Google Scholar]
14.Naismith R.T. Multiple sclerosis therapeutic strategies: start safe and effective, reassess early, and escalate if necessary. Neurol Clin Pract. 2011;1(1):69–72. doi: 10.1212/CPJ.0b013e31823cc2b0. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Milliman Client Report Multiple Sclerosis: New Perspectives on the Patient Journey - 2019 Update2019. 2019. https://us.milliman.com/-/media/milliman/importedfiles/uploadedfiles/insight/2019/ms-patient-journey-2019.ashx Accessed October 7, 2021.
16.Freeman L., Kee A., Tian M., Mehta R. Retrospective Claims Analysis of Treatment Patterns, Relapse, Utilization, and Cost Among Patients with Multiple Sclerosis Initiating Second-Line Disease-Modifying Therapy. Drugs Real World Outcomes. 2021 Dec;8(4):497–508. doi: 10.1007/s40801-021-00251-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Desai R.J., Mahesri M., Gagne J.J., et al. Utilization patterns of Oral disease-modifying drugs in commercially insured patients with multiple sclerosis. J Manag Care Spec Pharm. 2019;25(1) doi: 10.18553/jmcp.2019.25.1.113. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Li J., Chikermane S.G., Earla J.R., Hutton G.J., Aparasu R.R. Factors associated with switching from injectable to oral disease modifying agents among patients with multiple sclerosis. Mult Scler Relat Disord. 2022;60 doi: 10.1016/j.msard.2022.103703. [DOI] [PubMed] [Google Scholar]
19.Ling M., Tao X., Ma S., et al. Predictive value of intraoperative facial motor evoked potentials in vestibular schwannoma surgery under 2 anesthesia protocols. World Neurosurg. 2018;111:e267–e276. doi: 10.1016/j.wneu.2017.12.029. [DOI] [PubMed] [Google Scholar]
20.Liang Z., Zhang G., Huang J.X., Hu Q.V. 2014 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) IEEE; 2014. Deep learning for healthcare decision making with EMRs; pp. 556–559. [Google Scholar]
21.Manogaran G., Lopez D. A survey of big data architectures and machine learning algorithms in healthcare. Int J Biomed Eng Technol. 2017;25(2–4):182–211. [Google Scholar]
22.Buchlak Q.D., Esmaili N., Leveque J.C., et al. Machine learning applications to clinical decision support in neurosurgery: an artificial intelligence augmented systematic review. Neurosurg Rev. 2020;43(5):1235–1253. doi: 10.1007/s10143-019-01163-8. [DOI] [PubMed] [Google Scholar]
23.Devinsky O., Dilley C., Ozery-Flato M., et al. Changing the approach to treatment choice in epilepsy using big data. Epilepsy Behav. 2016;56:32–37. doi: 10.1016/j.yebeh.2015.12.039. [DOI] [PubMed] [Google Scholar]
24.Wu C.S., Luedtke A.R., Sadikova E., et al. Development and validation of a machine learning individualized treatment rule in first-episode schizophrenia. JAMA Netw Open. 2020;3(2) doi: 10.1001/jamanetworkopen.2019.21660. [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Senders J.T., Staples P.C., Karhade A.V., et al. Machine learning and neurosurgical outcome prediction: a systematic review. World Neurosurg. 2018;109:476–486.e1. doi: 10.1016/j.wneu.2017.09.149. [DOI] [PubMed] [Google Scholar]
26.Seccia R., Romano S., Salvetti M., Crisanti A., Palagi L., Grassi F. Machine learning use for prognostic purposes in multiple sclerosis. Life. 2021;11(2):122. doi: 10.3390/life11020122. [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Darvishi S., Hamidi O., Poorolajal J. Prediction of multiple sclerosis disease using machine learning classifiers: a comparative study. J Prev Med Hyg. 2021;62(1):E192–E199. doi: 10.15167/2421-4248/jpmh2021.62.1.1651. [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Eshaghi A., Young A.L., Wijeratne P.A., et al. Identifying multiple sclerosis subtypes using unsupervised machine learning and MRI data. Nat Commun. 2021;12(1):2078. doi: 10.1038/s41467-021-22265-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Vázquez-Marrufo M., Sarrias-Arrabal E., García-Torres M., Martín-Clemente R., Izquierdo G. A systematic review of the application of machine-learning algorithms in multiple sclerosis. Neurologia (Engl Ed) 2021 Feb 3 doi: 10.1016/j.nrl.2020.10.017. S0213-4853(20)30431-X. English, Spanish. [DOI] [PubMed] [Google Scholar]
30.Stapff M., Hilderbrand S. First-line treatment of essential hypertension: a real-world analysis across four antihypertensive treatment classes. J Clin Hypertens. 2019;21(5) doi: 10.1111/jch.13531. [DOI] [PMC free article] [PubMed] [Google Scholar]
31.Ontaneda D., Nicholas J., Carraro M., et al. Comparative effectiveness of dimethyl fumarate versus fingolimod and teriflunomide among MS patients switching from first-generation platform therapies in the US. Mult Scler Relat Disord. 2019:27. doi: 10.1016/j.msard.2018.09.038. [DOI] [PubMed] [Google Scholar]
32.Culpepper W.J., Marrie R.A., Langer-Gould A., et al. Validation of an algorithm for identifying MS cases in administrative health claims datasets. Neurology. 2019;92(10):e1016–e1028. doi: 10.1212/WNL.0000000000007043. [DOI] [PMC free article] [PubMed] [Google Scholar]
33.Giovannoni G., Butzkueven H., Dhib-Jalbut S., et al. Brain health: time matters in multiple sclerosis. Mult Scler Relat Disord. 2016;9:S5–S48. doi: 10.1016/j.msard.2016.07.003. [DOI] [PubMed] [Google Scholar]
34.Vollmer T.L., Nair K., v., Williams IM, Alvarez E. Multiple sclerosis phenotypes as a continuum. Neurol Clin Pract. 2021;11(4):342–351. doi: 10.1212/CPJ.0000000000001045. [DOI] [PMC free article] [PubMed] [Google Scholar]
35.Kuhn M. R Foundation for Statistical Computing, Vienna, Austria. 2023. The caret package.https://cran.r-project.org/web/packages/caret/caret.pdf URL. Published March 21, 2023. [Google Scholar]
36.Andersen RM. Revisiting the behavioral model and access to medical care: does it matter? J Health Soc Behav. 1995 Mar;36(1):1–10. [PubMed] [Google Scholar]
37.Elixhauser A., Steiner C., Harris D.R., Coffey R.M. Comorbidity measures for use with administrative data. Med Care. 1998;36(1) doi: 10.1097/00005650-199801000-00004. [DOI] [PubMed] [Google Scholar]
38.Blagus R., Lusa L. Class prediction for high-dimensional class-imbalanced data. BMC Bioinformatics. 2010;11(1):1–17. doi: 10.1186/1471-2105-11-523. [DOI] [PMC free article] [PubMed] [Google Scholar]
39.Kamalov F., Thabtah F., Leung H.H. Feature Selection in Imbalanced Data. Ann. Data. Sci. 2022 doi: 10.1007/s40745-021-00366-5. [DOI] [Google Scholar]
40.Provost F. Proceedings of the AAAI’2000 Workshop on Imbalanced Data Sets. vol. 68. AAAI Press; 2000. Machine learning from imbalanced data sets 101; pp. 1–3. [Google Scholar]
41.van Smeden M., Moons K.G., de Groot J.A., et al. Sample size for binary logistic prediction models: beyond events per variable criteria. Stat Methods Med Res. 2019;28(8):2455–2474. doi: 10.1177/0962280218784726. [DOI] [PMC free article] [PubMed] [Google Scholar]
42.Verikas A., Gelzinis A., Bacauskiene M. Mining data with random forests: a survey and results of new tests. Pattern Recogn. 2011;44(2):330–349. [Google Scholar]
43.Breiman L. Random forests. Mach Learn. 2001;45(1):5–32. doi: 10.1023/A:1010933404324. [DOI] [Google Scholar]
44.Acion L., Kelmansky D., van der Laan M., Sahker E., Jones D., Arndt S. Use of a machine learning framework to predict substance use disorder treatment success. PloS One. 2017;12(4) doi: 10.1371/journal.pone.0175383. [DOI] [PMC free article] [PubMed] [Google Scholar]
45.Chen C., Liaw A., Breiman L. 110(1−12) University of California; Berkeley: 2004. Using Random Forest to Learn Imbalanced Data; p. 24. [Google Scholar]
46.Deo R.C., Nallamothu B.K. Learning about machine learning: the promise and pitfalls of big data and the electronic health record. Circ Cardiovasc Qual Outcomes. 2016;9(6):618–620. doi: 10.1161/CIRCOUTCOMES.116.003308. [DOI] [PMC free article] [PubMed] [Google Scholar]
47.Liaw A., Wiener M. Classification and regression by randomForest. R News. 2002;2(3):18–22. [Google Scholar]
48.Probst P., Wright M.N., Boulesteix A. Hyperparameters and tuning strategies for random forest. Wiley Interdiscip Rev Data Min Knowl Discov. 2019;9(3) [Google Scholar]
49.Desai R.J., Wang S., v., Vaduganathan M, Evers T, Schneeweiss S. Comparison of machine learning methods with traditional models for use of administrative claims with electronic medical records to predict heart failure outcomes. JAMA Netw Open. 2020;3(1) doi: 10.1001/jamanetworkopen.2019.18962. [DOI] [PMC free article] [PubMed] [Google Scholar]
50.DeLong E.R., DeLong D.M., Clarke-Pearson D.L. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44(3):837. doi: 10.2307/2531595. [DOI] [PubMed] [Google Scholar]
51.Friedman J., Hastie T., Tibshirani R. Regularization paths for generalized linear models via coordinate descent. J Stat Softw. 2010;33(1):1. [PMC free article] [PubMed] [Google Scholar]
52.DeLong E.R., DeLong D.M., Clarke-Pearson D.L. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44(3):837. doi: 10.2307/2531595. [DOI] [PubMed] [Google Scholar]
53.Delong E.R., Delong D.M., Clarke-Pearson D.L. Vol. 44. 1988. Comparing the Areas under Two or More Correlated Receiver Operating Characteristic Curves: A Nonparametric Approach.https://about.jstor.org/terms [PubMed] [Google Scholar]
54.Dugan J.B., Bavuso S.J., Boyd M.A. Annual Proceedings on Reliability and Maintainability Symposium. IEEE; 1990. Fault trees and sequence dependencies; pp. 286–293. [Google Scholar]
55.Chicco D., Jurman G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics. 2020;21(1):6. doi: 10.1186/s12864-019-6413-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
56.Taha A.A., Hanbury A. Metrics for evaluating 3D medical image segmentation: analysis, selection, and tool. BMC Med Imaging. 2015;15(1):1–28. doi: 10.1186/s12880-015-0068-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
57.Kubat M., Holte R.C., Matwin S. Machine learning for the detection of oil spills in satellite radar images. Mach Learn. 1998;30(2):195–215. [Google Scholar]
58.Sokolova M., Japkowicz N., Szpakowicz S. 2006. Beyond Accuracy, F-Score and ROC: A Family of Discriminant Measures for Performance Evaluation; pp. 1015–1021. [DOI] [Google Scholar]
59.Josephine S.A. SAS Global Forum; 2017. Predictive accuracy: a misleading performance measure for highly imbalanced data classified negative. [Google Scholar]
60.Kuhn M. Building predictive models in R using the caret package. J Stat Softw. 2008;28:1–26. [Google Scholar]
61.Christodoulou E., Ma J., Collins G.S., Steyerberg E.W., Verbakel J.Y., van Calster B. A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. J Clin Epidemiol. 2019;110:12–22. doi: 10.1016/j.jclinepi.2019.02.004. [DOI] [PubMed] [Google Scholar]
62.Davis J., Goadrich M. Proceedings of the 23rd International Conference on Machine Learning. 2006. The relationship between Precision-Recall and ROC curves; pp. 233–240. [Google Scholar]
63.Saito T., Rehmsmeier M. The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PloS One. 2015;10(3) doi: 10.1371/journal.pone.0118432. [DOI] [PMC free article] [PubMed] [Google Scholar]
64.He H., Garcia E.A. Learning from imbalanced data. IEEE Trans Knowl Data Eng. 2009;21(9):1263–1284. [Google Scholar]
65.Wadekar A.S. Understanding opioid use disorder (OUD) using tree-based classifiers. Drug Alcohol Depend. 2020;208 doi: 10.1016/j.drugalcdep.2020.107839. [DOI] [PubMed] [Google Scholar]
66.Jakimovski D., Eckert S.P., Zivadinov R., Weinstock-Guttman B. Considering patient age when treating multiple sclerosis across the adult lifespan. Expert Rev Neurother. 2021;21(3):353–364. doi: 10.1080/14737175.2021.1886082. [DOI] [PubMed] [Google Scholar]
67.Patti F., Penaherrera J.N., Zieger L., Wicklein E.M. Clinical characteristics of middle-aged and older patients with MS treated with interferon beta-1b: post-hoc analysis of a 2-year, prospective, international, observational study. BMC Neurol. 2021;21(1):324. doi: 10.1186/s12883-021-02347-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
68.Earla J.R., Paranjpe R., Kachru N., Hutton G.J., Aparasu R.R. Use of disease modifying agents in patients with multiple sclerosis: analysis of ten years of national data. Res Social Adm Pharm. 2020;16(12) doi: 10.1016/j.sapharm.2020.02.016. [DOI] [PubMed] [Google Scholar]
69.Earla J.R., Hutton G.J., Thornton J.D., Chen H., Johnson M.L., Aparasu R.R. Comparative adherence trajectories of Oral Fingolimod and injectable disease modifying agents in multiple sclerosis. Patient Prefer Adherence. 2020;14 doi: 10.2147/PPA.S270557. [DOI] [PMC free article] [PubMed] [Google Scholar]
70.Boster A., Nicholas J., Wu N., et al. Comparative effectiveness research of disease-modifying therapies for the Management of Multiple Sclerosis: analysis of a large health insurance claims database. Neurol Ther. 2017;6(1) doi: 10.1007/s40120-017-0064-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
71.English C., Aloi J.J. New FDA-approved disease-modifying therapies for multiple sclerosis. Clin Ther. 2015;37(4) doi: 10.1016/j.clinthera.2015.03.001. [DOI] [PubMed] [Google Scholar]
72.Scolding N., Barnes D., Cader S., et al. Association of British Neurologists: revised (2015) guidelines for prescribing disease-modifying treatments in multiple sclerosis. Pract Neurol. 2015;15(4) doi: 10.1136/practneurol-2015-001139. [DOI] [PubMed] [Google Scholar]
73.Markianos M., Evangelopoulos M.E., Koutsis G., Davaki P., Sfagos C. Body mass index in multiple sclerosis: associations with CSF neurotransmitter metabolite levels. Int Sch Res Notices. 2013:2013. doi: 10.1155/2013/981070. [DOI] [PMC free article] [PubMed] [Google Scholar]
74.Gianfrancesco M.A., Barcellos L.F. Obesity and multiple sclerosis susceptibility: a review. J Neurol Neuromed. 2016;1(7):1. doi: 10.29245/2572.942x/2016/7.1064. [DOI] [PMC free article] [PubMed] [Google Scholar]
75.Mowry E.M., Azevedo C.J., McCulloch C.E., et al. Body mass index, but not vitamin D status, is associated with brain volume change in MS. Neurology. 2018;91(24):e2256–e2264. doi: 10.1212/WNL.0000000000006644. [DOI] [PMC free article] [PubMed] [Google Scholar]
76.Kvistad S.S., Myhr K.M., Holmøy T., et al. Body mass index influence interferon-beta treatment response in multiple sclerosis. J Neuroimmunol. 2015;288:92–97. doi: 10.1016/j.jneuroim.2015.09.008. [DOI] [PubMed] [Google Scholar]
77.Dardiotis E., Tsouris Z., Aslanidou P., et al. Body mass index in patients with multiple sclerosis: a meta-analysis. Neurol Res. 2019;41(9):836–846. doi: 10.1080/01616412.2019.1622873. [DOI] [PubMed] [Google Scholar]
78.Nicholas J., Ontaneda D., Carraro M., et al. Development of an algorithm to identify multiple sclerosis (MS) disease severity based on healthcare costs in a US Administrative Claims Database (P2.052) Neurology. 2017;88(16 Supplement) https://n.neurology.org/content/88/16_Supplement/P2.052 [Google Scholar]
79.Toliver J.C., Barner J.C., Lawson K.A., Rascati K.L. Use of a claims-based algorithm to estimate disease severity in the multiple sclerosis Medicare population. Mult Scler Relat Disord. 2021;49 doi: 10.1016/j.msard.2021.102741. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Material for Patient Baseline Characteristics by Training and Testing Sample

mmc1.docx^{(44.1KB, docx)}

[bb0005] 1.Kutzelnigg A., Lassmann H. Pathology of multiple sclerosis and related inflammatory demyelinating diseases. Handb Clin Neurol. 2014;122:15–58. doi: 10.1016/B978-0-444-52001-2.00002-9. [DOI] [PubMed] [Google Scholar]

[bb0010] 2.Wallin M.T., Culpepper W.J., Campbell J.D., et al. The prevalence of MS in the United States. Neurology. 2019;92(10) doi: 10.1212/WNL.0000000000007035. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0015] 3.McGinley M.P., Goldschmidt C.H., Rae-Grant A.D. Diagnosis and treatment of multiple sclerosis. JAMA. 2021;325(8) doi: 10.1001/jama.2020.26858. [DOI] [PubMed] [Google Scholar]

[bb0020] 4.Hunter S.F. Overview and diagnosis of multiple sclerosis. Am J Manag Care. 2016;22(6 Suppl):s141–s150. [PubMed] [Google Scholar]

[bb0025] 5.Campbell J.D., Ghushchyan V., McQueen R.B., et al. Burden of multiple sclerosis on direct, indirect costs and quality of life: national US estimates. Mult Scler Relat Disord. 2014;3(2):227–236. doi: 10.1016/j.msard.2013.09.004. [DOI] [PubMed] [Google Scholar]

[bb0030] 6.Earla J.R., Thornton J.D., Hutton G.J., Aparasu R.R. Marginal health care expenditure burden among US civilian noninstitutionalized individuals with multiple sclerosis: 2010-2015. J Manag Care Spec Pharm. 2020;26(6):741–749. doi: 10.18553/jmcp.2020.26.6.741. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0035] 7.Li J., Zakeri M., Hutton G.J., Aparasu R.R. Health-related quality of life of patients with multiple sclerosis: analysis of ten years of national data. Mult Scler Relat Disord. 2022;66 doi: 10.1016/j.msard.2022.104019. [DOI] [PubMed] [Google Scholar]

[bb0040] 8.Rae-Grant A., Day G.S., Marrie R.A., et al. Practice guideline recommendations summary: disease-modifying therapies for adults with multiple sclerosis. Neurology. 2018;90(17):777–788. doi: 10.1212/WNL.0000000000005347. [DOI] [PubMed] [Google Scholar]

[bb0045] 9.Comi G., Radaelli M., Soelberg Sørensen P. Evolving concepts in the treatment of relapsing multiple sclerosis. Lancet. 2017;389(10076) doi: 10.1016/S0140-6736(16)32388-1. [DOI] [PubMed] [Google Scholar]

[bb0050] 10.Elsisi Z., Hincapie A.L., Guo J.J. Expenditure, utilization, and cost of specialty drugs for multiple sclerosis in the US Medicaid population, 2008-2018. Am Health Drug Benefits. 2020;13(2):74–84. [PMC free article] [PubMed] [Google Scholar]

[bb0055] 11.Gajofatto A., Benedetti M.D. Treatment strategies for multiple sclerosis: when to start, when to change, when to stop? World J Clin Cases. 2015;3(7):545. doi: 10.12998/wjcc.v3.i7.545. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0060] 12.Chitnis T., Giovannoni G., Trojano M. Complexity of MS management in the current treatment era. Neurology. 2018;90(17) doi: 10.1212/WNL.0000000000005399. [DOI] [PubMed] [Google Scholar]

[bb0065] 13.Saccà F., Lanzillo R., Signori A., et al. Determinants of therapy switch in multiple sclerosis treatment-naïve patients: a real-life study. Mult Scler J. 2019;25(9) doi: 10.1177/1352458518790390. [DOI] [PubMed] [Google Scholar]

[bb0070] 14.Naismith R.T. Multiple sclerosis therapeutic strategies: start safe and effective, reassess early, and escalate if necessary. Neurol Clin Pract. 2011;1(1):69–72. doi: 10.1212/CPJ.0b013e31823cc2b0. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0075] 15.Milliman Client Report Multiple Sclerosis: New Perspectives on the Patient Journey - 2019 Update2019. 2019. https://us.milliman.com/-/media/milliman/importedfiles/uploadedfiles/insight/2019/ms-patient-journey-2019.ashx Accessed October 7, 2021.

[bb0080] 16.Freeman L., Kee A., Tian M., Mehta R. Retrospective Claims Analysis of Treatment Patterns, Relapse, Utilization, and Cost Among Patients with Multiple Sclerosis Initiating Second-Line Disease-Modifying Therapy. Drugs Real World Outcomes. 2021 Dec;8(4):497–508. doi: 10.1007/s40801-021-00251-w. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0085] 17.Desai R.J., Mahesri M., Gagne J.J., et al. Utilization patterns of Oral disease-modifying drugs in commercially insured patients with multiple sclerosis. J Manag Care Spec Pharm. 2019;25(1) doi: 10.18553/jmcp.2019.25.1.113. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0090] 18.Li J., Chikermane S.G., Earla J.R., Hutton G.J., Aparasu R.R. Factors associated with switching from injectable to oral disease modifying agents among patients with multiple sclerosis. Mult Scler Relat Disord. 2022;60 doi: 10.1016/j.msard.2022.103703. [DOI] [PubMed] [Google Scholar]

[bb0095] 19.Ling M., Tao X., Ma S., et al. Predictive value of intraoperative facial motor evoked potentials in vestibular schwannoma surgery under 2 anesthesia protocols. World Neurosurg. 2018;111:e267–e276. doi: 10.1016/j.wneu.2017.12.029. [DOI] [PubMed] [Google Scholar]

[bb0100] 20.Liang Z., Zhang G., Huang J.X., Hu Q.V. 2014 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) IEEE; 2014. Deep learning for healthcare decision making with EMRs; pp. 556–559. [Google Scholar]

[bb0105] 21.Manogaran G., Lopez D. A survey of big data architectures and machine learning algorithms in healthcare. Int J Biomed Eng Technol. 2017;25(2–4):182–211. [Google Scholar]

[bb0110] 22.Buchlak Q.D., Esmaili N., Leveque J.C., et al. Machine learning applications to clinical decision support in neurosurgery: an artificial intelligence augmented systematic review. Neurosurg Rev. 2020;43(5):1235–1253. doi: 10.1007/s10143-019-01163-8. [DOI] [PubMed] [Google Scholar]

[bb0115] 23.Devinsky O., Dilley C., Ozery-Flato M., et al. Changing the approach to treatment choice in epilepsy using big data. Epilepsy Behav. 2016;56:32–37. doi: 10.1016/j.yebeh.2015.12.039. [DOI] [PubMed] [Google Scholar]

[bb0120] 24.Wu C.S., Luedtke A.R., Sadikova E., et al. Development and validation of a machine learning individualized treatment rule in first-episode schizophrenia. JAMA Netw Open. 2020;3(2) doi: 10.1001/jamanetworkopen.2019.21660. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0125] 25.Senders J.T., Staples P.C., Karhade A.V., et al. Machine learning and neurosurgical outcome prediction: a systematic review. World Neurosurg. 2018;109:476–486.e1. doi: 10.1016/j.wneu.2017.09.149. [DOI] [PubMed] [Google Scholar]

[bb0130] 26.Seccia R., Romano S., Salvetti M., Crisanti A., Palagi L., Grassi F. Machine learning use for prognostic purposes in multiple sclerosis. Life. 2021;11(2):122. doi: 10.3390/life11020122. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0135] 27.Darvishi S., Hamidi O., Poorolajal J. Prediction of multiple sclerosis disease using machine learning classifiers: a comparative study. J Prev Med Hyg. 2021;62(1):E192–E199. doi: 10.15167/2421-4248/jpmh2021.62.1.1651. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0140] 28.Eshaghi A., Young A.L., Wijeratne P.A., et al. Identifying multiple sclerosis subtypes using unsupervised machine learning and MRI data. Nat Commun. 2021;12(1):2078. doi: 10.1038/s41467-021-22265-2. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0145] 29.Vázquez-Marrufo M., Sarrias-Arrabal E., García-Torres M., Martín-Clemente R., Izquierdo G. A systematic review of the application of machine-learning algorithms in multiple sclerosis. Neurologia (Engl Ed) 2021 Feb 3 doi: 10.1016/j.nrl.2020.10.017. S0213-4853(20)30431-X. English, Spanish. [DOI] [PubMed] [Google Scholar]

[bb0150] 30.Stapff M., Hilderbrand S. First-line treatment of essential hypertension: a real-world analysis across four antihypertensive treatment classes. J Clin Hypertens. 2019;21(5) doi: 10.1111/jch.13531. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0155] 31.Ontaneda D., Nicholas J., Carraro M., et al. Comparative effectiveness of dimethyl fumarate versus fingolimod and teriflunomide among MS patients switching from first-generation platform therapies in the US. Mult Scler Relat Disord. 2019:27. doi: 10.1016/j.msard.2018.09.038. [DOI] [PubMed] [Google Scholar]

[bb0160] 32.Culpepper W.J., Marrie R.A., Langer-Gould A., et al. Validation of an algorithm for identifying MS cases in administrative health claims datasets. Neurology. 2019;92(10):e1016–e1028. doi: 10.1212/WNL.0000000000007043. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0165] 33.Giovannoni G., Butzkueven H., Dhib-Jalbut S., et al. Brain health: time matters in multiple sclerosis. Mult Scler Relat Disord. 2016;9:S5–S48. doi: 10.1016/j.msard.2016.07.003. [DOI] [PubMed] [Google Scholar]

[bb0170] 34.Vollmer T.L., Nair K., v., Williams IM, Alvarez E. Multiple sclerosis phenotypes as a continuum. Neurol Clin Pract. 2021;11(4):342–351. doi: 10.1212/CPJ.0000000000001045. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0175] 35.Kuhn M. R Foundation for Statistical Computing, Vienna, Austria. 2023. The caret package.https://cran.r-project.org/web/packages/caret/caret.pdf URL. Published March 21, 2023. [Google Scholar]

[bb0180] 36.Andersen RM. Revisiting the behavioral model and access to medical care: does it matter? J Health Soc Behav. 1995 Mar;36(1):1–10. [PubMed] [Google Scholar]

[bb0185] 37.Elixhauser A., Steiner C., Harris D.R., Coffey R.M. Comorbidity measures for use with administrative data. Med Care. 1998;36(1) doi: 10.1097/00005650-199801000-00004. [DOI] [PubMed] [Google Scholar]

[bb0190] 38.Blagus R., Lusa L. Class prediction for high-dimensional class-imbalanced data. BMC Bioinformatics. 2010;11(1):1–17. doi: 10.1186/1471-2105-11-523. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0195] 39.Kamalov F., Thabtah F., Leung H.H. Feature Selection in Imbalanced Data. Ann. Data. Sci. 2022 doi: 10.1007/s40745-021-00366-5. [DOI] [Google Scholar]

[bb0200] 40.Provost F. Proceedings of the AAAI’2000 Workshop on Imbalanced Data Sets. vol. 68. AAAI Press; 2000. Machine learning from imbalanced data sets 101; pp. 1–3. [Google Scholar]

[bb0205] 41.van Smeden M., Moons K.G., de Groot J.A., et al. Sample size for binary logistic prediction models: beyond events per variable criteria. Stat Methods Med Res. 2019;28(8):2455–2474. doi: 10.1177/0962280218784726. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0210] 42.Verikas A., Gelzinis A., Bacauskiene M. Mining data with random forests: a survey and results of new tests. Pattern Recogn. 2011;44(2):330–349. [Google Scholar]

[bb0215] 43.Breiman L. Random forests. Mach Learn. 2001;45(1):5–32. doi: 10.1023/A:1010933404324. [DOI] [Google Scholar]

[bb0220] 44.Acion L., Kelmansky D., van der Laan M., Sahker E., Jones D., Arndt S. Use of a machine learning framework to predict substance use disorder treatment success. PloS One. 2017;12(4) doi: 10.1371/journal.pone.0175383. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0225] 45.Chen C., Liaw A., Breiman L. 110(1−12) University of California; Berkeley: 2004. Using Random Forest to Learn Imbalanced Data; p. 24. [Google Scholar]

[bb0230] 46.Deo R.C., Nallamothu B.K. Learning about machine learning: the promise and pitfalls of big data and the electronic health record. Circ Cardiovasc Qual Outcomes. 2016;9(6):618–620. doi: 10.1161/CIRCOUTCOMES.116.003308. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0235] 47.Liaw A., Wiener M. Classification and regression by randomForest. R News. 2002;2(3):18–22. [Google Scholar]

[bb0240] 48.Probst P., Wright M.N., Boulesteix A. Hyperparameters and tuning strategies for random forest. Wiley Interdiscip Rev Data Min Knowl Discov. 2019;9(3) [Google Scholar]

[bb0245] 49.Desai R.J., Wang S., v., Vaduganathan M, Evers T, Schneeweiss S. Comparison of machine learning methods with traditional models for use of administrative claims with electronic medical records to predict heart failure outcomes. JAMA Netw Open. 2020;3(1) doi: 10.1001/jamanetworkopen.2019.18962. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0250] 50.DeLong E.R., DeLong D.M., Clarke-Pearson D.L. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44(3):837. doi: 10.2307/2531595. [DOI] [PubMed] [Google Scholar]

[bb0255] 51.Friedman J., Hastie T., Tibshirani R. Regularization paths for generalized linear models via coordinate descent. J Stat Softw. 2010;33(1):1. [PMC free article] [PubMed] [Google Scholar]

[bb0260] 52.DeLong E.R., DeLong D.M., Clarke-Pearson D.L. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44(3):837. doi: 10.2307/2531595. [DOI] [PubMed] [Google Scholar]

[bb0265] 53.Delong E.R., Delong D.M., Clarke-Pearson D.L. Vol. 44. 1988. Comparing the Areas under Two or More Correlated Receiver Operating Characteristic Curves: A Nonparametric Approach.https://about.jstor.org/terms [PubMed] [Google Scholar]

[bb0270] 54.Dugan J.B., Bavuso S.J., Boyd M.A. Annual Proceedings on Reliability and Maintainability Symposium. IEEE; 1990. Fault trees and sequence dependencies; pp. 286–293. [Google Scholar]

[bb0275] 55.Chicco D., Jurman G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics. 2020;21(1):6. doi: 10.1186/s12864-019-6413-7. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0280] 56.Taha A.A., Hanbury A. Metrics for evaluating 3D medical image segmentation: analysis, selection, and tool. BMC Med Imaging. 2015;15(1):1–28. doi: 10.1186/s12880-015-0068-x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0285] 57.Kubat M., Holte R.C., Matwin S. Machine learning for the detection of oil spills in satellite radar images. Mach Learn. 1998;30(2):195–215. [Google Scholar]

[bb0290] 58.Sokolova M., Japkowicz N., Szpakowicz S. 2006. Beyond Accuracy, F-Score and ROC: A Family of Discriminant Measures for Performance Evaluation; pp. 1015–1021. [DOI] [Google Scholar]

[bb0295] 59.Josephine S.A. SAS Global Forum; 2017. Predictive accuracy: a misleading performance measure for highly imbalanced data classified negative. [Google Scholar]

[bb0300] 60.Kuhn M. Building predictive models in R using the caret package. J Stat Softw. 2008;28:1–26. [Google Scholar]

[bb0305] 61.Christodoulou E., Ma J., Collins G.S., Steyerberg E.W., Verbakel J.Y., van Calster B. A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. J Clin Epidemiol. 2019;110:12–22. doi: 10.1016/j.jclinepi.2019.02.004. [DOI] [PubMed] [Google Scholar]

[bb0310] 62.Davis J., Goadrich M. Proceedings of the 23rd International Conference on Machine Learning. 2006. The relationship between Precision-Recall and ROC curves; pp. 233–240. [Google Scholar]

[bb0315] 63.Saito T., Rehmsmeier M. The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PloS One. 2015;10(3) doi: 10.1371/journal.pone.0118432. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0320] 64.He H., Garcia E.A. Learning from imbalanced data. IEEE Trans Knowl Data Eng. 2009;21(9):1263–1284. [Google Scholar]

[bb0325] 65.Wadekar A.S. Understanding opioid use disorder (OUD) using tree-based classifiers. Drug Alcohol Depend. 2020;208 doi: 10.1016/j.drugalcdep.2020.107839. [DOI] [PubMed] [Google Scholar]

[bb0330] 66.Jakimovski D., Eckert S.P., Zivadinov R., Weinstock-Guttman B. Considering patient age when treating multiple sclerosis across the adult lifespan. Expert Rev Neurother. 2021;21(3):353–364. doi: 10.1080/14737175.2021.1886082. [DOI] [PubMed] [Google Scholar]

[bb0335] 67.Patti F., Penaherrera J.N., Zieger L., Wicklein E.M. Clinical characteristics of middle-aged and older patients with MS treated with interferon beta-1b: post-hoc analysis of a 2-year, prospective, international, observational study. BMC Neurol. 2021;21(1):324. doi: 10.1186/s12883-021-02347-w. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0340] 68.Earla J.R., Paranjpe R., Kachru N., Hutton G.J., Aparasu R.R. Use of disease modifying agents in patients with multiple sclerosis: analysis of ten years of national data. Res Social Adm Pharm. 2020;16(12) doi: 10.1016/j.sapharm.2020.02.016. [DOI] [PubMed] [Google Scholar]

[bb0345] 69.Earla J.R., Hutton G.J., Thornton J.D., Chen H., Johnson M.L., Aparasu R.R. Comparative adherence trajectories of Oral Fingolimod and injectable disease modifying agents in multiple sclerosis. Patient Prefer Adherence. 2020;14 doi: 10.2147/PPA.S270557. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0350] 70.Boster A., Nicholas J., Wu N., et al. Comparative effectiveness research of disease-modifying therapies for the Management of Multiple Sclerosis: analysis of a large health insurance claims database. Neurol Ther. 2017;6(1) doi: 10.1007/s40120-017-0064-x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0355] 71.English C., Aloi J.J. New FDA-approved disease-modifying therapies for multiple sclerosis. Clin Ther. 2015;37(4) doi: 10.1016/j.clinthera.2015.03.001. [DOI] [PubMed] [Google Scholar]

[bb0360] 72.Scolding N., Barnes D., Cader S., et al. Association of British Neurologists: revised (2015) guidelines for prescribing disease-modifying treatments in multiple sclerosis. Pract Neurol. 2015;15(4) doi: 10.1136/practneurol-2015-001139. [DOI] [PubMed] [Google Scholar]

[bb0365] 73.Markianos M., Evangelopoulos M.E., Koutsis G., Davaki P., Sfagos C. Body mass index in multiple sclerosis: associations with CSF neurotransmitter metabolite levels. Int Sch Res Notices. 2013:2013. doi: 10.1155/2013/981070. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0370] 74.Gianfrancesco M.A., Barcellos L.F. Obesity and multiple sclerosis susceptibility: a review. J Neurol Neuromed. 2016;1(7):1. doi: 10.29245/2572.942x/2016/7.1064. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0375] 75.Mowry E.M., Azevedo C.J., McCulloch C.E., et al. Body mass index, but not vitamin D status, is associated with brain volume change in MS. Neurology. 2018;91(24):e2256–e2264. doi: 10.1212/WNL.0000000000006644. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bb0380] 76.Kvistad S.S., Myhr K.M., Holmøy T., et al. Body mass index influence interferon-beta treatment response in multiple sclerosis. J Neuroimmunol. 2015;288:92–97. doi: 10.1016/j.jneuroim.2015.09.008. [DOI] [PubMed] [Google Scholar]

[bb0385] 77.Dardiotis E., Tsouris Z., Aslanidou P., et al. Body mass index in patients with multiple sclerosis: a meta-analysis. Neurol Res. 2019;41(9):836–846. doi: 10.1080/01616412.2019.1622873. [DOI] [PubMed] [Google Scholar]

[bb0390] 78.Nicholas J., Ontaneda D., Carraro M., et al. Development of an algorithm to identify multiple sclerosis (MS) disease severity based on healthcare costs in a US Administrative Claims Database (P2.052) Neurology. 2017;88(16 Supplement) https://n.neurology.org/content/88/16_Supplement/P2.052 [Google Scholar]

[bb0395] 79.Toliver J.C., Barner J.C., Lawson K.A., Rascati K.L. Use of a claims-based algorithm to estimate disease severity in the multiple sclerosis Medicare population. Mult Scler Relat Disord. 2021;49 doi: 10.1016/j.msard.2021.102741. [DOI] [PubMed] [Google Scholar]

PERMALINK

Assessing treatment switch among patients with multiple sclerosis: A machine learning approach

Jieni Li

Yinan Huang

George J Hutton

Rajender R Aparasu

Abstract

Background

Methods

Results

Conclusions

1. Introduction

2. Methods

2.1. Study design and data sources

2.2. Study population

2.3. Conceptual framework and measurement of variables

2.4. Machine learning methods

2.4.1. Random forest (RF) model

2.4.2. Logistic regression (LR) model

2.5. Statistical analysis

3. Results

3.1. Cohort characteristics

Fig. 1.

3.2. Study population characteristics

Table 1.

3.3. Important predictors of switching

Fig. 2.

3.4. Comparison of random forest model and logistic regression

Table 2.

4. Discussion

4.1. Strength and limitations

5. Conclusion

Funding

Disclosures

Declaration of Competing Interest

Footnotes

Appendix A. Supplementary data

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases