To the Editor:
We greatly appreciate the insightful comments and suggestions regarding our publication.1 Briefly, in this publication, we combined unsupervised and supervised classification machine learning techniques to predict patients' response to spinal cord stimulation. Clusters identified using unsupervised K-means clustering were fitted with individualized predictive models of logistic regression (LR), random forest, and extreme gradient boosting.
In our original publication, the elbow method was used to determine the number of clusters (K). The elbow point in the graph was determined as the maximal number of clusters. An elbow was detected at k = 3. However, as the letter authors noted, there can be a subjective component in the elbow method if the “elbow” cannot be unambiguously identified2 that can be mitigated by supplementing with an alternative means of determining cluster number identification. We projected the computed clusters into the first 2 components of principal component analysis, dimensions with a clearer separation into 2 clusters rather than 3. The third cluster (n = 17) was considered too small for use of multiclass classification to minimize over-fitting. Per the letter authors' recommendation,3 we have added a silhouette analysis for our K-means clustering (Figure 1A-1C). Here, we demonstrate that k = 2 had the highest score (0.263) compared with k = 3 (0.245), congruent with the elbow-principal component analysis method. We also used t-distributed stochastic network embedding to better visualize the clusters (Figure 1D) that further confirms that k = 2 is optimal for this data set.
FIGURE 1.
Silhouette analysis plot demonstrating the scores for A, k-value = 2 and B, k = value = 3. C, Silhouette score plotted per values of k; k = 2 has the highest score. D, T-distributed stochastic network embedding visualization for the k-means clustering demonstrating that k = 2 is optimal for this data set.
The authors pointed out other clustering techniques other than K-means should be considered. In our preprocessing, we have used several techniques including density-based spatial clustering of applications with noise (DBSCAN), K-modes (which includes both categorical and numeric features), and hierarchical clustering which were excluded from our manuscript. To the authors' request, we have added the hierarchical clustering method results. This method, in contrast to K-means, does not require number of clusters as an input and is typically less sensitive to noise and considered better suited for smaller data sets.4 Using this method, 3 clusters were found; however, the number of samples in cluster 3 was considerably low (n = 9) for multiclass classification without critical overfitting performance. Similarly, using the t-distributed stochastic network embedding for visualization, k = 2 was found as the optimal number of clusters using hierarchical clustering (Figure 2). Finally, using k = 2 hierarchical clustering, we have redeveloped all models using the same nested cross validation method. Lower performances were found in all LR algorithms compared with our original K-means clustering (Figure 3). The area under the curve of the receiver operating curve was lower in LR for cluster 1 (0.76-0.53), cluster 2 (0.71-0.64), and for all combined (0.63-0.46). Area under the curve was similar for both random forest and extreme gradient boosting algorithms.
FIGURE 2.
T-distributed stochastic network embedding visualization for the hierarchical clustering demonstrating that k = 2 is optimal for this data set.
FIGURE 3.

Receiver operating characteristic curves comparison for LR model on cluster 1, LR model on cluster 2, and LR, random forest, and XGBoost models on the entire cohort using A, K-means clustering and B, hierarchical clustering. Lower performances were found in all LR algorithms for hierarchical clustering compared with K-means clustering. AUC, area under the curve; LR, logistic regression; XGBoost, extreme gradient boosting.
We agree that as machine learning techniques become increasingly used within neurosurgical research, it becomes important to verify methodology and evaluate that the optimal technique is being used. Further investigation of these advanced methods as it applies to neurosurgical research is certainly warranted.
Funding
This study did not receive any funding or financial support.
Disclosures
Dr Pilitsis is a consultant for Boston Scientific, Nevro, TerSera, and Abbott and receives grant support from Medtronic, Boston Scientific, Abbott, Nevro, TerSera, NIH 2R01CA166379-06 and NIH U44NS115111. She is a medical advisor for Aim Medical Robotics and Karuna and has stock equity. Dr Hadanny has stock equity in Aviv Scientific and EEG Sense. Dr Telkes has grant support from NIH/NINDS K99NS119672 and NIH U44NS115111.
REFERENCES
- 1.Hadanny A, Harland T, Khazen O, Dimarizo M, et al. Development of machine learning-based models to predict treatment response to spinal cord stimulation. Neurosurgery. 2022;90(5):523-532. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Kodinariya TM, Makwana PR. Review on determining number of cluster in K-means clustering. Int J. 2013;1(6):90-95. [Google Scholar]
- 3.Naik A, Varshney LR, Hassaneen W, Arnold PM. Letter: development of machine learning-based models to predict treatment response to spinal cord stimulation. Neurosurgery. 2022;91(1):E30. [DOI] [PubMed] [Google Scholar]
- 4.Kaushik M, Mathur B. Comparative study of K-means and hierarchical clustering techniques. Int J Softw Hardw Res Eng. 2014;2(6):93-98. [Google Scholar]


