Skip to main content
Clinical Orthopaedics and Related Research logoLink to Clinical Orthopaedics and Related Research
. 2023 Jul 25;482(1):158–160. doi: 10.1097/CORR.0000000000002782

CORR Insights®: Does the Presence of Missing Data Affect the Performance of the SORG Machine-learning Algorithm for Patients With Spinal Metastasis? Development of an Internet Application Algorithm

Gregory P Gebauer 1,
PMCID: PMC10723897  PMID: 37493449

Where Are We Now?

Algorithms based in machine learning that are designed to anticipate prognosis and determine treatment plans are seeing wider use. In the field of spine surgery, these have been used to predict patient disposition after surgery, the risk of postoperative narcotic use, patient outcomes after surgery, and the prognosis for patients with epidural abscesses. The SORG Orthopaedic Research Group at Harvard has developed a number of these protocols and has made them available online (https://sorg.mgh.harvard.edu/predictive-algorithms/). This permits physicians to integrate these algorithms into their clinical decision-making process.

It can be difficult to know whether to recommend surgery for patients with spinal metastases. The SORG machine-learning algorithm (SORG-MLA) for metastatic disease is helpful because it can help determine which patients are likely to survive for 30 days, 90 days, or 1 year after surgery; that information factors heavily into the decision of whether or not to operate. This algorithm has been extensively tested and validated [5, 6, 10, 11]. However, in real-world situations, all the data necessary for the algorithm to work properly may not be readily available. This can be especially concerning for patients in whom the metastatic lesions are causing cauda equina syndrome or spinal cord compression, and for whom decisions must be made quickly to preserve neurological function.

The study by Huang et al. [4] in this month’s Clinical Orthopaedics and Related Research® evaluated whether the SORG-MLA remained accurate in the face of missing data. The authors used prior studies [5, 6, 10, 11] to ascertain which data points often are missing from patients’ charts in this setting, and then replaced it with artificially generated data using a missForest technique. The authors noted the albumin level and lymphocyte count were necessary to accurately predict patient survival as without them, the SORG-MLA algorithm did not provide an accurate survival prediction. Importantly, they demonstrated that the SORG-MLA still functions well when one to three data points are missing. This suggests that the SORG-MLA algorithm can still be a useful tool even when complete data are not readily available. As mentioned, the authors then used this set of findings to create their own internet-based application that can predict survival in patients with spinal metastatic disease even when some data are missing. This allows spinal surgeons and oncologists in other areas to access this algorithm and integrate it into their clinical decision-making, especially when making time-critical surgical decisions.

Where Do We Need To Go?

Predictive algorithms are only as good as the data used to create them. A review of 152 machine-learning algorithms found that 87% were at high risk of bias [7]. Factors contributing to bias included inadequate data, mishandling of missing data, and overfitting of the model. In the current study, data were obtained from patients at the National Taiwan University Hospital [4]. The data from many of the SORG-MLA studies come primarily from large, retrospective datasets at large academic centers. These data may not accurately represent patients from other geographic areas and patients treated at smaller, nonacademic institutions. More external validation studies are needed.

Retrospective data were used both in the current study [4] and in the original development of the SORG-MLA [7, 10]. These data may not represent changes in outcomes that arise from ever-evolving medical and oncological treatments. Recent studies have found that the SORG-MLA still performed well in a contemporary group of patients [2]. However, ongoing and repeated analyses will be necessary to ensure that this continues to be the case. In addition, prospective studies with a current set of patients can further validate the models.

The impact of missing data is not unique to the SORG-MLA for metastatic disease. The algorithm for patient outcomes after lumbar decompression surgery requires the imputation of PROMIS scores, which may not be collected by physicians in private or nonacademic practices. The effect of missing data on the performance of machine-learning algorithms is an important issue, and one that has not been extensively examined. The current study found that the SORG-MLA still performs well with missing data, and it identified which data points are necessary to have an accurate prediction. Other machine-learning algorithms must be similarly tested, and the algorithms adjusted if necessary to allow for calculations in the face of missing data and made available on the internet. This will facilitate meaningful, real-world use of these tools.

How Do We Get There?

As machine-learning predictive models continue to be developed, their impact on clinical practice will also grow. It is important for the clinicians who use these tools to understand how these algorithms are developed, the limitations of the data used in their creation, and how they can be best employed to improve patient care. Several papers have already been published in orthopaedic and spine journals to help educate clinicians on these important topics [3, 8, 9]. Continued education is necessary, including further journal publications, information and discussion sessions at national meetings, and the development of webinars or other educational materials from medical societies.

Machine-learning algorithms are most accurate when they have large pools of data from a wide range of patients. Large national patient registries from hip and knee arthroplasty societies are already being used to machine-learning algorithms [1]. I hope that the new American Spine Registry (https://www.americanspineregistry.org/) will provide additional data on a wide range of spinal conditions. It is important for this registry to include both academic and nonacademic locations around the country. This will allow for the widest array of data to be collected so that the models can be as inclusive and accurate as possible.

Finally, as machine learning expands, it is important to remember that in real-world practice, missing data are the norm, not the exception. That being so, developing machine-learning models that need the fewest possible data points to be accurate will be essential. The current study [4] demonstrated that doing so is both possible and helpful. Currently, many algorithms (including the SORG-MLA) are able to predict binary outcomes, such as mortality rate or risk of complications. As other algorithms are developed, they can be expanded to include factors that may be more concerning to patients, such as improved pain and quality of life. With this information, machine-learning algorithms will help us better anticipate prognostic endpoints that matter to our patients, and guide us toward better individual patient decisions in an era of personalized medicine.

Footnotes

This CORR Insights® is a commentary on the article “Does the Presence of Missing Data Affect the Performance of the SORG Machine-learning Algorithm for Patients With Spinal Metastasis? Development of an Internet Application Algorithm?” Huang and colleagues available at: DOI: 10.1097/CORR.0000000000002737.

The author certifies that there are no funding or commercial associations (consultancies, stock ownership, equity interest, patent/licensing arrangements, etc.) that might pose a conflict of interest in connection with the submitted article related to the author or any immediate family members.

All ICMJE Conflict of Interest Forms for authors and Clinical Orthopaedics and Related Research® editors and board members are on file with the publication and can be viewed on request.

The opinions expressed are those of the writer, and do not reflect the opinion or policy of CORR® or The Association of Bone and Joint Surgeons®.

References

  • 1.Alsoof D, McDonald CI, Kuis EO, Daniels AH. Machine learning for the orthopaedic surgeon, uses and limitations. J Bone Joint Surg Am. 2022;104:1586-1594. [DOI] [PubMed] [Google Scholar]
  • 2.Bongers MER, Karhade AV, Villavieja J, et al. Does the SORG algorithm generalize to a contemporary cohort of patients with spinal metastases on external validation? Spine J. 2020;20:1646-1652. [DOI] [PubMed] [Google Scholar]
  • 3.Hornung AL, Hornung CM, Mallow GM, et al. Artificial intelligence in spine care: current applications and future utility. Eur Spine J. 2022;31:2057-2081. [DOI] [PubMed] [Google Scholar]
  • 4.Huang C, Peng K, Hsieh H, et al. Does the presence of missing data affect the performance of the SORG machine-learning algorithm for patients with spinal metastasis? Development of an internet application algorithm. Clin Orthop Relat Res. 2024;482:143-157. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Karhade AV, Ahmed AK, Pennington Z, et al. External validation of the SORG 90-day and 1-year machine learning algorithms for survival in spinal metastatic disease. Spine J. 2020;20:14-21. [DOI] [PubMed] [Google Scholar]
  • 6.Karhade AV, Fenn B, Groot OQ, et al. Development and external validation of predictive algorithms for six-week mortality in spinal metastasis using 4,304 patients from five institutions. Spine J. 2022;22:2033-2041. [DOI] [PubMed] [Google Scholar]
  • 7.Navarro CL, Damen JAA, Takada T, et al. Risk of bias in studies on prediction models developed using supervised machine learning techniques: systematic review. BMJ. 2021;375:n2281. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Makhni EC, Makhni S, Ramkumar PN. Artificial intelligence for the orthopaedic surgeon: an overview of potential benefits, limitations, and clinical applications. J Am Acad Orthop Surg. 2021;29:235-243. [DOI] [PubMed] [Google Scholar]
  • 9.Myers TG, Ramkumar PN, Ricciardi BF, Urish KL, Kipper J, Ketonis C. Artificial intelligence and orthopaedics: an introduction for clinicians. J Bone Joint Surg Am. 2020;102:830-840. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Tseng TE, Lee CC, Yen HK, et al. International validation of the SORG machine-learning algorithm for predicting the survival of patients with extremity metastases undergoing surgical treatment. Clin Orthop Relat Res. 2022;480:367-378. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Zegarek G, Tessitore E, Chaboudez E, Nouri A, Schaller K, Gondar R. SORG algorithm to predict 3- and 12-month survival in metastatic spinal disease: a cross-sectional population-based retrospective study. Acta Neurochir (Wien). 2022;164:2627-2635. [DOI] [PubMed] [Google Scholar]

Articles from Clinical Orthopaedics and Related Research are provided here courtesy of The Association of Bone and Joint Surgeons

RESOURCES