Skip to main content
Plastic and Reconstructive Surgery Global Open logoLink to Plastic and Reconstructive Surgery Global Open
. 2021 Jun 24;9(6):e3638. doi: 10.1097/GOX.0000000000003638

Machine Learning Demonstrates High Accuracy for Disease Diagnosis and Prognosis in Plastic Surgery

Angelos Mantelakis *,, Yannis Assael , Parviz Sorooshian , Ankur Khajuria §,
PMCID: PMC8225366  PMID: 34235035

Abstract

Introduction:

Machine learning (ML) is a set of models and methods that can detect patterns in vast amounts of data and use this information to perform various kinds of decision-making under uncertain conditions. This review explores the current role of this technology in plastic surgery by outlining the applications in clinical practice, diagnostic and prognostic accuracies, and proposed future direction for clinical applications and research.

Methods:

EMBASE, MEDLINE, CENTRAL and ClinicalTrials.gov were searched from 1990 to 2020. Any clinical studies (including case reports) which present the diagnostic and prognostic accuracies of machine learning models in the clinical setting of plastic surgery were included. Data collected were clinical indication, model utilised, reported accuracies, and comparison with clinical evaluation.

Results:

The database identified 1181 articles, of which 51 articles were included in this review. The clinical utility of these algorithms was to assist clinicians in diagnosis prediction (n=22), outcome prediction (n=21) and pre-operative planning (n=8). The mean accuracy is 88.80%, 86.11% and 80.28% respectively. The most commonly used models were neural networks (n=31), support vector machines (n=13), decision trees/random forests (n=10) and logistic regression (n=9).

Conclusions:

ML has demonstrated high accuracies in diagnosis and prognostication of burn patients, congenital or acquired facial deformities, and in cosmetic surgery. There are no studies comparing ML to clinician's performance. Future research can be enhanced using larger datasets or utilising data augmentation, employing novel deep learning models, and applying these to other subspecialties of plastic surgery.

INTRODUCTION

An expanding population in the United States has resulted in an increasing demand for plastic surgery services, which, coupled with static number of residents and increasing number of retiring surgeons, is increasing the pressure for the delivery of high-quality care.1 It is now estimated that there is a workforce shortage of 800 attending physicians in the United States, reducing the availability of care.1 Artificial Intelligence (AI) could have a major impact on addressing challenges that healthcare systems face. Digital technologies are predicted to affect more than 80% of the healthcare workforce in the next 2 decades, changing the way physicians practice medicine and meeting the increasing demand for services.2 AI can help drive this change by automating repetitive tasks to free up time from clinicians, improving the diagnostic accuracy of diseases and predicting patient outcomes.2

Machine learning (ML), a subfield of AI, is a set of models able to learn from past cases (data) to make future predictions. A wide variety of such algorithms are in use today, such as in the automated, individualized suggestions generated during a Google Search, based on ones’ previous searches. These models can be classified into two broad categories: supervised learning and unsupervised learning. The difference between these two categories of learning models lies in the existence of labeled data. In supervised learning, the models are trained using examples of data with known labels, labeled data, and after training, they aim to predict outcomes utilizing new data.3,4 This function has been utilized in healthcare to assist in both making a diagnosis and for disease outcome prediction. Authors have utilized supervised learning to successfully classify whether a skin lesion is benign (eg, benign nevi) or malignant (malignant melanoma), outperforming the accuracy of 21 board-certified dermatologists (accuracy 72% versus 66%, P < 0.05).5 Similarly, supervised learning has also been utilized in predicting the risk of developing a condition such as breast cancer based on epidemiological data, and the risk of recurrence after treatment.6,7

In contrast, unsupervised learning models are trained using unlabeled data, and after training, aim to discover underlying groupings or patterns from the data themselves.3,8 These algorithms can be particularly useful in identifying previously unknown patterns in vast amounts of unprocessed data, which may then be used in clinical practice. Examples include novel classification of diseases into various subtypes and identifying subgroups of patients with increased risk of certain conditions based on various characteristics (for example, their genome).9,10

In addition to meeting demand for plastic surgery services, this technology has the potential to revolutionize how plastic surgery is practiced and enhance surgeon’s diagnosis prediction, preoperative planning, and outcome prediction, leading to improved patient care. In burn surgery, even the most experienced surgeons have a clinical estimation of 64%–76% accuracy in the diagnosis of burn depth.11,12 ML models may outperform this, achieving correct burn depth identification from 2D photographs up to 87%, potentially leading to more appropriate clinical management at presentation.13 Further, in the prognostication of whether a burn injury will heal within 14 days of presentation, ML models have demonstrated an accuracy of 86%, again surpassing the accuracy of prognostication by clinicians.4 In the field of microsurgery, postoperative monitoring via 2D image analysis achieves a 95% accuracy in classifying a flap as normal, presence of venous obstruction, or presence of arterial occlusion, leading to potential early identification of flap failure and increased salvage rates.4 However, the evidence of applications of ML is abstract, with no systematic reviews that summarize the clinical accuracy of such models in practice. This could act as a starting point of developing clinical practice guidelines and to guide future research.1417 The aim of this study was to systematically synthesize and report the current literature in the clinical applications of ML in plastic surgery.

METHODS

Search Strategy

The protocol for this systematic review was registered with PROSPERO international prospective registration of systematic reviews registration number: CRD42019140924. The full protocol was published a priori, and there were no deviations from the original protocol.18 This systematic review was conducted and reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) guidelines.19

A systematic literature search was performed in MEDLINE (OVID SP), EMBASE (OVID SP), CENTRAL, and ClinicalTrials.gov databases to identify relevant studies for review. The reference lists of all included studies were also screened, and relevant studies were included in the search. Lastly, manual searches of bibliographies, citations, and related articles (Pubmed function) were also performed to identify missed relevant studies. Medical Subject Headings (MeSH) terms were used in combination with free text to construct our search strategy. A sample search strategy used in MEDLINE (OVID SP) is shown in Table 1.2070

Table 1.

Example Search Strategy Used for MEDLINE2070

1 (“deep learning” OR “artificial intelligence” OR “machine learning” OR “decision trees” OR “random forests” OR SVM OR “support vector machine”)
2 exp “NEURAL NETWORKS (COMPUTER)”/ OR exp “DEEP LEARNING”/
3 exp “ARTIFICIAL INTELLIGENCE”/
4 (1 OR 2 OR 3)
5 (microsurgery OR (surgery AND (plastic OR reconstructive OR esthetic OR aesthetic OR burns OR hand OR craniofacial OR “peripheral nerve”)))
6 exp “SURGERY, PLASTIC”/ OR exp “RECONSTRUCTIVE SURGICAL PROCEDURES”/
7 (5 OR 6)
8 (4 AND 7)

Selection Criteria

All eligible studies between January 1990 and June 2020 were included in this review. We included any primary studies (including case reports) that present clinical data on the application of ML in plastic surgery. Only articles in the English language were included. Our exclusion criteria included descriptions of ML in plastic surgery without clinical data, review articles, conference abstracts, animal studies, and articles pertaining to the use of ML outside the remit of the specialty (as defined by the Intercollegiate Surgical Curriculum Program in Plastic Surgery).

After the library preparation, two independent reviewers (AM and PS) screened the search results for inclusion based on the title and abstracts. Subsequently, a full-text review was performed independently by the same two researchers (AM and PS) for all included studies. At each step, any discrepancy of opinion was resolved with consensus, and if not resolved, was referred to a third reviewer (AK). If any doubt remained, the article proceeded to the next step of the review. The search results of all included articles, abstracts, full-text articles, and records of the reviewers’ decisions, including reasons for exclusion, were recorded.

Outcome Measures

The primary outcome was the ML algorithm statistical accuracy in performing a prespecified clinical task (eg, prediction of a clinical diagnosis or postoperative outcome). Secondary outcomes include the reported specificity, sensitivity, area under the curve, and technical characteristics of the algorithms.

Data Extraction and Analysis

The data from all full-text articles accepted for the final analysis were independently retrieved by AM and PS, using a standardized data extraction form. Any disagreements were resolved by discussion or referred to the third researcher (AK). The following data (where available) were extracted:

  • a) Study details (year of publication, country), patient demographics, study setting, clinical condition examined.

  • b) ML algorithm characteristics (intended function, whether the model was supervised or unsupervised, function via classification or outcome prediction, usage of real or synthetic data, and which type of ML model was used)

  • c) Primary and secondary outcomes, as above.

Statistical meta-analysis could not be performed because of the heterogeneity of the studies in the conditions examined and software models utilized. Instead, a narrative review was performed, with a subgroup analysis of the mean accuracy of the models, calculated by measuring the number of correct predictions over the total predictions made.

The subgroup analyses are based on the model function (diagnosis prediction, preoperative planning and outcome prediction) and type of models (NNs, SVMs, decision tree/random forest, and linear regression). This subgroup classification was utilized based on the objectives set for AI models in clinical practice by NHS England.2

Quality Assessment

The quality of the included studies was assessed based on the Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2), performed by two independent reviewers (AM and PS).71 There were no disagreements between the authors. The QUADAS-2 tool allows for risk of bias assessment and applicability concern assessment of primary diagnostic accuracy studies. Risk of bias was assessed based on the patient selection, index test (in this review, this is the ML algorithm), reference standard (comparator), and flow and timing. Concerns regarding applicability were assessed on the first three terms alone.

RESULTS

Literature Search Results

From a total of 1536 studies, after removal of duplicates, 1181 articles were eligible for a title and abstract review. Of these, 1074 articles did not meet the inclusion criteria and were excluded. Following full-text review of the remaining 107 articles, 56 articles were excluded because the inclusion criteria were not met. A total of 51 articles were included and formed the basis of this systematic review (Fig. 1). Details of the included studies are summarized in Table 2.2070

Fig. 1.

Fig. 1.

The PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) flow diagram.

Table 2.

Primary Outcomes of Accuracy, Sensitivity, and Specificity for Reconstructive and Burns Surgery

Study Author, Year Function Model Accuracy Sensitivity Specificity AUC
1 Abubakar et al, 202020 DP CNN White: 99.3% Afro-Carribean: 97.1% NR NR NR
2 Chauhan J et al, 202021 DP BPBSAM (CNN + SVM) 91.70% NR NR NR
3 Desbois et al, 202022 DP DNN with 3 measures 91.98% NA NA NR
DNN with 4 measures 92.45% NA NA NR
Boost with 3 measures 97.89% NA NA NR
Boost with 4 measures 98.08% NA NA NR
avNN with 3 measures 97.45% NA NA NR
avNN with 4 measures 98.30% NA NA NR
4 Rashidi et al, 202023 OP DNN 100% 92% 93% 0.880
LR 95% 91% 90% 0.940
SVM 98% NR NR 0.780
RF 93% NR NR 1.000
k-NN 98% 91% 82% 0.960
5 Bhalodia et al, 202024 DP Shapeswork software with PCA NR NR NR NR
6 Guarin et al, 202025 DP NR NR NR NR NR
7 Formeister et al, 202026 OP Gradient Boosted Decision Tree 60.00% 62.00% 60.00% NR
8 Boczar et al, 202027 Intervention IBM Watson 92.30% NR NR NR
9 O’Neil et al, 202028 OP Decision Tree NR 5.00% 86.80% 0.672
10 Yoo et al, 202029 OP Deep Learning (Generative adversarial network- GAN) NR NR NR NR
Pix2pix NR NR NR NR
Lightweight CycleGAN NR NR NR NR
DP Deep Learning + No data augmentation 74.20% 75.80% 72.70% 0.824
Deep Learning + Std data augmentation 83.3%% 78.80% 87.90% 0.872
Deep Learning + GAN data augmentation 90.90% 87.80% 93.90% 0.957
11 Angullia et al, 202030 OP Least squares radial basis function NA NA NA NA
12 Eguia et al, 202031 OP Decision Tree NA NA NA 0.690
Stepwise Logistic Regression NA NA NA 0.800
LR NA NA NA 0.830
k-NN NA NA NA 0.840
13 Ohura et al, 201932 DP SegNet 97.60% 90.90% 98.20% 0.994
LinkNet 97.20% 98.90% 98.90% 0.987
U-Net 98.80% 99.30% 99.30% 0.997
Unet_VGG16 98.90% 99.20% 99.20% 0.998
14 Porras et al, 201933 DP SVM 95.30% 94.70% 96% NR
15 Knoops et al, 201934 DP SVM 95.40% 95.50% 95.20% NR
OP LRRRLARLASSO NR NR NR NR
16 Hallac et al, 201935 DP Pretrained Google-Net 94.10% 97.80% 86% NR
17 Levites et al, 201936 DP Text-based emotion analysis NR NR NR NR
18 Shew et al, 201937 OP 2-class Decision Forest 64.40% NR NR NR
19 Dorfman et al, 201938 DP Neural Nets NR NR NR NR
20 Qiu et al, 201939 PP U-Net CNN NR NR NR NR
21 Aghei et al, 201940 OP ANN-MLP 73.3% 76.20% 70.2 0.762
SVM 67.20% 66.10% 68.40% 0.731
RF 67.20% 61% 73.70% 0.751
LR (FS) 67.20% 61% 73.70% 0.711
LR (BS) 66.40% 64.40% 67.70% 0.718
22 Cirillo et al, 201941 DP VGG-16 77.53% NR NR NR
Google-Net 73.80% NR NR NR
Res-Net 50 77.79% NR NR NR
Res-Net 101 without data aug 90.54% 74.35% 94.25% NR
Res-Net 101 with data aug 82.72% NR NR NR
23 Tran et al, 201942 OP k-NN with k = 1-6 or 8-20 100% NA NA NR
24 Yadav et al, 201943 DP MDS modeling 80% 97.00% 60.00% NR
SVM 82.43% 87.80% 83.33% NR
25 Jiao et al, 201944 DP R101A CNN 82.04% NA NA NR
IV2RA CNN 83.02% NA NA NR
R101FA CNN 84.51% NA NA NR
26 Liu et al, 201845 PP Least Squares Regression NR NR NR NR
Decision tree NR NR NR NR
Sigmoid Neural Nets NR NR NR NR
Hyperbolic Tangent Neural Net NR NR NR NR
Combined Model (Tree +NN) NR NR NR NR
27 Martinez-Jemenez et al, 201846 OP Recurrent Partitioning Random Forest 85.35% NR NR NR
28 Su et al, 201847 OP Random Forest NA NA NA NR
29 Tang et al, 201848 OP L.R 80.50% 84.40% 77.70% 0.875
XGBoost 85.40% 82.0%% 89.7%% 0.920
30 Cobb et al, 201849 OP Random Forest NA NA NA NR
Stochastic Gradient Boosting NR
31 Cho MJ et al, 201850 DP K-means 96% NR NR NR
32 Kuo et al, 201851 OP MLR 72.70% 22.10% 93.30% NR
33 Tan et al, 201752 PP NR NR NR NR NR
34 Huang et al, 201653 OP SVM 100% NA NA NR
35 Park et al, 201554 PP Feature wrapping 77.30% 99% 74.10% NR
36 Serrano et al, 201555 PP SVM 79.73 97% 60% NR
37 Mukherjee et al, 201456 DP SVM with 3rd polynomial kernel 86.13% NA NA NR
Bayesian classifier 81.15% NA NA NR
38 Mendoza et al, 201457 DP LDA 95.70% 97.90% 99.60% NR
DP Random Forest 87.90% NR NR NR
DP SVM 90.80% NR NR NR
39 Acha et al, 201358 DP k-NN 66.2% NR NR NR
SVM 75.7% NR NR NR
PP k-NN 83.8% NR NR NR
SVM 82.4% NR NR NR
40 Schneider et al, 201259 OP CART Decision Tree with Gini splitting function 73.30% NA NA NR
41 Patil et al, 200960 OP Bayesian classifier 97.78% 100% 95.50% 0.978
Decision Tree 96.12% 96.60% 95.51% 0.961
SVM 96.12% 98.60% 93.26% 0.961
Back propagation 95% 96.71% 93.26% 0.949
42 Yamamura et al, 200861 OP ANN 100% NA NA NR
LR 72% NA NA NR
43 Correa et al, 200862 DP SVM 95.05% NR NR NR
44 Acha et al, 200563 DP Fuzzy-ArtMap Neural Network 82.26% 83.01% NA NR
45 Yeong et al, 200564 OP ANN 86% 75% 97% NR
46 Serrano et al, 200565 DP Fuzzy-ArtMap Neural Network 88.57% 83.01% NA NR
47 Yamamura et al, 200466 OP ANN 100% 100% 100% NR
LR 80% 66.70% 85.70% NR
ANN with leave-one-out crossvalidation 86.60% 66.70% 95.20% NR
48 Acha et al, 200367 OP Fuzzy-ArtMap Neural Network 82.60% NR NR NR
49 Estahbanati et al, 200268 OP ANN 90% 80% NA NR
50 Hsu et al, 200069 PP Shallow Neural Net NA NA NA NR
51 Fyre et al, 199670 OP Feed forward, back propagation error adjustment model 98% NA NA NR
77% NA NA NR

ADTree, alternating decision tree; AUC, area under the curve; CNN, convoluted NNs; DNN, deep neural network; DP, diagnosis prediction; k-NN, k-nearest neighbor; LASSO, least absolute shrinkage and selection operator; LDA, liner discriminant analysis; MLR, multiple logistic regression; NA, not applicable; NB classifier, Naive Bayes classifier; NR, not reported; OP, outcome prediction; PP, preoperative planning; RF, random forest .

Breakdown of the Applications of ML Models in Diagnosis Prediction, Outcome Prediction, and Preoperative Planning

In total, 51 studies were included in the review, which evaluated the accuracy of 103 ML algorithms. Of these, 27 were on burns surgery and 24 on general reconstructive surgery. The publication years ranged from 1996 to 2020, with 25 studies published in the past year alone (2019–2020). The clinical utility of these algorithms was to assist clinicians in diagnosis prediction (n = 22), outcome prediction (n = 21), and preoperative planning (n = 8).

In diagnosis prediction, algorithms were created to assist in automated burn depth diagnosis from 2D photography (n = 9) and total burn surface area (n = 1), automated diagnosis of craniosynostosis (n = 5), wound identification in 2D photography (n = 2), diagnosis and severity assessment of facial palsy (n = 1), diagnosis of congenital auricular deformities (n = 1), identification of emotional responses to plastic surgery on Twitter (n = 1), automated age estimation after rhinoplasty (n = 1), and identifying the correct answer to frequently asked questions (n = 1).

In outcome prediction, the ML algorithms created predicted mortality in burn patients (n = 5), the occurrence of AKI in burn and trauma patients (n = 4), occurrence of postoperative complications in breast and head and neck free flap reconstruction (n = 3), concentration and response of aminoglycosides in burn patients (n = 2), postoperative faces after oculoplastic and craniosynostosis surgery (n = 2), burn healing time (n = 1), mortality in patients with necrotizing soft tissue infection (n = 1), delay in radiotherapy following cancer excision (n = 1), posttraumatic stress disorder following burns (n = 1), and factors predicting the occurrence of burns in the pediatric population (n = 1).

In preoperative planning, ML was used to predict which wounds will need grafting (n = 2), which patients will need orthognathic or cleft palate operations (n = 2), planning orthognathic and mandibular resections (n = 2), predicting open wound size (n = 1), and complexion of reconstruction following head and neck cancer excision (n = 2).

ML Models Demonstrate High Accuracy, Sensitivity, and Specificity That May Enhance Clinical Decision-making

The 51 studies evaluated 103 ML algorithms (Table 2). The pooled mean of accuracy of ML algorithms was 86.84% (range 60.00–100%). The pooled mean sensitivity and specificity is 81.88% (range 5.00– 99.30%) and 86.38% (range 60.00–100%), respectively, as reported in 39 models.

A subgroup analysis was performed based on the clinical utility of the algorithms. For diagnosis prediction, the pooled accuracy, sensitivity, and specificity of ML algorithms was 88.80% (range 66.20–97.60%), 90.62% (range 75.80–97.90%), and 86.81% (range 60.00–99.60%). In outcome prediction, this was 86.11% (range 66.20–97.60%), 69.67% (range 5.00–100%), and 85.94% (range 60.00–100%), respectively. In preoperative planning, two studies reported the accuracy, sensitivity, and specificity, which were 80.28% (range 77.30–83.80%), 98.00% (range 97.00–99.00%), and 67.05% (range 60.00–74.10%).

A second subgroup analysis on the reported accuracy was performed based on the type of model utilized. The mean accuracy for NNs was 88.25% (range 73.80–100%), SVMs 88.02% (range 67.20–100%), decision trees/random forest 78.75% (range 60.00–96.12%), and linear regression 76.85% (range 66.40–95.00%).

Breakdown and Analysis of the Supervised and Unsupervised ML Models Utilized

Supervised ML was utilized in 50 of the included studies and unsupervised learning in three studies (two studies employed both supervised and unsupervised learning). The supervised ML algorithms identified are summarized in Table 3. The most commonly used ones were NNs (n = 34), SVMs (n = 13), decision trees/random forests (DT/RF, n = 10), and LR (n = 9). The unsupervised ML models utilized were K-means clustering, a shapeswork software with principal component analysis and the algorithm was not reported in one study.

Table 3.

Technical Characteristics of ML Algorithms Utilized in Burns and Reconstructive Surgery

Study No. Author Function Purpose Input Output Supervised or Unsupervised Modeling (Classification or Regression) Real or Synthetic Data Training
Training Validation Test
1 Abubakar et al, 202020 DP Differentiate healthy versus burned skin in both white and black skin 2D photographs Differentiate healthy versus burned skin in both white and black skin Supervised Classification Data augmentation 80% NA 20%
2 Chauhan J et al, 202021 DP Diagnose depth of burns 2D photographs Differentiate body part + severity of burn Supervised Classification Data augmentation 80% 20% Separate test set
3 Desbois et al, 202022 DP Automated assessment of TBSA Anthropometric measurements Automated assessment of TBSA Supervised Regression Real data 80% NA 20%
4 Rashidi et al, 202023 OP Prediction of AKI in burn and trauma patients Renal injury biomarkers and urine output Prediction of AKI in burn and trauma patients Supervised Classification Real data 59% NA 41%
5 Bhalodia et al, 202024 DP Measuring severity of craniosynostosis CT images Measuring severity of craniosynostosis Unsupervised NA Real data NR NR NR
6 Guarin et al, 202025 DP Diagnosis and severity assessment of facial palsy 2D photographs Automatic localization of 68 facial features in healthy and patients photographs Unsupervised N/A Real data 90% 5% 5%
7 Formeister et al, 202026 OP Predicting any type of complications following free flap reconstruction 14 patient characteristics Prediction of complications in microvascular free flaps Supervised Classification Real data 80% NA 20%
8 Boczar et al, 202027 DP Answering frequently asked questions Participant question Correct answer to FAQs Supervised Classification Real data NR NR NR
9 O’Neil et al, 202028 OP Predicting flap failure in microvascular breast free flap reconstruction 7 patient characteristics Flap failure (yes/no) Supervised Classification Data augmentation 50%–70% NA 30%–50%
10 Yoo et al, 202029 OP Postoperative appearance following oculoplastic surgery for thyroid-associated opthalmopathy Preoperative photograph Postoperative photograph Supervised Regression Data augmentation NR NR NR
11 Angullia et al, 202030 OP Prediction of changes in face shape from craniosynostosis surgery High resolution CT Predict changes in face shape from craniosynostosis surgery Supervised Regression Real data NR NR NR
12 Eguia et al, 201931 OP Prediction of in-hospital mortality in patients with necrotizing skin and soft tissue infection Patient demographics, co-morbidities, and hospital characteristics (73 parameters in total) Prediction of in-hospital mortality in patients with necrotizing skin and soft tissue infection Supervised Classification Real data 80% NA 20%
13 Ohura et al, 201932 DP Diagnosis of wound ulcer 2D photographs Differentiation of healthy tissue from ulcer region Supervised Classification Real data 90% NA 10%
14 Porras et al, 201933 DP Diagnosis of craniosynostosis from 3D photographs 3D photographs Diagnosis of craniosynostosis from 3D photographs Supervised Classification Real data NR NR NR
15 Knoops et al, 201934 PP Orthgonathic surgery CT Need for orthognathic surgery (yes/no) Supervised Classification Real data 80% NA 20%
16 Hallac et al, 201935 DP Diagnosis of congenital auricular deformities 2D photographs Identify presence of congenital auricular deformities (yes/no) Supervised Classification Real data NR NR NR
17 Levites et al, 201936 DP Identify emotional responses to plastic surgery Twitter key words Analyze emotional responses to plastic surgery procedures Supervised Classification Real data 60% 20% 20%
18 Shew et al, 201937 OP Prediction of delay in radiotherapy Variable inpatient patient data Prediction of delay of radiotherapy (more or less than 50 days to treatment) Supervised Classification Real data NR NR NR
19 Dorfman et al, 201938 DP Identification of age perception following rhinoplasty 2D photographs Automated age prediction Supervised Classification Real data NR NR NR
20 Qiu et al, 201939 PP Plan mandibular resections CT Automated 3D mandibular segmentation preoperatively Supervised Regression Real data 48% 7% 45%
21 Aghaei et al, 201940 OP Elaboration of factors predicting pediatric burns Various health, social, and demographic risk factors Most important factors in predicting burn occurrence Supervised Classification Real data 70% NA 30%
22 Cirillo et al, 201941 DP Diagnose depth of burns 2D photographs Classification of burn depth Supervised Classification Data augmentation NR NR NR
23 Tran et al, 201942 OP Prediction of AKI in burn and trauma patients Renal injury biomarkers and urine output Prediction of AKI in burn and trauma patients Supervised Classification Real data 80% NA 20%
24 Yadav et al, 201943 DP Diagnose depth of burns 2D photographs Classify burns by depth and surface area Supervised Classification Real data NR NR NR
25 Jiao et al, 201944 DP Diagnose depth of burns 2D photographs Classify burns by depth and surface area Supervised Classification Real data 87% NA 13%
26 Liu et al, 201845 PP Explore whether ML can predict open wound size Fluid resus volume and other patient factors Predict open wound size Supervised Regression Real data 90% NA 10%
27 Martinez-Jimenez et al, 201846 PP Predicting which wounds need grafting Infrared thermography Prediction of treatment modality required for burn wound Supervised Classification Real data 61% NA 39.00%
28 Su et al, 201847 OP Prediction of PTSD & major depressive disorder in burn patients Burn-related variables, empirically-derived risk factors from previous meta-analysis & theory-derived cognitive variables Prediction of PTSD & major depressive disorder in burn patients NR NR NR NR NR NR
29 Tang et al, 201848 OP Prediction of AKI in burn patients Patient risk factors and laboratory measurements Prediction of AKI in burn patients Supervised Classification Real data NR NR NR
30 Cobb et al, 201849 OP Prediction of mortality of burn patients Patient risk factors and laboratory measurements Predict whether a patient would (1) live versus (2) die Supervised Classification Real data 66% NA 34%
31 Cho MJ et al, 201850 DP Diagnosis of cranionynostosis CT images Automated differentiation of craniosynostosis from benign metopic ridge from CT Unsupervised Classification Real data NR NR NR
32 Kuo et al, 201851 OP Predicting surgical site infection Patient risk factors Prediction of SSI (yes/no) Supervised Classification Real data 70% NA 30%
33 Tan et al, 201752 PP Complexion of reconstruction following basal cell cancer excision Patient risk factors Prediction of intraoperative surgical complexity Supervised Classification Real data NR NR NR
34 Huang et al, 201653 OP Prediction of mortality of burn patients Patient risk factors and laboratory measurements Prediction of whether a patient would (1) live versus (2) die Supervised Classification Real data 21% 66% 13%
35 Park et al, 201554 PP Prediction of need for surgery in patients with cleft lip/palate Lateral cephalograms Prediction of need for surgery in patients with cleft lip/palate Supervised Classification Real data NR NR NR
36 Serrano et al, 201555 PP Predicting which wounds need grafting 2D photographs Predicting which wounnds need grafting (yes/no) Supervised Classification Real data 21% NA 79%
37 Mukherjee et al, 201456 DP Wound recognition and classification 2D photographs Automated assessment of wound classification Supervised Classification Real data NR NR NR
38 Mendoza et al, 201457 DP Diagnosis of cranionynostosis CT images Automated craniosynostosis diagnosis from CT Supervised Classification Real data NR NR NR
39 Acha et al, 201358 DP Diagnose depth of burns 2D photographs Classify burns by depth Supervised Classification Real data 21% NA 79%
PP Predicting which wounds need grafting 2D photographs Predict whether a burn will need grafting Supervised Classification Real data 21% NA 79%
40 Schneider et al, 201259 OP Prediction of AKI in burn patients Patient risk factors and laboratory measurements Prediction of AKI in burn patients Supervised Classification Real data 71% NA 29.00%
41 Patil et al, 200960 OP Prediction of mortality of burn patients Patient risk factors and laboratory measurements Prediction of mortality in burn patients Supervised Classification Real data K-cross validation K-cross validation K-cross validation
42 Yamamura et al, 200861 OP Prediction of response of aminoglycosides against MRSA infection in burn patients Patient risk factors and laboratory measurements Prediction of response of aminoglycosides against MRSA infection in burn patients Supervised Classification Real data K-cross validation K-cross validation K-cross validation
43 Ruiz-Correa et al, 200862 DP Diagnosis of craniosynostosis CT images Classification of craniosynostosis Supervised Classification Real data
44 Acha et al, 200563 DP Diagnose depth of burns 2D photographs Automated assessment of burn wound depth Supervised Classification Real data 56% NA 44%%
45 Yeong et al, 200564 OP Prediction of burn healing time Reflectance spectometer measurements Prediction of burn healing time Supervised Classification Real data NR NR NR
46 Serrano et al, 200565 DP Diagnose depth of burns 2D photographs Automated assessment of burn wound depth Supervised Classification Real data NR NR NR
47 Yamamura et al, 200466 OP Prediction of aminoglycoside/ab × concentration in burn patients Patient risk factors and laboratory measurements Prediction of aminoglycoside/ab × concentration in burn patients Supervised Classification Real data 100% 100% 100%
Supervised Classification Real data 80% 66.70% 85.70%
48 Acha et al, 200367 DP Identify burn tissue from healthy, and classify depth of burn 2D photographs Identify burn tissue from healthy, and classify depth of burn Supervised Classification Real data 80% NA 20%
49 Estahbanati et al, 200268 OP Prediction of mortality of burn patients Patient risk factors and laboratory measurements Prediction of mortality of burn patients Supervised Classification Real data 75% NA 25%
50 Hsu et al, 200069 PP Skull reconstruction of areas needing an operation CT Skull reconstruction in CT for preoperative planning Supervised Regression Real data NA NA NA
51 Frye et al, 199670 OP Prediction of mortality of burn patients Patient risk factors and laboratory measurements Prediction of mortality of burn patients Supervised Classification Real data 90% NA 10%
Prediction of hospital stay of burn patients Prediction of hospital stay of burn patients Supervised Classification Real data 90% NA 10%

NA, not applicable; NR, not reported.

Lack of Data Augmentation and Validation during Training

Data augmentation is often used in small datasets, to artificially create more data samples and increase the effective dataset size, and as a result the statistical performance of a model. Data augmentation was used in only six of the 51 included studies. The remaining articles relied only on real data. For diagnostic predictions, the majority of studies utilized 2D photographs (n = 15) and CT scans (n = 4). For clinical outcome prediction, patient risk factors and laboratory measurements on admission was utilized in most models (n = 17). In preoperative planning, CT scans (n = 3) and 2D photographs (n = 2) comprised the majority of inputs utilized.

Training ML models requires splitting the data set in training, validation, and test sets, where the validation set is used for hyperparameter tuning during training to prevent “overfitting” of the model to the given data. Only 10 of the 35 studies utilized a validation set during training. In total, 35 studies report their data training and testing splits, with an 80%–20% split between the training and testing set being the most common methodology presented (n = 9).

In terms of output, ML algorithms functioned primarily via classification in 45 studies and via regression in six studies. Classification was utilized for the allocation of a new subject to a specific outcome (for example, burn patient needing a grafting versus healing via secondary intention). Regression was used in studies aiming to recreate a prediction of a postoperative outcome (postoperative CT scan, postoperative 2D photograph, and predicted wound size).

Risk of Bias Assessment

The risk of bias was assessed via the QUADAS-2 tool for risk of bias assessment and concerns over applicability (Fig. 2). The majority of studies had an unclear risk of bias (RoB) in the patient selection (n = 20) and index test domains (n = 24). Most had a low RoB by the reference standard (n = 39) and flow and timing domains (n = 35). For applicability concern, more than half of the studies had a low risk of RoB regarding the patient selection, index test, and reference standard domains (n = 32, n = 33, and n = 38 respectively).

Fig. 2.

Fig. 2.

Summary of the QUADAS-2 (Quality Assessment on Diagnostic Accuracy Studies-2) analysis.

DISCUSSION

This is the first systematic review focusing on the application of ML in plastic surgery, adding to previous reviews on AI in the specialty.72 After careful selection of studies that demonstrated the clinical application of these algorithms, we identified 51 articles describing the application of 103 ML algorithms. In our review, the mean accuracy for diagnosis prediction, outcome prediction, and preoperative planning was 88.80%, 86.11%, and 80.28%, respectively. The model with the highest mean accuracy was NNs (88.25%), followed by SVMs (88.02%), decision trees/random forest (78.75%), and linear regression (76.85%).

Similar findings have been reported in systematic reviews of other surgical specialties. In orthopedic surgery and neurosurgery, the most common models utilized have been Neural Networks (NNs), followed by support vector machines (SVMs) and logistic regression (LR).3,73 Outcome prediction of ML models in these specialties ranged from 70% to 97%, which is in line with the findings of this report8,72 Nonsurgical specialties have also utilized NNs and SVMs the most frequent, with accuracies approaching 96% depending on the specialty and model intent.74,75 The reason behind this preference is potentially that NN, SVM, and DT most closely resemble the cognition behind clinical judgment, where clinicians aim to derive outcome classifications based on multiple, nonlinear inputs. In plastic surgery, ML demonstrated potentially superior accuracy in diagnosis and outcome prediction when compared with clinician judgment. In burn surgery, models included in this review were able to classify burn thickness with an accuracy of up to 99.3%, in contrast to the 60%–70% achieved by surgeons.21,76 Models have also demonstrated the ability to predict mortality rates with an accuracy of 93%, outperforming commonly used predictive models such as the Belgian score, Boston score, and APACHE II with a sensitivity of 72%, 66%, and 81%, respectively.50 In microsurgery, models produced high accuracy in prognosis of free flap failure (66%), whereas commonly used prognostic surgical risk calculators have been deemed unreliable for head and neck and breast microsurgical reconstruction (Brier score <0.01 and 0.09–0.44, respectively).77,78 In addition, ML models demonstrated a predictive capacity for outcomes for which predictive models have not yet been developed but may assist the surgeon in the clinical workplace. Examples include prediction of AKI in burn patients, mortality from necrotizing infections, and postoperative surgical outcomes in craniosynostosis surgery and reconstructive surgery following craniosynostosis correction.29,31,48,59

ML in plastic surgery has an incredible potential to advance patient care, but it is still in its infancy. This review has highlighted several patterns in successful application. Whenever a diagnosis is solely reliant upon a visual stimulus, for example 2D photography or CT, ML has consistently and reliably outperformed surgeons’ diagnostic accuracy.18,37,39,40,46,51,53,59,63 Further, in conditions in which there are well-established correlations between certain risk markers and an outcome of interest, such as deranged blood tests on admission and AKI in burn patients, ML yielded highly accurate predictive algorithms.38,44,55 24,47However, attempts to include weakly related risk markers resulted in algorithms that had an overall lower predictive accuracy, rendering them unsafe for clinical practice. This review further identified that some plastic surgery subspecialties, such as hand surgery, have yet to incorporate this technology. This may be due to the challenging nature of classifying potential outcomes (eg, classification of hand function outcomes), or lack of data, yet future studies should aim to harvest the potential of this technology.

From a technological standpoint, this review identified three key areas to improve future algorithms, that is by tapping into the potential of expanding the dataset size using data augmentation, utilizing novel deep learning models, and making proper use of algorithm validation in research. Data augmentation can be invaluable in the creation of future algorithms, solving the main obstacle of accessibility to large amounts of data needed to train these models. It is a process by which one can artificially enhance the diversity of a patient database without actually collecting new data. (See figure, Supplemental Digital Content 1, which displays data augmentation utilizing random cropping, random rotation, and mirroring (horizontal flipping). A single datapoint has now been augmented to seven novel datapoints. http://links.lww.com/PRSGO/B676.)

This was utilized in only five studies in this review. O’Neil et al utilized data augmentation to enhance a database of 11 patients to 269, allowing the creation of an algorithm to predict the probability of total free flap failure in microvascular breast reconstruction.24 Until large-scale anonymized medical datasets become more readily available, such as the OpenSAFELY platform, by tapping to this potential of data augmentation, clinicians can overcome the challenges of limited patient datasets. Secondly, future research could substantially benefit from utilizing more recent advances in the field of NNs and deep learning. Compared with traditional ML, deep NNs can process vast amounts of data efficiently and discover complex underlying patterns in the data at scale. A limitation here is the large volume of appropriately structured data needed to train these models. Lastly, future research should ensure that all algorithms created are validated before testing. Separating the validation and test sets is crucial because it prevents overfitting of an algorithm to a set of given data and reports a misleading higher performance. Our review identified that only 10 of the 51 studies utilized validation, indicating that there is a high risk of bias in the remaining studies, as the high accuracies of the algorithms could be the result of overfitting.

The evidence in this study is limited by the lack of high-quality level I evidence. The existing studies are mostly small retrospective case series that are inherently at the risk of bias. There are no prospective, randomized controlled trials evaluating these technologies in the clinical setting comparing them with clinician acumen, which limits our comparison on the safety and utility of the technologies. Further, the mean accuracy, sensitivity, and specificity of included algorithms were reported collectively for all algorithms, rather than performing subgroup analysis based on the condition examined because of insufficient studies in the specialty. This pooling of results is not an indication of the accuracy of any individual model, where each algorithm should be examined in isolation. However, this still provided an invaluable insight into the accuracy of these algorithms in plastic surgery. Lastly, because of the limited MeSH terms currently utilized in ML and medicine, potentially important studies on the topic may have been missed. These are expected to be minimal, as we performed a wide library search, which was also completed by extensive reference checking to provide an accurate, up-to-date review.

CONCLUSIONS

ML has the potential to enhance clinical decision-making in plastic surgery by making highly accurate diagnostic and outcome predictions; however, the technology is still in its infancy. There is vast heterogeneity between published studies in regard to the clinical task the algorithms are designed on and the model utilized, thus not allowing for data synthesis and meta-analysis. There is a pressing need for larger prospective, randomized control trials for level I and II data, where these algorithms are utilized in the clinical setting. Future research could benefit from larger datasets, data augmentation, state-of-the-art deep learning models, and more rigorous validation during design.

Supplementary Material

gox-9-e3638-s001.pdf (43.8KB, pdf)

Footnotes

Published online 24 June 2021.

Disclosure: All the authors have no financial interest to declare in relation to the content of this article.

Related Digital Media are available in the full-text version of the article on www.PRSGlobalOpen.com.

REFERENCES

  • 1.Yang J, Jayanti MK, Taylor A, et al. The impending shortage and cost of training the future plastic surgical workforce. Ann Plast Surg. 2014; 72:200–203 [DOI] [PubMed] [Google Scholar]
  • 2.Topol E. Preparing the healthcare workforce to deliver the digital future. Health Educ Engl. 2019;1:1–48 [Google Scholar]
  • 3.Celtikci E. A systematic review on machine learning in neurosurgery: the future of decision-making in patient care. Turk Neurosurg. 2018; 28:167–173 [DOI] [PubMed] [Google Scholar]
  • 4.Kanevsky J, Corban J, Gaster R, et al. Big data and machine learning in plastic surgery: a new frontier in surgical innovation. Plast Reconstr Surg. 2016; 137:890e–897e [DOI] [PubMed] [Google Scholar]
  • 5.Esteva A, Kuprel B, Novoa RA, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017; 542:115–118 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Ahmad LG, Eshlaghy AT, Poorebrahimi A, et al. Using three machine learning techniques for predicting breast cancer recurrence. J Health Med Inform. 2013; 4:3 [Google Scholar]
  • 7.Ayer T, Chhatwal J, Alagoz O, et al. Informatics in radiology: comparison of logistic regression and artificial neural network models in breast cancer risk estimation. Radiographics. 2010; 30:13–22 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Senders JT, Staples PC, Karhade AV, et al. Machine learning and neurosurgical outcome prediction: a systematic review. World Neurosurg. 2018; 109:476–486.e1 [DOI] [PubMed] [Google Scholar]
  • 9.Folweiler KA, Sandsmark DK, Diaz-Arrastia R, et al. Unsupervised machine learning reveals novel traumatic brain injury patient phenotypes with distinct acute injury profiles and long-term outcomes. J Neurotrauma. 2020; 37:1431–1444 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Lopez C, Tucker S, Salameh T, et al. An unsupervised machine learning method for discovering patient clusters based on genetic signatures. J Biomed Inform. 2018; 85:30–39 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Heimbach DM, Afromowitz MA, Engrav LH, et al. Burn depth estimation–man or machine. J Trauma. 1984; 24:373–378 [PubMed] [Google Scholar]
  • 12.Brown RF, Rice P, Bennett NJ. The use of laser Doppler imaging as an aid in clinical management decision making in the treatment of vesicant burns. Burns. 1998; 24:692–698 [DOI] [PubMed] [Google Scholar]
  • 13.Liu NT, Salinas J. Machine learning in burn care and research: a systematic review of the literature. Burns. 2015; 41:1636–1641 [DOI] [PubMed] [Google Scholar]
  • 14.Brinker TJ, Hekler A, Utikal JS, et al. Skin cancer classification using convolutional neural networks: systematic review. J Med Internet Res. 2018; 20:e11936. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Gardezi SJS, Elazab A, Lei B, Wang T. Breast cancer detection and diagnosis using mammographic data: systematic review. J Med Internet Res. 2019; 21:e14464. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Nindrea RD, Aryandono T, Lazuardi L, et al. Diagnostic accuracy of different machine learning algorithms for breast cancer risk calculation: a meta-analysis. Asian Pac J Cancer Prev. 2018; 19:1747–1752 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Thomsen K, Iversen L, Titlestad TL, Winther O. Systematic review of machine learning for diagnosis and prognosis in dermatology. J Dermatol Treat. 2019; 29:1–5. [DOI] [PubMed] [Google Scholar]
  • 18.Mantelakis A, Khajuria A. The applications of machine learning in plastic and reconstructive surgery: protocol of a systematic review. Syst Rev. 2020; 9:44. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Liberati A, Altman DG, Tetzlaff J, et al. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: explanation and elaboration. Plos Med. 2009; 6:e1000100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Abubakar A, Ugail H, Bukar AM. Assessment of human skin burns: a deep transfer learning approach. J Med Biol Eng. 2020; 40:321–333 [Google Scholar]
  • 21.Chauhan J, Goyal P. BPBSAM: body part-specific burn severity assessment model. Burns. 2020; 46:1407–1423 [DOI] [PubMed] [Google Scholar]
  • 22.Desbois A, Beguet F, Leclerc Y, et al. Predictive modeling for personalized three-dimensional burn injury assessments. J Burn Care Res. 2020; 41:121–130 [DOI] [PubMed] [Google Scholar]
  • 23.Rashidi HH, Sen S, Palmieri TL, et al. Early recognition of burn- and trauma-related acute kidney injury: a pilot comparison of machine learning techniques. Sci Rep. 2020; 10:205. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Bhalodia R, Dvoracek LA, Ayyash AM, et al. Quantifying the severity of metopic craniosynostosis: a pilot study application of machine learning in craniofacial surgery. J Craniofac Surg. 2020; 31:697–701 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Guarin DL, Yunusova Y, Taati B, et al. Toward an automatic system for computer-aided assessment in facial palsy. Facial Plast Surg Aesthet Med. 2020; 22:42–49 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Formeister EJ, Baum R, Knott PD, et al. Machine learning for predicting complications in head and neck microvascular free tissue transfer. Laryngoscope. 2020; 130:E843–E849 [DOI] [PubMed] [Google Scholar]
  • 27.Boczar D, Sisti A, Oliver JD, et al. Artificial intelligent virtual assistant for plastic surgery patient’s frequently asked questions: a pilot study. Ann Plast Surg. 2020; 84:e16–e21 [DOI] [PubMed] [Google Scholar]
  • 28.O’Neill AC, Yang D, Roy M, et al. Development and evaluation of a machine learning prediction model for flap failure in microvascular breast reconstruction. Ann Surg Oncol. 2020; 27:3466–3475 [DOI] [PubMed] [Google Scholar]
  • 29.Yoo TK, Choi JY, Kim HK. A generative adversarial network approach to predicting postoperative appearance after orbital decompression surgery for thyroid eye disease. Comput Biol Med. 2020; 118:103628. [DOI] [PubMed] [Google Scholar]
  • 30.Angullia F, Fright WR, Richards R, et al. A novel RBF-based predictive tool for facial distraction surgery in growing children with syndromic craniosynostosis. Int J Comput Assist Radiol Surg. 2020; 15:351–367 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Eguia E, Vivirito V, Cobb AN, et al. Predictors of death in necrotizing skin and soft tissue infection. World J Surg. 2019; 43:2734–2739 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Ohura N, Mitsuno R, Sakisaka M, et al. Convolutional neural networks for wound detection: the role of artificial intelligence in wound care. J Wound Care. 2019; 28Sup10S13–S24 [DOI] [PubMed] [Google Scholar]
  • 33.Porras AR, Tu L, Tsering D, et al. Quantification of head shape from three-dimensional photography for presurgical and postsurgical evaluation of craniosynostosis. Plast Reconstr Surg. 2019; 144:1051e–1060e [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Knoops PGM, Papaioannou A, Borghi A, et al. A machine learning framework for automated diagnosis and computer-assisted planning in plastic and reconstructive surgery. Sci Rep. 2019; 9:13597. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Hallac RR, Lee J, Pressler M, et al. Identifying ear abnormality from 2D photographs using convolutional neural networks. Sci Rep. 2019; 9:499–504 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Levites HA, Thomas AB, Levites JB, et al. The use of emotional artificial intelligence in plastic surgery. Plast Reconstr Surg. 2019; 144:499–504 [DOI] [PubMed] [Google Scholar]
  • 37.Shew M, NJ, Bur AM. Segmentation and classification of burn images by color and texture information. Otolaryngol Head Neck Surg. 2019; 160:1058–1064 [DOI] [PubMed] [Google Scholar]
  • 38.Dorfman R, Chang I, Saadat S, et al. Making the subjective objective: machine learning and rhinoplasty. Aesthet Surg J. 2020; 40:493–498 [DOI] [PubMed] [Google Scholar]
  • 39.Qiu B, Guo J, Kraeima J, et al. Automatic segmentation of the mandible from computed tomography scans for 3D virtual surgical planning using the convolutional neural network. Phys Med Biol. 2019; 64:175020. [DOI] [PubMed] [Google Scholar]
  • 40.Aghaei A, Soori H, Ramezankhani A, et al. Factors related to pediatric unintentional burns: the comparison of logistic regression and data mining algorithms. J Burn Care Res. 2019; 40:606–612 [DOI] [PubMed] [Google Scholar]
  • 41.Cirillo MD, Mirdell R, Sjöberg F, et al. Time-independent prediction of burn depth using deep convolutional neural networks. J Burn Care Res. 2019; 40:857–863 [DOI] [PubMed] [Google Scholar]
  • 42.Tran NK, Sen S, Palmieri TL, et al. Artificial intelligence and machine learning for predicting acute kidney injury in severely burned patients: A proof of concept. Burns. 2019; 45:1350–1358 [DOI] [PubMed] [Google Scholar]
  • 43.Yadav DP, Sharma A, Singh M, et al. Feature extraction based machine learning for human burn diagnosis from burn images. IEEE J Transl Eng Health Med. 2019; 7:1800507. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Jiao C, Su K, Xie W, et al. Burn image segmentation based on mask regions with convolutional neural network deep learning framework: more accurate and more convenient. Burns Trauma. 2019; 7:6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Liu NT, Rizzo JA, Shields BA, et al. Predicting the ability of wounds to heal given any burn size and fluid volume: an analytical approach. J Burn Care Res. 2018; 39:661–669 [DOI] [PubMed] [Google Scholar]
  • 46.Martínez-Jiménez MA, Ramirez-GarciaLuna JL, Kolosovas-Machuca ES, et al. Development and validation of an algorithm to predict the treatment modality of burn wounds using thermographic scans: prospective cohort study. PLoS One. 2018; 13:e0206477. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Su YJ. Prevalence and predictors of posttraumatic stress disorder and depressive symptoms among burn survivors two years after the 2015 Formosa Fun Coast Water Park explosion in Taiwan. Eur J Psychotraumatol. 2018; 9:1512263. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Tang CQ, Li JQ, Xu DY, et al. [Comparison of machine learning method and logistic regression model in prediction of acute kidney injury in severely burned patients]. Zhonghua Shao Shang Za Zhi. 2018; 34:343–348 [DOI] [PubMed] [Google Scholar]
  • 49.Cobb AN, Daungjaiboon W, Brownlee SA, et al. Seeing the forest beyond the trees: Predicting survival in burn patients with machine learning. Am J Surg. 2018; 215:411–416 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Cho MJ, Hallac RR, Effendi M, et al. Comparison of an unsupervised machine learning algorithm and surgeon diagnosis in the clinical differentiation of metopic craniosynostosis and benign metopic ridge. Sci Rep. 2018; 8:6312. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Kuo PJ, Wu SC, Chien PC, et al. Artificial neural network approach to predict surgical site infection after free-flap reconstruction in patients receiving surgery for head and neck cancer. Oncotarget. 2018; 9:13768–13782 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Tan E, Lin F, Sheck L, et al. A practical decision-tree model to predict complexity of reconstructive surgery after periocular basal cell carcinoma excision. J Eur Acad Dermatol Venereol. 2017; 31:717–723 [DOI] [PubMed] [Google Scholar]
  • 53.Huang Y, Zhang L, Lian G, et al. A novel mathematical model to predict prognosis of burnt patients based on logistic regression and support vector machine. Burns. 2016; 42:291–299 [DOI] [PubMed] [Google Scholar]
  • 54.Park HM, Kim PJ, Kim HG, et al. Prediction of the need for orthognathic surgery in patients with cleft lip and/or palate. J Craniofac Surg. 2015; 26:1159–1162 [DOI] [PubMed] [Google Scholar]
  • 55.Serrano C, Boloix-Tortosa R, Gómez-Cía T, et al. Features identification for automatic burn classification. Burns. 2015; 41:1883–1890 [DOI] [PubMed] [Google Scholar]
  • 56.Mukherjee R, Manohar DD, Das DK, et al. Automated tissue classification framework for reproducible chronic wound assessment. Biomed Res Int. 2014; 2014:851582. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Mendoza CS, Safdar N, Okada K, et al. Personalized assessment of craniosynostosis via statistical shape modeling. Med Image Anal. 2014; 18:635–646 [DOI] [PubMed] [Google Scholar]
  • 58.Acha B, Serrano C, Fondón I, et al. Burn depth analysis using multidimensional scaling applied to psychophysical experiment data. IEEE Trans Med Imaging. 2013; 32:1111–1120 [DOI] [PubMed] [Google Scholar]
  • 59.Schneider DF, Dobrowolsky A, Shakir IA, et al. Predicting acute kidney injury among burn patients in the 21st century: a classification and regression tree analysis. J Burn Care Res. 2012; 33:242–251 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Patil BM, Joshi RC, Toshniwal D, et al. A new approach: role of data mining in prediction of survival of burn patients. J Med Syst. 2011; 35:1531–1542 [DOI] [PubMed] [Google Scholar]
  • 61.Yamamura S, Kawada K, Takehira R, et al. Prediction of aminoglycoside response against methicillin-resistant Staphylococcus aureus infection in burn patients by artificial neural network modeling. Biomed Pharmacother. 2008; 62:53–58 [DOI] [PubMed] [Google Scholar]
  • 62.Ruiz-Correa S, Gatica-Perez D, Lin HJ, et al. A Bayesian hierarchical model for classifying craniofacial malformations from CT imaging. Annu Int Conf IEEE Eng Med Biol Soc. 2008; 2008:4063–4069 [DOI] [PubMed] [Google Scholar]
  • 63.Acha B, Serrano C, Acha JI, et al. Segmentation and classification of burn images by color and texture information. J Biomed Opt. 2005; 10:034014. [DOI] [PubMed] [Google Scholar]
  • 64.Yeong EK, Hsiao TC, Chiang HK, et al. Prediction of burn healing time using artificial neural networks and reflectance spectrometer. Burns. 2005; 31:415–420 [DOI] [PubMed] [Google Scholar]
  • 65.Serrano C, Acha B, Gómez-Cía T, et al. A computer assisted diagnosis tool for the classification of burns by depth of injury. Burns. 2005; 31:275–281 [DOI] [PubMed] [Google Scholar]
  • 66.Yamamura S, Kawada K, Takehira R, et al. Artificial neural network modeling to predict the plasma concentration of aminoglycosides in burn patients. Biomed Pharmacother. 2004; 58:239–244 [DOI] [PubMed] [Google Scholar]
  • 67.Acha B, Serrano C, Acha JI, et al. CAD tool for burn diagnosis. Inf Process Med Imaging. 2003; 18:294–305 [DOI] [PubMed] [Google Scholar]
  • 68.Estahbanati HK, Bouduhi N. Role of artificial neural networks in prediction of survival of burn patients-a new approach. Burns. 2002; 28:579–586 [DOI] [PubMed] [Google Scholar]
  • 69.Hsu JH, Tseng CS. Application of orthogonal neural network to craniomaxillary reconstruction. J Med Eng Technol. 2000; 24:262–266 [DOI] [PubMed] [Google Scholar]
  • 70.Frye KE, Izenberg SD, Williams MD, et al. Simulated biologic intelligence used to predict length of stay and survival of burns. J Burn Care Rehabil. 1996; 176 Pt 1540–546 [DOI] [PubMed] [Google Scholar]
  • 71.Whiting PF, Rutjes AW, Westwood ME, et al. ; QUADAS-2 Group. QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med. 2011; 155:529–536 [DOI] [PubMed] [Google Scholar]
  • 72.Jarvis T, Thornburg D, Rebecca AM, et al. Artificial intelligence in plastic surgery: current applications, future directions, and ethical implications. Plast Reconstr Surg Glob Open. 2020; 8:e3200. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Cabitza F, Locoro A, Banfi G. Machine learning in orthopedics: a literature review. Front Bioeng Biotechnol. 2018; 6:75. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Krittanawong C, Virk HUH, Bangalore S, et al. Machine learning prediction in cardiovascular diseases: a meta-analysis. Sci Rep. 2020; 10:16057. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Choy G, Khalilzadeh O, Michalski M, et al. Current applications and future impact of machine learning in radiology. Radiology. 2018; 288:318–328 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Thatcher JE, Squiers JJ, Kanick SC, et al. Imaging techniques for clinical burn assessment with a focus on multispectral imaging. Adv Wound Care (New Rochelle). 2016; 5:360–378 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Tierney W, Shah J, Clancy K, et al. Predictive value of the ACS NSQIP calculator for head and neck reconstruction free tissue transfer. Laryngoscope. 2020; 130:679–684 [DOI] [PubMed] [Google Scholar]
  • 78.O’Neill AC, Murphy AM, Sebastiampillai S, et al. Predicting complications in immediate microvascular breast reconstruction: validity of the breast reconstruction assessment (BRA) surgical risk calculator. J Plast Reconstr Aesthet Surg. 2019; 72:1285–1291 [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

gox-9-e3638-s001.pdf (43.8KB, pdf)

Articles from Plastic and Reconstructive Surgery Global Open are provided here courtesy of Wolters Kluwer Health

RESOURCES