Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2021 Feb 1.
Published in final edited form as: Surgery. 2019 Nov 18;167(2):321–327. doi: 10.1016/j.surg.2019.10.008

A Machine Learning approach to predict surgical learning curves

Yuanyuan Gao 1, Uwe Kruger 1,2, Xavier Intes 1,2, Steven Schwaitzberg 3,4,5, Suvranu De 1,2
PMCID: PMC6980926  NIHMSID: NIHMS1542368  PMID: 31753325

Abstract

Background:

Contemporary surgical training programs rely on the repetition of selected surgical motor tasks. Such methodology is inherently open-ended with no control on the time taken to attain a set level of proficiency, given the trainees’ intrinsic differences in initial skill levels and learning abilities. Hence, an efficient training program should aim at tailoring the surgical training protocols to each trainee. In this regard, a predictive model utilizing information from the initial learning stage to predict learning curve characteristics should facilitate the whole surgical training process.

Methods:

This paper analyzes learning curve data to train a multivariate supervised machine learning model. One factor is extracted to define the trainees’ learning ability. An unsupervised machine learning model is also utilized for trainee classification. When established, the model can predict robustly the learning curve characteristics based on the first few trials.

Results:

We show that the information present in the first ten trials of surgical tasks can be utilized to predict the number of trials required to achieve proficiency (R2 = 0.72) and the final performance level (R2 = 0.89). Furthermore, only a single factor, learning index, is required to describe the learning process and to classify learners with unique learning characteristics.

Conclusions:

Using machine learning models, we show, for the first time, that the first few trials contain sufficient information to predict learning curve characteristics and that a single factor can capture the complex learning behavior. Using such models holds the potential for personalization of training regimens, leading to greater efficiency and lower costs.

Keywords: Learning curve, FLS tasks, machine learning

Introduction

Bimanual motor skill learning is an important aspect of surgical training. Surgeons learn technical skills through repeated practice. Though, most residency programs provide the opportunity to practice skills without ensuring that a certain level of proficiency has been reached. Technical surgical skills have been traditionally assessed using in-training evaluation reports (ITER), procedural-based assessment (PBA) or surgical log books. All these approaches are based on the traditional model of apprenticeship with the faculty responsible for assessing technical proficiency based on direct observation. However, problems including leniency/severity errors, central tendency errors, and “halo effects” associated with such approaches are well-known1,2. Moreover, trainees have inherent differences in initial skill levels and learning rates as well as variations between surgical procedures which limits the effectiveness of the time-limited training approach.

Technical skill testing for certification and competency-based curricular are increasingly becoming popular. Demonstrating proficiency in basic laparoscopic and endoscopic skills is now pre-requisites for certification in general surgery3. Starting in 2018, the fundamentals of laparoscopic surgery (FLS) program is also required for board certification for obstetric and gynecological surgery (www.flsprogram.org). Realizing the inherent problems with the approach of repeated practice, there is significant interest in proficiency-based training47. In this approach, repetition is continued until a certain level of proficiency is achieved. However, the procedure is cumbersome and time consuming as the number of repetitions is not known in advance. To develop structured training programs that account for individual variability in skills and learning abilities, a more personalized method is needed which can predict individual learning curves for any surgical procedure based on initial performance. This requires a deeper understanding of learning curves for surgical skills.

Different techniques to analyze and model the learning curves of surgical procedures have been presented in the literature. A review paper8 has summarized those approaches. First, without any statistical analysis, a simple graph or a table displaying the outcome of the surgery against the number of operations was presented to show the learning curve in a substantial number of studies8. As a higher level of analysis, basic statistical tools such as t-test, analysis of variance, or chi-squared test was applied to two or three groups of data split by a number of practices8. However, the splitting points in these studies were arbitrarily selected, and the underlying curves in each group were not analyzed8.

Besides statistically analyzing the learning curves, analytical modeling methods have also been presented in literature. A commonly used modeling method is to fit a curve to the learning curve data using least squares regression, with or without an adjustment for other confounding factors including age and sex8. Both linear and exponential curves have been used to fit learning curves, without much justification for the choice of these curves8.

The cumulative summation (CUSUM) technique was initially suggested to monitor surgical performance9 but was recently applied to analyze the learning curve of a surgical skill1015. CUSUM has a simple formulation that positive or negative increments are added to a cumulative score according to failure or success of the successive trial16. A graph representation of CUSUM is intuitive in that a declining trend indicates successive success and an increasing trend indicates successive failures16. The two boundary limits h1 and h0 in CUSUM graphs represent whether or not the observed failure rate is significantly different from the desired acceptable failure rate16. The number of trials to achieve proficiency is derived by counting the number of attempts before crossing the boundary limits17. With different specific aims, the design of CUSUM schemes varies across studies. In some studies1820, CUSUM is calculated as the cumulative difference between observed and expected outcome values, such as operation time or blood loss, instead of binary results of success or failure. A change-point in these CUSUM graphs was determined to derive the number of trials to achieve proficiency18,20 or learning phases19. Some studies set the downward part of CUSUM graphs to zero to monitor only failures21.

A unique disadvantage of all these existing approaches is that they require information regarding training that has already taken place. While they are academically of interest, their utility in designing individualized training programs is limited. To the best of our knowledge, there are no approaches in the literature that focus on predicting learning curve features including the final performance level after a certain number of trials. Hence, our goal in this study is to overcome these limitations in the existing literature by testing two hypotheses. First, we hypothesize that the performance of a trainee during the first few trials of a bimanual motor task has sufficient information to predict the number of trials required to achieve proficiency and the final performance level. Secondly, we hypothesize that it is possible to define a single factor that can describe how these parameters including the initial skill level of the trainee are related to each other. To accomplish that, we have performed a meta-analysis of bimanual skill acquisition in pattern cutting task which is a part of the FLS program13,15. Learning data on the physical FLS trainer box as well as on a virtual basic laparoscopic skill trainer (VBLaST) replicating the FLS tasks14,2225 have been utilized.

Materials and methods

Data sources

In this study, we performed a retrospective analysis of the learning curve data from three IRB approved studies13,15,26 The studies were associated with the FLS pattern cutting task which involves cutting a gauze following a marked circle13,15and the FLS intracorporeal suturing26. The three studies were selected because of the similarity in experimental setup and procedures. Novice medical students, with no prior surgical experience, were selected for these studies13,15,26. The training was carried out over a three-week period using the FLS trainer box or VBLaST replicating these tasks. The FLS program has been shown to be reliable to quantify surgical skill level27,28 and metrics for these tasks are established in the surgery literature29 and utilized in Board certification in general surgery. The performance score of each trial was calculated from task performance time and performance error using the accredited FLS scoring methodology with consent under a nondisclosure agreement from the FLS Committee. The metrics for the VBLaST are derived from the FLS metrics and discussed elsewhere23. In both cases, the metrics are aggregated into a final score, either being the FLS score or the VBLaST score. These two scores are the quantities used to build the surgical learning curves. We excluded as outliers those curves that did not exhibit a clear initial learning stage or a learning plateau. Fifteen learning curves were then selected for this study (Table 1).

Table 1.

The source of the data13,15

Study Platform Motor
task
No. of learning
curves
Nemani et al. 201713 FLS* Pattern cutting 4
Nemani et al. 201713 VBLaST* Pattern cutting 6
Linsk et al. 201715 VBLaST* Pattern cutting 2
Fu et al. 201926 FLS* Suturing 3
*

FLS is a physical training box and VBLaST is a virtual reality version of it.

Variables

This study involved three variables: the number of trials required to achieve proficiency, the initial performance level, and the final performance level. We defined the number of trials required to achieve proficiency based on the Technical Skill Proficiency-Based Training Curriculum4 for the FLS program. For the pattern cutting task, proficiency is achieved if the task can be performed within 98 seconds on two consecutive repetitions; for the intracorporeal suture, trainees achieve proficiency when they perform 10 additional trials after two consecutive trials within 112 seconds with allowable errors. A similar definition was employed with the VBLaST data. We defined the initial performance level as the average score of the first three trials. To define the final performance level, multiple two-tailed t-tests were performed between different trial intervals. From the result in Table 2, the fifth 10 trials performance is not significantly different from the fourth 10 trials, but the other pairs of trial intervals are significantly different. This indicates that from the fortieth trial, the performance scores do not significantly change. Thus, we defined the final performance level as the average score after the fortieth trial.

Table 2.

Significant test results between trial intervals

Trial intervals Significant test
First vs. second 10 trials  p = 0.000*
Second vs. third 10 trials  p = 0.004*
Third vs. fourth 10 trials  p = 0.023*
Fourth vs. fifth 10 trials p = 0.205
*

Significance α = 0.05

Multivariate supervised learning model

To test our first hypothesis, we used a multivariate supervised machine learning approach known as kernel partial least squares (KPLS)30,31 to predict the learning curve features including the number of trials to achieve proficiency and the final performance level from the initial learning performance. KPLS first computes a non-linear transformation of the ten initial trials performance scores (denoted as X) from the thirteen learning curve entries into a high dimensional feature space. Once transformed, the projected observations are then linearly regressed to maximize their covariance with the final performance score or the number of trials to achieve proficiency (denoted as y). In this way, the non-linear relationship could be modeled optimally between the input X and the output y.

We used the coefficient of determination (denoted as R2) as defined below to quantify the accuracy of the proposed model:

R2=1SSESST (2)
SSE=i=1n(yiy^i)2n1 (3)
SST=i=1n(yiy¯)2n1 (4)

where yi is the true outcome value, y^i is the predicted outcome value, y¯ is the mean of the true outcome value, and n is the total number of samples. R2 = 1 indicates that all the variance of the data is explained by the model30. Conversely, smaller values or even negative values indicate a poor predictability is considered.

Considering the small sample size of the dataset, we validated our modeling results using the leave-one-out cross-validation (LOOCV) scheme31. In LOOCV, we excluded one learning curve from the training dataset and tested the model on the learning curve left out, repeating the analysis for all learning curves to ensure robustness of the model and to avoid overfitting.

Since the dataset is from subjects practicing on two types of training platforms, a physical FLS trainer box as well as on a VBLaST, it is necessary to validate whether the model works across different platforms. To achieve this, a platform-wise cross-validation scheme was designed. In this scheme, we trained the model on one platform and tested it on the other one to demonstrate the effectiveness of the model across the two platforms.

Wright32 was the first to discuss learning curve modeling, and his log-linear model has been used in prior literature33. Other log-linear based models have also been developed considering different factors affecting learning curves, like the S-Curve model which takes a gradual start-up into consideration33. In the experimental setup in this study, the subjects recruited were novice medical school students with no prior surgical experience, and the training paradigm was consistent across trials. Thus, we adopted the conventional log-linear model as a comparison, given by the following equation:

Y=Y0Nθ (5)

where N is the number of trials, Y is the performance, θ is the learning rate, and Y0 is the initial performance level. The model variables are independently assessed for each subject.

Factor analysis model

Factor analysis is a statistical method to explore latent variables or factors from the observed variables. In our case, we have three observed variables - initial performance level, number of trials to achieve proficiency and final performance level. The question we asked is whether a single variable can be used to represent all three variables. Here, we use a factor analysis model known as the kernel principal component analysis (KPCA) approach31 to extract a representative factor, learning index (LI), from the three learning curve features. We first mapped the three features into a high dimensional non-linear space. Then, we extracted one principal component as LI from these high-dimensional data by principle component analysis (PCA). To test whether LI represents the three learning curve features, we used LI to predict the three features by KPLS. If LI could predict all three features accurately, then the information compressed in LI is enough to represent the three learning curve features.

Unsupervised classification of skill level

An unsupervised learning approach known as k-means clustering analysis34,35 was adopted to separate the trainees according to their different learning curve characteristics. By analyzing the learning curve features of different trainee groups, the learning characteristics of each group were summarized. Furthermore, grouping results derived from LI was compared to the grouping result from all the three features, to see whether LI could indicate unique learning characteristics.

Results

Learning curve data and features

The learning curve data for all subjects are presented (4 trained on FLS physical training box and 9 trained on VBLaST virtual reality trainer) in Figure 1 and the three features calculated from the learning curves are summarized in Table 3.

Figure 1.

Figure 1

The learning curves are from 3 studies: (a) FLS pattern cutting study in Nemani et al. 201713; (b) VBLaST pattern cutting study in Nemani et al. 201713; and (c) VBLaST pattern cutting study in Linsk et al. 201715; (d) FLS intracorporeal suture study in Fu et al. 201926. (The permission to reuse the data has been acquired from the journals).

Table 3.

The learning curve feature values for all subjects (performance levels refer to the FLS/VBLaST scores)

Subject No. Initial
performance
level (average
from trial #1 to
trial #3)
Number of
trials required
to achieve
proficiency
Final
performance level
(average from
trial #40 to trial
#50)
1 13.10 57 65.46
2 4.88 15 83.36
3 2.38 33 72.86
4 18.57 22 69.32
5 26.07 45 70.50
6 34.88 114 66.84
7 16.90 50 62.28
8 25.36 67 59.64
9 26.31 52 68.21
10 42.98 10 74.93
11 21.43 20 80.32
12 40.83 50 71.14
13 16.13 56 83.50
14 0 114 66.77
15 0 97 76.95

Prediction performance

Table 4 represents the R2 values for KPLS and the log-linear models when they are used to predict the number of trials to achieve proficiency and the final performance level based on the performance in initial several trials. The performance of the KPLS model indicates that the initial performance pattern can be used to predict the two learning curve features with a high degree of accuracy. Conversely, using the log-linear model, the resultant R2 values for predicting the number of trials to achieve proficiency are negative and those for predicting the final performance level indicate the need for a considerable number of initial trials to be close to 1.

Table 4.

Accuracy of KPLS and log-linear model

Learning curve variable KPLS log-linear model
Number of trials to achieve proficiency R2 = 0.72 (first 10 trials) R2 = −4.21 (first 50 trials)*
R2 = − 9.27 (first 40 trials)*
R2 = − 15.17(first 30 trials)*
R2 = −49.53 (first 20 trials)*
R2 = − 109.55 (first 10 trials)*
Final performance level R2 = 0.89 (first 10 trials) R2 = 0.76 (first 50 trials)
R2 = 0.58 (first 40 trials)
R2 = 0.37 (first 30 trials)
R2 = − 0.27 (first 20 trials)
R2 = −3.36 (first 10 trials)
*

For first 10 trials, the predicted learning curves of subject 4, 6, 12 and 13 do not achieve proficiency in 1000 trials and are excluded from the R2 calculation; for first 20 trials, the predicted learning curve of subject 6 does not achieve proficiency in 1000 trials and is excluded from the R2 value calculation; for first 30, 40 and 50 trials, the predicted learning curve of subject 14 does not achieve proficiency in 1000 trials and is excluded from the R2 value calculation

The log-linear model performance was further explored with a different number of initial trials for which data were used for model development. One example of the learning curve from subject 1 is shown in Figure 2. The figure shows that the log-linear model works well when the FLS scores for the first 50 trials are known. However, when fewer trials are used as input to the model, the log-linear curve clearly becomes less accurate. This is particularly evident when data from only the first 10 trials are used, and the log-linear curve underestimates the learning effect considerably. The underestimation of the log-linear model also explains why the R2 values for predicting the number of trials to achieve proficiency are all negative using this approach (Table 4). The predicted curves are grossly underestimated, implying that a much larger number of trials is required to achieve proficiency than what is actually needed. Some predicted curves could not achieve proficiency even after 1000 trials and are excluded from the analysis (Table 4). Although the variables being included in this work are very simple, it is shown to be hard to predict those using existing models, such as the log-linear model.

Figure 2.

Figure 2

The performance of the log-linear model developed using the scores for the first 50, 40, 30, 20, and 10 of trials for subject 1. ‘o’s represent the trials for which the scores are assumed to be known; ‘x’s represent the remainder of the trials for which the data are not used in developing the model. The solid line represents the log-linear model.

Considering the two different platforms the subjects practiced on (physical and virtual), platform-wise cross-validation tests were performed on the KPLS model. The results are listed in Table 5. When predicting the number of trials to achieve proficiency and the final performance, the R2 of the KPLS model are all above 0.70 in the platform-wise cross-validation testing. This indicates that the learning curve data patterns are consistent across the physical FLS training box and the VBLaST box, and the datasets could be meta-analyzed.

Table 5.

Cross-validation results of KPLS

Learning curve variable Platform-wise
Trained on FLS platform Trained on the VBLaST platform
But tested on the VBLaST platform But tested on FLS platform
Number of trials to achieve proficiency R2 = 0.72 R2 = 0.78
Final performance level R2 = 0.73 R2 = 0.78

Factor analysis

We derived a single factor, which we referred to as LI, from the three learning curve features (initial performance level, number of trials to achieve proficiency and final performance level), based on the KPCA method. LI is a latent variable which depicts the learning characteristics of the learners. The R2 values when using LI to predict the three learning curve features by a KPLS model are listed in Table 6. Since all the R2 values are above 0.8, the extracted feature could be determined as a representative feature of the three features.

Table 6.

the R2 values using the learning ability to predict the three learning curve features

Case R2
Predicting all the three features 0.92
 Initial performance level 0.87
 Number of trials to achieve proficiency 0.93
 Final performance level 0.94

K-means clustering

Next, we further grouped the subjects by their learning curve features (initial performance level, number of trials to achieve proficiency and final performance level) using the k-means clustering algorithm. Two groups naturally emerged from this analysis: group 1 with subjects 2, 10, 11, and 13 and group 2 consisting of the remaining ones. To understand the implication of this grouping, we plotted the number of trials to achieve proficiency and the final performance level, against the initial performance level in Figure 3(a). The crosses represent subjects in group 1, and the circles represent subjects in group 2. From these plots it is clear that trainees in group 1 have higher initial performance levels, require fewer trials to achieve proficiency and achieve higher final performance levels, while the trainees in group 2 have lower initial performance levels, need more trials to achieve proficiency and achieve lower final performance levels. When we use the same k-means clustering algorithm to group the subjects based on the feature values of LI, the same grouping result is derived as shown in Figure 3(b). The two groups are clearly separable and clustered by the extracted features. This result supports that LI is sufficient to classify learners with unique learning characteristics.

Figure 3.

Figure 3

The trainees were clustered into two groups by the k-means clustering algorithm. ‘x’s represent trainees that are clustered into Group 1 and ‘o’s represent trainees that are clustered into Group 2. The grouping results are from (a) the learning curve features and (b) the extracted factor “learning ability”.

Discussion

Our findings highlight that the use of machine learning enables assessing the performance of a trainee by evaluating his or her performance during the first ten repetitive trials. Based on only ten trials, we can predict (i) the number of trials required to achieve proficiency and (ii) the final performance level, as defined as the average FLS score after fortieth trial, with a high degree of accuracy. Furthermore, a single factor, LI, which we refer to as the learning ability here, can be derived from a non-linear factor analysis model. The single factor describes common variation within these two parameters and the initial performance level. This, in turn, implies that the number of trials required to achieve proficiency, the final performance level and the initial performance are not independent of each other.

These findings are related to earlier work in this area, as the initial learning stages are related to the later stages36,37. Moreover, Jirapinyo et al. 38 showed that a log-linear regression model can describe surgical training learning curves reasonably well when all learning data are used. However, other studies contract this 19,39. With respect to our study, even though we extracted very simple learning curve characteristics, we still found limitation in the use of log-linear models. One potential reason is that the log-linear regression model is derived from group data, but individual learning progress may be distinct from the group trend 40. Another, more intuitive reason is that the log-linear model assumes a predefined form for the learning curves prior to fitting the curves. In sharp contrast, the KPLS model does not assume such an assumption. In addition to that, it is well known that a larger set of variables that are highly correlated or collinear can yield difficulties in identifying regression models. KPLS is a nonlinear regression tool that has been developed in the field of chemometrics with the aim to handle such situations30. Therefore, the KPLS regression model is able to capture the complex process of surgical skill learning in a data-driven format, which is missed by simplistic analytical models including the log-linear model. Once a KPLS regression model has been developed for a surgical task, the trained model may, consequently, be employed to predict the number of trials needed to achieve proficiency by any new learner based on scores of the initial few trials.

Moreover, the use of machine learning for learning curve prediction has the potential to redesign surgical training programs and has implications for skill decay and retraining throughout the entire professional life of surgeons. Predicting the learning curve variables early in the training process would help to provide more focused feedback and implement adaptive learning strategies. The idea of adaptive training is not new and has been suggested in the surgical literature. For example, Stefanidis et al.41 pointed out that by establishing skill learning curves, the training curricula could be tailored to provide additional training to those that need more training than others. Another study42 compared adaptive curricula and volume-based curricula in surgical training and demonstrated that the group trained with adaptive curricula achieved the same level of performance, but required fewer training hours.

Another important finding of our study is that the single factor allows clustering the trainees into two groups based on their distinct learning curve characteristics. These groups have clearly unique characteristics, where the participants in Group 1 had a higher initial ability to carry out the FLS task, showed higher final performance level and did not require a considerable number of trials to converge from the initial to their final performance level. On the other hand, the subjects in Group 2 had a lower initial ability, showed a lower final performance level and required more repetitive trials to reach the final performance level. The surgical motor learning difference between individuals has also been demonstrated and studied in other studies. For example, Louridas et al.43 showed that with the same training curriculum, the participants demonstrated different learning results and could be divided into three groups with top, moderate and low performance. Similar grouping results of surgical residents were reported44. Our study provides a quantitative understanding that individuals with different initial skill levels require different practice to reach a final performance level. Differences in learning characteristics may be due to innate factors including handedness, gender, visual-spatial ability, and confidence level4551 or extrinsic factors including research experience, selection of specialty, and grouping49,52,53.

Our study has several limitations which may suggest conducting further research in this area. First, the learning curves were quantified by task performance scores which are calculated from performance time and performance error. While these same metrics are used in the FLS, there could be other kinematic (e.g., hand trajectory) or physiological (e.g., eye motion or skin conductance) metrics which could provide further information regarding performance. In separate work, we have shown that functional brain imaging that relies on neuro-vascular coupling, provides much more accurate quantification of bimanual motor skill learning than the traditional FLS metrics54,55. Second, the bimanual motor task in this study is limited to the pattern cutting task and intracorporeal suture task. Extending the analysis procedures to other FLS tasks and actual surgical procedures is left for future work. It is important to note that the results reported can assist in planning individualized training regimes. It is not intended to share the predicted number of trials with the learners, as this may affect their performance negatively. In addition to that, physiological measurement such as functional brain imaging could play an important role here to monitor the workload and attention level to determine whether the trainees are trying their best to learn. Finally, data from only a few learning curves have been used in this study. Learning curve studies are inherently difficult to do due to the extended time commitment of the medical students and the limited number of willing participants. As summarized in a comprehensive review paper 56, a sample size of 8–23 is reported in previous simulation-based surgical training studies, indicating the difficulty to recruit participants in multiple days training protocols. We included three of our previous studies collecting the learning curve information to have a sample size of fifteen. To set up a machine learning model based on such a small sample size, we selected KPLS, which is a machine learning method suitable for small sample size. To determine machine learning models for a relatively small sample size, we selected KPLS, which is a kernel-based projection method. Such methods have an extensive record of applications in data chemometrics5759 where small sample sizes are the norm rather than the exception. Moreover, we report results that are based on an independent assessment of the model performance to ensure that the established regression models are not compromised by overfitting. Future multi-center studies may mitigate some of these issues.

In conclusion, we propose the use of sophisticated machine learning models for predicting the learning curve features from the initial few trials of bimanual surgical motor tasks. A single factor, LI, which we define as the learning ability, is able to capture complexities of learning behavior. Use of such models holds the potential for personalization of training regimens, leading to greater efficiency and lower costs.

Acknowledgments

Funding/Support:

We thank the funding provided by NIH/National Institute of Biomedical Imaging and Bioengineering grants 2R01EB005807, 5R01EB010037, 1R01EB009362, 1R01EB014305, and R01EB019443.

Footnotes

COI/Disclosure:

Yuanyuan Gao, Uwe Kruger, Xavier Intes, Steve Schwaitzberg and Suvranu De have no conflicts of interest or financial ties to disclose.

References:

  • 1.Moorthy K, Munz Y. Objective assessment of technical skills in surgery. Br Med J. 2003;327(7422): 1032–1037. doi: 10.1136/bmj.327.7422.1032. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Shah J, Darzi A. Surgical skills assessment: An ongoing debate. BJU Int. 2001;88(7):655–660. doi: 10.1046/j.1464-4096.2001.02424.x. [DOI] [PubMed] [Google Scholar]
  • 3.Soper N, Fried GM. The fundamentals of laparoscopic surgery: its time has come. Bull Am Coll Surg. 2008;93(9):30–32. [PubMed] [Google Scholar]
  • 4.Scott DJ, Ritter EM, Tesfay ST, Pimentel EA, Nagji A, Fried GM. Certification pass rate of 100% for fundamentals of laparoscopic surgery skills after proficiency-based training. Surg Endosc Other Interv Tech. 2008;22(8):1887–1893. doi: 10.1007/s00464-008-9745-y. [DOI] [PubMed] [Google Scholar]
  • 5.Stefanidis D, Korndorffer JR, Sierra R, Touchard C, Dunne JB, Scott DJ. Skill retention following proficiency-based laparoscopic simulator training. Surgery. 2005;138(2):165–170. doi: 10.1016/j.surg.2005.06.002. [DOI] [PubMed] [Google Scholar]
  • 6.Ahlberg G, Enochsson L, Gallagher AG, et al. Proficiency-based virtual reality training significantly reduces the error rate for residents during their first 10 laparoscopic cholecystectomies. Am JSurg. 2007;193(6):797–804. doi: 10.1016/j.amjsurg.2006.06.050. [DOI] [PubMed] [Google Scholar]
  • 7.Sroka G, Feldman LS, Vassiliou MC, Kaneva PA, Fayez R, Fried GM. Fundamentals of Laparoscopic Surgery simulator training to proficiency improves laparoscopic performance in the operating room-a randomized controlled trial. Am J Surg. 2010;199(1):115–120. doi: 10.1016/j.amjsurg.2009.07.035. [DOI] [PubMed] [Google Scholar]
  • 8.Ramsay CR, Grant AM, Wallace SA, Garthwaite PH, Monk AF, Russell IT. Assessment of the learning curve in health technologies - A systematic review. Int J tech. 2000;4:1095–1108. [DOI] [PubMed] [Google Scholar]
  • 9.Steiner SH, Cook RJ, Farewell VT. Monitoring paired binary surgical outcomes using cumulative sum charts. Stat Med. 1999;18(1):69–86. doi:. [DOI] [PubMed] [Google Scholar]
  • 10.Biau DJ, Resche-Rigon M, Godiris-Petit G, Nizard RS, Porcher R. Quality control of surgical and interventional procedures: a review of the CUSUM. Qual Saf Heal Care. 2007;16(3):203–207. doi: 10.1136/qshc.2006.020776. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Young A, Miller JP, Azarow K. Establishing learning curves for surgical residents using Cumulative Summation (CUSUM) Analysis. Curr Surg. 2005;62(3):330–334. doi: 10.1016/j.cursur.2004.09.016. [DOI] [PubMed] [Google Scholar]
  • 12.Bolsin S, Colson M. The use of the cusum technique in the assessment of trainee competence in new procedures. Int J Qual Heal Care. 2000;12(5):433–438. doi: 10.1093/intqhc/12.5.433. [DOI] [PubMed] [Google Scholar]
  • 13.Nemani A, Ahn W, Cooper C, Schwaitzberg S, De S. Convergent validation and transfer of learning studies of a virtual reality-based pattern cutting simulator. Surg Endosc Other Interv Tech. 2017:1–8. doi: 10.1007/s00464-017-5802-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Zhang L, Sankaranarayanan G, Arikatla VS, et al. Characterizing the learning curve of the VBLaST-PT{©} (Virtual Basic Laparoscopic Skill Trainer). Surg Endosc Other Interv Tech. 2013;27(10):3603–3615. doi: 10.1007/s00464-013-2932-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Linsk AM, Monden KR, Sankaranarayanan G, et al. Validation of the VBLaST pattern cutting task: a learning curve study. Surg Endosc Other Interv Tech. 2018;32(4):1990–2002. doi: 10.1007/s00464-017-5895-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Kestin IG. A statistical approach to measuring the competence of anaesthetic trainees at practical procedures. Br J Anaesth. 1995;75(6):805–809. doi: 10.1093/bja/75.6.805. [DOI] [PubMed] [Google Scholar]
  • 17.Biau DJ, Williams SM, Schlup MM, Nizard RS, Porcher R. Quantitative and individualized assessment of the learning curve using LC-CUSUM. Br J Surg. 2008;95(7):925–929. doi: 10.1002/bjs.6056. [DOI] [PubMed] [Google Scholar]
  • 18.Miskovic D, Ni M, Wyles SM, Tekkis P, Hanna GB. Learning curve and case selection in laparoscopic colorectal surgery: systematic review and international multicenter analysis of 4852 cases. Dis Colon Rectum. 2012;55(12):1300–1310. doi: 10.1097/DCR.0b013e31826ab4dd. [DOI] [PubMed] [Google Scholar]
  • 19.Yoshida M, Kakushima N, Mori K, et al. Learning curve and clinical outcome of gastric endoscopic submucosal dissection performed by trainee operators. Surg Endosc Other Interv Tech. 2017;31(9):3614–3622. doi: 10.1007/s00464-016-5393-9. [DOI] [PubMed] [Google Scholar]
  • 20.Wen Z, Liang H, Liang J, Liang Q, Xia H. Evaluation of the learning curve of laparoscopic choledochal cyst excision and Roux-en-Y hepaticojejunostomy in children: CUSUM analysis of a single surgeon’s experience. Surg Endosc Other Interv Tech. 2017;31(2):778–787. doi: 10.1007/s00464-016-5032-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Lim TO, Soraya A, Ding LM, Morad Z. Assessing doctor’s competence: Application of CUSUM technique in monitoring doctors’ performance. Int J Qual Heal Care. 2002;14(3):251–258. doi: 10.1093/oxfordjournals.intqhc.a002616. [DOI] [PubMed] [Google Scholar]
  • 22.Moustris GP, Hiridis SC, Deliparaschos KM, Konstantinidis KM. Evolution of autnomous and semi-autnomous robotic surgical systems: a review of the literature. Int J Med Robot. 2011; 7(April): 375–392. doi: 10.1002/rcs. [DOI] [PubMed] [Google Scholar]
  • 23.Chellali A, Ahn W, Sankaranarayanan G, et al. Preliminary evaluation of the pattern cutting and the ligating loop virtual laparoscopic trainers. Surg Endosc. 2015;29(4):815–821. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Sankaranarayanan G, Lin H, Arikatla VS, et al. Preliminary face and construct validation study of a virtual basic laparoscopic skill trainer. J Laparoendosc Adv Surg Tech. 2010;20(2):153–157. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Arikatla VS, Sankaranarayanan G, Ahn W, et al. Face and construct validation of a virtual peg transfer simulator. Surg Endosc. 2013;27(5):1721–1729. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Fu Y, Cavuoto L, Qi D, et al. Characterizing the learning curve of a virtual intracorporeal suturing simulator VBLaST-SS{©}. Surg Endosc. September 2019:1–10. doi: 10.1007/s00464-019-07081-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Zendejas B, Ruparel RK, Cook DA. Validity evidence for the Fundamentals of Laparoscopic Surgery (FLS) program as an assessment tool: a systematic review. Surg Endosc Other Interv Tech. 2016;30(2):512–520. doi: 10.1007/s00464-015-4233-7. [DOI] [PubMed] [Google Scholar]
  • 28.Zendejas B, Jakub JW, Terando AM, et al. Laparoscopic skill assessment of practicing surgeons prior to enrollment in a surgical trial of a new laparoscopic procedure. Surg Endosc Other Interv Tech. 2017;31(8):3313–3319. doi: 10.1007/s00464-016-5364-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Fraser SA, Klassen DR, Feldman LS, Ghitulescu GA, Stanbridge D, Fried GM. Evaluating laparoscopic skills, setting the pass/fail score for the MISTELS system. Surg Endosc Other Interv Tech. 2003;17(6):964–967. doi: 10.1007/s00464-002-8828-4. [DOI] [PubMed] [Google Scholar]
  • 30.Kim K, Lee JM, Lee IB. A novel multivariate regression approach based on kernel partial least squares with orthogonal signal correction. Chemom Intell Lab Syst. 2005;79(1–2):22–30. doi: 10.1016/j.chemolab.2005.03.003. [DOI] [Google Scholar]
  • 31.Fu Y, Kruger U, Li Z, et al. Cross-validatory framework for optimal parameter estimation of KPCA and KPLS models. Chemom Intell Lab Syst. 2017;167:196–207. doi: 10.1016/j.chemolab.2017.06.007. [DOI] [Google Scholar]
  • 32.Wright TP. Factors affecting the cost of airplanes. J Aeronaut Sci. 1936;3(4):122–128. doi: 10.2514/8.155. [DOI] [Google Scholar]
  • 33.Badiru AB. Computational survey of univariate and multivariate learning curve models. IEEE Trans Eng Manag. 1992;39(2):176–188. doi: 10.1109/17.141275. [DOI] [Google Scholar]
  • 34.Hartigan JA, Wong MA. Algorithm AS 136: A K-Means Clustering Algorithm. J R Stat Soc Ser C (Applied Stat. 1979;28(1):100–108. doi: 10.2307/2346830. [DOI] [Google Scholar]
  • 35.Jain AK, Murty MN, Flynn PJ. Data clustering: a review. ACM Comput Surv. 1999;31(3): 264–323. doi: 10.1145/331499.331504. [DOI] [Google Scholar]
  • 36.Vegter RJK, de Groot S, Lamoth CJ, Veeger DH, van der Woude LH V. Initial skill acquisition of handrim wheelchair propulsion: A new perspective. IEEE Trans Neural Syst RehabilEng. 2014;22(1):104–113. [DOI] [PubMed] [Google Scholar]
  • 37.Vegter RJK, Lamoth CJ, de Groot S, Veeger DHEJ, van der Woude LH V. Inter-individual differences in the initial 80 minutes of motor learning of handrim wheelchair propulsion. PLoS One. 2014;9(2):e89729. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Jirapinyo P, Abidi WM, Aihara H, et al. Preclinical endoscopic training using a part-task simulator: learning curve assessment and determination of threshold score for advancement to clinical endoscopy. Surg Endosc Other Interv Tech. 2017;31(10):4010–4015. doi: 10.1007/s00464-017-5436-x. [DOI] [PubMed] [Google Scholar]
  • 39.Dias JA, Dall’oglio MF, Colombo JR, Coelho RF, Nahas WC. The influence of previous robotic experience in the initial learning curve of laparoscopic radical prostatectomy. Int BrazJ Urol. 2017;43(5):871–879. doi: 10.1590/S1677-5538.IBJU.2016.0526. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Gallistel CR, Fairhurst S, Balsam P. The learning curve: Implications of a quantitative analysis. Proc Natl Acad Sci. 2004;101(36):13124–13131. doi: 10.1073/pnas.0404965101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Stefanidis D, Gardner C, Paige JT, Korndorffer JR, Nepomnayshy D, Chapman D. Multicenter longitudinal assessment of resident technical skills. Am J Surg. 2015;209(1): 120–125. doi: 10.1016/j.amjsurg.2014.09.018. [DOI] [PubMed] [Google Scholar]
  • 42.Hu Y, Brooks KD, Kim H, et al. Adaptive simulation training using cumulative sum: A randomized prospective trial. Am J Surg. 2016;211(2):377–383. doi: 10.1016/j.amjsurg.2015.08.030. [DOI] [PubMed] [Google Scholar]
  • 43.Louridas M, Szasz P, Fecso AB, et al. Practice does not always make perfect: need for selection curricula in modern surgical training. Surg Endosc Other Interv Tech. 2017;31 (9):3718–3727. doi: 10.1007/s00464-017-5572-3. [DOI] [PubMed] [Google Scholar]
  • 44.Grantcharov TP, Funch-Jensen P. Can everyone achieve proficiency with the laparoscopic technique? Learning curve patterns in technical skills acquisition. Am J Surg. 2009; 197(4):447–449. doi: 10.1016/j.amjsurg.2008.01.024. [DOI] [PubMed] [Google Scholar]
  • 45.Elneel FHF, Carter F, Tang B, Cuschieri A. Extent of innate dexterity and ambidexterity across handedness and gender: Implications for training in laparoscopic surgery. Surg Endosc Other Interv Tech. 2008;22(1):31–37. doi: 10.1007/s00464-007-9533-0. [DOI] [PubMed] [Google Scholar]
  • 46.Hughes DT, Forest SJ, Foitl R, Chao E. Influence of medical students’ past experiences and innate dexterity on suturing performance. Am J Surg. 2014;208(2):302–306. doi: 10.1016/j.amjsurg.2013.12.040. [DOI] [PubMed] [Google Scholar]
  • 47.Martin AN, Hu Y, Le IA, et al. Predicting surgical skill acquisition in preclinical medical students. Am J Surg. 2016;212(4):596–601. doi: 10.1016/j.amjsurg.2016.06.024. [DOI] [PubMed] [Google Scholar]
  • 48.Gardner AK, Marks JM, Pauli EM, Majumder A, Dunkin BJ. Changing attitudes and improving skills: demonstrating the value of the SAGES flexible endoscopy course for fellows. Surg Endosc Other Interv Tech. 2017;31(1):147–152. doi: 10.1007/s00464-016-4944-4. [DOI] [PubMed] [Google Scholar]
  • 49.Nomura T, Matsutani T, Hagiwara N, et al. Characteristics predicting laparoscopic skill in medical students: nine years’ experience in a single center. Surg Endosc Other Interv Tech. 2017;32(1):1–9. doi: 10.1007/s00464-017-5643-5. [DOI] [PubMed] [Google Scholar]
  • 50.Ali A, Subhi Y, Ringsted C, Konge L. Gender differences in the acquisition of surgical skills: a systematic review. Surg Endosc Other Interv Tech. 2015;29(11):3065–3073. doi: 10.1007/s00464-015-4092-2. [DOI] [PubMed] [Google Scholar]
  • 51.Mackenzie H, Dixon AR. Proficiency gain curve and predictors of outcome for laparoscopic ventral mesh rectopexy. Surg (United States). 2014; 156(1):158–167. doi: 10.1016/j.surg.2014.03.008. [DOI] [PubMed] [Google Scholar]
  • 52.Berger AP, Giacalone JC, Barlow P, Kapadia MR, Keith JN. Choosing surgery as a career: Early results of a longitudinal study of medical students. Surgery. 2017;161(6):1683–1689. doi: 10.1016/j.surg.2016.12.016. [DOI] [PubMed] [Google Scholar]
  • 53.Roch PJ, Rangnick HM, Brzoska JA, et al. Impact of visual-spatial ability on laparoscopic camera navigation training. Surg Endosc Other Interv Tech. 2017:1–10. doi: 10.1007/s00464-017-5789-1. [DOI] [PubMed] [Google Scholar]
  • 54.Nemani A, Yucel MA, Kruger U, et al. Assessing bimanual motor skills with optical neuroimaging. Sci Adv. 2018;4(10):1–10. doi: 10.1101/204305. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Nemani A, Kruger U, Cooper CA, Schwaitzberg SD, Intes X, De S. Objective assessment of surgical skill transfer using non-invasive brain imaging. Surg Endosc. 2018. doi: 10.1007/s00464-018-6535-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Sturm LP, Windsor JA, Cosman PH, Cregan P, Hewett PJ, Maddern GJ. A systematic review of skills transfer after surgical simulation training. Ann Surg. 2008;248(2):166–179. doi: 10.1097/SLA.0b013e318176bf24. [DOI] [PubMed] [Google Scholar]
  • 57.Li Z, Kruger U, Xie L, Almansoori A, Su H. Adaptive KPCA modeling of nonlinear systems. IEEE Trans Signal Process. 2015;63(9):2364–2376. [Google Scholar]
  • 58.Gutierrez PA, Silva M, Serrano JM, Herva C. Combining classification and regression approaches for the quantification of highly overlapping capillary electrophoresis peaks by using evolutionary sigmoidal and product unit neural networks. J Chemom. 2007;21:567–577. doi: 10.1002/cem. [DOI] [Google Scholar]
  • 59.An Y, Sherman W, Dixon SL. Kernel-based partial least squares: Application to fingerprint-based QSAR with model visualization. J Chem Inf Model. 2013;53(9):2312–2321. doi: 10.1021/ci400250c. [DOI] [PubMed] [Google Scholar]

RESOURCES