Machine-Learning Algorithms Predict Graft Failure After Liver Transplantation

Lawrence Lau; Yamuna Kankanige; Benjamin Rubinstein; Robert Jones; Christopher Christophi; Vijayaragavan Muralidharan; James Bailey

doi:10.1097/TP.0000000000001600

. 2017 Apr 25;101(4):e125–e132. doi: 10.1097/TP.0000000000001600

Machine-Learning Algorithms Predict Graft Failure After Liver Transplantation

Lawrence Lau ¹, Yamuna Kankanige ², Benjamin Rubinstein ², Robert Jones ¹, Christopher Christophi ¹, Vijayaragavan Muralidharan ¹, James Bailey ²

PMCID: PMC7228574 PMID: 27941428

Abstract

Background

The ability to predict graft failure or primary nonfunction at liver transplant decision time assists utilization of scarce resource of donor livers, while ensuring that patients who are urgently requiring a liver transplant are prioritized. An index that is derived to predict graft failure using donor and recipient factors, based on local data sets, will be more beneficial in the Australian context.

Methods

Liver transplant data from the Austin Hospital, Melbourne, Australia, from 2010 to 2013 has been included in the study. The top 15 donor, recipient, and transplant factors influencing the outcome of graft failure within 30 days were selected using a machine learning methodology. An algorithm predicting the outcome of interest was developed using those factors.

Results

Donor Risk Index predicts the outcome with an area under the receiver operating characteristic curve (AUC-ROC) value of 0.680 (95% confidence interval [CI], 0.669-0.690). The combination of the factors used in Donor Risk Index with the model for end-stage liver disease score yields an AUC-ROC of 0.764 (95% CI, 0.756-0.771), whereas survival outcomes after liver transplantation score obtains an AUC-ROC of 0.638 (95% CI, 0.632-0.645). The top 15 donor and recipient characteristics within random forests results in an AUC-ROC of 0.818 (95% CI, 0.812-0.824).

Conclusions

Using donor, transplant, and recipient characteristics known at the decision time of a transplant, high accuracy in matching donors and recipients can be achieved, potentially providing assistance with clinical decision making.

Outcome after liver transplantation depends upon a complex interaction between donor, recipient and process factors. Driven by the disparity between the increasing number of potential transplant recipients and the limited number of suitable organ donors, there is increasing use of organs of marginal quality.1,2 This shift brings into focus, the delicate balance with organ allocation, between organ utility and the potential to cause harm to the recipient. Add to this the significant financial costs and regulatory pressures with each transplant, a quantitative tool which can help the transplant surgeon optimize this decision-making process is urgently required.

Surgeon intuition in the evaluation of donor risk is inconsistent and often inaccurate.3 Scoring indices such as the Donor Risk Index (DRI)4 attempts to quantify the quality of the donor liver based on donor characteristics but include factors which may not be applicable internationally (eg, ethnicity and regional location of donor) and does not include factors which are known to be strong predictors of outcome but may not be consistently appraised (eg, Hepatic steatosis). DRI has not found wide adoption into routine practice.5

Beyond the assessment of donor organ quality is the concept of donor-recipient matching6 to maximize organ utilization while protecting patients from posttransplant complications. Risk scores that use both donor and recipient characteristics such as survival outcomes after liver transplantation (SOFT)7 score have been proposed for this purpose. Theoretically, the success of a transplant may be altered if a given donor organ were transplanted into different recipients. Unfortunately, aside from blood group matching and recipient urgency, currently, there is little that guides this decision, and the ideal donor-recipient matching algorithm6 remains a long-term vision. Attempts to match donors to recipients based on recipient model for end-stage liver disease (MELD) score have had conflicting results.8,9

Machine-learning algorithms can be used to predict the outcome of a new observation, based on a training data set containing previous observations where the outcome is known. They can detect complex nonlinear relationships between numerous variables and are used for predictive applications in a wide range of fields including agriculture, financial markets, search engines, and match-making.10–13 They are also finding increasing application in medicine.14 A machine-learning algorithm, developed from the experience of a particular liver transplant unit, may be able to predict the likelihood of transplant success which is unit-specific and potentially allow for evolving practice.

The objective of this study is to evaluate the utility of machine-learning algorithms, such as random forests and artificial neural networks, to predict outcome based on donor and recipient variables which are known before organ allocation. The performance of these algorithms will be compared against current standards of donor and recipient risk assessment, such as DRI, MELD, and SOFT score in predicting transplant outcome. This risk quantification tool may potentially assist donor-recipient matching, with improved balancing of the considerable risks associated with liver transplantation.

MATERIALS AND METHODS

Study Cohort

This study included the Liver Transplant Database from Austin Health, Melbourne, Australia, from January 1988 to October 2013. Austin Health is 1 of 5 state-based liver transplant units within Australia and serves the population in the States of Victoria and Tasmania. Brain-dead and cardiac death organ donors of whole liver and split liver transplants were included. Transplants involving paediatric recipients (younger than 18 years) and transplants from living-related donors were excluded from the study. Although transplant records are available from 1988, due to the significant number of values not available in the records before 2010 (particularly with the factors used to calculate DRI), only transplants which occurred after January 1, 2010, were included for analysis. Transplants from November 2013 to May 2015 were used for validating the results. This research was approved by the Austin Health Human Research Ethics Committee (Project Number: LNR/14/Austin/368).

Data Set Collation

The prospectively maintained database contains comprehensive information about each transplant including donor factors, transplant factors, recipient factors as well as recipient outcomes. The database was collated into the working data set, with all fields arranged into categorical, ordinal, or continuous variables.

Model Development

Well-known machine learning techniques, such as random forests,15,16 artificial neural networks, and logistic regression, were used for model development.17 However, logistic regression was not used for models with many factors due to its comparatively poor performance during initial testing.

Training and test data sets were created by bootstrap sampling with replacement. In brief, an equivalent number of cases from the original data set were randomly selected with duplicates to create a sample training set. It has been shown in the literature that such a bootstrap sample will contain about 63% unique cases from the original data set.18 The remaining transplants, not included in the training set were allocated as the corresponding test set. This methodology known as out-of-bag error estimation, ensures that there will be no overlaps between the training and test sets,18 and is similar to the leave-one-out bootstrap technique for estimating prediction error.19 This process was then repeated 1000 times to yield a set of 1000 training and corresponding testing data sets. Performances of all the algorithms were evaluated by the average of area under the receiver operating characteristic curve (AUC-ROC) values for the corresponding 1000 testing samples. Random forest (Figure S1, SDC, http://links.lww.com/TP/B381) and artificial neural network (Figure S2, SDC, http://links.lww.com/TP/B381) implementations in Weka data mining software were used for the experiments (SDC, Materials and Methods for further information, http://links.lww.com/TP/B381).

First, random forest algorithms and artificial neural networks were trained using all available characteristics for the 1000 bootstrapped samples.

Next, all the characteristics were ranked per training sample using AUC-ROC based characteristic ranking method, which is suitable for data sets with high number of factors, missing values and imbalanced class sizes.20,21 The implementation on “party package” for R statistical software22 was used for this task. By scoring the characteristics according to their importance per each sample, over the 1000 samples, we determined the overall ranks of the characteristics for our training data.

As the next step, the top 15 factors for each sample were trained and evaluated using the random forests and artificial neural networks. Fifteen was chosen as the number of factors to be considered based on clinical utility. When training random forests, the following standard parameters were used23: 5000 as the number of trees, the square root of the number of available factors as the number of randomly selected factors considered at each decision point. Two hidden layers were used when training artificial neural networks.

Random forests and artificial neural networks with the overall top 15 ranked characteristics were used to determine the performance with the validation data.

Outcome Parameters

The primary outcome parameter used to develop and evaluate the prediction model was graft failure or primary nonfunction, as defined by death or retransplantation, within 30 days of the transplant. As a secondary outcome parameter, the performance of the developed model to predict graft failure at 3 months was evaluated, using a separate validation data set.

DRI

As a comparative predictor of outcome, the DRI was calculated using the definition provided by Feng et al.4 In the data set, some factors required to calculate DRI for a particular donor may not have been recorded. DRI was considered as missing for that record, if any of the factors that are used in DRI were missing: age, cause of death (stroke, anoxia, trauma, other), whether the organ offer is after brain death or cardiac death, height, race (white, African American, other), donor hospital location (local, regional, national), cold ischemia time, partial/split liver. Actual cold ischemia time recorded was used in the calculations. Donor hospital location was assigned as follows: offers from hospitals in Melbourne metropolitan area as local, within Victoria state as regional, and others as national. Logistic regression was used to evaluate the performance of the samples with DRI.

DRI +/− MELD by Random Forest

The coefficients of the factors in DRI were derived in accordance to a Cox linear regression analysis of a large data set from the United States.4 It is possible that if the coefficients were recalculated or used to develop a nonlinear model, the factors considered in DRI may be more specific to the local Australian context. Therefore, a random forest algorithm was developed using the DRI factors to assess their predictive capability.

A further random forest algorithm was developed using the factors required to calculate the DRI and the MELD score. This was an attempt to consider both donor and recipient factors in their contribution to transplant outcome.

SOFT Score

We calculated SOFT score as another comparative predictor of the outcome concerned, using the definition provided by Rana A et al.7 (Table S1, SDC, http://links.lww.com/TP/B381). Portal bleed 48 hours pretransplant was removed from the formula due to its unavailability in the data set. SOFT score was considered as missing for a record, if any of the 18 factors used for SOFT score calculations were missing. Due to the high number of missing values in SOFT score (56%), performance with SOFT score was evaluated using random forests.

Statistical Analysis

The predictive performance of all the models was assessed using AUC-ROC analysis, a measurement of the discriminative ability of the model which is especially suited for imbalanced class classification.24–26 AUC-ROC values vary from 0 to 1, where greater than 0.9 is considered excellent discrimination, greater than 0.75 is considered good discrimination and 0.5 is equivalent to random guessing.24 AUC-ROC values were computed for each of the 1000 sample training/testing data sets and 95% confidence intervals (CIs) were determined.

RESULTS

Data Set Characteristics

The final data set had 180 transplants, including 16 retransplants, with 11 graft failures (6.1%) within 30 days. 276 available donor and recipient characteristics (95 dichotomous, 25 nondichotomous, 156 numerical) were included for characteristic selection, where 32% of the values in the data set were missing values. One hundred seventy-three (173) donor characteristics, including demographic, clinical and logistical information were included. The recipient characteristics used in the study included 103 demographic and pretransplant clinical information. A summary of the donor and recipient demographic and clinical characteristics are shown in Table 1 and the full list of characteristics are given in the SDC, Data Set, http://links.lww.com/TP/B381.

TABLE 1.

Summary of donor and recipient characteristics

graphic file with name tp-101-e125-g001.jpg

Open in a new tab

Algorithm Performances

The ranks of the factors were determined from the sample training data sets using random forest characteristic importance method and the overall top 15 predictive donor and recipient factors were selected.

These donor factors were: cause of death (stroke, anoxia, trauma, other), serum albumin level, donation after brain or cardiac death, the state in which the donor hospital is located, alcohol consumption (no, unknown quantity, <1, 2-4, >4 drinks per day), hemoglobin (Hb) level, total protein level, insulin usage, age, previous surgery, whether pancreas was retrieved concurrently, and donor cytomegalovirus (CMV) status.

The recipient factors were: disease category, medical status at activation (home, frequent hospital care, hospital bound, intensive care unit, ventilated), and serum herpes simplex antibodies. Table 2 provides the ranking of overall top 15 factors with their percentages of missingness in the study and validation data sets. It is noteworthy that most of these top predictors have less missing percentages when compared with the average of 32%.

TABLE 2.

Overall top 15 predictors with the percentage of missing values in the study data and validation data

Open in a new tab

Without characteristic selection, neural networks had an average AUC-ROC of 0.734 (95% CI, 0.729-0.739), whereas random forests achieved 0.787 (95% CI 0.782-0.793). By comparison, when using the top 15 factors of each sample for 30-day graft failure, the predictive ability had an average AUC-ROC value of 0.818 (95% CI, 0.812-0.824) with random forests and 0.835 (95% CI, 0.831-0.840) with neural networks.

The validation data set contained 90 transplants with 3 graft failures within 3 months, which was selected as the outcome for validation due to the lack of graft failures within 30 days. When the performance of the final model with the overall top 15 factors, trained for graft failure at 30 days, was assessed in its prediction ability for graft failure at 3 months, random forests achieved an average AUC-ROC value of 0.715 (95% CI, 0.705-0.724), whereas neural networks yielded 0.559 (95% CI, 0.548-0.569).

DRI, SOFT Score, and DRI +/− MELD by Random Forest Performance

To compare, the DRI for each donor in our data set was calculated with a mean value of 1.56 (±0.37). DRI predicted graft failure within 30 days with an average AUC-ROC value of 0.680 (95% CI, 0.669-0.690). Using DRI trained for graft failure at 30 days, to predict graft failure at 3 months for the validation data set, the average AUC-ROC value was 0.595 (95% CI, 0.587-0.602).

Using the same factors that are used in DRI, we developed a model using Random Forests. This model achieved an average AUC-ROC of 0.697(95% CI, 0.688- 0.705). When MELD scores were added to the DRI factors for Random Forest modeling, a predictive average AUC-ROC of 0.764 (95% CI, 0.756-0.771) was observed.

The SOFT score was also assessed and had a mean value of 5.5 (±4.3). As a predictor for 30-day graft failure, it had average AUC-ROC of 0.638 (95% CI, 0.632-0.645).

A comparison of all the results with the study data set is given in Table 3 and Figure 1.

TABLE 3.

Comparison of AUC-ROC values of different models created during the study

graphic file with name tp-101-e125-g003.jpg

Open in a new tab

FIGURE 1. — ROC curve comparison of different models created during the study.

DISCUSSION

This study is a proof-of-concept that machine-learning algorithms can be an invaluable tool, supporting the decision-making process for liver transplant organ allocation. This is particularly relevant in the current high-stakes environment where suboptimal organ utility leads to either increased waiting list mortality or patient mortality after transplantation.

The results of this study revealed that using 15 of the top-ranking donor and recipient variables available before transplantation were the best predictors of outcome with an average AUC-ROC of 0.818 with the random forest algorithm and 0.835 with artificial neural networks. Both machine learning techniques showed significant improvements in AUC-ROC with characteristic selection. This was followed by training the random forest classifier with the variables used to calculate DRI plus MELD score (AUC-ROC, 0.764). Using the random forest classifier with the factors used to calculate DRI improved the discrimination of DRI from 0.680 to 0.697. SOFT score achieved an average AUC-ROC of 0.638. Assessing the predictive accuracy of the final models with top 15 factors, as trained for 30 day outcome, for graft failure at 3 months, the AUC-ROC value decreased from 0.818 to 0.715 with random forests and 0.835 to 0.559 with neural networks. By comparison, DRI prediction of 3 month graft failure was 0.595.

There are many machine-learning paradigms, of which 2 of the most widely used are artificial neural networks and random forest classifiers. In a recent landmark article where the performance of 179 different machine-learning classifiers were used to classify all 121 data sets, representing the entire University of California Irvine Machine Learning Repository, random forest classifiers were found to be the most accurate.27 There are 4 reports using artificial neural networks to predict transplant outcome in literature.28–31 The present study is the first report using a random forest machine-learning algorithm for predicting outcome after liver transplantation.

There are multiple theoretical advantages with the use of random forest algorithms in this application. It is well known in machine learning literature that artificial neural networks are prone to overfitting and learning noise in data, resulting in unstable models with poor generalization ability.32–35 However, by design, random forest classifiers are less prone to overfitting producing more stable models.36–38 In medical data sets, there is frequently a large degree of missing data because the data are often not collected for research purposes, and some tests are not routinely performed even though they may be highly prognostic (eg, donor liver biopsy for assessment of steatosis). Simply excluding these cases may bias the results due to the fact that the “missingness” of the data is not completely at random.39,40 Random forest algorithms are superior in handling data sets missing a significant proportion of input data such as with this study.41 Furthermore, although artificial neural networks are essentially, a “black-box” into which data are inputted and a prediction is outputted, the characteristic importance measure with random forest can indicate the importance of each variable in the data set thereby improving the transparency of the algorithm.38,41,42

Myriad factors interact to influence liver transplant including donor, recipient, and locally specific transplant factors. There have been many attempts to predict graft failure, after liver transplant in literature.7,8,43–48 Some studies looked at predicting graft failure using either donor factors, recipient factors,43 or a combination of both.7,8,45–48 However, these approaches have all failed to gain greater adaptability because they are developed from patient populations which may not be generalizable to other centers due to regional differences in patient, donor or process factors, or changes in practice since their development.5,6 Furthermore, they are calculated from simple multiple regression statistical models which assumes the linear influence of different variables. A predictive model required to enable effective organ allocation needs to be locally and temporally applicable, and account for the complex interactions within the data available before transplantation.

Currently, decisions for organ allocation are largely subjective or based on a recipient “sickest-first” or “waiting-time” approach rather than an outcome-based approach. Machine-learning algorithms are increasingly used for modern clinical decision-making. Compared with current methods, they are data driven, able to accommodate numerous interdependent variables, and specific to the population from which they were trained on. In addition, compared with static indices, they are dynamic, able to “learn” case-by-case with the expansion of the training set.

Using characteristic importance measure, the most influential donor and recipient variables were determined. Most of these factors, such as donor age, whether the offer is after brain death or cardiac death, donor cause of death, donor hospital state (geographical distance), donor alcohol consumption, recipient disease category, and medical status at activation are already known as important factors.4,45,49,50 Donor Hb, protein level and insulin usage were also top-ranking predictive characteristics which make sense clinically. Donor CMV and recipient herpes simplex virus status were also predictive, and although less intuitive, has been shown to be associated with acute viral infection and rejection.51,52 Interestingly, the decision to retrieve the pancreas for islet cell or whole organ transplant was also a top-ranking factor, although the decisions to retrieve kidneys, lungs, or heart were not significant factors. This is likely because the decision for pancreas retrieval is usually more stringent, requiring more ideal donor conditions.

This study highlights the importance of characteristic selection and tailoring in predictive modeling. The predictive accuracy of the well-known DRI was improved when tailored to the specific influences at the Austin Health Liver Transplant Unit. Accuracy was further improved with the addition of recipient MELD characteristic with the best accuracy found with the application of a unit-specific Random Forest algorithm using the top-ranking predictive factors.

The main limitations of machine-learning algorithms are that they are best suited to predicting outcome in the environment from which they are derived. Conversely, this limitation is also its strength, in that it is highly specific to the peculiarities of a particular transplant centre, enabling the best decision for each individual transplant. Therefore, although it is not ideal to export a trained algorithm from 1 transplant center to the next, certainly, the approach, with an algorithm tailored to each transplant center is possible. A further limitation of this algorithm is that although it is trained to predict 30-day graft failure, its predictive accuracy may not extend to other important liver transplant outcomes, such as 3-, 6-, or 12-month graft failure, early graft dysfunction, acute/chronic rejection, infections, immunosuppression, or late biliary strictures. Each of these outcomes might require a separately trained algorithm.

A limitation of this study is that the machine-learning algorithm was derived from an observational database. While the bootstrapping with replacement methodology is well validated for the development of robust predictive machine-learning models,53,54 and our attempts to predict a 3-month graft failure for a separate validation data set looks promising, prospective validation for 30-day graft failure would be valuable to confirm the predictive ability.

This study confirms that machine-learning algorithms based on donor and recipient variables which are known before organ allocation can be utilized to predict transplant outcomes. This approach may be used as a tool for transplant surgeons to improve organ allocation decisions. The ability to quantify risk may allow for improved confidence with the use of marginal organs and better outcome after transplantation.

ACKNOWLEDGMENTS

The authors gratefully acknowledge Angela Li, and the staff of the Liver Transplant Unit at Austin Hospital for their invaluable support for this study.

Footnotes

L.L. and Y.K. are joint first authors.

V.M. and J.B. are joint last authors.

The authors were supported in part by the Royal Australasian College of Surgeons Surgeon Scientist Research Scholarship, the Avant Doctor in Training Research Scholarship and the Australian Postgraduate Award.

The authors declare no conflicts of interest.

L.L. participated in research design, data collection, and article writing. Y.K. participated in research design, data analysis, and article writing. B.R. participated in research design, data analysis, and article revision. R.J. participated in research design and article revision. C.C. participated in research design and article revision. V.M. participated in research design and article revision. J.B. participated in research design, data analysis, and article revision.

Correspondence: Lawrence Lau, MBBS, FRACS, Department of Surgery, Austin Hospital Heidelberg, Melbourne, Australia. (thelau@gmail.com).

Supplemental digital content (SDC) is available for this article. Direct URL citations appear in the printed text, and links to the digital files are provided in the HTML text of this article on the journal’s Web site (www.transplantjournal.com).

The authors report an algorithm based on 15 of the top-ranking donor and recipient variables available prior to transplantation for predicting outcome following liver transplantation using a random forest machine learning. Supplemental digital content is available in the text.

REFERENCES

1.Busuttil RW, Tanaka K. The utility of marginal donors in liver transplantation Liver Transpl 2003. 9651–663 [DOI] [PubMed] [Google Scholar]
2.Tector AJ, Mangus RS, Chestovich P. Use of extended criteria livers decreases wait time for liver transplantation without adversely impacting posttransplant survival Ann Surg 2006. 244439–450 [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Volk ML, Roney M, Merion RM. Systematic bias in surgeons’ predictions of the donor-specific risk of liver transplant graft failure Liver Transpl 2013. 19987–990 [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Feng S, Goodrich N, Bragg-Gresham JL. Characteristics associated with liver graft failure: the concept of a Donor Risk Index Am J Transplant 2006. 6783–790 [DOI] [PubMed] [Google Scholar]
5.Mataya L, Aronsohn A, Thistlethwaite JR. Decision making in liver transplantation—limited application of the liver Donor Risk Index Liver Transpl 2014. 20831–837 [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Briceño J, Ciria R, de la Mata M. Donor-recipient matching: myths and realities J Hepatol 2013. 58811–820 [DOI] [PubMed] [Google Scholar]
7.Rana A, Hardy MA, Halazun KJ. Survival outcomes following liver transplantation (SOFT) score: a novel method to predict patient survival following liver transplantation Am J Transplant 2008. 82537–2546 [DOI] [PubMed] [Google Scholar]
8.Halldorson JB, Bakthavatsalam R, Fix O. D-MELD, a simple predictor of post liver transplant mortality for optimization of donor/recipient matching Am J Transplant 2009. 9318–326 [DOI] [PubMed] [Google Scholar]
9.Croome K, Marotta P, Wall W. Should a lower quality organ go to the least sick patient? Model for End-Stage Liver Disease score and Donor Risk Index as predictors of early allograft dysfunction Transplant Proc 2012. 441303–6 [DOI] [PubMed] [Google Scholar]
10.Feyyad U. Data mining and knowledge discovery: making sense out of data IEEE 1996. 1120–25 [Google Scholar]
11.Kaur M, Gulati H, Kundra H. Data mining in agriculture on crop price prediction: techniques and applications Intl J Comput Appl 2014. 991–3 [Google Scholar]
12.Joachims T. Optimizing search engines using clickthrough data. Paper presented at: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining 2002. [Google Scholar]
13.Langley P. Machine learning for adaptive user interfaces. Paper presented at: KI-97: Advances in artificial intelligence 1997. [Google Scholar]
14.Kononenko I. Machine learning for medical diagnosis: history, state of the art and perspective Artif Intell Med 2001. 2389–109 [DOI] [PubMed] [Google Scholar]
15.Breiman L. Random forests Machine Learning 2001. 455–32 [Google Scholar]
16.Liaw A, Wiener M. Classification and regression by Random Forest R News 2002. 218–22 [Google Scholar]
17.Bellazzi R, Zupan B. Predictive data mining in clinical medicine: current issues and guidelines Int J Med Inform 2008. 7781–97 [DOI] [PubMed] [Google Scholar]
18.Breiman L. Out-of-bag estimation. Berkeley CA 94708: Statistics Department, University of California Berkeley; 1996. [Google Scholar]
19.Efron B. Estimating the error rate of a prediction rule: improvement on cross-validation J Am Stat Assoc 1983. 78316–331 [Google Scholar]
20.Janitza S, Strobl C, Boulesteix AL. An AUC-based permutation variable importance measure for random forests. BMC bioinformatics. 2013;14:119. doi: 10.1186/1471-2105-14-119. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Hapfelmeier A, Hothorn T, Ulm K. A new variable importance measure for random forests with missing data Stat Comput 2014. 2421–34 [Google Scholar]
22.Hothorn T, Hornik K, Strobl C, et al. Party: a laboratory for recursive partytioning. 2010. [Google Scholar]
23.Lunetta KL, Hayward LB, Segal J. Screening large-scale association study data: exploiting interactions using random forests. BMC Genet. 2004;5:32. doi: 10.1186/1471-2156-5-32. [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Ray P, Le Manach Y, Riou B. Statistical evaluation of a biomarker Anesthesiology 2010. 1121023–1040 [DOI] [PubMed] [Google Scholar]
25.Bradley AP. The use of the area under the ROC curve in the evaluation of machine learning algorithms Pattern Recog 1997. 301145–1159 [Google Scholar]
26.Steyerberg EW, Vickers AJ, Cook NR. Assessing the performance of prediction models: a framework for some traditional and novel measures. Epidemiology. 2010;21:128. doi: 10.1097/EDE.0b013e3181c30fb2. [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Fernández-Delgado M, Cernadas E, Barro S. Do we need hundreds of classifiers to solve real world classification problems? J Mach Learn Res 2014. 153133–3181 [Google Scholar]
28.Dvorchik I, Subotin M, Marsh W. Performance of multi-layer feedforward neural networks to predict liver transplantation outcome Methods Inf Med 1996. 3512–18 [PubMed] [Google Scholar]
29.Matis S, Doyle H, Marino I, et al. Use of neural networks for prediction of graft failure following liver transplantation. Paper presented at: Computer-Based Medical Systems, 1995. Proceedings of the Eighth IEEE Symposium on 1995. [Google Scholar]
30.Briceño J, Cruz-Ramírez M, Prieto M. Use of artificial intelligence as an innovative donor-recipient matching model for liver transplantation: results from a multicenter Spanish study J Hepatol 2014. 611020–1028 [DOI] [PubMed] [Google Scholar]
31.Cruz-Ramírez M, Hervás-Martínez C, Fernández JC. Predicting patient survival after liver transplantation using evolutionary multi-objective artificial neural networks Artif Intell Med 2013. 5837–49 [DOI] [PubMed] [Google Scholar]
32.Cheng B, Titterington DM. Neural networks: a review from a statistical perspective. Statistical science. 1994:2–30. [Google Scholar]
33.Gardner MW, Dorling S. Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences Atmos Environ 1998. 322627–2636 [Google Scholar]
34.Adya M, Collopy F. How effective are neural networks at forecasting and prediction? A review and evaluation J Forecasting 1998. 17481–495 [Google Scholar]
35.Zhang GP. Neural networks for classification: a survey Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on 2000. 30451–462 [Google Scholar]
36.Anaissi A, Kennedy PJ, Goyal M. A balanced iterative random forest for gene selection from microarray data. BMC Bioinformatics. 2013;14:261. doi: 10.1186/1471-2105-14-261. [DOI] [PMC free article] [PubMed] [Google Scholar]
37.Amaratunga D, Cabrera J, Lee YS. Enriched random forests Bioinformatics 2008. 242010–2014 [DOI] [PubMed] [Google Scholar]
38.Cutler DR, Edwards TC, Beard KH. Random forests for classification in ecology Ecology 2007. 882783–2792 [DOI] [PubMed] [Google Scholar]
39.Acuna E, Rodriguez C. The treatment of missing values and its effect on classifier accuracy. In: Classification, clustering, and data mining applications. Springer; 2004:639–647. [Google Scholar]
40.Schafer JL, Graham JW. Missing data: our view of the state of the art. Psychol Methods. 2002;7:147. [PubMed] [Google Scholar]
41.Pantanowitz A, Marwala T. Missing data imputation through the use of the Random Forest Algorithm. In: Advances in Computational Intelligence. Springer; 2009:53–62. [Google Scholar]
42.Ball RL, Tissot P, Zimmer B, et al. Comparison of random forest, artificial neural network, and multi-linear regression: a water temperature prediction case. In: Paper presented at: Seventh Conference on Artificial Intelligence and its Applications to the Environmental Sciences. New Orleans, LA: 2009. [Google Scholar]
43.Desai NM, Mange KC, Crawford MD. Predicting outcome after liver transplantation: utility of the model for end-stage liver disease and a newly derived discrimination function1 Transplantation 2004. 7799–106 [DOI] [PubMed] [Google Scholar]
44.Avolio AW, Siciliano M, Barbarino R. Donor Risk Index and organ patient index as predictors of graft survival after liver transplantation Transplant Proc 2008. 401899–902 [DOI] [PubMed] [Google Scholar]
45.Ioannou GN. Development and validation of a model predicting graft survival after liver transplantation Liver Transpl 2006. 121594–1606 [DOI] [PubMed] [Google Scholar]
46.Amin MG, Wolf MP, TenBrook JA. Expanded criteria donor grafts for deceased donor liver transplantation under the MELD system: a decision analysis Liver Transpl 2004. 101468–1475 [DOI] [PubMed] [Google Scholar]
47.Avolio AW, Cillo U, Salizzoni M. Balancing donor and recipient risk factors in liver transplantation: the value of D-MELD with particular reference to HCV recipients Am J Transplant 2011. 112724–2736 [DOI] [PubMed] [Google Scholar]
48.Dutkowski P, Oberkofler CE, Slankamenac K. Are there better guidelines for allocation in liver transplantation? A novel score targeting justice and utility in the model for end-stage liver disease era Ann Surg 2011. 254745–753 [DOI] [PubMed] [Google Scholar]
49.Mateo R, Cho Y, Singh G. Risk factors for graft survival after liver transplantation from donation after cardiac death donors: an analysis of OPTN/UNOS data Am J Transplant 2006. 6791–796 [DOI] [PubMed] [Google Scholar]
50.Moore DE, Feurer ID, Speroff T. Impact of donor, technical, and recipient risk factors on survival and quality of life after liver transplantation Arch Surg 2005. 140273–277 [DOI] [PubMed] [Google Scholar]
51.Linares L, Sanclemente G, Cervera C. Influence of cytomegalovirus disease in outcome of solid organ transplant patients Transplant Proc 2011. 432145–8 [DOI] [PubMed] [Google Scholar]
52.Pedersen M, Seetharam A. Infections after orthotopic liver transplantation J Clin Exp Hepatol 2014. 4347–360 [DOI] [PMC free article] [PubMed] [Google Scholar]
53.Breiman L. Out-of-bag estimation. Citeseer; 1996. [Google Scholar]
54.Austin PC, Tu JV. Bootstrap methods for developing predictive models Am Statist 2004. 58131–137 [Google Scholar]

[R1] 1.Busuttil RW, Tanaka K. The utility of marginal donors in liver transplantation Liver Transpl 2003. 9651–663 [DOI] [PubMed] [Google Scholar]

[R2] 2.Tector AJ, Mangus RS, Chestovich P. Use of extended criteria livers decreases wait time for liver transplantation without adversely impacting posttransplant survival Ann Surg 2006. 244439–450 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] 3.Volk ML, Roney M, Merion RM. Systematic bias in surgeons’ predictions of the donor-specific risk of liver transplant graft failure Liver Transpl 2013. 19987–990 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R4] 4.Feng S, Goodrich N, Bragg-Gresham JL. Characteristics associated with liver graft failure: the concept of a Donor Risk Index Am J Transplant 2006. 6783–790 [DOI] [PubMed] [Google Scholar]

[R5] 5.Mataya L, Aronsohn A, Thistlethwaite JR. Decision making in liver transplantation—limited application of the liver Donor Risk Index Liver Transpl 2014. 20831–837 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R6] 6.Briceño J, Ciria R, de la Mata M. Donor-recipient matching: myths and realities J Hepatol 2013. 58811–820 [DOI] [PubMed] [Google Scholar]

[R7] 7.Rana A, Hardy MA, Halazun KJ. Survival outcomes following liver transplantation (SOFT) score: a novel method to predict patient survival following liver transplantation Am J Transplant 2008. 82537–2546 [DOI] [PubMed] [Google Scholar]

[R8] 8.Halldorson JB, Bakthavatsalam R, Fix O. D-MELD, a simple predictor of post liver transplant mortality for optimization of donor/recipient matching Am J Transplant 2009. 9318–326 [DOI] [PubMed] [Google Scholar]

[R9] 9.Croome K, Marotta P, Wall W. Should a lower quality organ go to the least sick patient? Model for End-Stage Liver Disease score and Donor Risk Index as predictors of early allograft dysfunction Transplant Proc 2012. 441303–6 [DOI] [PubMed] [Google Scholar]

[R10] 10.Feyyad U. Data mining and knowledge discovery: making sense out of data IEEE 1996. 1120–25 [Google Scholar]

[R11] 11.Kaur M, Gulati H, Kundra H. Data mining in agriculture on crop price prediction: techniques and applications Intl J Comput Appl 2014. 991–3 [Google Scholar]

[R12] 12.Joachims T. Optimizing search engines using clickthrough data. Paper presented at: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining 2002. [Google Scholar]

[R13] 13.Langley P. Machine learning for adaptive user interfaces. Paper presented at: KI-97: Advances in artificial intelligence 1997. [Google Scholar]

[R14] 14.Kononenko I. Machine learning for medical diagnosis: history, state of the art and perspective Artif Intell Med 2001. 2389–109 [DOI] [PubMed] [Google Scholar]

[R15] 15.Breiman L. Random forests Machine Learning 2001. 455–32 [Google Scholar]

[R16] 16.Liaw A, Wiener M. Classification and regression by Random Forest R News 2002. 218–22 [Google Scholar]

[R17] 17.Bellazzi R, Zupan B. Predictive data mining in clinical medicine: current issues and guidelines Int J Med Inform 2008. 7781–97 [DOI] [PubMed] [Google Scholar]

[R18] 18.Breiman L. Out-of-bag estimation. Berkeley CA 94708: Statistics Department, University of California Berkeley; 1996. [Google Scholar]

[R19] 19.Efron B. Estimating the error rate of a prediction rule: improvement on cross-validation J Am Stat Assoc 1983. 78316–331 [Google Scholar]

[R20] 20.Janitza S, Strobl C, Boulesteix AL. An AUC-based permutation variable importance measure for random forests. BMC bioinformatics. 2013;14:119. doi: 10.1186/1471-2105-14-119. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] 21.Hapfelmeier A, Hothorn T, Ulm K. A new variable importance measure for random forests with missing data Stat Comput 2014. 2421–34 [Google Scholar]

[R22] 22.Hothorn T, Hornik K, Strobl C, et al. Party: a laboratory for recursive partytioning. 2010. [Google Scholar]

[R23] 23.Lunetta KL, Hayward LB, Segal J. Screening large-scale association study data: exploiting interactions using random forests. BMC Genet. 2004;5:32. doi: 10.1186/1471-2156-5-32. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R24] 24.Ray P, Le Manach Y, Riou B. Statistical evaluation of a biomarker Anesthesiology 2010. 1121023–1040 [DOI] [PubMed] [Google Scholar]

[R25] 25.Bradley AP. The use of the area under the ROC curve in the evaluation of machine learning algorithms Pattern Recog 1997. 301145–1159 [Google Scholar]

[R26] 26.Steyerberg EW, Vickers AJ, Cook NR. Assessing the performance of prediction models: a framework for some traditional and novel measures. Epidemiology. 2010;21:128. doi: 10.1097/EDE.0b013e3181c30fb2. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R27] 27.Fernández-Delgado M, Cernadas E, Barro S. Do we need hundreds of classifiers to solve real world classification problems? J Mach Learn Res 2014. 153133–3181 [Google Scholar]

[R28] 28.Dvorchik I, Subotin M, Marsh W. Performance of multi-layer feedforward neural networks to predict liver transplantation outcome Methods Inf Med 1996. 3512–18 [PubMed] [Google Scholar]

[R29] 29.Matis S, Doyle H, Marino I, et al. Use of neural networks for prediction of graft failure following liver transplantation. Paper presented at: Computer-Based Medical Systems, 1995. Proceedings of the Eighth IEEE Symposium on 1995. [Google Scholar]

[R30] 30.Briceño J, Cruz-Ramírez M, Prieto M. Use of artificial intelligence as an innovative donor-recipient matching model for liver transplantation: results from a multicenter Spanish study J Hepatol 2014. 611020–1028 [DOI] [PubMed] [Google Scholar]

[R31] 31.Cruz-Ramírez M, Hervás-Martínez C, Fernández JC. Predicting patient survival after liver transplantation using evolutionary multi-objective artificial neural networks Artif Intell Med 2013. 5837–49 [DOI] [PubMed] [Google Scholar]

[R32] 32.Cheng B, Titterington DM. Neural networks: a review from a statistical perspective. Statistical science. 1994:2–30. [Google Scholar]

[R33] 33.Gardner MW, Dorling S. Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences Atmos Environ 1998. 322627–2636 [Google Scholar]

[R34] 34.Adya M, Collopy F. How effective are neural networks at forecasting and prediction? A review and evaluation J Forecasting 1998. 17481–495 [Google Scholar]

[R35] 35.Zhang GP. Neural networks for classification: a survey Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on 2000. 30451–462 [Google Scholar]

[R36] 36.Anaissi A, Kennedy PJ, Goyal M. A balanced iterative random forest for gene selection from microarray data. BMC Bioinformatics. 2013;14:261. doi: 10.1186/1471-2105-14-261. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R37] 37.Amaratunga D, Cabrera J, Lee YS. Enriched random forests Bioinformatics 2008. 242010–2014 [DOI] [PubMed] [Google Scholar]

[R38] 38.Cutler DR, Edwards TC, Beard KH. Random forests for classification in ecology Ecology 2007. 882783–2792 [DOI] [PubMed] [Google Scholar]

[R39] 39.Acuna E, Rodriguez C. The treatment of missing values and its effect on classifier accuracy. In: Classification, clustering, and data mining applications. Springer; 2004:639–647. [Google Scholar]

[R40] 40.Schafer JL, Graham JW. Missing data: our view of the state of the art. Psychol Methods. 2002;7:147. [PubMed] [Google Scholar]

[R41] 41.Pantanowitz A, Marwala T. Missing data imputation through the use of the Random Forest Algorithm. In: Advances in Computational Intelligence. Springer; 2009:53–62. [Google Scholar]

[R42] 42.Ball RL, Tissot P, Zimmer B, et al. Comparison of random forest, artificial neural network, and multi-linear regression: a water temperature prediction case. In: Paper presented at: Seventh Conference on Artificial Intelligence and its Applications to the Environmental Sciences. New Orleans, LA: 2009. [Google Scholar]

[R43] 43.Desai NM, Mange KC, Crawford MD. Predicting outcome after liver transplantation: utility of the model for end-stage liver disease and a newly derived discrimination function1 Transplantation 2004. 7799–106 [DOI] [PubMed] [Google Scholar]

[R44] 44.Avolio AW, Siciliano M, Barbarino R. Donor Risk Index and organ patient index as predictors of graft survival after liver transplantation Transplant Proc 2008. 401899–902 [DOI] [PubMed] [Google Scholar]

[R45] 45.Ioannou GN. Development and validation of a model predicting graft survival after liver transplantation Liver Transpl 2006. 121594–1606 [DOI] [PubMed] [Google Scholar]

[R46] 46.Amin MG, Wolf MP, TenBrook JA. Expanded criteria donor grafts for deceased donor liver transplantation under the MELD system: a decision analysis Liver Transpl 2004. 101468–1475 [DOI] [PubMed] [Google Scholar]

[R47] 47.Avolio AW, Cillo U, Salizzoni M. Balancing donor and recipient risk factors in liver transplantation: the value of D-MELD with particular reference to HCV recipients Am J Transplant 2011. 112724–2736 [DOI] [PubMed] [Google Scholar]

[R48] 48.Dutkowski P, Oberkofler CE, Slankamenac K. Are there better guidelines for allocation in liver transplantation? A novel score targeting justice and utility in the model for end-stage liver disease era Ann Surg 2011. 254745–753 [DOI] [PubMed] [Google Scholar]

[R49] 49.Mateo R, Cho Y, Singh G. Risk factors for graft survival after liver transplantation from donation after cardiac death donors: an analysis of OPTN/UNOS data Am J Transplant 2006. 6791–796 [DOI] [PubMed] [Google Scholar]

[R50] 50.Moore DE, Feurer ID, Speroff T. Impact of donor, technical, and recipient risk factors on survival and quality of life after liver transplantation Arch Surg 2005. 140273–277 [DOI] [PubMed] [Google Scholar]

[R51] 51.Linares L, Sanclemente G, Cervera C. Influence of cytomegalovirus disease in outcome of solid organ transplant patients Transplant Proc 2011. 432145–8 [DOI] [PubMed] [Google Scholar]

[R52] 52.Pedersen M, Seetharam A. Infections after orthotopic liver transplantation J Clin Exp Hepatol 2014. 4347–360 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R53] 53.Breiman L. Out-of-bag estimation. Citeseer; 1996. [Google Scholar]

[R54] 54.Austin PC, Tu JV. Bootstrap methods for developing predictive models Am Statist 2004. 58131–137 [Google Scholar]

PERMALINK

Machine-Learning Algorithms Predict Graft Failure After Liver Transplantation

Lawrence Lau, MBBS, FRACS

Yamuna Kankanige, BScEng

Benjamin Rubinstein, PhD

Robert Jones, MBChB, FRACS

Christopher Christophi, MD, FRACS

Vijayaragavan Muralidharan, PhD, FRACS

James Bailey, PhD