Skip to main content
PLOS ONE logoLink to PLOS ONE
. 2020 Nov 5;15(11):e0242008. doi: 10.1371/journal.pone.0242008

Measuring adverse events following hip arthroplasty surgery using administrative data without relying on ICD-codes

Martin Magnéli 1,*, Maria Unbeck 2, Cecilia Rogmark 3, Olof Sköldenberg 1, Max Gordon 1
Editor: Liza N Van Steenbergen4
PMCID: PMC7644076  PMID: 33152055

Abstract

Introduction

Measure and monitor adverse events (AEs) following hip arthroplasty is challenging. The aim of this study was to create a model for measuring AEs after hip arthroplasty using administrative data, such as length of stay and readmissions, with equal or better precision than an ICD-code based model.

Materials and methods

This study included 1 998 patients operated with an acute or elective hip arthroplasty in a national multi-centre study. We collected AEs within 90 days following surgery with retrospective record review. Additional data came from the Swedish Hip Arthroplasty Register, the Swedish National Patient Register and the Swedish National Board of Health and Welfare. We made a 2:1 split of the data into a training and a holdout set. We used the training set to train different machine learning models to predict if a patient had sustained an AE or not. After training and cross-validation we tested the best performing model on the holdout-set. We compared the results with an established ICD-code based measure for AEs.

Results

The best performing model was a logistic regression model with four natural age splines. The variables included in the model were as follows: length of stay at the orthopaedic department, discharge to acute care, age, number of readmissions and ED visits. The sensitivity and specificity for the new model was 23 and 90% for AE within 30 days, compared with 5 and 94% for the ICD-code based model. For AEs within 90 days the sensitivity and specificity were 31% and 89% compared with 16% and 92% for the ICD-code based model.

Conclusion

We conclude that a prediction model for AEs following hip arthroplasty surgery, relying on administrative data without ICD-codes is more accurate than a model based on ICD-codes.

Introduction

Hip arthroplasty surgery improves the quality of life for more than one million patients each year worldwide and is generally considered a safe procedure [1]. However, some patients will sustain adverse events (AEs) during or following the surgery. Rates of AEs following hip arthroplasty surgery are between 3%– 27%, depending on patient selection, measuring method and AE definition [24].

AEs cause both suffering for the patients and expenses for the healthcare. In a study by Culler et al. the mean cost for a hip arthroplasty surgery without an AE was $15 600 and for a surgery with any AE was $19 000 [5]. The cost for a surgery with ≥ 3AEs was $42 900. As bundled payments and pay-per-performance are becoming more commonplace, the importance of adequate AE identification become vital from more than just a patient perspective.

Identifying and monitoring AEs is challenging. In Sweden, there have been attempts at comparing hospitals using incidence of AEs and other quality indicators. The AEs have been measured through ICD-10 codes related to readmissions in the National Patient Register (NPR). As we previously have shown the accuracy for this method is very low [6]. In addition, the reliance of ICD-codes risks introducing a coding bias in the national databases. This is well-known when diagnostic and procedural codes are connected to reimbursement [7]. In the Medicare system, self-reporting of hospital-acquired infections were biased by upcoding (mis-reporting of AEs to increase reimbursement or avoiding penalties, also known as DRG-creep) when the reporting of many infections would lead to financial penalties [8].

The aim of this study was to create a model for measuring AEs after hip arthroplasty relying on administrative data, such as length of stay and readmissions, with equal or better precision than an ICD-code based model.

Patients and methods

Setting and study population

This is a retrospective multi-centre cohort study on prospectively collected data from medical records and registry data [6]. The study population consisted of all patients aged ≥ 18 and operated with a hip arthroplasty due to osteoarthritis, hip fractures and other forms of degenerative joint disease that are registered in the Swedish Hip Arthroplasty Register (SHAR) between 2009 and 2011 (n = 21 774).

Ethical approval

Ethical approval was provided by the Regional Ethics Committee of Gothenburg (516–13 and T732-13). Permission for data access for the reviewers was granted by the head of each respective unit. The patients did not provide an informed consent to the record review, and the need for informed consent was waived by the regional ethics committee.

Data sources

From the SHAR we collected data on the primary procedures that were cross-linked with data from the NPR, using the Swedish personal identity numbers. From the NPR, we collected data on all admissions from the primary procedure and 90 days post-operatively. The National Board of Health and Welfare furthermore supplied aggregated data on length of stay in Swedish hospital during the study period.

We performed retrospective record review (RRR) using the Swedish version [9] of the Global Trigger Tool (GTT) [10] for all inpatient and unplanned outpatient hospital care up to 90 days after surgery. The review process is described in detail elsewhere [6, 11, 12].

Study cohort

The study cohort consisted of 2 000 patients with both acute and elective hip arthroplasty surgery. The patients underwent surgery in one of four major regions in Sweden (Stockholm County Council, Region Västra Götaland, Region Skåne and Västerbotten County Council).

To increase the probability of selecting medical records with an AE and avoiding excess RRR on records without AEs, we used a weighted sample. 20 different selection groups for acute and elective arthroplasties were created as follows (Table 1).

Table 1. Selection groups used for the weighted sample.

With a predefined ICD-10 code indicating an AE in the NPR
Acute Elective
Population Sample Population Sample
Percentiles of length of stay 0–55% 194 11 95 22
56–80% 148 16 58 33
81–100% 302 25 235 49
Readmission 2–30 days 274 98 356 196
31–90 days 199 98 204 195
Without a predefined ICD-10 code indicating an AE in the NPR
Acute Elective
Population Sample Population Sample
Percentiles of length of stay 0–55% 2859 44 9769 86
56–80% 1167 65 2070 131
81–100% 766 97 1781 197
Readmission 2–30 days 294 147 337 295
31–90 days 341 66 325 129
Total 6544 667 15230 838

ICD-10, the 10th revision of the International Classification of Diseases.

  1. We constructed three groups with lengths of primary stay in percentiles divided as 0–55%, 56–80% and 81–100%. The three groups were further divided based on whether there was an ICD-10 code indicating an AE in the NPR (Table 2). Overall, six groups were generated.

  2. A selection was made for patients who had readmissions in the NPR. The readmission groups were divided in readmission within 2–30 days and within 31–90 days after surgery. The two groups were further divided based on whether there was an ICD-10 code indicating an AE in the NPR, generating a total of four groups.

Table 2. Set of ICD-10 codes used in the selection of patients.

As main diagnosis
All I codes Diseases of the circulatory system
J819 Pulmonary oedema
J13 Pneumonia due to Streptococcus pneumoniae
J15 Bacterial pneumonia, not elsewhere classified
J18 Pneumonia, organism unspecified
R33 Retention of urine
As main or secondary diagnosis
I803 Phlebitis and thrombophlebitis of lower extremities, unspecified
I269 Pulmonary embolism without mention of acute cor pulmonale
L899 Decubitus ulcer and pressure area, unspecified
M243 Pathological dislocation and subluxation of joint, not elsewhere classified
M244 Recurrent dislocation and subluxation of joint
S730 Dislocation, sprain and strain of joint and ligaments of hip
T810 Haemorrhage and haematoma complicating a procedure, not elsewhere classified
T813 Disruption of operation wound, not elsewhere classified
T814 Infection following a procedure, not elsewhere classified
T840 Mechanical complication of internal joint prosthesis
T845 Infection and inflammatory reaction due to internal joint prosthesis
T933 Sequelae of dislocation, sprain and strain of lower limb

ICD-10, the 10th revision of the International Classification of Diseases.

This created 10 selection groups and we sampled according to the table, both from acute and elective patients.

Definitions

The index admission was defined as the orthopaedic admission when the patient had hip arthroplasty surgery. If the patient was discharged directly to a geriatric or rehabilitation department, this admission was also considered a part of the index admission.

An AE in the RRR was defined as suffering, physical harm or disease as well as death related to the index admission and as a condition that was not an inevitable consequence of the patient´s disease or treatment. If an adverse event affects a patient there are in most cases also suffering, i.e. something subjective unpleasant, involved. If, for example, a patient is affected by a deep wound infection with a long hospital stay and reoperations there is inevitably suffering involved in connection to this along with physical harm. Suffering is closely connected to the physical harm or disease, and death part of the AE definition used in our study. An inevitable consequence means that the adverse event is associated with healthcare-related omissions or commissions rather than an underlying disease or injury of the patient.

The outcome was at least one AE of any type or severity.

Data set

Two patients were excluded, one did not have an available medical record and the other did not have hip arthroplasty surgery and was presumed faulty registered in the SHAR. The final study cohort consisted of 1 998 patients. Of these patients, 1 171 had at least one AE (the high proportion of AEs was due to the selection of the cohort which targeted groups with high probability for AEs, see above section). Predictor variables included gender, age, length of primary stay (LOS) both for the orthopaedic admission and the index admission, acute or elective procedure, type of hospital for the primary surgery (university, central county council, county council or private), number of readmissions, number of emergency department (ED) visits and if the patient was discharged to an acute care ward (surgical, internal medicine, cardiology, infection or intensive care). We cross-linked the patient data with aggregated data matching the patient’s age, gender, acute or elective care, year of surgery and type of hospital with the aggregated data. The aggregated data included the 50th, 75th, 90th and 95th percentiles, mean and standard deviation of LOS of that patient and hospital category. This resulted in that each patient had in addition to their own LOS, aggregated data for their patient characteristics. We used the mean of these 50th percentiles to calculate the LOS trends in Sweden during the study period.

Reference model based on ICD-codes (code-model)

The model used by both the SHAR and the National Board of Health and Welfare is based on a set of ICD-10 codes (Table 3). According to the model, a patient has sustained an AE if any of the codes is present in the NPR during readmissions. This model was used as our reference model.

Table 3. The set of ICD-10 codes for the defining an adverse event by the instrument.

As main diagnosis
All I codes Diseases of the circulatory system
J819 Pulmonary oedema
J13 Pneumonia due to Streptococcus pneumoniae
J15 Bacterial pneumonia, not elsewhere classified
J18 Pneumonia, organism unspecified
R33 Retention of urine
As main or secondary diagnosis
L899 Decubitus ulcer and pressure area, unspecified
S730 Dislocation, sprain and strain of joint and ligaments of hip
T810 Haemorrhage and haematoma complicating a procedure, not elsewhere classified
T813 Disruption of operation wound, not elsewhere classified
T814 Infection following a procedure, not elsewhere classified
T840 Mechanical complication of internal joint prosthesis
T845 Infection and inflammatory reaction due to internal joint prosthesis
T933 Sequelae of dislocation, sprain and strain of lower limb

ICD-10, the 10th revision of the International Classification of Diseases.

Model development

The full technical description of the model development is available in the S1 Appendix. We made a 2/1 split of the data into a training and holdout dataset. The training data was used to train a set of machine learning algorithms (random forests, logistic regression with and without natural splines, support vector machines and neural networks with three different structures). We used 10-folds cross validation to evaluate model accuracy, fine tune hyper parameters and control for over-fitting. During training, we started with all variables included in the model and did a stepwise removal of variables. We also split the dataset into acute and elective surgery and trained two separate models for each set. The best performing and fastest model was s logistic regression model with four natural age splines. Fig 1 shows a flow chart of the model development.

Fig 1.

Fig 1

Validation of models

In the final test, we used the whole training set to train the model and the trained model was used for prediction for the holdout set. We made a prediction for each selection group. The sensitivity and specificity in the groups were multiplied by the group proportion (group size in population/total population) and summed, this yielded the adjusted sensitivity and specificity. The final model was evaluated against the holdout set. For the code-based model, the sensitivity, specificity and Youden’s index for the code-based model on the holdout data was calculated using the same method as on the training data.

Performance metrics

We compared the models with the code-model by measuring sensitivity, specificity and Youden Index (sensitivity + specificity– 1) [13]. For intermodal comparisons, we relied on the area under the receiver operator characteristic curve (AUC). The receiver operator characteristic (ROC) is created by plotting a curve of the different classification thresholds on the true positive and false positive rates. The AUC is the two-dimensional area under this curve. This curve could not be calculated for the code-model because the result from this model are dichotomous and does not contain any thresholds, we therefore used AUC during the model training and Youden Index for the validation of the final model.

Software and packages

We used R 3.5.1 for all statistics. We used the stats package for logistic regressions and the rms package (v.5.1–2) and the contrast function for calculating odds ratio and 95% confidence interval (CI) for age and LOS. The graphs were created using ggplot2 (v. 3.0.0). The packages used for the different models are available in the S1 Appendix.

Results

One third of the participants in the study cohort were treated due to hip fractures (acute group) and two thirds due to degenerative joint disease (elective group). The acute patient group consisted of more women, with a higher median age and longer LOS (orthopaedic admission with following rehabilitation admission) than the elective patients (Table 4). There were no large differences in median age, AE proportion, gender, median LOS or acute or elective operation between the training and holdout set (Table 5).

Table 4. Demographics.

All patients n = 1 998 Acute group n = 667 Elective group n = 1 331
Female, n (%) 1 250 (62.6) 444 (66.6) 806 (60.6)
Male, n (%) 748 (37.4) 223 (33.4) 525 (39.4)
Age, median (IQR) 77 (68–84) 84 (79–89) 73 (64–80)
LOS, median (IQR) 7 (4–13) 14 (9–20) 5 (4–8)
Type of hospital, n (%)
    University 630 (31.5) 295 (44.2) 335 (25.2)
    Central county council 556 (27.8) 180 (27.0) 376 (28.2)
    County council 531 (26.6) 109 (16.3) 422 (31.7)
    Private 281 (14.1) 83 (12.4) 198 (14.9)

LOS, Length of stay; IQR, Interquartile range.

Note: Weighted sample, the mean values are not representable for average Swedish orthopedic care.

Table 5. Demographics for the training and holdout set.

Training set n = 1 332 Holdout set n = 666
Age, median (IQR) 77 (67–84) 78 (69–85)
AEs, n (%) 781 (58.6) 390 (58.6)
Female, n (%) 840 (63.1) 410 (61.6)
Male, n (%) 492 (36.9) 256 (38.4)
LOS, median (IQR) 6 (4–9) 6 (4–8)
Acute 438 (32.9) 229 (34.4)
Elective 894 (67.1) 437 (65.6)

LOS, Length of stay; IQR, Interquartile range.

Training results

The performance difference between the models was negligible. The AUC from the training were similar for all the models. Most models had a slightly higher AUC when we included ICD-codes than without codes (Fig 2). For the three different configurations of neural networks, no configuration was superior, and all neural networks had inferior performance compared to the traditional machine learning models (Fig 3).

Fig 2.

Fig 2

Fig 3.

Fig 3

Best performing model results

The best performing model, logistic regression with four natural age splines (henceforth: top model) had higher sensitivity, specificity and Youden’s Index for both 30 and 90 days when tested on the acute, elective and all patients (Table 6). We started with all variables in the model and then removed them one by one. The best performance was observed using length of stay at the orthopaedic department, discharge to acute care, age, number of readmissions and ED visits.

Table 6. Results comparing the reference code model with the top performing logistic regression model.

30 days 90 days
Sensitivity Specificity Youden’s index Sensitivity Specificity Youden’s index
All patients
    Code model 0.054 0.942 -0.005 0.164 0.915 0.079
    Top model 0.230 0.906 0.136 0.314 0.894 0.208
Acute patients
    Code model 0.107 0.865 -0.028 0.277 0.767 0.044
    Top model 0.319 0.779 0.098 0.464 0.758 0.221
Elective patients
    Code model 0.035 0.976 0.010 0.050 0.962 0.012
    Top model 0.073 0.937 0.010 0.090 0.937 0.028

The precision was higher for all patients than for both acute and elective patients. We analysed the relative importance of the variables in the top model and readmission and number of ED visits were the two most important variables (Table 7). We weighted the top model to include more negative cases to match the specificity of the code-based model. This way the precision could be compared using the sensitivity. We also tried the top model with ICD-codes but this weakened the results instead of improving them.

Table 7. Importance of the variables in the logistic regression model.

Estimate Standard Error z value Pr(>|z|)
Intercept -0.649 0.940 -0.690 0.490
LOS 0.058 0.022 2.629 0.009
Readmissions 0.567 0.077 7.361 >0.001
ED visits 0.685 0.117 5.846 >0.001
Discharge to acute care 1.854 0.677 2.738 0.006
Age spline 1 -1.102 0.857 -1.286 0.198
Age spline 2 -0.427 0.616 -0.694 0.488
Age spline 3 -0.635 1.951 -0.325 0.745
Age spline 4 0.496 0.916 0.542 0.588

LOS, Length of stay; ED, emergency department.

Other analyses

An increased LOS was associated with an increased risk of having a registered AE (Fig 4). Also increased age was associated with an increased risk of having a registered AE (Fig 5).

Fig 4.

Fig 4

Fig 5.

Fig 5

Discussion

Key results

Our alternative model without any ICD-codes outperformed the reference code-based model. It was able to attain the same specificity while having 2–3 times the sensitivity of the code-based model. The strongest indicator for the occurrence of an AE were the number of readmissions and ED visits.

We found that the risk of having a registered AE occur increases with longer LOS and increased age. LOS is naturally dependent on how the healthcare is organized. In Sweden, the median LOS for hip fracture patients in the orthopaedic ward is 7 days and after this is transferred to either a geriatric ward, nursing home or home with or without home healthcare or social care. We used LOS for the orthopaedic stay and not the combined LOS of the orthopaedic and geriatric stay because this improved the model accuracy. This is logical considering that most AE occurred during the orthopaedic stay. In other healthcare systems the patient may stay shorter in the orthopaedic ward and is discharged to step-down facilities or longer if the rehabilitation is done in the orthopaedic ward. In these systems, the occurrence of an AE may not affect LOS as much as in the Swedish system.

There was barely any performance difference in the tested models. We suspect that this is due to that the amount of information in the administrative data is limited and can be adequately captured using standard statistical models. This was supported by the fact that the most complex models such as neural networks performed even worse than the simpler models.

Depending on the purpose of the top model, it can be adjusted to gain a higher sensitivity or specificity by changing the cut point or the case weights. A model for economic purposes might be adjusted to elevate the specificity to ensure a low true negative rate.

Strengths and limitations

The main strength in this study is the use of a large multi-centre data set with high quality data, and probably most important that all the AEs were validated with RRR. RRR with GTT is the method that will detect most AEs [1416], but still it is limited to the information recorded in the records. Also, RRR with GTT is both time and resource consuming. The variables used in the model are robust and easy to measure. An interesting finding is that the model with less variables performed better than when all variables were included. The variety of different AEs is wide; however, they seem to only affect only a few variables found in administrative data. We interpret this as a sign that this dataset is not complex, and this is an explanation why the more advanced machine learning models did not outperform basic statistic models. If reimbursement to hospitals is based on short LOS and few readmissions instead of ICD-codes, this might stimulate hospitals to improve these variables which will unlock resources for other patients. Compared to upcoding the side effects from weighting in LOS is much more positive.

This study explores the use of many different machine learning methods. The use of neural networks was not more accurate than the other methods. Neural networks have become very popular in recent years, especially the use of convolutional neural networks for image classification [17]. The result of this study can be a healthy reminder that this method is maybe not always the best choice for all type of prediction and it could be worthwhile to try different methods.

The use of a weighted sample has the advantage of recording many AEs with minimum record review and it will generate a dataset that is more balanced concerning the outcome. However, the results have to be adjusted to represent the results in the study population.

Notably, also legislation on confidentiality has to be considered when designing models to monitor AE. Our study was delayed due to the bureaucracy to require all records in this national multi-centre study. The lack of a unanimous definition of what should be considered as AE hinders comparisons between studies and countries.

This study only includes limited patient demographic data (age and gender) and lacks some important demographics as comorbidities, smoking status and BMI, which is often found in the medical record, but sometimes in the administrative data.

Even though the accuracy of the top model is higher than the code model, it is limited and there is room for improvement. To improve it there is probably a need to add data beyond the NPR. The improvement might come from adding data that might correspond to certain individual AEs. One possibility would be to add data from the Swedish Prescribed Drug Registry that collects data on all prescribed drugs that are delivered to Swedish patients. If a patient is prescribed antibiotics or high-dose anticoagulants following surgery, this could be a proxy for infection or thrombosis that could be included and improve the model.

Interpretation

Risk adjusted prolonged length of stay (RAPLOS) as a measure for AEs following colon resection, coronary artery bypass graft and hip arthroplasty have been studied by Fry et al. [18]. The authors concluded that RAPLOS was a better measure for AEs than codes. However Lyman et al. [19] studied RAPLOS as a measure for AEs after following elective hip and knee arthroplasty surgery and concluded that RAPLOS was not superior to a measure based on ICD-codes. This study did not rely on RRR for measuring AEs and our model uses more variables which makes comparison difficult.

Generalizability

The administrative data used in the model is universal and easily available through hospital administration systems, which would enable use of the model worldwide. AEs causes prolonged LOS [2023] and readmissions are also correlated with AEs [24] and unplanned readmission can be used as a proxy for AEs [25]. Based on this knowledge a model based on LOS and readmissions is probably applicable to other types of surgery and developing such models for other types of surgery could probably be done with less patients than in this study.

Conclusion

We conclude that a prediction model for AEs following hip arthroplasty surgery based on administrative data without ICD-codes is more accurate than a model based on ICD-codes. In addition to the accuracy, variables such as LOS, readmissions, gender and age are robust and objective. Therefore, they are not prone to be biased in a way that ICD-codes can be. We consider that this less is more model is superior to ICD-code based models.

Patient involvement

This is a register and record-based retrospective study with no patient involvement.

Supporting information

S1 Appendix

(DOCX)

Acknowledgments

The authors thank Marie Ax, Susanne Hansson, Ammar Jobory, Zara Hedlund, Mirta Stupin, Tim Hansson, Lovisa Hult-Ericson and Christina Jansson for valuable help in carrying out the study. We would also like to thank all department managers for access to the medical records and Per Nydert for help with the study database.

Data Availability

There are legal restrictions to upload the dataset. However, researchers interested in the dataset can contact forskning.ortopedkliniken@sll.se and will after review and agreement to keep patient confidentiality access to the dataset. Due to the difficulty for full anonymization this restricted form of access is required. The Regional Ethical Review Board in Gothenburg: Regionala etikprövningsnämnden i Göteborg Box 401, 405 30 Göteborg; Email: registrator@etikprovning.se; Phone: +4610-475 08 00.

Funding Statement

The authors received no specific funding for this work.

References

  • 1.Pivec R, Johnson AJ, Mears SC, et al. Hip arthroplasty. The Lancet 2012;380:1768–77. 10.1016/S0140-6736(12)60607-2 [DOI] [PubMed] [Google Scholar]
  • 2.Wolf BR, Lu X, Li Y, et al. Adverse Outcomes in Hip Arthroplasty: Long-Term Trends. J Bone Joint Surg Am 2012;94:e103 10.2106/JBJS.K.00011 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Huddleston JI, Wang Y, Uquillas C, et al. Age and obesity are risk factors for adverse events after total hip arthroplasty. Clin Orthop Relat Res 2012;470:490–6. 10.1007/s11999-011-1967-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Millstone DB, Perruccio AV, Badley EM, et al. Factors Associated with Adverse Events in Inpatient Elective Spine, Knee, and Hip Orthopaedic Surgery. J Bone Joint Surg-Am Vol 2017;99 10.2106/JBJS.16.00843 [DOI] [PubMed] [Google Scholar]
  • 5.Culler SD, Jevsevar DS, Shea KG, et al. The Incremental Hospital Cost and Length-of-Stay Associated With Treating Adverse Events Among Medicare Beneficiaries Undergoing THA During Fiscal Year 2013. The Journal of Arthroplasty 2016;31:42–8. 10.1016/j.arth.2015.07.037 [DOI] [PubMed] [Google Scholar]
  • 6.Magnéli M, Unbeck M, Rogmark C, et al. Validation of adverse events after hip arthroplasty: a Swedish multi-centre cohort study. BMJ Open 2019;9:e023773 10.1136/bmjopen-2018-023773 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Pongpirul K, Robinson C. Hospital manipulations in the DRG system: a systematic scoping review. 2013;7:10. [Google Scholar]
  • 8.Bastani H, Goh J, Bayati M. Evidence of Upcoding in Pay-for-Performance Programs. Rochester, NY: Social Science Research Network; 2015. https://papers.ssrn.com/abstract = 2630454 (accessed 25 Sep 2019). [Google Scholar]
  • 9.Swedish Association of Local Authorities and Regions (SALAR). Marker based record review to identify and measure harm in healthcare (in Swe: Markörbaserad Journalgranskning–för att identifiera och mäta skador i vården). Stockholm 2012. Stockholm: Sveriges kommuner och landsting; 2012. [Google Scholar]
  • 10.Griffin F, Resar R. IHI Global Trigger Tool for Measuring Adverse Events (Second Edition). Cambridge, MA: Institute for Healthcare Improvement; 2009. [Google Scholar]
  • 11.Magnéli M, Unbeck M, Samuelsson B, et al. Only 8% of major preventable adverse events after hip arthroplasty are filed as claims: a Swedish multi-center cohort study on 1,998 patients. Acta Orthop 2020;91:20–5. 10.1080/17453674.2019.1677382 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Hommel A, Magnéli M, Samuelsson B, et al. Exploring the incidence and nature of nursing-sensitive orthopaedic adverse events: A multicenter cohort study using Global Trigger Tool. Int J Nurs Stud 2020;102:103473 10.1016/j.ijnurstu.2019.103473 [DOI] [PubMed] [Google Scholar]
  • 13.Youden WJ. Index for rating diagnostic tests. Cancer 1950;3:32–5. [DOI] [PubMed] [Google Scholar]
  • 14.Unbeck M, Muren O, Lillkrona U. Identification of adverse events at an orthopaedics department in Sweden. Acta Orthop 2008;79:396–403. 10.1080/17453670710015319 [DOI] [PubMed] [Google Scholar]
  • 15.Classen DC, Resar R, Griffin F, et al. ‘Global Trigger Tool’ Shows That Adverse Events In Hospitals May Be Ten Times Greater Than Previously Measured. Health Aff 2011;30:581–9. 10.1377/hlthaff.2011.0190 [DOI] [PubMed] [Google Scholar]
  • 16.Naessens JM, Campbell CR, Huddleston JM, et al. A comparison of hospital adverse events identified by three widely used detection methods. Int J Qual Health Care 2009;21:301–7. 10.1093/intqhc/mzp027 [DOI] [PubMed] [Google Scholar]
  • 17.Gordon M. Tech-trends in orthopaedics 2018. Acta Orthop 2018;89:475–6. 10.1080/17453674.2018.1518806 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Fry DE, Pine M, Jones BL, et al. Adverse outcomes in surgery: redefinition of postoperative complications. The American Journal of Surgery 2009;197:479–84. 10.1016/j.amjsurg.2008.07.056 [DOI] [PubMed] [Google Scholar]
  • 19.Lyman S, Fields KG, Nocon AA, et al. Prolonged Length of Stay Is Not an Acceptable Alternative to Coded Complications in Assessing Hospital Quality in Elective Joint Arthroplasty. J Arthroplasty 2015;30:1863–7. 10.1016/j.arth.2015.05.019 [DOI] [PubMed] [Google Scholar]
  • 20.Grigor EJM, Ivanovic J, Anstee C, et al. Impact of Adverse Events and Length of Stay on Patient Experience After Lung Cancer Resection. Ann Thorac Surg 2017;104:382–8. 10.1016/j.athoracsur.2017.05.025 [DOI] [PubMed] [Google Scholar]
  • 21.Hoogervorst-Schilp J, Langelaan M, Spreeuwenberg P, et al. Excess length of stay and economic consequences of adverse events in Dutch hospital patients. BMC Health Serv Res 2015;15 10.1186/s12913-015-1205-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Ricciardi R, Roberts PL, Read TE, et al. Which adverse events are associated with mortality and prolonged length of stay following colorectal surgery? J Gastrointest Surg 2013;17:1485–93. 10.1007/s11605-013-2224-3 [DOI] [PubMed] [Google Scholar]
  • 23.Zhang Z, Mostofian F, Ivanovic J, et al. All grades of severity of postoperative adverse events are associated with prolonged length of stay after lung cancer resection. J Thorac Cardiovasc Surg 2018;155:798–807. 10.1016/j.jtcvs.2017.09.094 [DOI] [PubMed] [Google Scholar]
  • 24.Nandan AR, Bohnen JD, Chang DC, et al. The impact of major intraoperative adverse events on hospital readmissions. Am J Surg 2017;213:10–7. 10.1016/j.amjsurg.2016.03.018 [DOI] [PubMed] [Google Scholar]
  • 25.Minhas SV, Kester BS, Lovecchio FC, et al. Nationwide 30-Day Readmissions After Elective Orthopaedic Surgery: Reasons and Implications. The Journal for Healthcare Quality (JHQ) 2017;39:34 10.1097/JHQ.0000000000000045 [DOI] [PubMed] [Google Scholar]

Decision Letter 0

Susan Hepp

20 Aug 2020

PONE-D-20-09967

Developing and validating a model for measuring adverse events following hip arthroplasty surgery using administrative data without ICD-codes

PLOS ONE

Dear Dr. Magneli,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Your manuscript has been reviewed by two experts in the field, who request more information and clarification mostly about specifics on modelling.

Please submit your revised manuscript by Oct 03 2020 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

Susan Hepp

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Please provide additional details regarding participant consent. You state that "the patients did not provide an informed consent to the record review, which was accepted by the local ethics committee". Please clarify if the need for consent was waived by the ethics committee. Alternatively, please discuss whether all data were fully anonymized before you accessed them.

3.We suggest you thoroughly copyedit your manuscript for language usage, spelling, and grammar. If you do not know anyone who can help you do this, you may wish to consider employing a professional scientific editing service.  

Whilst you may use any professional scientific editing service of your choice, PLOS has partnered with both American Journal Experts (AJE) and Editage to provide discounted services to PLOS authors. Both organizations have experience helping authors meet PLOS guidelines and can provide language editing, translation, manuscript formatting, and figure formatting to ensure your manuscript meets our submission guidelines. To take advantage of our partnership with AJE, visit the AJE website (http://learn.aje.com/plos/) for a 15% discount off AJE services. To take advantage of our partnership with Editage, visit the Editage website (www.editage.com) and enter referral code PLOSEDIT for a 15% discount off Editage services.  If the PLOS editorial team finds any language issues in text that either AJE or Editage has edited, the service provider will re-edit the text for free.

Upon resubmission, please provide the following:

  • The name of the colleague or the details of the professional service that edited your manuscript

  • A copy of your manuscript showing your changes by either highlighting them or using track changes (uploaded as a *supporting information* file)

  • A clean copy of the edited manuscript (uploaded as the new *manuscript* file)

4.We note that you have indicated that data from this study are available upon request. PLOS only allows data to be available upon request if there are legal or ethical restrictions on sharing data publicly. For information on unacceptable data access restrictions, please see http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions.

In your revised cover letter, please address the following prompts:

a) If there are ethical or legal restrictions on sharing a de-identified data set, please explain them in detail (e.g., data contain potentially identifying or sensitive patient information) and who has imposed them (e.g., an ethics committee). Please also provide contact information for a data access committee, ethics committee, or other institutional body to which data requests may be sent.

b) If there are no restrictions, please upload the minimal anonymized data set necessary to replicate your study findings as either Supporting Information files or to a stable, public repository and provide us with the relevant URLs, DOIs, or accession numbers. Please see http://www.bmj.com/content/340/bmj.c181.long for guidelines on how to de-identify and prepare clinical data for publication. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories.

We will update your Data Availability statement on your behalf to reflect the information you provide.

5. Your ethics statement must appear in the Methods section of your manuscript. If your ethics statement is written in any section besides the Methods, please move it to the Methods section and delete it from any other section. Please also ensure that your ethics statement is included in your manuscript, as the ethics section of your online submission will not be published alongside your manuscript.

Additional Editor Comments (if provided):

1. Please provide additional details regarding participant consent. You state that "the patients did not provide an informed consent to the record review, which was accepted by the local ethics committee". Please clarify if the need for consent was waived by the ethics committee. Alternatively, please discuss whether all data were fully anonymized before you accessed them.

2. At this time, please thoroughly copyedit your manuscript for language usage, spelling, and grammar. If you do not know anyone who can help you do this, you may wish to consider employing a professional scientific editing service.   Whilst you may use any professional scientific editing service of your choice, PLOS has partnered with both American Journal Experts (AJE) and Editage to provide discounted services to PLOS authors. Both organizations have experience helping authors meet PLOS guidelines and can provide language editing, translation, manuscript formatting, and figure formatting to ensure your manuscript meets our submission guidelines. To take advantage of our partnership with AJE, visit the AJE website (http://learn.aje.com/plos/) for a 15% discount off AJE services. To take advantage of our partnership with Editage, visit the Editage website (www.editage.com) and enter referral code PLOSEDIT for a 15% discount off Editage services. If the PLOS editorial team finds any language issues in text that either AJE or Editage has edited, the service provider will re-edit the text for free.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: I Don't Know

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: No

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: Review of the manuscript for PlosONE by Magneli et al developing and validating a model for measuring adverse events following hip arthroplasty surgery using administrative data without ICD-codes. The research groups hypothesized that they could use administrative data independent of ICD-codes to create a new model with equal or better ability to measure adverse events after hip arthroplasty. They concluded that a prediction model for AEs following hip arthroplasty surgery based on administrative data without ICD-codes is more accurate than a model based on ICD-codes.

This is a well written paper with a clear message. The research group did a tremendous job by combining the datasets of the SHAR and the National Patient Register. Additionally using aggregated data from the National Board of Health and Welfare and performing retrospective record review using the Global Trigger Tool. They performed sound statistics with a large variety of testing models which increased strength of this research.

However, there are some minor points of attention.

What about missing data? Wat are the proportions of missing data for each variable? Does the medical files contain close to all variables for each patient? Is the missing data missing at random?

Several different models were tested, with rather similar outcomes on sensitivity and specificity. As you speculate in the discussion this might be caused by the limited number of variables include. Could the model(s) be improved by adding any additional and often available variables? Which variables do you expect to improve the model most?

Length of stay at the orthopaedic department is an important variable in your best model, especially in the Swedish situation so you describe in your discussion. Can you elaborate on this ‘length of stay’ variable. Does it possibly contains several unmeasured factors bundles as 'length of stay' at the orthopaedic department? And therefore, how generalizable is your model to other countries?

Do you have any suggestions as how to use the results of your study to improve orthopaedic care? Can you elaborate on that?

I am pleased with the Figure 1 showing clearly the data and analysis steps. I would suggest to use the same y-axis scale in figure 4 and 5 and clearly state in the title that it concerns Risk of adverse events associated with age/LOS. In my opinion figure 6 is not so informative and these results could be mentioned in the text of the Results section (an/or figure in an appendix).

Reviewer #2: I would like to thank the editorial office for giving me the chance to review

"Developing and validating a model for measuring adverse events following hip arthroplasty surgery using administrative data without ICD-codes" by Magnéli et al.

I interpret this study as an exploratory (diagnostic) study on retrospective data.

The overall idea with the study is of utmost importance, and I would recommend acceptance of the manuscript.

I do however have comments to the manuscript, which I believe would improve the readability.

General comments:

It is a very technical paper, and it is hard to follow the red thread from introduction to conclusion. Many models are evaluated on the same data set, and commented on throughout the manuscript, but this study is not on the performance of different AI models, but whether administrative data captures AE better than ICD codes (to my understanding, or do I miss something?). But this also needs to reflect in the title, is it model development or adm. data vs ICD testing or perhaps both you look into.

I would suggest revision of the entire manuscript with focus on a single chosen model (the best performing) and simply explain, perhaps in an appendix, that the model was chosen based on sensitivity analysis of many models, and explain the models in the appendix. And then focus the manuscript on this models performance against the gold standard of an ICD code model. Which is what I believe is the authors purpose??

The statistical modeling is complex, and beyond the abilities of this reviewer to evaluate in specifics. A notion of the experience of the author group in DL modeling would thus be nice. And also a passage on how easy the model can be constructed for use in other healthcare systems?

It is not clear why the individual administrative data was selected for modelling? Where they simply available or did the authors reflect a priori on this. Needs to be adressed.

I would like if the authors state their opinion on what is next - is this model only applicable to Swedish system or should it be validated locally before application. And what is the clinical perspective of the model, is it only for use in administrative settings - such as comparison between centers, or can clinician locally or nationally benefit from this model?

A discussion of whether the lower specificity of the new model compared to the ICD model is actually beneficial in the context of use by administrators to oversee surgeons.

Do the authors plan to perform a validation of the model in a low prevalent AE group as well?

Specific comments:

line 36: must be 30 days instead of 90 days?

Line 69: I do not like the word "hypothesize" in a retrospective, exploratory study

Line 94-106 it is very difficult to comprehend how, and why the groups were constructed, and to what extend this influence the outcome. Why did you want to increase the probability of AE in the sample? Other causes than just for ease of journal review? Also later on discuss how the prevalence of AE in the selected sample influence the accuracy of your models. It is a highly selected dataset which could influence the conclusion. I do not understand Aggregated LOS data? So not individual LOS data?? You discuss later in text but it is not easily understood.

line 98 add "on a dedicated sample of patients" after surgery and before the period sign.

line 112: "suffering" needs to be defined. In general this is a very loose description of AE, which makes it very difficult for others to replicate your study if desired. Needs to be more clearly stated.

line 113-114. - please define "inevitable consequences". See line 112 above.

line 115-117 why is the definition of index admission placed under AE definition subheading?

line 123 & line 130 I do not understand the difference in the context of this study. Why is this necessary?

Line 134 why is LOS trends valuable information in this study?

Line 161-162 rephrase to a more scientific description than "we tried to"....

Line 204-205 Does this not mean that we should use BOTH administrative data AND ICD codes in future models, instead of only either? Needs to be discussed.

Line 220 Only 2/3 were elective, non-fracture patients. Needs to address the consequences - sensitivity analysis. : line 220 "The precision was higher for all patients than for both acute and elective patients." what does this mean for your model and for external validity?

Line 226-227: Needs to be discussed in the discussion section as to potential pitfalls of the model.

Line 248 "We found that the risk for sustaining an AE increased with longer LOS" - I would not use the word "risk of sustaining". I would use risk for having an registered AE occur increases with longer LOS. Sustaining an AE during admission will lead to longer LOS, LOS does not increase the risk of sustaining an AE during admission in your data set. also since line 253 states: This is logical considering that most AE occurred during the orthopaedic stay.

line 268 "the AEs were collected with RRR." Could AEs occur without being registered. I would use term "validated" instead of collected, and add sentence about missing potential AEs.

line 262 and line 269 "An interesting finding is that the model with less variables performed better than when all variables were included. ". You place this under strength of your study, but you lack any further discussion into this. This goes as well for the complexity in the model, which gives a worse performance. These two findings has also been found by others in similar AI fields (Lauritsen et al. 2020 Artificial Intelligence in Medicine.). The manuscript would benefit with a discussion of this.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: Yes: Jeppe Lange

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

 

PLoS One. 2020 Nov 5;15(11):e0242008. doi: 10.1371/journal.pone.0242008.r002

Author response to Decision Letter 0


21 Sep 2020

Editors comments

Comment: Please provide additional details regarding participant consent. You state that "the patients did not provide an informed consent to the record review, which was accepted by the local ethics committee". Please clarify if the need for consent was waived by the ethics committee. Alternatively, please discuss whether all data were fully anonymized before you accessed them.

Answer: The need for written consent was waived by the ethics committee. We have added this in the manuscript.

C: We note that you have indicated that data from this study are available upon request. PLOS only allows data to be available upon request if there are legal or ethical restrictions on sharing data publicly.

A: There are legal restrictions to upload the dataset. However, researchers interested in the dataset can contact forskning.ortopedkliniken@sll.se and will after review and agreement to keep patient confidentiality access to the dataset. Due to the difficulty for full anonymization this restricted form of access is required. The Regional Ethical Review Board in Gothenburg: Regionala etikprövningsnämnden i Göteborg Box 401, 405 30 Göteborg; Email: registrator@etikprovning.se; Phone: +4610-475 08 00.

C: Your ethics statement must appear in the Methods section of your manuscript. If your ethics statement is written in any section besides the Methods, please move it to the Methods section and delete it from any other section. Please also ensure that your ethics statement is included in your manuscript, as the ethics section of your online submission will not be published alongside your manuscript.

A: We have moved the ethics statement to the methods section.

Reviewer 1

1 Question: What about missing data? Wat are the proportions of missing data for each variable? Does the medical files contain close to all variables for each patient? Is the missing data missing at random?

Answer: Thank you for this important question, we had no missing data for the variables used in the mode (LOS, discharge to acute care, age, number of readmissions and ED visits) since these variables emanated from the national patient registry, which is very complete. All these variables were also available in the medical records. In the aggregated data there were a few missing datapoints (n=10). The explanation is not the completeness of the registry data, but some younger patients were the only patient in that age/sex/hospital type group during that specific year. So, these patients can be considered outliers in age and not completely missing at random. However, since the missing data patients was only a fraction (0.5%) we decided to exclude them in the initial model training and testing. We added these patients when we discovered that these aggregated data variables did not improve the model.

One patient had no available medical record (missing at random) and one did not have arthroplasty surgery (but had osteosynthesis for a hip fracture) and we assume that it was faulty registered in the SHAR. Both patients were excluded from the study. For the AE data (outcome data) we used retrospective record review (RRR), which is considered to be the most reliable method for measuring AEs. But, of course even RRR is limited by the information in the medical records.

2 Q: Several different models were tested, with rather similar outcomes on sensitivity and specificity. As you speculate in the discussion this might be caused by the limited number of variables include. Could the model(s) be improved by adding any additional and often available variables? Which variables do you expect to improve the model most?

A: There might be different ways of improving the AE measure for certain AEs. If a patient is prescribed drugs like antibiotics or high dose-anticoagulants, this could be a proxy for infection or thrombotic events. One way to extract this proxy is from the Swedish Prescribed Drug Registry that contains data on all prescribed drugs delivered to patients from Swedish pharmacies. This could be a possible addition to our model.

3 Q: Length of stay at the orthopaedic department is an important variable in your best model, especially in the Swedish situation so you describe in your discussion. Can you elaborate on this ‘length of stay’ variable. Does it possibly contains several unmeasured factors bundles as 'length of stay' at the orthopaedic department? And therefore, how generalizable is your model to other countries?

A: This is a very important concern, and we would like to elaborate on our thoughts: The bundling of multiple factors has been one of our central hypotheses when planning this study. Length of stay is a complex factor with multiple causes and in some countries we believe that there is still a minimal stay required for reimbursement. In the vast majority of countries, length of stay is kept to an absolute minimum as hospital beds are expensive and an obvious place where money can be saved. This development has continued since the onset of the study and today's stay beyond the standard days is most likely even more associated with some kind of unmeasured factor that has been bundled into this single measure. The benefits of length of stay are (1) it is highly available in existing data collection and (2) it is difficult to manipulate - attempting to retrieve the possibly unmeasured factors is most likely only feasible as a research project, while length of stay is already available on a nationwide scale. We believe that this effect of length of stay is not specific to the Swedish setting and the results should be highly generalizable although you may need to fine-tune the model for each country. Fine-tuning is most likely also required for the Swedish setting as the mean length of stay may have changed somewhat since data collection, although fine tuning generally requires fewer data points and could thus be feasible with a much smaller data set.

4 Q: Do you have any suggestions as how to use the results of your study to improve orthopaedic care? Can you elaborate on that?

A: Using easy-accessible and reliable data on AE can improve the quality-of-care at hospitals. Swedish National Registers have practiced open accounting on hospital level for more than one decade. We believe this will initiate a competitive strive to be “best in class”, as long as participating centers actually rely on the results.

Our model would penalize hospitals for lengthy admissions and readmissions, and we believe that the orthopedic care would be improved if both of these were reduced:

(1) as adverse events most likely increase length of stay the indirect effect should be that hospitals increase their efforts to reduce these as well as other reasons for prolonged stay. (2) similarly, there will be an incentive for making sure that patients are optimally prepared for surgery, e.g. a diabetes patient with poor glycemic control will more likely be readmitted early on than one where the diabetes is under control.

5 Q: I am pleased with the Figure 1 showing clearly the data and analysis steps. I would suggest to use the same y-axis scale in figure 4 and 5 and clearly state in the title that it concerns Risk of adverse events associated with age/LOS.

A: Thank you for this constructive comment, we have adjusted the mentioned figures.

6 Q: In my opinion figure 6 is not so informative and these results could be mentioned in the text of the Results section (an/or figure in an appendix).

A: We agree that this figure might be superfluous and that the time period that the data covers is too short to spot any real trends and have decided to omit the LOS trends from the paper.

Reviewer #2:

General comments:

1 Question: It is a very technical paper, and it is hard to follow the red thread from introduction to conclusion. Many models are evaluated on the same data set, and commented on throughout the manuscript, but this study is not on the performance of different AI models, but whether administrative data captures AE better than ICD codes (to my understanding, or do I miss something?). But this also needs to reflect in the title, is it model development or adm. data vs ICD testing or perhaps both you look into.

Answer:

This is correct, the primary aim of the paper is to show that we do not have to rely on ICD-codes for measuring adverse events at a hospital level. We agree that the paper has a technical feel to it with a large set of models that can be somewhat overwhelming to readers. It is though also important that all these models perform similarly, this is what we would expect if there were a simple underlying truth and this connects to the earlier point; that we can use non-ICD administrative data to measure the adverse events. We have tried to revise the text accordingly.

2 Q: I would suggest revision of the entire manuscript with focus on a single chosen model (the best performing) and simply explain, perhaps in an appendix, that the model was chosen based on sensitivity analysis of many models, and explain the models in the appendix. And then focus the manuscript on this models performance against the gold standard of an ICD code model. Which is what I believe is the authors purpose??

A: See above answer. We agree that focusing on a single model would perhaps be beneficial but it will at the same time leave one of the central questions unanswered that most with a machine learning background will have, “does the data contain more complexity than meets the eye?”. As this paper will most likely be most of interest to an audience more heavily invested in statistics than the regular orthopedic surgeon, we would prefer to retain the current, slightly wider focus. We have revised the text accordingly and only kept the most important details about the model development and moved the full technical description to an appendix.

3 Q: The statistical modeling is complex, and beyond the abilities of this reviewer to evaluate in specifics. A notion of the experience of the author group in DL modeling would thus be nice. And also a passage on how easy the model can be constructed for use in other healthcare systems?

A: We have been working with machine learning, especially with deep learning since 2014. Prior to that MG had been developing non-linear models for modeling other complex statistical together with the Swedish Hip Arthroplasty Registry. Machine learning is a complex topic and there is a vast number of models that most clinicians will not be familiar with. The models in this paper were though rather basic since the complexity of the input data is nowhere near that of our deep learning research. E.g. a single image fed into our deep learning network for radiographs has 256 x 256 values, i.e. more than 60 000 values, while we here only had 16 values to work with. This was expected and a little simplified we evaluated with the models if there were (1) higher level interactions and (2) advanced non-linearities. As we mentioned earlier, the data followed a rather straight-forward manner without a lot of complexity and thus the statistical model that we suggest is a slightly more advance logistic regression.

4 Q: It is not clear why the individual administrative data was selected for modelling? Where they simply available or did the authors reflect a priori on this. Needs to be adressed.

A: The choice of individual administrative was made on purpose. We never considered looking using non-individual data, e.g. at averages over large groups, as we have individual data in our national registries. Furthermore, when designing the study, we were aware of the limitations when using diagnostic codes. For example, faulty coding, a multitude of possible codes for common AE such as renal failure, difficulties to separate codes given for unchanged chronic conditions and codes given for an acute deterioration and DRG creep. Therefore, we looked at which more reliable data variables that were available for all patients in easy-accessible registers and designed our models accordingly.

5 Q: I would like if the authors state their opinion on what is next - is this model only applicable to Swedish system or should it be validated locally before application. And what is the clinical perspective of the model, is it only for use in administrative settings - such as comparison between centers, or can clinician locally or nationally benefit from this model?

A: We agree that a clinical perspective always should guide the researcher. Please see our answer to Reviewer 1 on his 3rd and 4th question. We are hoping that the concept of the model, possibly with some fine-tuning, can be implemented at a national level and allow us to evaluate in a few years if the number of adverse events drop.

6 Q: A discussion of whether the lower specificity of the new model compared to the ICD model is actually beneficial in the context of use by administrators to oversee surgeons.

A: Yes, the slightly lower specificity could be beneficial as surgeons will not know exactly what will affect the length of stay and readmissions and thus will have to apply a wider battery of improvements than just targeting single ICD-codes. We also hope that the decoupling of ICD-codes and penalties/reimbursements will improve surgeons ICD-coding and in time as this will help them to understand addressable causes for increased length of stay.

7 Q: Do the authors plan to perform a validation of the model in a low prevalent AE group as well?

A: This is an excellent suggestion. However, the record review is very time consuming and involves a lot of administrative work that none of the authors have any plans to endure once more. We have presented the model to the Swedish Board of Health and Welfare and they might have the capacity to validate the model.

Specific comments:

8 Question: line 36: must be 30 days instead of 90 days?

Answer: Thank you for pointing this error out. It should say 30 days, we have revised the abstract accordingly.

9 Q: Line 69: I do not like the word "hypothesize" in a retrospective, exploratory study

A: We agree that this wording might be reserved for prospective studies. We have rephrased the sentence.

10 Q: Line 94-106 it is very difficult to comprehend how, and why the groups were constructed, and to what extend this influence the outcome. Why did you want to increase the probability of AE in the sample? Other causes than just for ease of journal review?

A: This is due to the improved ability to build models when having a balanced outcome. If we would have only 5% adverse events, our models would be unable to beat the most basic model that always suggests “no adverse event”. By balancing the number of adverse events could review fewer charts and test more advanced models.

11 Q: Also later on discuss how the prevalence of AE in the selected sample influence the accuracy of your models. It is a highly selected dataset which could influence the conclusion. I do not understand Aggregated LOS data? So not individual LOS data?? You discuss later in text but it is not easily understood.

A: The aggregated data came from the Swedish board of health and welfare. It is the different percentiles of LOS divided on type of hospital, sex, fracture and year. So for each patient we added an aggregated LOS for example a 90 year old female with a fracture, treated in a university hospital in 2011 had a median LOS of x days. These variables were not used in the final model.

12 Q: line 98 add "on a dedicated sample of patients" after surgery and before the period sign.

A: We apologize, but we cannot find the word surgery in line 98, could this suggestion be for another line?

13 Q: line 112: "suffering" needs to be defined. In general this is a very loose description of AE, which makes it very difficult for others to replicate your study if desired. Needs to be more clearly stated.

A: The word suffering is included in the national patient safety terminology developed by the Swedish National Board of Health and Welfare. The Swedish adverse event definition, according to the National Board of Health and Welfare, is suffering, physical or mental harm or disease as well as death affecting a patient. If adverse events, i.e. harm, occur in a patient there are in most cases also suffering involved. If, for example, a patient is affected by a deep wound infection with a long hospital stay and reoperations there is inevitably suffering involved in connection to this along with physical harm. Suffering is closely connected to the physical harm or disease, and death part of the definition used in our study.

The adverse event definition used in this study is based on national definitions and laws (Patient Safety Act, SFS 2010:659 and Management System for Quality and Patient safety in Healthcare, SOSFS 2005:12) and the ones from many other studies which use a structured record review method. In Sweden nearly all acute care hospitals carry out Global Trigger Tool reviews on a monthly basis since 2013 (around 100 000 reviews so far) and they use nearly the same definition as we have done in this study so we believe that the study can be replicated by others.

14 Q: line 113-114. - please define "inevitable consequences". See line 112 above.

A: The word inevitable is also based on the definition in the law “Management System for Quality and Patient safety in Healthcare”, SOSFS 2005:12, but also from the one recommended by WHO in their term Healthcare-associated harm, and the one used in the Harvard Medical Practice Study (HMPS) and its subsequent nation-wide studies around the world. The HMPS record review methodology was used in the national adverse event study carried out by the Swedish National Board of Health and Welfare in 2007. It means that the harm is associated with healthcare-related omissions or commissions rather than an underlying disease, treatment or injury.

15 Q: line 115-117 why is the definition of index admission placed under AE definition subheading?

A: Thank you for pointing this out. We agree and have revised the subheading to simply; definitions.

16 Q: line 123 & line 130 I do not understand the difference in the context of this study. Why is this necessary?

A: Please see our answer concerning the aggregated data.

17 Q: Line 134 why is LOS trends valuable information in this study?

A: LOS have constantly been decreasing in Sweden and is now probably about as short it can be (or at least in some form of plateau phase). A patient that sustains an AE will most likely stay longer (this is truer for serious AEs) and because of the compressed LOS this can be used as a proxy variable for AEs.

The idea behind the LOS trends was to highlight the decreasing LOS trend in the available data. However, since we only had available data for three years it might be to short to spot trends. We have decided to omit the LOS trends from the paper.

18 Q: Line 161-162 rephrase to a more scientific description than "we tried to"....

A: Thank you for this improving remark, we have rephrased according to your suggestion.

19 Q: Line 204-205 Does this not mean that we should use BOTH administrative data AND ICD codes in future models, instead of only either? Needs to be discussed.

A: When the ICD codes were included as a variable in the model the accuracy was lower than only the administrative variables. A plausible explanation is that the ICD code model has a very low sensitivity and the AE cases that are correctly identified by the codes are severe cases that also has a prolonged LOS and readmissions and will thus only add noise and not strengthen the model. Our recommendation is that only administrative data is used.

20 Q: Line 220 Only 2/3 were elective, non-fracture patients. Needs to address the consequences - sensitivity analysis. : line 220 "The precision was higher for all patients than for both acute and elective patients." what does this mean for your model and for external validity?

A: We conclude that this effect is due to the size of the data. The model trained on the whole training set have higher accuracy than the two models trained on the two subsets. The two cohorts apparently have more similarities than differences and therefore the models trained on 1/3 and 2/3 of the data will have lower accuracy.

21 Q: Line 226-227: Needs to be discussed in the discussion section as to potential pitfalls of the model.

A: Please see our answer above, concerning ICD codes included in the model.

22 Q: Line 248 "We found that the risk for sustaining an AE increased with longer LOS" - I would not use the word "risk of sustaining". I would use risk for having an registered AE occur increases with longer LOS. Sustaining an AE during admission will lead to longer LOS, LOS does not increase the risk of sustaining an AE during admission in your data set. also since line 253 states: This is logical considering that most AE occurred during the orthopaedic stay.

A: Thank you for this valuable comment. We agree that it is the AE that prolongs the LOS and not the prolonged LOS that increases the risk of an AE, and this is what we tried to formulate. We have rephrased the wording according to your suggestion.

23 Q: line 268 "the AEs were collected with RRR." Could AEs occur without being registered. I would use term "validated" instead of collected, and add sentence about missing potential AEs.

A: Thank you for this constructive remark. We agree that RRR can only catch the AEs that are mentioned in the medical record, and although probably most AEs are mentioned we will never know how many that are not mentioned. This is mentioned later in the discussion and we have now moved that sentence to follow the passage mentioned in your question. We have rephrased according to your suggestion.

24 Q: line 262 and line 269 "An interesting finding is that the model with less variables performed better than when all variables were included. ". You place this under strength of your study, but you lack any further discussion into this. This goes as well for the complexity in the model, which gives a worse performance. These two findings has also been found by others in similar AI fields (Lauritsen et al. 2020 Artificial Intelligence in Medicine.). The manuscript would benefit with a discussion of this.

A: Thank you for highlighting this. We agree that adding a comment on this in the discussion will improve this paper. We have revised accordingly.

Attachment

Submitted filename: Response to Reviewers.docx

Decision Letter 1

Liza N Van Steenbergen

16 Oct 2020

PONE-D-20-09967R1

Measuring adverse events following hip arthroplasty surgery using administrative data without relying on ICD-codes

PLOS ONE

Dear Dr. Magneli,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Nov 30 2020 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

Liza N. van Steenbergen

Academic Editor

PLOS ONE

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: (No Response)

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The manuscript is adjusted according to my comments. I consider this manuscript ready for publication.

Reviewer #2: I agree with the authors in relation to my Q12. No further change needed in the revised manuscript.

In relation to line 121 in revised manuscript. I would ask the authors to add the definition of "suffering" to the text as defined in A to my Q13, and likewise with "inevitable consequences" as defined in A to Q14.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Liza van Steenbergen

Reviewer #2: Yes: Jeppe Lange

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2020 Nov 5;15(11):e0242008. doi: 10.1371/journal.pone.0242008.r004

Author response to Decision Letter 1


20 Oct 2020

Question: 6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The manuscript is adjusted according to my comments. I consider this manuscript ready for publication.

Reviewer #2: I agree with the authors in relation to my Q12. No further change needed in the revised manuscript.

In relation to line 121 in revised manuscript. I would ask the authors to add the definition of "suffering" to the text as defined in A to my Q13, and likewise with "inevitable consequences" as defined in A to Q14.

Answer: Thank you for this constructive comment. We have added the suggested definitions.

Attachment

Submitted filename: Response to Reviewers.docx

Decision Letter 2

Liza N Van Steenbergen

26 Oct 2020

Measuring adverse events following hip arthroplasty surgery using administrative data without relying on ICD-codes

PONE-D-20-09967R2

Dear Dr. Magneli,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Liza N. Van Steenbergen

Guest Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Acceptance letter

Liza N Van Steenbergen

28 Oct 2020

PONE-D-20-09967R2

Measuring adverse events following hip arthroplasty surgery using administrative data without relying on ICD-codes

Dear Dr. Magneli:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Liza N. Van Steenbergen

Guest Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Appendix

    (DOCX)

    Attachment

    Submitted filename: Response to Reviewers.docx

    Attachment

    Submitted filename: Response to Reviewers.docx

    Data Availability Statement

    There are legal restrictions to upload the dataset. However, researchers interested in the dataset can contact forskning.ortopedkliniken@sll.se and will after review and agreement to keep patient confidentiality access to the dataset. Due to the difficulty for full anonymization this restricted form of access is required. The Regional Ethical Review Board in Gothenburg: Regionala etikprövningsnämnden i Göteborg Box 401, 405 30 Göteborg; Email: registrator@etikprovning.se; Phone: +4610-475 08 00.


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES