Skip to main content
Indian Journal of Orthopaedics logoLink to Indian Journal of Orthopaedics
. 2023 Mar 2;57(5):653–665. doi: 10.1007/s43465-023-00845-2

Patient Perspectives on Artificial Intelligence in Healthcare Decision Making: A Multi-Center Comparative Study

Matthew W Parry 1,2, Jonathan S Markowitz 5, Cara M Nordberg 3, Aalpen Patel 4, Wesley H Bronson 5, Edward M DelSole 1,2,
PMCID: PMC9979110  PMID: 37122674

Abstract

Objective

Investigate the patient opinion on the use of Artificial Intelligence (AI) in Orthopaedics.

Methods

397 orthopaedic patients from a large urban academic center and a rural health system completed a 37-component survey querying patient demographics and perspectives on clinical scenarios involving AI. An average comfort score was calculated from thirteen Likert-scale questions (1, not comfortable; 10, very comfortable). Secondary outcomes requested a binary opinion on whether it is acceptable for patient healthcare data to be used to create AI (yes/no) and the impact of AI on: orthopaedic care (positive/negative); healthcare cost (increase/decrease); and their decision to refuse healthcare if cost increased (yes/no). Bivariate and multivariable analyses were employed to identify characteristics that impacted patient perspectives.

Results

The average comfort score across the population was 6.4, with significant bivariate differences between age (p = 0.0086), gender (p = 0.0001), education (p = 0.0029), experience with AI/ML (p < 0.0001), survey format (p < 0.0001), and four binary outcomes (p < 0.05). When controlling for age and education, multivariable regression identified significant relationships between comfort score and experience with AI/ML (p = 0.0018) and each of the four binary outcomes (p < 0.05). In the final multivariable model gender, survey format, perceived impact of AI on orthopaedic care, and the decision to refuse care if it were to increase cost remained significantly associated with the average AI comfort score (p < 0.05). Additionally, patients were not comfortable undergoing surgery entirely by a robot with distant physician supervision compared to close supervision.

Conclusion

The orthopaedic patient appears comfortable with AI joining the care team.

Keywords: Artificial intelligence, Patient perspectives, Patient comfort, Multi-center

Introduction

Artificial intelligence (AI) refers to manufacturing, intelligent machines [1]. It is used to describe both symbolic artificial intelligence, such as a computer that can play chess, and a subtype of AI known as machine learning (ML), which utilizes computer algorithms to learn a decision-making process from datasets [2, 3]. AI/ML have been applied with success in the fields of pharmaceutical development, diagnostic radiology, and oncology [4, 5]. AI has an increasingly promising role in the practice of orthopaedic surgery with the anticipated impact being an improvement in patient care in areas such as diagnosis, management, research, and systems analysis [6].

At present, the patient opinion regarding the implementation of AI/ML into the healthcare team remains poorly defined. Only one center has previously published their findings regarding patient perspective of AI/ML implementation in orthopaedic surgery [7], however, their results are based on a smaller sample size in a single clinical environment. The success of ML depends upon having large volumes of clean data upon which the machine is trained to interpret future data and predict an outcome [8]. The data for healthcare-related applications must come directly from patient care and thus patients indirectly participate in the creation of accurate ML. Therefore, patient participation, either direct or indirect, is critical for the success of these endeavors.

This general lack of understanding of the patient perspective has major implications in the construction of ML models because patients witness media reports such as technology companies amassing large volumes of patient data without consent [9]. Conduct of this nature may cause distrust in the data-gathering efforts and prevent accurate data acquisition. This study attempts to characterize the general patient opinion and identify factors that influence these opinions concerning the implementation of AI/ML in orthopaedic surgery across multiple academic and community health centers.

Methods

A 37-component questionnaire was developed and approved by our institutional review boards. Initially, the survey was distributed in person to patients at an orthopaedic clinic using either paper and pencil or a tablet for electronic data collection (Table 1). This was interrupted by the onset of the COVID-19 pandemic. The survey was subsequently distributed electronically via REDCap survey software due to the temporary closure of our practice. Survey language was changed to eliminate a redundant question (Table 2). The survey was then delivered via e-mail to a random sample of 2101 orthopaedic surgery patients at a large rural integrated health system and 1684 orthopaedic surgery patients at a large, urban academic medical center. Survey invitations were secure and unique for each participant so only one response could be recorded per patient. Patients were not enrolled if less than 18 years of age, and samples were omitted from analysis if age was unknown due to incompleteness of the questionnaire. All surveys were completed within an 18-month time window from March 2020 to August 2021. Of note, participants were only aware that the survey was to assess their opinions of AI/ML in healthcare and no supplemental educational material was provided at the time of survey administration.

Table 1.

Thirteen questions for average AI comfort score calculation, administered by tablet survey or paper/pencil format*

ID On a scale of 1 to 10 how comfortable you would be with: Average SD n
1 A computer telling your surgeon the best treatment option for your condition 4.5 3.2 28
2 A computer telling your surgeon your predicted surgical outcome (i.e., good or bad prognosis)? 3.9 3.2 28
3 Your surgeon consulting a computer algorithm to determine your predicted surgical outcome (i.e., good or bad prognosis)? 4.2 3.0 27
4 Your surgeon using a robot to assist during your surgery 5.3 3.6 27
5 Your surgeon using artificial intelligence or machine learning to predict complications with your surgery 4.2 3.0 27
6 Your surgeon using artificial intelligence or machine learning to predict your chance of surgical failure/disease recurrence 4.0 3.1 27
7 Your surgeon using artificial intelligence or machine learning in your care if your primary reason for visiting was because of a broken bone 4.0 3.2 27
8 Your surgeon using artificial intelligence or machine learning in your care if your primary reason for visiting was because of a sports injury 4.1 3.0 27
9 Your surgeon using artificial intelligence or machine learning in your care if your primary reason for visiting was because of a spine condition causing neck or back pain 3.8 3.0 27
10 Your surgeon using artificial intelligence or machine learning in your care if your primary reason for visiting was because of spine or neck pain 3.9 3.2 26
11 Your surgeon using artificial intelligence or machine learning in your care if your primary reason for visiting was because of shoulder, elbow, or hand pain 4.4 3.2 26
12 Undergoing surgery performed entirely by a robot, under the supervision of your surgeon, with your surgeon in the same room 3.8 3.3 25
13 Undergoing surgery performed entirely by a robot, under the supervision of your surgeon, with your surgeon outside the room and monitoring your operation from a distance 2.8 2.7 25

*This iteration of the survey was halted and then delivered electronically via RedCAP due to the onset of the Covid− 19 pandemic. Question 9 was identified as redundant to Question 10 and replaced in the second iteration of the survey

Table 2.

Modified survey of average AI comfort score calculation, administered through REDCap

ID On a scale of 1 to 10 how comfortable you would be with: Average SD n
1 Your surgeon consulting a computer algorithm to diagnose your condition 6.9 2.5 365
2 A computer telling your surgeon the best treatment option for your condition 6.8 2.5 395
3 Your surgeon consulting a computer algorithm to determine your predicted surgical outcome (i.e., good or bad prognosis)? 6.8 2.6 391
4 Your surgeon using a robot to assist during your surgery 6.9 2.6 391
5 Your surgeon using artificial intelligence or machine learning to predict complications with your surgery 6.9 2.6 391
6 Your surgeon using artificial intelligence or machine learning to predict your chance of surgical failure/disease recurrence 6.8 2.7 392
7 Your surgeon using artificial intelligence or machine learning in your care if your primary reason for visiting was because of a broken bone 6.7 2.8 379
8 Your surgeon using artificial intelligence or machine learning in your care if your primary reason for visiting was because of a sports injury 6.7 2.7 378
9 Your surgeon using artificial intelligence or machine learning in your care if your primary reason for visiting was because of hip or knee pain 6.9 2.6 355
10 Your surgeon using artificial intelligence or machine learning in your care if your primary reason for visiting was because of spine or neck pain 6.8 2.7 356
11 Your surgeon using artificial intelligence or machine learning in your care if your primary reason for visiting was because of shoulder, elbow, or hand pain 6.8 2.8 380
12 Undergoing surgery performed entirely by a robot, under the supervision of your surgeon, with your surgeon in the same room 5.7 3.2 387
13 Undergoing surgery performed entirely by a robot, under the supervision of your surgeon, with your surgeon outside the room and monitoring your operation from a distance 3.7 2.9 384

The primary outcome measure was the average level of comfort with AI/ML in the context of orthopaedic surgery, determined using 13 questions asking patients to rate their level of comfort with a clinical scenario on a Likert scale with 1 being not comfortable at all and 10 being very comfortable. Each patient’s average response to the 13 questions was calculated, quantifying the average comfort level. A binary indicator of positivity and negativity was also created, with a value of 0 if the patient-level response was less than or equal to the cohort median, and a value of 1 if greater than the median.

Secondary outcomes included categorical measures of perceived impact of AI in orthopaedic care, perceived impact of AI on healthcare costs, whether patients would refuse AI if it increased healthcare costs, and whether patients believe it is acceptable for a physician to profit from the distribution of personal health data to a third party for the purpose of developing AI.

Descriptive statistics were calculated for demographic data, orthopaedic data, and technology use data, and on data relating to patients’ perspectives on the use of AI in healthcare. For categorical predictors, bivariate analyses were performed with the continuous average AI comfort level outcome. Kruskal–Wallis and Student’s T test were used for bivariate analyses where appropriate. Chi-square test was employed to compare the response rates from the two academic centers.

Generalized linear multivariable regression models were run to predict differences in average AI/ML comfort level. A base model was first built from demographic variables that were significantly associated with the outcome in bivariate analyses. Then, the remaining predictors that were significantly associated with the outcome in bivariate analyses were added one at a time to the base model. Those predictors that remained significantly associated with the outcome were added together in a final multivariable model.

Bivariate analyses were also performed between the secondary outcomes and demographic, orthopaedic, and technology use predictors. Kruskal–Wallis and Student’s t test were performed where appropriate. For all secondary outcomes, associations with all other predictors were evaluated using chi-squared tests. A multivariable model was built for secondary outcomes using logistic regression.

A sensitivity analysis was conducted using a weighted analysis approach. Clinical site-specific response weights were calculated based on age, gender, and race. Within each site, response rates were calculated by dividing the cell count among responders by the cell count among all recruited patients for each age, gender, and race cross-tabulation cell. To reduce small cell sizes, gender was operationalized as female or not female, and race was operationalized as white or not white. Five age categories were used: 18– < 45, 45– < 55, 55– < 65, 65– < 75, and 75 + . Response weights were calculated as the inverses of the response rates. Finally, site-specific relative response weights were calculated by dividing each response weight by the lowest observed response weight within the respective site, so that each site’s minimum relative response weight was 1.0. These relative response weights were used in the weighted analyses. All analyses were performed in SAS Enterprise Guide v8.2 [10]. All patients were provided with an informed consent and the study protocol was approved by the institutional review board of both academic centers where patient data was collected. Data transfer agreements were also approved by each center’s review committees.

Results

At the rural health center, 32 patients completed surveys in the clinic (21 via tablet, 11 via paper and pencil) and 152 out of 2,101 (7.2%) responded via REDCap for a combined 184 rural health center responses. A significantly higher response rate was observed at the urban health center: 213 patients responded via REDCap out of 1,688 (12.6%, p = 0.001) queries (Fig. 1). A total of 23 patients were missing data for average AI comfort level and thus were excluded, leaving 387 patients for analysis of the primary outcome variable. The average AI comfort level among the analytical population was 6.4 (SD: 2.4). Patients responded with an average comfort score of 5.7 out of 10 when asked to consider undergoing surgery entirely by a robot under the supervision of the surgeon in the same room. Conversely, they responded with an average comfort score of 3.7 when asked to consider undergoing surgery performed entirely by a robot with the surgeon outside of the room monitoring from a distance (Table 1).

Fig. 1.

Fig. 1

Recruitment diagram

Average AI comfort level differed significantly (Table 3) between age groups (p = 0.032), with the 55– < 65 and 65– < 75 age groups having the highest average comfort level (mean: 6.8; SD: 2.4) and 18– < 45 and 45– < 55 having the lowest (mean: 6.0; SD: 2.4 and 2.3, respectively). Average AI comfort level also differed significantly by gender (p = 0.0001), with males having the highest comfort (mean: 7.1; SD: 2.2) and by education (p = 0.0029), with patients with a graduate degree being most comfortable (mean: 6.9; SD: 2.1) and patients with less than a high school diploma or unknown education being least comfortable (mean: 5.3; SD: 2.5). Average AI comfort level differed significantly by survey format (p < 0.0001), with patients who completed the survey using REDCap having the highest average (mean: 6.6; SD: 2.2) and patients who completed the survey using a tablet having the lowest average (mean: 4.0; SD: 2.7). Patients’ experience with AI was significantly associated with average AI comfort level (p < 0.0001), with patients who responded that they do not know what the terms “AI” or “ML” mean or have unknown experience with AI/ML having the lowest average AI comfort level (mean: 5.6; SD: 2.6).

Table 3.

Demographic, orthopaedic, and technology use descriptive statistics and stratified AI comfort level

Characteristicd Frequency/Mean Average AI comfort level
Mean (SD) p value
Age (years), mean (SD) 57.7 (15.0)
Age category, n (%) 0.0315
 18–< 45 83 (20.9) 6.0 (2.4)
 45–< 55 77 (19.4) 6.0 (2.3)
 55–< 65 90 (22.7) 6.8 (2.4)
 65—< 75 96 (24.2) 6.8 (2.4)
 75 +  51 (12.8) 6.3 (2.2)
Gender, n (%) 0.0001
 Female 247 (62.2) 6.1 (2.4)
 Male 141 (35.5) 7.1 (2.2)
 Other/Unknown 9 (2.3) 5.6 (3.4)
Education, n (%) 0.0029
 Less than HS/Unknown 25 (6.3) 5.3 (2.5)
 High school/equivalent 97 (24.4) 6.0 (2.7)
 Associate degree 52 (13.1) 5.8 (2.3)
 Bachelor’s degree 99 (24.9) 6.8 (2.2)
 Graduate degree 124 (31.2) 6.9 (2.1)
Race/ethnicity, n (%) 0.3139
 Non-Hispanic White 302 (76.1) 6.5 (2.3)
 Non-white/Hispanic 74 (18.6) 6.2 (2.3)
 Unknown 21 (5.3) 5.7 (3.2)
Religion, n (%) 0.1477
 Catholicism 118 (29.7) 6.4 (2.5)
 Protestantism 70 (17.6) 6.6 (2.1)
 Other 93 (23.4) 6.3 (2.3)
 No religion 80 (20.2) 6.9 (2.3)
 Unknown 36 (9.1) 5.6 (2.6)
Occupation, n (%) 0.4252
 Healthcare 68 (17.1) 6.5 (2.2)
 Management 40 (10.1) 6.4 (2.8)
 Skilled trade 20 (5.0) 6.2 (2.6)
 Education 27 (6.8) 6.7 (2.2)
 Other/Multiple fields 134 (33.8) 6.6 (2.2)
 Retired 74 (18.6) 6.6 (2.2)
 Student/Unemployed/Unknown 34 (8.6) 5.2 (3.0)
Survey format, n (%)  < .0001
 REDCap 348 (87.7) 6.6 (2.2)
 Tablet 30 (7.6) 4.0 (2.7)
 Paper 19 (4.8) 6.4 (2.8)
Survey site, n (%) 0.4096
 Rural integrated health system 184 (46.4) 6.3 (2.5)
 Urban academic health system 213 (53.6) 6.5 (2.3)
Orthopaedic complaint, n (%) 0.5191
 Neck/back pain 181 (45.6) 6.6 (2.2)
 Foot/ankle/hand/wrist pain 33 (8.3) 6.3 (2.3)
 Knee pain 71 (17.9) 6.0 (2.3)
 Hip pain 36 (9.1) 6.8 (2.2)
 Shoulder/elbow pain 34 (8.6) 6.1 (3.1)
 Other 16 (4.0) 6.2 (2.2)
 Unknown 26 (6.6) 6.3 (3.0)
Prior orthopaedic surgery, n (%) 0.0814
 No 173 (43.6) 6.4 (2.3)
 Yes, one 102 (25.7) 6.4 (2.4)
 Yes, multiple 106 (26.7) 6.7 (2.4)
 Unknown 16 (4.0) 5.0 (2.3)
Cell phone use frequency, n (%) 0.1983
  > 1 h/day 300 (75.6) 6.5 (2.3)
  < 1 h/day 63 (15.9) 6.6 (2.3)
 Only to make calls 23 (5.8) 5.7 (2.7)
 Never/Do not own a cell phone 4 (1.0) 6.0 (2.8)
 Unknown 7 (1.8) 4.2 (2.3)
Computer use frequency, n (%) 0.0535
  > 3 h/day 184 (46.4) 6.7 (2.2)
 1–3 h/day 107 (27.0) 6.6 (2.3)
  < 1 h/day 50 (12.6) 6.1 (2.6)
 Weekly 21 (5.3) 6.4 (2.4)
 Monthly 14 (3.5) 5.4 (2.7)
 Never/Do not own a computer 15 (3.8) 4.9 (2.9)
 Unknown 6 (1.5) 4.4 (3.4)
Experience with AI/ML, n (%)  < .0001
 Work(ed) in a field relevant/directly related to AI/ML 22 (5.5) 7.1 (1.9)
 Understand how AI/ML function 64 (16.1) 7.4 (1.8)
 Have researched terms 44 (11.1) 7.1 (2.0)
 Have heard of terms 174 (43.8) 6.3 (2.4)
 Do not know AI/ML/Unknown 93 (23.4) 5.6 (2.6)
 Average AI comfort level, mean (SD) 6.4 (2.4)
Average AI comfort level, n (%)
  ≤  Median (6.8) 192 (48.4)
  > Median (6.8) 195 (49.1)
Unknown 10 (2.5)

Average AI comfort level did not differ significantly between the rural health care center and the urban health care center. Additionally, comfort did not differ significantly by race, religion, occupation, primary orthopaedic complaint, surgical history, frequency of cell phone use, or frequency of computer use in the unweighted analysis.

Secondary outcomes relating to patients’ perspectives on AI in healthcare were all significantly associated with average AI comfort level in the expected directions. Average comfort levels were highest among those who perceived AI to have a positive impact on orthopaedic care (mean: 7.7; SD: 1.7; p < 0.0001), those who perceived AI would decrease healthcare costs (mean: 7.4; SD: 1.9; p < 0.0001), those who would not refuse AI if it were to increase healthcare costs (mean: 7.8; SD: 1.9; p < 0.0001), and those who felt it acceptable for a physician to sell health data to a technology company (mean: 7.3; SD: 2.1; p < 0.0001).

Administrative factors analyzed for impact on AI comfort level revealed a significant relationship (p < 0.001) between survey format and comfort level. Participants who took the survey by REDCap had the highest comfort level (mean 6.6: SD:2.2), while those who took the survey on a tablet in the clinic had the lowest comfort level (mean: 4.0; SD: 2.7).

Age, gender, and education level were the only demographic variables that were significantly associated with average AI comfort level in bivariate analyses and so served as the base model for multivariable regression model building. Other predictors that remained significantly associated with average AI comfort level when controlling for age, gender, and education were patient’s experience with AI/ML (p = 0.0018), perceived impact of AI in orthopaedic care (p < 0.0001), perceived impact of AI on healthcare costs (p = 0.0013), whether patients would refuse AI if it increased healthcare costs (p < 0.0001), whether patients believe it’s acceptable for a physician to sell health data to a third party for the purpose of building intelligent computers for use in healthcare (p < 0.0001), and survey format (p < 0.0001) (Table 4).

Table 4.

Multivariable model controlling for age, gender, and education

Multivariable model1 Beta coeff. (95% CI) p value
Base model
 Age (continuous) 0.016 (0.00020, 0.031) 0.0471
 Gender  < .0001
 Female Reference
 Male 1.06 (0.58, 1.53)  < .0001
 Other/unknown 0.30 (− 1.48, 2.09) 0.7389
Education 0.0018
 Less than HS/unknown − 0.73 (− 1.79, 0.34) 0.1812
 High school/equivalent Reference
 Associate degree 0.0052 (− 0.78, 0.79) 0.9896
 Bachelor’s degree 0.81 (0.17, 1.46) 0.0136
 Graduate degree 0.85 (0.24, 1.46) 0.0065
Base model +  experience with AI/ML
 Experience with AI/ML 0.0018
 Work(ed) in a field relevant/directly related to AI/ML 0.56 (− 0.46, 1.58) 0.2784
 Understand how AI/ML function 1.06 (0.40, 1.72) 0.0016
  Have researched terms 0.64 (− 0.11, 1.39) 0.0935
 Have heard of terms Reference
 Unknown/do not know what AI/ML mean − 0.43 (− 1.02, 0.16) 0.1515
Base model + perceived impact of AI in orthopaedic care
 Perceived impact of AI in orthopaedic care  < .0001
  Positive Reference
  Negative − 4.70 (− 5.42, − 3.98)  < .0001
 Not sure/unknown − 2.39 (− 2.77, − 2.01)  < .0001
Base model + perceived impact of AI on healthcare costs
 Perceived impact of AI on healthcare costs 0.0013
  Increase Reference
  Decrease 1.20 (0.55, 1.84) 0.0003
 Not sure/unknown 0.43 (− 0.080, 0.93) 0.0987
Base model + Would refuse AI if increased healthcare costs
 Would refuse AI if increased healthcare costs  < .0001
  Yes Reference
  No 2.16 (1.57, 2.76)  < .0001
 Not sure/unknown 1.00 (0.49, 1.52) 0.0001
Base model + acceptable for doctor to sell health data to a third party for building intelligent computers for healthcare
 Acceptable for doctor to sell health data to a third party for building intelligent computers for healthcare  < .0001
  Yes 1.28 (0.71, 1.85)  < .0001
  No Reference
 Not sure/unknown 0.67 (0.15, 1.20) 0.0116
Base model + survey format
 Survey format  < .0001
  REDCap Reference
  Tablet − 2.36 (− 3.26, − 1.47)  < .0001
  Paper − 0.24 (− 1.30, 0.82) 0.6526

Base model includes the three significantly associated variables, age, gender, and education. Subsequent regressions were performed with the base model and one additional predictor for statistical testing

1Except for Base model only results for demographic control variables not shown

All nine predictors were included together in a final multivariable generalized linear regression model (Table 5). In this model, gender, perceived impact of AI in orthopaedic care, refusal of AI if it were to increase healthcare costs, and survey format remained significantly associated with the AI comfort score. Patients who identified as the male had a higher average AI comfort level compared to those who identified as female (beta: 0.50; 95% CI: 0.13, 0.88; p = 0.0085). Compared to those who perceived AI to have a positive impact in orthopaedic care, patients who perceived AI to have a negative impact had an average comfort level 4.01 units lower (beta: − 4.01; 95% CI: − 4.8, − 3.3; p < 0.0001) and patients who were unsure or missing a response had an average comfort level 2.0 units lower (beta: − 2.0; 95% CI: − 2.4, − 1.6; p < 0.0001). Patients who would not refuse AI if it were to increase healthcare costs had a higher average AI comfort level compared to patients who would refuse it (beta: 1.0; 95% CI: 0.53, 1.5; p < 0.0001). Compared to those who completed the survey using REDCap, patients who completed the survey using a tablet had an average comfort level of 1.5 units lower (beta: − 1.5; 95% CI: − 2.2, − 0.82; p < 0.0001). Those who completed the survey on paper did not differ significantly for those who completed the survey using REDCap (Table 5).

Table 5.

Nine-variable multivariable generalized linear regression model for AI comfort level

Predictor Beta coeff. (95% CI) p value
Age (continuous) 0.0084 (− 0.0036, 0.020) 0.1708
Gender 0.0260
 Female Reference
 Male 0.50 (0.13, 0.88) 0.0085
Other/Unknown − 0.18 (− 1.52, 1.16) 0.7957
Education 0.8069
 Less than HS/unknown − 0.31 (− 1.11, 0.49) 0.4503
 High school/equivalent Reference
 Associate degree − 0.24 (− 0.84, 0.36) 0.4256
 Bachelor’s degree 0.046 (− 0.45, 0.54) 0.8564
 Graduate degree − 0.12 (− 0.61, 0.38) 0.6395
Experience with AI/ML 0.6608
 Work(ed) in a field relevant/directly related to AI/ML 0.27 (− 0.49, 1.04) 0.4819
 Understand how AI/ML function 0.38 (− 0.12, 0.88) 0.1379
 Have researched terms 0.10 (− 0.47, 0.67) 0.7358
 Have heard of terms Reference
 Unknown/do not know what AI/ML mean 0.055 (− 0.40, 0.51) 0.8129
Perceived impact of AI in orthopaedic care  < .0001
 Positive Reference
 Negative − 4.01 (− 4.75, − 3.27)  < .0001
 Not sure/Unknown − 2.02 (− 2.42, − 1.62)  < .0001
Perceived impact of AI on healthcare costs 0.2275
 Increase Reference
 Decrease 0.37 (− 0.12, 0.86) 0.1367
 Not sure/unknown 0.28 (− 0.11, 0.67) 0.1623
Would refuse AI if increased healthcare costs 0.0002
 Yes Reference
 No 1.02 (0.53, 1.52)  < .0001
 Not sure/unknown 0.37 (− 0.061, 0.79) 0.0930
Acceptable for doctor to sell health data to third party for building intelligent computers for healthcare 0.2045
 Yes 0.40 (− 0.047, 0.84) 0.0793
 No Reference
 Not sure/unknown 0.20 (− 0.22, 0.61) 0.3515
Survey format 0.0001
 REDCap Reference
 Tablet − 1.52 (− 2.22, − 0.82)  < .0001
 Paper − 0.053 (− 0.86, 0.76) 0.8968

The sensitivity analysis (weighted analysis) supported all the findings from the primary analysis (unweighted). Congruent with the primary analysis, average AI comfort level differed significantly by age, gender, education, and patient experience with AI/ML. In addition, the sensitivity analysis demonstrated that comfort level differed significantly by occupation (p = 0.0356), cell phone use frequency (p = 0.0005), and computer use frequency (p = 0.0148) independently; however, these three predictors were not significant predictive factors in the weighted multivariable regression model.

Compared to patients who had merely heard of the terms AI and/or ML, patients who reported understanding how AI/ML function had higher odds of perceiving a positive impact of AI in orthopaedic care (OR: 2.3; 95% CI: 1.2, 4.4), and patients with unknown experience or who do not know what AI/ML mean had lower odds of perceiving a positive impact (OR: 0.38; CI: 0.21, 0.68). Having a graduate degree (vs. high school/equivalent; OR: 2.7; 95% CI: 1.1, 6.5) and having researched the terms AI/ML (vs. merely having heard of the terms; OR: 2.5; 95% CI: 1.0, 6.5) were both predictive of perceiving that AI would decrease healthcare costs compared to believing that it would increase costs. Male gender (vs. female; OR: 2.3; 95% CI: 1.3, 4.2) and understanding how AI/ML function (vs. merely having heard of the terms; OR: 3.0; 95% CI: 1.4, 6.5) were both predictive of patients saying they would not refuse AI if it were to increase healthcare costs, compared to patients saying they would refuse it. Male gender (vs. female) was also predictive of patients saying it is acceptable for a doctor to sell health data to a third party (OR: 2.2; 95% CI: 1.3, 3.8). Patients with unknown experience with AI/ML or who have not heard those terms were more likely to be unsure whether it’s acceptable for a doctor to sell health data to a third party (OR: 2.3; 95% CI: 1.3, 4.2).

Discussion

The development and implementation of ML require large, accurate datasets to train predictive algorithms. Data is culled from real-world patient samples in electronic medical records and used as the foundation of algorithm training. Gathering this data demands the implicit or explicit consent of patients, the ethics of which have not been fully defined. Presumably a patient enthusiastic about the prospects of AI and ML would be more likely than not to give explicit consent for such data gathering and many factors may contribute to willingness. Therefore, the primary goal of this study was to characterize an orthopaedic surgery population’s perspective on the use of AI/ML in their care. A patient’s age, gender, education, experience with AI, and preconceived perceptions of the impact of AI/ML on healthcare quality and cost affect the patient’s comfort with AI/ML in their healthcare team. On average patients appear comfortable with AI joining the care team.

The present study demonstrates that patients appear comfortable with AI, illustrated by the 6.4 average comfort score across the entire study population. This suggests that even among patients that are most comfortable with AI, there are still reservations, which is consistent with previously published findings [7].

This study suggests that age impacts a patient’s comfort level with AI/ML, with younger patients being less comfortable. This is an interesting finding and contradicts previously published results [7]. York et. Al published data suggesting that younger individuals had a higher comfort score compared to older patients. Their results are based on a 2 question survey from a single urban center, and there is no indication of controlled logistic analysis or consideration of a weighted sensitivity analysis. Although young adults have grown up in a society where technological and computational integration into the world has been rapid and widely accepted. It is possible that this life-long relationship with computers has left younger patients skeptical of AI/ML, or that youth is associated with a less advanced understanding of AI and ML. Further investigation is warranted on this topic.

The patient’s gender was also a significant variable. Patients who identified as male reported significantly higher comfort across the population compared to those who identified as female and those who identify as other or did not respond. This is similar to previously published results demonstrating the repeatability of this finding [7].

Patients with a higher level of education were more likely to approve of AI in the care team, which is also similar to prior findings [7]. There can be multiple reasons for this. Educated patients may have a deeper knowledge of technology, or work with technology daily. The patient’s level of understanding of the terms “artificial intelligence” or “machine learning” was an important factor, as familiarity with AI increased overall comfort with AI. Increased familiarity with AI also positively impacted patient’s perception of AI in orthopaedic care, their belief that AI would decrease healthcare costs, and decreased the number of individuals who would refuse care if AI were to increase healthcare cost. This data demonstrates that future patient education can significantly impact patient perception of AI and ML in healthcare in a positive way.

Survey format impacted patient comfort level in bivariate and controlled multivariable analyses. Our data demonstrate that individuals who completed the survey in the clinic on a tablet had a lower average comfort score by 2.3 units compared to patients who completed the survey via REDCap and those who completed a paper survey in the office setting. Previous studies have identified that email/mail surveys allow patients to participate from home, which is often a more comfortable environment compared to a stressful office encounter [11]. However, there is little understanding of how the mode of completion effects survey results, reliability, and validity [12]. Our study is the first, to our knowledge, to provide evidence that environment and modality of survey completion may have an impact on how patients respond to survey questions [12]. These results should be considered in design, implementation, and interpretation of future surveys. REDCap which was delivered by email may have been more convenient, or the fact that the patient was able to obtain it and complete it by email and web browser may have been a selection bias for patients who are baseline more comfortable with technology [11]. For example, an increased frequency of computer use was associated with more AI comfort in our data. This information is valuable as it demonstrates that it may be preferable to request data usage for future data-gathering efforts when the patient is not in the clinical site, experiencing external stressors. Further research may elucidate these unknowns.

The primary analysis demonstrates that patient ethnicity, religion, and clinical setting (urban vs rural) do not have an identifiable impact on the patient’s comfort level. Our sample of nearly 400 patients is large, however, it is possible that a larger sample size could identify a relationship not observed here. Understanding differences in opinion and perspectives among racial, religious, and regional groups is a critical aspect of providing equitable AI-driven healthcare. The implications of distrust among a specific group are important for providing appropriate healthcare in the ML era; if a specific population were to refuse data gathering for the build of a predictive algorithm, the outputs of that algorithm may not apply to that group, thus creating an unintended health disparity.

Patients appear to distrust fully autonomous machines, which is evident by a few patients reporting willingness to undergo robotic surgery without the immediate supervision of a human physician. This point should be clarified by research to guide the future of robot-assisted joint arthroplasty [13, 14] and other automated surgeries.

As expected, patients who thought AI would have positive effects on their care were more likely to have a higher comfort level compared to those who thought that AI would have a negative impact on their care. Similarly, perceived impact on the cost of healthcare had a significant impact on average comfort level with 28.8% of respondents reporting that they would refuse the integration of AI if it increased the cost of care. Similarly, patients with higher comfort levels also felt comfortable with a physician (or by proxy a healthcare entity) participating in the sale of health data to third parties in an effort to improve the quality of AI. The majority of patients, however, did not feel comfortable with this. This data should be considered by physicians, hospital systems, and technology companies when designing future data-gathering efforts.

One of the first studies to consider the patient opinion on AI was an interview with a small cohort of patients undergoing radiologic imaging [15]. Those patients expressed a desire for proof of concept, reliability, efficiency, accountability and education on the general process before they would be comfortable accepting AI into their radiology care [15]. York et al. found patients strongly preference the opinion of their physician over that of an AI machine [7]. A subsequent study surveyed patient opinion regarding AI implementation in radiology, concluding that patients tended to have moderate distrust in machines assuming diagnostic tasks of a radiologist [16]. Our results support these previous studies and also identified that patients harbor some distrust of AI, but that patient education may improve patient comfort with AI in healthcare. Patient comfort is important because anxiety, catastrophic thinking, and somatization have been linked to negative perioperative events and poor post-operative outcomes with elective orthopaedic surgeries [17, 18]. Thus, not only will a positive patient attitude lead to favorable surgical outcomes in cases where intelligent machines are involved, but it will also impact the patient opinion on the subject of AI, which could subsequently increase patient consent for participation in critical data-gathering efforts.

The clinical implications of comfort level with AI reach far beyond the obvious patient emotional response to technology. The nature of the “learning” dataset is mission-critical for ML; the machine must be trained on data representative of the patients to which it is applied. For this reason, it is highly relevant that age, gender, and education level impact patient comfort with the technology, for if these patients do not provide data to the training sets then the algorithm will be biased against them. In this study, females, younger patients, and less-educated patients were not comfortable with the idea of the technology being utilized in their care. If, because of their discomfort, these patients’ data were excluded from the training dataset, the resultant algorithm could not be appropriately applied to them in the future should their opinion change. This would ultimately lead to an unconscious bias within the algorithm, making it poorly generalizable and potentially prone to generating health disparities based upon age, gender, and education. The algorithm can only make predictions based upon what it has seen, therefore the “training” population should be representative of the “treatment” population. Practically speaking, this should generate an impetus for improving comfort with AI among these groups via directed education or other modalities.

Although the present study was designed to investigate the patient perspective on AI and a subset of AI known as ML in healthcare, the results of our study may be beneficial to the development of other forms of AI. ML algorithms such as neural networks (NN) are the primary modality driving the development of precision medicine, an approach to tailor disease treatment to individuals based on patient genetics, environment, and lifestyles [4, 19]. A more complex form of ML called deep learning (DL) utilizes NN’s to make sense of its input in a layered fashion and practically can be used to identify abnormal findings within a radiology study, for example a potentially malignant lesion in the brain [5, 19]. Another distinct type of AI is Natural Language Processing (NLP), which is already widely integrated with modern society in forms as common as spell check, Amazon’s Alexa, Google search, and web-based assistants [19]. NLP can be used in healthcare to converse with patients and triage patient requests, among other tasks [19, 20]. Although these tools are already in use and undergoing continuous refinement patient opinion has not been adequately quired. One major area of AI that has not been breached in healthcare is autonomous physical robots (APR); although surgical robots are widely used, they still require human direction [19, 20]. Barriers to autonomous machines are not necessarily technological, but as emphasized by our results: societal [19]. Patients in our study are uncomfortable with the idea of autonomous robots compared to assistive devices, which is consistent with prior findings suggesting the limitation to this next step of APR is societal in nature. The future of AI is a topic of great debate, notably the ramifications of the “singularity”, or the moment a machine reaches the capability of human-level thinking also referred to as artificial general intelligence (AGI). Although the concept of AGI and if it will ever happen is a contentious topic [21], the fact remains that multiple facets of AI will continue to become integrated with healthcare and understanding how to improve its integration and the magnitude of integration with which patients are comfortable is paramount.

This study has several limitations. Data derived from patient surveys may be confounded by complex influences and are difficult to validate; however, several methods can be used to mitigate these uncertainties such as utilizing multiple Likert scales and easily comprehensible language in the survey [7, 22]. Although several Likert scales were utilized, the Likert scales used in this study have not yet been validated. Furthermore, although prior studies have identified an average comfort score from multiple Likert scales, this may not be a valid method and should be considered as a limitation for this study [7]. In addition, the language used in framing the questions may have provoked implicit biases among the patients. For example, replacing the term “sell” with “share” in the question, “Is it acceptable for a doctor to sell health data to a third party for the purpose of building intelligent computers for use in healthcare?” could potentially change a patient’s response to the question.

Intrinsic bias may be present in the data based on the medium through which this survey was distributed. The COVID-19 pandemic interrupted in-person survey administration, limiting the in-clinic population to only 11 responses via paper and pencil and 21 responses via tablet. Thus, comparisons between survey format should be interpreted with caution and repeated with a larger population. We were also limited by survey response rate; of the 3789 patients that were emailed the survey, only 365 (9.6%) patients responded. This low response rate was accounted for in the weighted analysis and directly compared to the non-weighted analysis. Although no significant difference was observed between rural and urban populations in both non-weighted and weighted analyses, the rural site experienced a significantly lower response rate compared to the urban center. Prior studies have also observed lower response rates from rural populations compared to urban populations [23, 24]. It is well documented that less than 20% of patients will respond to a survey that is emailed or mailed to them [11, 2527]. Response rates can be increased by providing the survey in the clinic with the physician [11]; however, this was not possible due to the COVID-19 pandemic-associated office closure during the collection period. Current research suggests the most effective way to increase response rate is to request survey response several times with more than one mode of request (i.e. via email and then phone call) [12]. In this study, multiple requests were delivered to patients who had not completed the surveys, with some improvement. It is well-established in medicine that implicit biases exist among care practitioners towards individuals of different race and religion, and that these biases do impact health outcomes [2830]. Involving a diverse patient population in the development of AI/ML applications is of paramount importance, as a lack of diversity could build implicit bias into the algorithm itself and in effect limit the applicability of AI/ML to the general population. Algorithms may be an effective tool only for those populations whose data was used to build the algorithm. Therefore, the study could be improved by increasing the demographic diversity of survey respondents. Notably, the sensitivity analysis demonstrated no major significant differences when applying weighted statistics to these results, which may suggest generalizability. The data collection could be improved by utilizing an internally validated and standardized patient questionnaire like the model described by Ongena et al [16].

Conclusions

Orthopaedic surgery patients on average appear comfortable with the use of AI in their care. A patient’s overall comfort level appears to be influenced by age, education level, knowledge of AI, and perceptions of the effects of the technology on clinical outcomes and healthcare costs. Presently, patients do not appear comfortable with autonomous AI-driven surgical robots. Further research is needed across more heterogeneous populations to demonstrate the generalizability of these results.

Funding

Educational grant was awarded for medical student research from Geisinger Commonwealth School of Medicine.

Declarations

Conflict of interest

Investigation performed at Geisinger Community Medical Center, Scranton, Pennsylvania and The Mount Sinai Hospital, New York, New York and there are no conflicts of interest declared.

Ethical approval

This article does not cotain any studies with human or animal subjects performed by the any of the any of the authors.

IRB Approval

Geisinger: 2019-1080; Mount Sinai: 20-01793.

Informed consent

For this type of study informed consent is not required.

Footnotes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Russell SJNP. (2020) Artificial Intelligence: A Modern Approach. 4th ed. Prentice Hall
  • 2.Connor CW. Artificial intelligence and machine learning in anesthesiology. Anesthesiology. 2019;131(6):1346–1359. doi: 10.1097/ALN.0000000000002694. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Mitchell T. (2020) Machine Learning textbook. Published March 1, 1997. Accessed November 8. http://www.cs.cmu.edu/~tom/mlbook.html
  • 4.Freedman DH. Hunting for new drugs with AI. Nature. 2019;576(7787):S49–S53. doi: 10.1038/d41586-019-03846-0. [DOI] [PubMed] [Google Scholar]
  • 5.Reardon S. Rise of robot radiologists. Nature. 2019;576(7787):S54–S58. doi: 10.1038/d41586-019-03847-z. [DOI] [PubMed] [Google Scholar]
  • 6.Panchmatia JR, Visenio MR, Panch T. The role of artificial intelligence in orthopaedic surgery. British Journal of Hospital Medicine. 2018;79(12):676–681. doi: 10.12968/hmed.2018.79.12.676. [DOI] [PubMed] [Google Scholar]
  • 7.York T, Jenney H, Jones G. Clinician and computer: a study on patient perceptions of artificial intelligence in skeletal radiography. BMJ Health Care Inform. 2020;27(3):100233. doi: 10.1136/BMJHCI-2020-100233. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Tekkeşin Aİ. Artificial intelligence in healthcare: past present and future. Anatolian Journal of Cardiology. 2019;22(Suppl 2):8–9. doi: 10.14744/AnatolJCardiol.2019.28661. [DOI] [PubMed] [Google Scholar]
  • 9.Google’s ‘Project Nightingale’ Gathers Personal Health Data on Millions of Americans—WSJ. 2020 Accessed October 15, 2020. https://www.wsj.com/articles/google−s−secret−project−nightingale−gathers−personal−health−data−on−millions−of−americans−11573496790
  • 10.SAS Enterprise Guide | SAS Support. 2020 Accessed June 22, 2020. https://support.sas.com/en/software/enterprise−guide−support.html
  • 11.Booker QS, Austin JD, Balasubramanian BA. Survey strategies to increase participant response rates in primary care research studies. Family Practice. 2021;38(5):699–702. doi: 10.1093/FAMPRA/CMAB070. [DOI] [PubMed] [Google Scholar]
  • 12.Nonresponse in Social Science Surveys. 2022 A Research Agenda—National Research Council, Division of Behavioral and Social Sciences and Education, Committee on National Statistics, Panel on a Research Agenda for the Future of Social Science Data Collection—Google Books. Accessed March 8, 2022. https://books.google.com/books?hl=en&lr=&id=mg51AgAAQBAJ&oi=fnd&pg=PR9&ots=Hz9SCFUheZ&sig=8XkU9TsUkapjknUM2s2MDZy0huA#v=onepage&q&f=false
  • 13.van der List JP, Chawla H, Joskowicz L, Pearle AD. Current state of computer navigation and robotics in unicompartmental and total knee arthroplasty: a systematic review with meta− analysis. Knee Surgery, Sports Traumatology, Arthroscopy. 2016;24(11):3482–3495. doi: 10.1007/s00167-016-4305-9. [DOI] [PubMed] [Google Scholar]
  • 14.Lonner JH, Kerr GJ. Low rate of iatrogenic complications during unicompartmental knee arthroplasty with two semiautonomous robotic systems. The Knee. 2019;26(3):745–749. doi: 10.1016/j.knee.2019.02.005. [DOI] [PubMed] [Google Scholar]
  • 15.Haan M, Ongena YP, Hommes S, Kwee TC, Yakar D. A qualitative study to understand patient perspective on the use of artificial intelligence in radiology. Journal of the American College of Radiology. 2019;16(10):1416–1419. doi: 10.1016/j.jacr.2018.12.043. [DOI] [PubMed] [Google Scholar]
  • 16.Ongena YP, Haan M, Yakar D, Kwee TC. Patients’ views on the implementation of artificial intelligence in radiology: development and validation of a standardized questionnaire. European Radiology. 2020;30(2):1033–1040. doi: 10.1007/s00330-019-06486-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Gil JA, Goodman AD, Mulcahey MK. Psychological factors affecting outcomes after elective shoulder surgery. Journal of American Academy of Orthopaedic Surgeons. 2018;26(5):e98–e104. doi: 10.5435/JAAOS-D-16-00827. [DOI] [PubMed] [Google Scholar]
  • 18.Flanigan DC, Everhart JS, Glassman AH. Psychological factors affecting rehabilitation and outcomes following elective orthopaedic surgery. Journal of the American Academy of Orthopaedic Surgeons. 2015;23(9):563–570. doi: 10.5435/JAAOS-D-14-00225. [DOI] [PubMed] [Google Scholar]
  • 19.Davenport T, Kalakota R. DIGITAL TECHNOLOGY the potential for artificial intelligence in healthcare. Future Healthc J. 2019;6(2):94–102. doi: 10.7861/futurehosp.6-2-94. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Malik P, Pathania M, Kumar RV. Overview of artificial intelligence in medicine. Published online. 2019 doi: 10.4103/jfmpc.jfmpc_440_19. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.When will singularity happen? 2023 1700 expert opinions of AGI [2023]. Accessed January 4, 2023. https://research.aimultiple.com/artificial−general−intelligence−singularity−timing/
  • 22.Bennett C, Khangura S, Brehaut JC, et al. Reporting guidelines for survey research: an analysis of published guidance and reporting practices. PLoS Medicine. 2011;8(8):e1001069. doi: 10.1371/JOURNAL.PMED.1001069. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Carman KG, Chandra A, Weilant S, Miller C, Tait M. 2018 National Survey of Health Attitudes: Appendix B: Survey Results Comparing Urban and Rural Populations. Published online 2018. Accessed March 8, 2022. www.rand.org/t/RR2876
  • 24.Edelman LS, Yang R, Guymon M, Olson LM. Survey methods and response rates among rural community dwelling older adults. Nursing Research. 2013;62(4):286–291. doi: 10.1097/NNR.0B013E3182987B32. [DOI] [PubMed] [Google Scholar]
  • 25.Ricci− Cabello I, Avery AJ, Reeves D, Kadam UT, Valderas JM. Measuring patient safety in primary care: the development and validation of the “patient reported experiences and outcomes of safety in primary care” (PREOS− PC) Annals of Family Medicine. 2016;14(3):253–261. doi: 10.1370/AFM.1935. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Medina− Lara A, Grigore B, Lewis R, et al. Cancer diagnostic tools to aid decision− making in primary care: mixed− methods systematic reviews and cost− effectiveness analysis. Health Technology Assessment. 2020;24(66):1–366. doi: 10.3310/HTA24660. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Warwick H, Hutyra C, Politzer C, et al. Small social incentives did not improve the survey response rate of patients who underwent orthopaedic surgery: a randomized trial. Clinical Orthopaedics and Related Research. 2019;477(7):1648–1656. doi: 10.1097/CORR.0000000000000732. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Peterson K, Anderson J, Boundy E, Ferguson L, McCleery E, Waldrip K. Mortality disparities in racial/ethnic minority groups in the veterans health administration: an evidence review and Map. American Journal of Public Health. 2018;108(3):e1–e11. doi: 10.2105/AJPH.2017.304246. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Walker RJ, Strom Williams J, Egede LE. Influence of race, ethnicity and social determinants of health on diabetes outcomes. American Journal of the Medical Sciences. 2016;351(4):366–373. doi: 10.1016/j.amjms.2016.01.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Breathett K, Yee E, Pool N, et al. Does race influence decision making for advanced heart failure therapies? Journal of the American Heart Association. 2019 doi: 10.1161/JAHA.119.013592. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Indian Journal of Orthopaedics are provided here courtesy of Indian Orthopaedic Association

RESOURCES