Skip to main content
BMC Psychology logoLink to BMC Psychology
. 2025 Aug 17;13:925. doi: 10.1186/s40359-025-03283-x

Turkish adaptation of the artificial intelligence ethics scale (EAI): a validity and reliability study for nursing students

Ferhat Onur Agaoglu 1,, Sinan Tarsuslu 2, Daimi Koçak 3, Murat Baş 1
PMCID: PMC12359871  PMID: 40820222

Abstract

Objective

This study was designed to culturally adapt the “Attitude towards Artificial Intelligence Ethics (EAI)” scale into Turkish and to evaluate its validity and reliability in the Turkish population.

Methods

This study was designed as methodological research to adapt the Attitudes Towards EAI Scale into Turkish and to evaluate its psychometric properties. The linguistic and cultural adaptation of the scale was carried out using the translation-back translation method. The study sample sample consisted of 656 undergraduate nursing students studying at a university in Türkiye. Participants were determined by a simple random sampling method, and voluntary participation was taken as a basis.

Results

When the findings were evaluated, the results of the Exploratory Factor Analysis (EFA) showed that the original five-factor structure of the scale (transparency, harmlessness, privacy, responsibility, and fairness) was largely preserved. According to the Confirmatory Factor Analysis (CFA) results, it was determined that the five-factor model had good fit values. The scale’s internal consistency was evaluated with Cronbach Alpha (α) coefficient, and α values for all sub-dimensions ranged between 0.85 and 0.95. These results show that the scale has a high level of reliability. In addition, composite reliability (CR) values above 0.70 and average variance extracted (AVE) values above 0.50 supported the convergent and discriminant validity of the scale. Furthermore, discriminant validity was further confirmed by MSV (< 0.31) and ASV (< 0.18), and EFA results via the FACTOR program (RMSEA = 0.047, CFI = 0.993, NNFI = 0.985) also supported the five-factor structure. Measurement invariance across gender was established at all levels (∆CFI < 0.01).

Conclusion

The results show that the EAI is an appropriate tool for assessing university students’ attitudes toward artificial intelligence ethics in Turkish society.

Implications

In this direction, testing the scale in different professional groups, age groups, and cultural contexts in the future may expand its generalizability and usage areas. In addition, such scales are thought to provide important contributions to educational programs and policy development processes to increase ethical awareness in today’s rapidly developing artificial intelligence technologies.

Supplementary Information

The online version contains supplementary material available at 10.1186/s40359-025-03283-x.

Keywords: Nursing students, Artificial intelligence, Ethics, Validity, Reliability

Introduction

The rapid development of artificial intelligence (AI) technologies has created excitement in both industry and academia and raised important questions about how the ethical aspects of these technologies should be handled. AI systems can make decisions through the patterns they learn from large datasets, raising ethical issues such as bias, discrimination, security weaknesses, or privacy violations [1, 2]. This situation requires ethical principles to be addressed with a comprehensive and interdisciplinary approach, from the design of AI to the end user. Therefore, it is seen that AI ethics has become a multidimensional field of study that includes not only technical issues but also legal, sociological, and philosophical dimensions [3]. In addition, according to Rodríguez et al. (2023), reliable AI applications are based on ten technical requirements: human factor, control, robustness, confidentiality, transparency, diversity, non-discrimination, justice, social and environmental welfare, and accountability [4]. Dignum (2018) also states that the ethical implications of AI systems are critical in building trust, respecting human rights, and protecting social understanding [5]. In this context, he states that frameworks that ensure the harmonization of technological developments with social, legal, and moral values should be established to address these ethical issues.

While the literature on AI ethics initially focused more on the potential risks brought by technological developments, it has developed in recent years in studies on establishing a more comprehensive ethical framework. AI ethics/literature aims explicitly to protect fundamental ethical principles such as human rights, privacy, justice, responsibility, and transparency in all processes, from the design of AI to its implementation [1, 3]. These principles are also among the prominent basic principles of AI ethics literature.

Transparency and accountability aim to make the justifications for the decisions taken in AI systems explainable and to answer questions about which actors are responsible for these decisions [6]. Especially in high-risk areas (e.g., medical and legal practices), the understandability of the algorithmic process is seen as a critical factor in protecting the rights of affected individuals [2]. However, while ensuring transparency, organizational competition, and security dynamics must also be taken into account.

The principle of fairness includes measures to prevent an AI system from producing biased results against different demographic or cultural groups. Algorithms trained on large datasets can reinforce inequalities that already exist in society or reproduce them in new contexts [7]. For example, Vlasceanu and Amodio’s (2022) research stated that although AI algorithms are used to increase objectivity in decision-making processes, they may contain systemic social biases [8]. In addition, the study found that gender bias in online algorithms reflects and reinforces social inequalities. Moreover, this study emphasized the need for an integrated ethical AI model that considers human psychology. In this respect, eliminating diversity and representation problems in the data collection is important for constructing a fair AI system [9].The principle of liability, on the other hand, involves ethical and legal debates on who will be held responsible for the harm caused by an AI application and becomes even more complex, especially in cases where clear boundaries for autonomous systems or machine learning models become unclear [10, 11]. Bottomley and Thaldar’s (2023) research emphasized that it has become unclear who or what is responsible for the damage caused by AI in health services, and the need for re-adaptation or reconciliation of traditional concepts of responsibility has also been put forward [12]. In addition, the principle of harmlessness in AI ethics is also widely discussed. According to the principle of nonmaleficence, AI systems should be structured not to harm any individual or group of society while being designed and implemented. This is especially critical in preventing systemic biases and discrimination in health, justice, and education [13, 14].

Privacy is also an important dimension of AI ethics. In processing users’ data, developing regulatory frameworks and policies for data security and the protection of personal rights has become increasingly important [15]. In this direction, compliance with international standards and regional legislation on AI is not only limited to technical requirements but is also considered a reflection of ethical responsibility [3, 16].

These fundamentals indicate that measurement tools and scales developed in AI ethics are becoming increasingly important. Because these tools have the potential to produce valuable information for policymakers and institutions by providing a systematic examination of attitudes and perceptions about AI ethics [17], in particular, conducting scale adaptation studies in different countries and cultures can provide a holistic framework on how the understanding of AI ethics is shaped on a global scale [18]. In this way, it is possible for many actors, from technology companies to decision-makers, from civil society organizations to academic researchers, to act together for AI to minimize ethical risks while increasing social benefit.

The Ethics of Artificial Intelligence (EAI) scale, developed by Jang et al. (2022) [17], is based on a theoretical model based on five basic principles for evaluating the ethical aspects of AI technologies. These principles are transparency, non-maleficence, privacy, responsibility and fairness, and are consistent with the basic ethical values ​​frequently emphasized in international AI ethics guides. The model in question offers conceptual integrity in the ethical evaluation of AI applications at both individual and institutional levels. In this context, adapting the scale to Turkish culture and especially to nursing students studying in the field of health is important in terms of both increasing ethical awareness and enabling an ethically based evaluation in the use of AI technologies in the field of health.

As a result, the AI ethics literature requires a more comprehensive examination of the innovations brought about by technological advances and the accompanying ethical responsibilities. Both international and local studies provide theoretical and methodological foundations for systematically evaluating the ethical dimensions of AI applications. In this context, adapting the AI ethics scale into Turkish stands out as a valuable measurement tool for both academic and practical studies. For this purpose, it is thought that the adaptation of the EAI will provide important contributions to the consideration of AI ethics in a broader perspective and to examine its applicability in various sectors, especially with future research to be conducted in Turkish nursing education and practice areas. Therefore, it is also believed that the adaptation and widespread use of the AI ethics scale will be a strategic step that will determine the direction of future research, especially in human-oriented sectors such as education and health. In parallel with this information, testing the question ‘’Is Turkish adaptation of EAI valid for Turkish society?‘’ was determined as the final goal of the research.

Method

Study design

This study was designed in an instrumental design because it was carried out to test the psychometric properties of psychological measurement tools [19]. Based on this design, the study was conducted to examine the psychometric properties of the Turkish version of the EAI scale in a sample of nursing students.

Participants

The participants consisted of 656 undergraduate students studying in the nursing department of a university in Türkiye. Inclusion criteria for the study sample were as follows: being enrolled as an undergraduate student in the nursing department of the relevant university at the time of data collection, voluntarily agreeing to participate in the study, and completing the questionnaire in full. Exclusion criteria included students who were not enrolled in the nursing department, those who were absent during data collection, and those who submitted incomplete questionnaires. The questionnaires were hand-distributed to the students in the classroom environment by the first author after obtaining permission from the university administration and collected after they were answered. In the first part of the questionnaire, the participants were informed about the purpose of the study, that participation was voluntary, and that their answers would remain confidential. Of the 656 students, 70.7% were male, 48.8% were aged between 18 and 20 years, 34% were second-class students, 42.5% rated their knowledge of technology as moderate, and 54.6% had experience with AI education (Table 1).

Table 1.

Descriptive characteristics of the participants

Demographics Number (n) Percent (%)
Gender
 Female 464 70.7
 Male 192 29.3
Age
 18–20 320 48.8
 21–23 218 33.2
 24–26 111 16.9
≥ 27 7 1.1
Class
 First 153 23.3
 Second 223 34.0
 Third 170 25.9
 Fourth 110 16.8
How would you rate your knowledge of technology?
 Very low 10 1.5
 Low 32 4.9
 Moderate 279 42.5
 High 217 33.1
 Very high 118 18.0
Have experience with AI education
 Yes 358 54.6
 No 298 45.4

Procedure

EAI scale developed by Jang et al. (2022) [17], originally in English, was first translated into Turkish according to the translation-back translation procedures proposed by Brislin et al. (1973) [18]. This scale, which was prepared in English, was first translated into Turkish independently by two native Turkish linguists fluent in English and Turkish. In the second step, the scale was translated from Turkish to English by two linguists fluent in both languages and cultures. In the third step, an expert group consisting of two linguists who had not been involved in previous translations and who were fluent in both languages and cultures and three academic members who had conducted research in the field of social sciences were consulted. After the feedback received from the experts, the original language and translation of the scale were checked by a linguist who was fluent in both languages. The Turkish translation of the scale was decided to be appropriate after it was checked and approved by two academic members fluent in English who have conducted research in the field of social sciences.

In order to evaluate the comprehensibility and applicability of the scale, a pilot study was conducted with 40 undergraduate students studying in the Department of Health Management at Erzincan Binali Yıldırım University. As a result of the study, the participants stated that there were no incomprehensible, inappropriate, or disturbing expressions in the scale items.

Sample size

The EAI scale has five dimensions with a total of 17 items. Researchers have no consensus about the sample size representing the main population. While some researchers argue that the sample size should be at least 20% of the universe (400 if the sample is vast) to represent the universe (Krejcie & Morgan, 1970), others claim that it should be at least 10 times the number of items [20, 21]. In line with these explanations, the sample size we used in our research is sufficient.

Data collection tools

The questionnaire consists of two parts. In the first part, the participants were informed about the purpose of the study and the confidentiality of personal data. Then, questions to measure the demographic characteristics of the participants (gender, age, class, knowledge of technology, and experience with AI education) were included. In the second part, 17 items belonging to the five-dimensional EAI scale developed by Jang et al. (2022) [17] were included. Four items were used to measure transparency: “it is essential for AI to explain the reasons for its decisions.” Non-maleficence was also measured using four items: “Individuals who utilize AI technology should make an effort to put it to good use.” Three items were used to measure privacy: “the government should regulate developers and companies that create AI to ensure that they do not freely use people’s personal information.” Responsibility was also measured using three items: “in the event of any problems caused by AI, determining precisely who is to be held responsible is difficult; therefore, social consensus on who should compensate and how is required.” Three items were used to measure fairness: “if the government develops AI robots to assist students in their studies and provides them for free, it should also provide them to elderly people.” Participants responded on a 5-point Likert scale ranging from (1) strongly disagree to (5) strongly agree.

Data analysis plan

To test the construct validity of the EAI scale, EFA was conducted using the FACTOR 12.06.08 program. The EFA supported the scale’s 17-item, five-dimensional structure. Subsequently, a series of CFA were conducted to examine the discriminant validity of the scale’s dimensions using the AMOS 24.0 program. Model fit was assessed using the chi-square/degrees of freedom (χ2/df), comparative fit index (CFI), Tucker-Lewis Index (TLI), root mean square error of approximation (RMSEA), and standardized root mean square residual (SRMR). We compared the base model (a four-factor model consisting of transparency, harmlessness, privacy, responsibility, and fairness) with four alternative models. We then used the SPSS 27.0 program to examine the correlations between the scale’s dimensions and the reliability (Cronbach’s alpha) results for each dimension.

Result

Validation of the instrument ethics of artificial intelligence

Exploratory factor analysis

Firstly, EFA was performed to evaluate the construct validity of the data obtained from the sample. To examine the construct validity of the EAI scale, an EFA was conducted using FACTOR program. Although many studies have stated that it is appropriate to use KMO and Bartlett tests in EFA conducted using SPSS, some researchers argue that examining the default options of SPSS (Kaiser criterion + Varimax rotation of the Little Jiffy combination) is not sufficient to conduct exploratory factor analysis [22, 23, 24]. Researchers [23, 24] have stated that the most appropriate program for EFA is the FACTOR program [25, 26]. Accordingly, we conducted EFA using the FACTOR program. In this analysis, we used the parallel analysis procedure to determine the number of factors and the PROMIN [27] rotation method, which allows correlation between factors, for factor rotation. The results are given in Table 2.

Table 2.

Results of EFA

Items Dimensions Initial self-values
1 2 3 4 5 FDI E % of variance % accumulated
Transparency1 .961 .929 5.60 32.92 32.92
Transparency2 .968
Transparency3 .884
Transparency4 .817
Non-maleficence1 .908 .941 4.05 23.83 56.75
Non-maleficence2 .935
Non-maleficence3 .942
Non-maleficence4 .842
Privacy1 .864 .960 2.40 14.14 70.89
Privacy2 .923
Privacy3 .872
Responsibility1 .840 .944 1.59 9.33 80.22
Responsibility2 .905
Responsibility3 .782
Fairness1 .859 .968 1.11 6.54 86.76
Fairness2 .917
Fairness3 .926
Kaiser-Meyer-Olkin Measure of Sampling Adequacy 0.787
Bartlett's Test of Sphericity Approx. Chi-Square 7477.6
df 136
Sig. 0.000
Bootstrap 95% CI of KMO 0.205, 0.851
Robust goodness of fit statistics after LOSEFER correction RMSEA 0.047
NNFI 0.985
CFI 0.993

Note: N = 656; FDI = Factor Determinacy Index; E = Eigenvalue; df = degrees of freedom; CI = confidence interval; RMSEA = Root Mean Square Error of Approximation; NNFI = Non-Normed Fit Index; CFI = Comparative Fit Index

According to the EFA results, we can say that the KMO coefficient (KMO = 0.787, 95% CI 0.205, 0.851) and Bartlett’s test of sphericity (7477.6; df = 136; p < .001) are acceptable and significant. In addition, it was seen that the EAI items were gathered under five factors with eigenvalues ​​greater than 1 and the total variance explanation rate of these 5 factors with eigenvalues ​​greater than 86.763%. Finally, according to the results of Robust goodness of fit statistics after LOSEFER correction, it can be said that all fit indices are appropriate and support the 5-factor structure (RMSEA = 0.047, NNFI = 0.985, and CFI = 0.993).

Confirmatory factor analysis

Using AMOS 24.0, we conducted a series of CFA to test whether the five-factor EAI model fit the data. In accordance with the Standards for Educational and Psychological Testing, the analyses conducted (i.e., EFA, CFA) provide evidence based on the internal structure of the scale, rather than constituting discriminant evidence in the broader validity framework [28]. The model was estimated to use Maximum Likelihood. In these analyses, the goodness of fit values of the five-factor measurement model were compared with alternative models. TLI and CFI values ​​of 0.90 and above indicate that the model is at an acceptable level of fit, while 0.95 and above indicate that it has a good level of fit. The RMSEA value should be below 0.06 for the model to show excellent fit, and below 0.08 for an acceptable fit. This value should be between 0.08 and 0.10, indicating a moderate level of fit, and above 0.10, indicating a poor fit. In terms of the SRMR value, values ​​below 0.08 are considered acceptable, and values ​​below 0.05 are considered good fit [29]. The goodness of fit values of the measurement and alternative models are given in Table 3.

Table 3.

Comparison of measurement models

χ2(df) χ2/(df) TLI CFI RMSEA SRMR Model comparison
∆X² (∆df)
Model 1, five-factor model 404 (104) 3.88 0.96 0.97 0.07 0.04 -
Model 2, four-factors modela 1827 (108) 16.92 0.75 0.81 0.16 0.10 1423 (4)
Model 3, three-factors modelb 2979 (111) 26.84 0.60 0.67 0.20 0.16 2575 (7)
Model 4, two-factors modelc 3957 (113) 35.02 0.47 0.56 0.23 0.19 3553 (9)
Model 5, one-factors modeld 5008 (114) 43.93 0.34 0.44 0.26 0.21 4604 (10)

Note. N = 656; All models are significant at p < .05; χ2, chi-square discrepancy; df, degrees of freedom; RMSEA, root mean square error of approximation; CFI, comparative fit index; TLI, Tucker-Lewis index; SRMR, standardized root mean square residual; χ2, difference in chi-square; ∆df, difference in degrees of freedom; aFour-factor model = Transparency and non-maleficence combined into a single factor; b three-factor model = Transparency, non-maleficence, and privacy combined into a single factor; cTransparency, non-maleficence, privacy, and responsibility combined into a single factor; dOne-factor model, all variables combined into a single factor

The results showed that the five-factor model of EAI, transparency, non-maleficence, privacy, responsibility, and justice fit the data better when compared with alternative models including a four-factor model, a three-factor model, a two-factor model, and a one-factor model (Model 1: χ²(104) = 404, χ²/df = 3.88, CFI = 0.97, TLI = 0.96, RMSEA = 0.07, SRMR = 0.04). When the model comparisons were examined, it was seen that the five-factor model provided a significantly better fit than the four-factor model (∆χ² (4) = 1423, p < .001). These findings support the idea that the EAI scale measures five separate constructs. However, CFA results alone may not provide sufficient evidence for the internal structure validity of the scale. Therefore, the internal consistency and convergent validity of each factor in the scale were evaluated with additional analyses. For this purpose, CR and AVE values ​​were calculated for each factor. Table 4 shows CR, AVE, and internal structure (Average Shared Square, ASV and Maximum Squared Variance, MSV) for each factor of the scale.

Table 4.

Means, standard deviations, CR, AVE, ASV, and MSV

Dimensions M SD CR AVE ASV MSV 1 2 3 4 5
Transparency 3.53 1.23 0.94 0.78 0.18 0.30 (0.89)
Non-maleficence 3.63 0.93 0.91 0.71 0.17 0.30 0.56 (0.84)
Privacy 3.86 0.91 0.91 0.77 0.14 0.35 0.11 0.18 (0.88)
Responsibility 3.87 0.90 0.91 0.78 0.12 0.35 0.03 0.02 0.35 (0.88)
Fairness 3.62 1.09 0.89 0.74 0.10 0.30 0.30 0.04 0.09 0.29 (0.86)

Note. N = 656. ASV = Average Shared Square; MSV = Maximum Squared Variance

CR value must be above 0.70, AVE value must be above 0.50, MSV and ASV must be less than AVE [20, 21]. As a result of the analysis, it was determined that all CR values were above 0.70 and AVE values were above 0.50. These results confirmed the reliability and consistency of the EAI model.

A convergent validity analysis of the dimensions of the EAI scale was performed according to the criteria of Fornell and Larcker [30]. The convergent validity of the measurement model is checked by comparing the square root of the AVE value of each construct with the correlation between that construct and other constructs. According to the Fornell-Larcker criterion, the square root of the AVE of each construct should be higher than the correlation with other constructs [30]. The AVE square root value of each construct is given in brackets in Table 4. When the AVE square root values in Table 4 are analyzed, it is seen that these values are more significant than the other correlation coefficients. According to Fornell and Larcker, the HTMT (Heterotrait-Monotrait) value below 0.90 in measurement models with similar concepts means that internal structure is achieved [30]. According to this rule, no HTMT value exceeds the threshold value. The results confirm that the proposed model meets the discriminant validity criteria with Fornell-Larcker and HTMT analysis.

In determining the reliability (internal consistency) of the EAI scale, the Cronbach Alpha technique, which is most commonly used in social sciences, was used. In Table 5, Cronbach Alpha values showing the internal consistency reliability of the factors of the scale were found to be between 0.862 and 0.951.

Table 5.

Reliability of the five dimensions in EAI

Dimension No. of items n Corrected item-
total correlation
Cronbach’s Alpha
Transparency 4 656 0.835-0.917 0.951
Non-maleficence 4 656 0.582-0.869 0.902
Privacy 3 656 0.735-0.801 0.884
Responsibility 3 656 0.695-0.762 0.862
Fairness 3 656 0.734-0.810 0.888

In the study, inter-item correlation values were also analyzed in order to determine the relationships between the scale items. Inter-item correlation values examine the extent to which the scores on an item are related to the scores of all other items in the scale. When Table 5 is analyzed, it is seen that the corrected item-total correlation values for each item were 0.582 at the lowest and 0.917 at the highest. High Cronbach alpha values and item-total correlation values mean that the EAI scale’s internal consistency reliability is high [31].

Equivalence of measurement

The measurement equivalence of the five-factor EAI scale in terms of male and female groups was tested using the AMOS 24 program. The results are reported in Table 6.

Table 6.

Measurement equivalence results

Model χ2(df) χ2/(df) CFI TLI RMSEA SRMR Model comparison
∆X² (∆df) ∆CFI
1. Unconstrained 564 (208) 2.71 0.96 0.95 0.05 0.04 - -
2. Factor loadings 581 (220) 2.64 0.96 0.95 0.05 0.04 17 (12) 0.00
3. Intercepts 598 (237) 2.52 0.96 0.95 0.05 0.04 17 (17) 0.00
4. Structural covariances 613 (252) 2.43 0.96 0.96 0.05 0.05 15 (15) 0.00
5. Measurement residuals 693 (274) 2.53 0.96 0.95 0.05 0.05 80 (22) 0.00

The equivalence of the measure was tested in five stages. In the first stage, structural equivalence was tested on the unconstrained model in which no parameter values ​​were equal. As a result of the analysis, it can be said that the scale has structural equivalence since the goodness of fit values ​​of the model (χ2/(df) = 2.71, CFI = 0.96, TLI = 0.95, RMSEA = 0.05, SRMR = 0.04) have acceptable threshold values. After testing the structural equivalence, metric equivalence should be tested in the second stage. For this, the multi-group CFA results obtained by equating the factor loadings of the items in the scale between the groups were compared with the structural model. In this comparison, CFI difference tests should be used and the CFI value between the compared models should be less than 0.01 [32]. When the results in Table 6 are examined, it can be said that there is metric equivalence since the CFI difference (∆CFI) of the structural model and the metric model is less than 0.01. This result shows that the factor loadings of the scale items are equivalent in terms of groups. In the third stage, in order to test the scalar invariance, the CFI value obtained as a result of the multi-group CFA conducted by equating the constants (intercepts) of the scale items between the groups was compared with the CFI value of the metric model. As a result of the comparison, it was seen that the scale model and the metric model were equivalent (∆CFI < 0.01). In the next stage, the structural covariance model and the scale model were compared. The results confirmed that the structural covariance model and the scale model were equivalent (∆CFI < 0.01). Finally, in order to test the strict equivalence (residual invariance), the measurement residuals model and the structural covariance model were compared. The results showed that these two models were equivalent (∆CFI < 0.01). The findings can be said that the tested EAI scale has measurement equivalence.

Discussion

In this study, the attitudes towards EAI developed by Jang et al. (2022) was adapted into Turkish, and its psychometric reliability and validity were examined [17]. The findings showed that the five-factor structure of the scale (transparency, harmlessness, privacy, responsibility, and fairness) could be maintained in the Turkish context. Both EFA and CFA revealed that the original measurement model fit well in the Turkish sample. This makes an important contribution to the cross-cultural validity of the scale. Similarly, it is seen that scale adaptation studies developed in nursing and other health sciences in recent years have produced reliable results by using similar methodological approaches and statistical techniques. In this literature, following the methodological steps (translation-retranslation, expert opinion, pilot application, and factor analyses) consistently plays a critical role in the validity and reliability of cultural adaptations.

The fact that the sub-dimensions (transparency, harmlessness, privacy, responsibility, fairness) in the Turkish version of the EAI scale exhibit similar patterns in this study indicates that the basic dimensions of AI ethics contain universal elements regardless of cultural context. However, the speed of technological developments, the increasing use of AI applications in the field of health, and the support of decision-support systems with AI transform the future of health services [33, 34], the use of this scale in healthcare professionals and students will also benefit the measurement of the ethical requirements of the subject. In addition, awareness of the effects of AI in the context of clinical decision-making, patient care quality, accountability, transparency [35], and reduction of inequalities and ethical risks is increasing in nursing, medicine, and other health disciplines [36]. At this point, measuring individuals’ ethical attitudes towards AI applications through the EAI scale can guide the organization of educational curricula and the shaping of institutional policies.

According to the results of another analysis carried out in the research; high Cronbach’s Alpha (α) values and item-total correlation coefficients supported the internal consistency reliability of the Turkish version of the EAI scale [31]. The Cronbach’s Alpha subscales (α = 0.862-0.951) values of the Turkish EAI scale are positively consistent with the original Cronbach’s Alpha subscales (α = 0.753-0.865) values [17]. These results meet the acceptable Cronbach’s alpha threshold (α > 0.600), which indicates satisfactory internal consistency [37]. In addition, the CR and AVE values were also found to exceed acceptable thresholds [30]. In the original form of the scale, CR values for all seventeen items were 0.756 and AVE values were 0.516, while in this study, CR values for all seventeen items were 0.896 and AVE values were 0.707. Finally, since these values (CR-AVE) exceeded the acceptable threshold values, adequate content validity was confirmed [38]. In addition, according to the EFA analyses, the KMO value of the original form of the scale was 0.848, and Bartlett’s Sphericity test was significant (= 3093.183, df = 153, p = .000), indicating that the data were suitable for factor analysis [16]. According to the EFA analyses conducted in our study, factor analyses could be performed because the KMO value was determined as 0.787, and Bartlett’s Sphericity test was significant (= 7477.6, df = 136, p = .000). Because the determined KMO value was above 0.60, and Bartlett’s Sphericity values were found to be above the acceptable threshold values of p < .05 and significant [37, 39]. These findings show that the Turkish scale form is a valid and reliable measurement tool.

In addition, the fact that the participants were mostly nursing students shows that the results may be a practical benefit in health discipline. The ethical dimensions of AI are critical for future healthcare professionals, especially in issues such as patient rights, privacy, data management, patient initiative, confidentiality, transparency, trust, responsibility, data quality, and the role of technology in sharing responsibility [40, 41, 42, 43, 44].

In addition to confirming the validity of the Turkish version of the EAI scale, it is also valuable to consider how this adaptation aligns with similar efforts in other cultural contexts. For instance, in the original study by Jang et al. (2022) [17], the EAI scale demonstrated consistent psychometric properties when adapted to the South Korean cultural context. Likewise, the AI ​​Ethical Reflection Scale (AIERS), developed by Wang et al. (2025) [44], has been validated and validated in an educational-focused psychometric study on a sample of 730 university students. CFA analyses provided internal structure-based validity by supporting the three-dimensional structure, while discriminant validity was provided by the weak positive correlation between awareness and AI literacy. This study provides a strong methodological example of the cultural adaptability of AI-powered scales in the educational context.

In this context, the study by Moreira-Choez et al. (2025) [45], which validates an educational model through AI algorithms in Ecuadorian higher education, presents a valuable methodological parallel. Their work demonstrates that AI-supported tools can be effectively employed for instrument validation and educational assessment in culturally diverse settings, reinforcing the methodological rigor and cross-cultural applicability seen in this study’s adaptation of the EAI scale.

The current research contributes to this growing cross-cultural literature by offering evidence from the Turkish health education context. The fact that the Turkish version preserved the original factor structure supports the view that the basic ethical concerns surrounding AI in healthcare have a universal resonance, while also highlighting the necessity for cultural sensitivity in ethical measurement tools. Additionally, the work of Moreira-Choez et al. (2024) [46] emphasizes the relevance of assessing digital competencies in university faculty members within the broader framework of AI ethics. Their multimodal approach highlights the importance of integrating ethical reasoning with digital skill sets, thereby offering a complementary theoretical basis for employing the EAI scale in university-level training programs.

Moreover, the practical utility of the EAI scale in health education and training can be enhanced through targeted strategies aimed at both learners and professionals. For educators, the scale may be integrated into nursing and medical ethics curricula as both an evaluative and pedagogical tool. For example, students’ responses to subdimensions such as transparency or privacy could inform case-based discussions or simulation exercises involving AI-assisted clinical scenarios. The EAI scale may also guide the development of instructional modules that strengthen students’ awareness of algorithmic decision-making and the ethical challenges it entails.

For healthcare professionals, the instrument can serve as a diagnostic tool to assess ethical readiness and digital attitudes, thereby informing in-service training and continuous professional development. Institutional leaders and curriculum developers can leverage the results obtained through the scale to align teaching content with evolving ethical requirements in AI-enhanced healthcare systems. Given the growing reliance on decision-support systems powered by AI, especially in clinical diagnostics, patient monitoring, and administrative workflows, the ability to measure and shape ethical perspectives through tools like the EAI becomes increasingly crucial.

In these ways, the EAI scale not only contributes to empirical research but also facilitates ethically informed practices in rapidly evolving AI-integrated healthcare environments. Its application can serve as a foundation for interdisciplinary dialogue, curriculum innovation, and policy development that prioritizes both technological advancement and human-centered care.

Within the scope of the study, the fact that the students who participated in the pilot study evaluating the applicability and comprehensibility of the scale reported that they could comprehend the scale items also confirms the usability of this scale in educational environments. However, considering the continuous development of AI technologies, it is important to test the scale with students or health professionals in different departments (e.g., medicine, dentistry, physiotherapy, etc.) in the future. Thus, the generalizability of the measurement tool and whether there is interdisciplinary differentiation in ethical attitudes will be better understood. In addition, cultural norms, different attitudes towards professional ethics, and digital literacy levels may create variance in the sub-dimensions of the scale. For this reason, follow-up studies involving nursing students and large groups of participants from different cultures with different levels of professional experience are needed.

In conclusion, this study confirmed the theoretical integrity and psychometric properties of the Turkish version of the EAI scale. This scale, which enables a holistic evaluation of ethical attitudes towards AI in the field of health, can be widely used in the context of education and future research. It is predicted that the findings obtained regarding ethical decision-making and protection of professional standards will be guiding, especially in health services where AI technologies are increasingly diversified and intensified.

Limitations

The fact that the participants in this study were nursing students studying at a single university limits the applicability of the findings to different universities, departments, or professional groups. In addition, the sample size, the varying levels of technology use and AI knowledge, awareness, and utilization among the participants, and the educational competencies of nursing students in these areas may have affected the interpretation of the results. In particular, the fact that almost half of the nursing students’ technology usage levels are moderate and almost half lack knowledge and experience about AI raises questions about their understanding of EAI. These limitations may affect the reproducibility of the scale’s factor structure because the participants’ different levels of knowledge, awareness, and experience may lead to inconsistencies in their responses when evaluating EAI among nursing students. Although the validity and reliability analyses of the EAI scale were conducted comprehensively in this study, the test-retest reliability regarding the temporal stability of the scale was not evaluated. This situation constitutes a limitation in terms of the external validity of the scale. In future studies, the long-term use potential of the scale can be supported more strongly by determining the degree to which the measurement results remain consistent over time through test-retest analyses. The sample size used in this study (N = 656) is sufficient to support both EFA and CFA. However, it is frequently noted in the literature that splitting the sample into two subgroups (e.g., n ≈ 328) may insufficient statistical power for analyses requiring structural equation modeling such as CFA [21, 29]. Furthermore, splitting the sample can lead to parameter deviations that could compromise the structural integrity of the subdimensions [24]. Finally, in today’s conditions, where cultural and technological factors can change rapidly, it will be important to retest the EAI scale in different periods and samples (different professional groups) to keep the validity and reliability of findings up to date.

Conclusion

In this study, the EAI scale, which was developed to measure the ethical dimensions of AI applications, was adapted to Turkish, and its validity and reliability properties were examined. When the results were evaluated, it was revealed that the five-dimensional structure of the scale (transparency, harmlessness, privacy, responsibility, and justice) showed a high goodness of fit in the Turkish sample. Both psychometric indicators and statistical analyses (EFA, CFA, CR, AVE, ASV, MSV) proved that the scale can be used scientifically. In particular, exploratory factor analysis performed using the FACTOR program and measurement invariance tests across gender confirmed the robustness and generalizability of the scale structure. These findings indicate that EAI is a reliable tool for assessing the ethical perception of AI among nursing students. Accordingly, it is predicted that the EAI scale will be a valuable tool for evaluating the ethical attitudes of individuals in health, education, and other disciplines in which AI technologies are increasingly involved. Moreover, recent studies emphasize that AI-supported scale development and the integration of digital competencies and ethical awareness in education can further enhance the relevance and applicability of such tools [46, 47]. It is also thought that examining whether the scale maintains the same measurement properties in future studies to be conducted in different demographic and cultural environments will enable comparisons to be made with both national and international literature and thus will continue to make important contributions to the field of AI. Finally, all the processes, analyses and findings regarding the study are summarized in Table 7.

Table 7.

Artificial intelligence ethics scale (EAI) – Turkish adaptation process

Stage Description Method / Tools Used Output
1. Preparation & Permission

- Identification of the target group (nursing students)

- Literature review on AI ethics

- Permission obtained from the original authors

Literature synthesis

Author correspondence

Study plan and ethical permissions completed
2. Translation Process

- Forward translation by two bilingual experts

- Back translation by two different experts

- Review by expert panel

Brislin’s translation-back translation model

Expert review

Draft Turkish version (v1.0)
3. Content Validity - Evaluation by 2 linguists and 3 social science experts for cultural and semantic relevance

Content validity check

Conceptual and semantic adjustments

Revised Turkish version (v2.0)
4. Pilot Study

- Pilot tested with 40 health management students

- Evaluation of clarity and appropriateness

Descriptive statistics

Qualitative feedback

Final Turkish version (v3.0)
5. Construct Validity (EFA)

- EFA confirmed five-factor structure

- Total variance explained: 86.76%

- KMO = 0.787, Bartlett’s p < .001

FACTOR software

Maximum Likelihood + PROMIN rotation

Parallel analysis

Five-factor structure statistically supported
6. Confirmatory Analysis (CFA)

- Five-factor model tested against alternatives

- Fit indices: CFI = 0.97, TLI = 0.96, RMSEA = 0.07, SRMR = 0.04

AMOS 24

Structural equation modeling

Model comparison

Five-factor model validated
7. Reliability Testing

- Cronbach’s α = 0.862–0.951

- CR > 0.70, AVE > 0.50

- MSV < AVE, ASV < AVE

Internal consistency analysis

CR, AVE, MSV, ASV

HTMT evaluation

High reliability and discriminant validity confirmed
8. Measurement Invariance

- Measurement invariance tested across gender (five-step)

- ∆CFI < 0.01 at all levels

Multi-group CFA

Invariance tests (loadings, intercepts, covariances, residuals)

Measurement invariance confirmed

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary Material 1 (17.7KB, docx)

Acknowledgements

We thank all the students who participated in the study voluntarily and gave their time.

We thank all the students who participated in the study voluntarily and gave their time.

Author contributions

FOA: Conceptualization, Methodology, Investigation, Writing – original draft, Writing – review & editing. ST: Conceptualization, Data curation, Writing – review & editing, Project administration. DK: Conceptualization, Investigation, Software. MB: Conceptualization, Investigation.

Funding

This research has not received any funding or special financial support from the public, private sector or non-profit foundations.

Data availability

The corresponding author can provide the datasets created and analysed for this study upon reasonable request.

Declarations

Ethics approval and consent to participate

Erzincan Binali Yinali Yildirim University’s Ethics Committee (Approval Number: E-88012460-050.04-415726) approved the study. All stages of the study were conducted in accordance with the Declaration of Helsinki. Informed consent was obtained from all the individual participants that were included in the study.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Jobin A, Ienca M, Vayena E. The global landscape of AI ethics guidelines. Nat Mach Intell. 2019;1(9):389–99. 10.1038/s42256-019-0088-2. [Google Scholar]
  • 2.Morley J, Cowls J, Taddeo M, Floridi L. Ethical guidelines for COVID-19 tracing apps. Nature. 2020;582(7810):29–31. 10.1038/d41586-020-01578-0. [DOI] [PubMed] [Google Scholar]
  • 3.Floridi L. Translating principles into practices of digital ethics: five risks of being unethical. Philos Technol. 2019;32(2):185–93. 10.1007/s13347-019-00354-x. [Google Scholar]
  • 4.Rodríguez N, Ser J, Coeckelbergh M, De Prado M, Herrera-Viedma E, Herrera F. Connecting the Dots in trustworthy artificial intelligence: from AI principles, ethics, and key requirements to responsible AI systems and regulation. Inf Fusion. 2023;99:101896. 10.48550/arXiv.2305.02231. [Google Scholar]
  • 5.Dignum V. Ethics in artificial intelligence: introduction to the special issue. Ethics Inf Technol. 2018;20:1–3. 10.1007/s10676-018-9450-z. [Google Scholar]
  • 6.Doshi-Velez F, Kim B. Towards a rigorous science of interpretable machine learning. ArXiv Preprint. 2017. 10.48550/arXiv.1702.08608. arXiv:1702.08608. [Google Scholar]
  • 7.Challen R, Danon L. Clinical decision-making and algorithmic inequality. BMJ Qual Saf. 2023;32:495–7. 10.1136/bmjqs-2022-015874. [DOI] [PubMed] [Google Scholar]
  • 8.Vlasceanu M, Amodio D. Propagation of societal gender inequality by internet search algorithms. Proc Natl Acad Sci USA. 2022;119. 10.1073/pnas.2204529119. [DOI] [PMC free article] [PubMed]
  • 9.Mehrabi N, Morstatter F, Saxena N, Lerman K, Galstyan A. A survey on bias and fairness in machine learning. ACM Comput Surv (CSUR). 2021;54(6):1–35. 10.1145/3457607. [Google Scholar]
  • 10.Li S, Faure M, Havu K. Liability rules for AI-Related harm: law and economics lessons for a European approach. Eur J Risk Regul. 2022;13:618–34. 10.1017/err.2022.26. [Google Scholar]
  • 11.La Diega G, Bezerra L. Can there be responsible AI without AI liability? Incentivising generative AI safety through ex-post tort liability under the EU AI liability directive. Int J Law Inform Technol. 2024. 10.1093/ijlit/eaae021. [Google Scholar]
  • 12.Bottomley D, Thaldar D. Liability for harm caused by AI in healthcare: an overview of the core legal concepts. Front Pharmacol. 2023;14. 10.3389/fphar.2023.1297353. [DOI] [PMC free article] [PubMed]
  • 13.Bajgar O, Horenovsky J. (2022). Negative human rights as a basis for Long-term AI safety and regulation., 8474. 10.1613/jair.1.14020
  • 14.Salhab W, Ameyed D, Jaafar F, Mcheick H. A systematic literature review on AI safety: identifying trends, challenges, and future directions. IEEE Access. 2024;12:131762–84. 10.1109/ACCESS.2024.3440647. [Google Scholar]
  • 15.EU A. (2021). European Commission white paper on artificial intelligence-a European approach to excellence and trust. URL https://ec.europa.eu/info/sites/info/files/commissionwhitepaper-artificial-intelligence-feb2020 en. pdf.
  • 16.Helberger N, Diakopoulos N. The European AI act and how it matters for research into AI in media and journalism. Digit Journalism. 2022;11:1751–60. 10.1080/21670811.2022.2082505. [Google Scholar]
  • 17.Jang Y, Choi S, Kim H. Development and validation of an instrument to measure undergraduate students’ attitudes towards the ethics of artificial intelligence (AT-EAI) and analysis of its difference by gender and experience of AI education. Educ Inform Technol. 2022;27(8):11635–67. 10.1007/s10639-022-11086-5. [Google Scholar]
  • 18.Brislin RW, Lonner WJ, Thorndike RM. Cross-cultural research methods. John Wiley; 1973.
  • 19.Ato M, Lopez JJ, Benavente A. A classification system for research designs in psychology. Anales De Psicología. 2013;29(3):1038–59. [Google Scholar]
  • 20.Krejcie RV, Morgan DW. Determining sample size for research activities. Educ Psychol Meas. 1970;30(3):607–10. 10.1177/001316447003000308. [Google Scholar]
  • 21.Hair JF, Black WC, Babin BJ, Anderson RE. Multivariate data analysis. Pearson Education; 2010.
  • 22.Ferrando PJ, Anguiano-Carrasco C. Factor analysis as a research technique in psychology. Papeles Del Psicólogo. 2010;31(1):18–33. [Google Scholar]
  • 23.Izquierdo I, Olea J, Abad FJ. Exploratory factor analysis in validation studies: uses and recommendations. Psicothema; 2014. pp. 395–400. [DOI] [PubMed]
  • 24.Lloret S, Ferreres A, Hernández A, Tomás I. The exploratory factor analysis of items: guided analysis based on empirical data and software. Anales De Psicología. 2017;33(2):417–32. [Google Scholar]
  • 25.Lorenzo-Seva U, Ferrando PJ. FACTOR: a computer program to fit the exploratory factor analysis model. Behav Res Methods. 2006;38:88–91. 10.3758/BF03192753. [DOI] [PubMed] [Google Scholar]
  • 26.Lorenzo-Seva U, Ferrando PJ. FACTOR 9.2. A comprehensive program for fitting exploratory and semiconfirmatory factor analysis and IRT models. Appl Psychol Meas. 2013;37:497–8. 10.1177/0146621613487794. [Google Scholar]
  • 27.Lorenzo-Seva U. Promin: a method for oblique factor rotation. Multivar Behav Res. 1999;34:347–56. [Google Scholar]
  • 28.American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. Standards for educational and psychological testing. Washington, DC: American Educational Research Association; 2014. [Google Scholar]
  • 29.Kline RB. Principles and practice of structural equation modeling. 4th ed. Guilford Press; 2016.
  • 30.Fornell C, Larcker DF. Evaluating structural equation model with unobservable variables and measurement error. J Mark Res. 1981;18(1):39–50. 10.1177/002224378101800104. [Google Scholar]
  • 31.George D, Mallery P. SPSS for windows step by step: A simple guide and reference. 11.0 update. 4th ed. Boston: Allyn & Bacon; 2003. [Google Scholar]
  • 32.Gürbüz S. Amos Ile Yapısal Eşitlik modellemesi. Ankara: Seçkin yayıncılık; 2019. [Google Scholar]
  • 33.Noorbakhsh-Sabet N, Zand R, Zhang Y, Abedi V. Artificial intelligence transforms the future of health care. Am J Med. 2019. 10.1016/j.amjmed.2019.01.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Jiang F, Jiang Y, Zhi H, Dong Y, Li H, Wang S, Dong Y, Shen Q, H., Wang Y. Artificial intelligence in healthcare: past, present and future. Stroke Vascular irhaNeurology. 2017;2:230–43. 10.1136/svn-2017-000101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Lysaght T, Lim H, Xafis V, Ngiam K. AI-Assisted Decision-making in healthcare. Asian Bioeth Rev. 2019;11:299–314. 10.1007/s41649-019-00096-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Wiens J, Spector-Bagdady K, Mukherjee B. Toward realising the promise of AI in precision health across the spectrum of care. Annu Rev Genom Hum Genet. 2024. 10.1146/annurev-genom-010323-010230. [DOI] [PubMed] [Google Scholar]
  • 37.Gürbüz S, Şahin F. Research methods in social sciences. Ankara: Seçkin Publishing; 2014. p. 271. [Google Scholar]
  • 38.Gefen D, Straub D, Boudreau MC. Structural equation modelling and regression: guidelines for research practice. Commun Association Inform Syst. 2000;4(1):7. 10.17705/1CAIS.00407. [Google Scholar]
  • 39.Kalaycı Ş. SPSS applied multivariate statistical techniques. Volume 5. Türkiye: Asil Yayın Dağıtım; 2010. p. 359. [Google Scholar]
  • 40.Jeyaraman M, Balaji S, Jeyaraman N, Yadav S, Cureus. 15. 10.7759/cureus.43262
  • 41.Murdoch B. Privacy and artificial intelligence: challenges for protecting health information in a new era. BMC Med Ethics. 2021. 10.1186/s12910-021-00687-3. 22. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Sheth S, Baker H, Prescher H, Strelzow J. Ethical considerations of artificial intelligence in health care: examining the role of generative pretrained Transformer-4. J Am Acad Orthop Surg. 2024. 10.5435/JAAOS-D-23-00787. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Kuwaiti A, Nazer K, Al-Reedy A, Al-Shehri S, Al-Muhanna A, Subbarayalu A, Muhanna D, Al-Muhanna F. A review of the role of artificial intelligence in healthcare. J Personalised Med. 2023;13. 10.3390/jpm13060951. [DOI] [PMC free article] [PubMed]
  • 44.Naik N, Hameed B, Shetty D, Swain D, Shah M, Paul R, Aggarwal K, Ibrahim S, Patil V, Smriti K, Shetty S, Rai B, Chłosta P, Somani B. Legal and ethical consideration in artificial intelligence in healthcare: who takes responsibility?? Front Surg. 2022;9. 10.3389/fsurg.2022.862322. [DOI] [PMC free article] [PubMed]
  • 45.Wang Z, Chai CS, Li J, Lee VWY. Assessment of AI ethical reflection: the development and validation of the AI ethical reflection scale (AIERS) for university students. Int J Educational Technol High Educ. 2025;22(1):19. [Google Scholar]
  • 46.Moreira-Choez JS, Gómez Barzola KE, Lamus de Rodríguez TM, Sabando-García AR, Mendoza C, J. C., Cedeño Barcia LA. Validation of a teaching model instrument for university education in Ecuador through an artificial intelligence algorithm. Front Educ. 2025;10:1473524. 10.3389/feduc.2025.1473524. [Google Scholar]
  • 47.Moreira-Choez JS, Gómez Barzola KE, Lamus de Rodríguez TM, Sabando-García AR, Mendoza C, J. C., Cedeño Barcia LA. Assessment of digital competencies in higher education faculty: A multimodal approach within the framework of artificial intelligence. Front Educ. 2024;9:1425487. 10.3389/feduc.2024.1425487. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Material 1 (17.7KB, docx)

Data Availability Statement

The corresponding author can provide the datasets created and analysed for this study upon reasonable request.


Articles from BMC Psychology are provided here courtesy of BMC

RESOURCES