Abstract
Background
Medical communication is a core task of physicians, and its quality affects both patients and physicians. Hence, the teaching and measurement of medical communication competence (MCC) are crucial during medical studies. This study aims to explore the factorial and construct validity of the Video-Based Assessment of Medical Communication Competence (VA-MeCo), an online-situational judgement test (SJT) designed to measure three aspects of medical students’ communication competence in patient encounters: advancing content, providing structure, and building relationship.
Methods
We conducted an online survey with N = 395 medical students. Factorial validity was tested by confirmatory factor analysis (CFA). To investigate convergent and discriminant aspects of construct validity, we tested correlations of participants’ VA-MeCo sub-test scores with scores in relevant cognitive variables, patient-interaction, and general personality traits.
Results
The CFA confirmed the expected three-dimensional factorial structure showing good model fit and highly correlated dimensions. A McDonald’s Ω of.94 for the complete test and >.81 for the sub-scales indicated high reliability. Regarding construct validity, the directions of the correlations were in line with the theoretically assumed associations; correlation sizes partly deviated from our expectations.
Conclusions
The results support that MCC can be validly measured using the VA-MeCo. The CFA results align with theory and previous studies that proposed three distinguishable dimensions of MCC. The findings on convergent and discriminant validity demonstrate that the test measures MCC as a specific construct that is positively related to patient-interaction; moreover, it can be distinguished from related but more generic constructs (e.g., social competence) and is not overly confounded with construct-irrelevant traits (e.g., intelligence). Notwithstanding the need for further validation, the results indicate that the VA-MeCo is a useful test for assessing MCC in medical education.
Background
Patient encounters are a central and frequent task of physicians [1,2]. Effective communication between physicians and patients serves as the foundation for successful treatment and the provision of optimal care [3–6]. Research shows that high-quality medical communication is beneficial for both parties [7], fosters patients’ well-being, and supports recovery [3,8,9]. Therefore, training medical communication competence (MCC) has become an important—even obligatory—part of medical education in several countries and training as well as assessment methods are receiving increasing research attention [10–13].
Teaching MCC involves the challenge of measuring it as a learning outcome. Reliable and valid assessment tools are required [14–16] not only to gauge learners’ progress and provide them with feedback [16] but also to evaluate training interventions [17] and teaching methods [18]. Standardised patient encounters in objective structured clinical examinations (OSCEs) [19] or multiple mini interviews [20] are frequently used and often considered the gold standard in teaching and assessment of MCC [19,20]. However, they are particularly resource-intensive to design and implement. In addition, available MCC assessment instruments differ in focus and in the aspects of communication competence they capture, which limits comparability and standardisation. To address such challenges, there is growing interest in using situational judgement tests (SJTs) as assessment tools for a wide range of competences [21]. In the medical context, SJTs have successfully been developed to measure relevant aspects of communicative and social competences (e.g., empathy [22], communication-related factual and procedural knowledge [23,24]) as well as other characteristics (e.g., safety performance [25]). Previous studies have shown that carefully designed SJTs can yield good reliability and validity [26–28]. Compared with the above mentioned methods, it has been argued that SJTs optimally balance efficient test implementation with validity [28,29].
Despite this growing interest, the number of SJTs developed specifically to assess MCC in medical education remains very limited, and most currently provide only preliminary evidence for their validity. The present study contributes to the further validation of the Video-Based Assessment of Medical Communication Competence (VA-MeCo) [30], a computer-administrated SJT designed to measure basic aspects of MCC in medical students. While earlier research provided evidence of the VA-MeCo’s curricular and content validity [30], other important aspects, such as its factorial structure and (convergent and discriminant) construct validity have not been evaluated, yet. The present study aimed to contribute to closing this gap.
Below, first, we elaborate on the SJT method and its use for assessment of traits relevant to medical education. Afterwards, we describe the VA-MeCo’s test design and the available evidence for its reliability and validity.
The SJT method and SJTs in medical education
SJTs were originally developed for personnel selection and have continuously gained acceptance as effective measures in other fields [28,29,31]. SJTs present examinees with multiple problem scenarios, which represent typical or critical work-related situations. For each problem scenario, examinees need to evaluate a set of specific response options related to the task at hand [28]. Participants’ answers can then be compared with a scoring key, which is often based on expert judgements [32]. Hence, SJTs assess individual abilities to provide knowledge-based and situation-specific judgements in authentic scenarios. That is, they can evaluate an individual’s ability to apply professional knowledge to analyse tasks in authentic scenarios, such as clinical situations, and to make judgements about available courses of action [27,33]. Research on competence assessment highlights that such situation-specific skills provide a crucial bridge between individuals’ basic dispositions for achievement (e.g., foundational knowledge) and their actual performance in real-world tasks [34]. For modelling the situational context, video-based problem scenarios have proven particularly useful [26,27,35]. In addition to conceptual benefits, such as enhancing authenticity, videos can foster face validity as well as participants’ test acceptance and motivation [36,37].
Because of their advantages, SJTs are increasingly used in medical education to measure a multitude of different constructs, such as professionalism [38,39], social skills [40], hygiene competence [41], and preclinical interprofessional collaboration [42]. The use of SJTs in medical education has been associated with favourable psychometric properties [24,43–45]. Moreover, the ease of administration, enhanced authenticity, and high acceptance among participants [22,27,46] make video-based SJTs a promising tool for measuring MCC in an online test setting, especially for large groups [35,43]. SJTs can therefore help bridge the gap between knowledge-based measures of MCC and behavioural assessments, such as OSCEs. By requiring judgements grounded in authentic scenarios, they indirectly capture procedural knowledge and attitudinal dispositions (e.g., empathy, respect, professionalism), thereby contributing to a more comprehensive evaluation of MCC [34].
The VA-MeCo test of medical communication competence
Conceptual basis and test structure.
The VA-MeCo is a video-based online-SJT for assessing MCC in medical education with a theoretical basis in (medical) communication theory and established curricular standards of patient-physician communication [47–50]. The Calgary-Cambridge Guide [48] guided the selection and design of the tasks and the setting of quality standards for the test items. Moreover, the test structure draws upon the Munich Model of Professional Conversation Competence, an empirically validated model that has been used before to guide curriculum and assessment development in medical education [49,50]. The Munich Model of Professional Conversation Competence conceptualises professional communication competence as a hierarchical, multidimensional construct where the top-level factor of general medical communication competence subsumes three correlated sub-dimensions: (i) facilitating joint-problem solving, (ii) organising conversations in a proactive and transparent way, and (iii) fostering a strong working relationship. This dimensional structure has been empirically corroborated in several studies [50,51].
The VA-MeCo was designed to represent this theoretical structure, to allow a differentiated assessment of MCC. The test measures MCC by three scales: (i) advancing the content of the conversation, (ii) structuring the conversation, and (iii) building a relationship with the patient. The advancing content-scale is based on the joint-problem solving-dimension of the Munich Model of Professional Conversation Competence [49,50]. Both emphasise the importance of a shared understanding between patient and physician. However, the focus of the advancing content-scale is less on problem solving and more on the progression of the conversation, as described below.
The advancing content-scale assesses participants’ ability to advance the content level of the conversation effectively, for example by gathering information, explaining and planning [48,52,53]. The providing structure-scale addresses examinees’ ability to structure a conversation actively by guiding the patient through the conversation in a comprehensible and systematic manner. This involves the systematic use of meta-communication, such as providing summaries and managing transitions [48,52,53]. Finally, the building relationship-scale refers to the relational dimension by addressing participants’ ability to establish and maintain good working relationships with patients throughout the conversation. This includes aspects such as empathy and consideration of the patient’s needs [48,52,53]. Hence, on the basic level, the test delivers sub-scale scores on the three basic dimensions of MCC. These sub-scales can be aggregated to a total score representing the top-level construct of MCC. Therefore, depending on the intended purpose of the assessment, the VA-MeCo allows a global assessment of participants’ total achievement as well as detailed reports concerning their differential achievement on the three sub-dimensions (e.g., for providing feedback).
Test tasks and procedure.
The VA-MeCo’s test tasks have a standardised structure and focus on critical parts of physician–patient communication. S1 Fig in the online supplementary material shows an example. Each task begins with a written description of the scenario that provides a background of the patient and the general situation. All scenarios focus on the medical history taking during an initial encounter between a physician and a patient. This initial description is followed by a short video sequence of the encounter (~ 60 s) that stops at a critical point in the conversation. The respondents are then provided with a medical communication goal that the physician wants to achieve as well as 3–5 answer options featuring statements to continue the conversation.
The participants’ task is to assess the effectiveness of each answer option in achieving the stated communication goal on a 6-point rating scale (1 = very ineffective, 6 = very effective), with the answer options differing in their effectiveness. These ratings cover all three basic dimensions of MCC, so that participants judge each answer option in terms of its effectiveness in advancing the content, providing structure, and building or maintaining a positive relationship. Hence, the test has a nested structure comprising 11 patient encounter scenarios (“tasks”), each featuring 3–5 embedded answer options, and three ratings per option. These ratings constitute the test items (overall 117 items).
Test scores are obtained by comparing participants’ effectiveness ratings with an expert- based key using so-called raw consensus scoring, a common method of scoring SJTs [54]. The scoring key was developed in a previous study with a sample of experts in medical communication training who demonstrated strong rater agreement [30]. Performance scores are calculated for each item by determining the squared difference between the participant’s answer and the respective expert rating [55]. The scores are then reversed so that higher values express higher levels of competence, and they are summed up per sub-dimension.
Existing evidence on psychometric quality criteria.
Previous studies provided preliminary evidence on the VA-MeCo’s usability, reliability and content validity based on two expert studies, cognitive pretesting, and a preliminary test evaluation based on a study with medical students (N = 117) [30]. Results suggested good reliability for the total test score (Cronbach’s α > .93) and sub-test scores regarding the three dimensions of MCC (α > .83). The two expert studies supported curricular and content validity by confirming the medical correctness of the content and relevance for measuring medical students’ MCC. Finally, both expert and student data supported the high acceptance and perceived usability of the VA-MeCo as an assessment tool. Although these data provided initial supportive evidence, further psychometric evaluation is required.
Validation strategy of the present study
General approach.
This study aimed to gain additional evidence on the VA-MeCo’s psychometric properties with a focus on (a) factorial validity and reliability and (b) convergent and discriminant construct validity. According to the standards for educational and psychological testing [56], collecting evidence on different validity aspects is essential to argue for the interpretability of test scores. Evaluating the mentioned validity aspects enhances the available evidence that the VA-MeCo measures empirically discernible and theoretically relevant dimensions of MCC.
-
a)
Factorial validity: We sought to test whether the empirical structure of the test scores reflects the theoretically assumed three-dimensional structure of MCC. Given these a-priori assumptions, confirmatory factor analysis (CFA) was the appropriate method evaluating factorial validity [57,58]. Because issues related to dimensionality are closely related to reliability [59], we also examined whether the good test reliability found in a previous study [30] was replicated in the present study.
-
b)
Construct validity: To assess convergent and discriminant validity, we followed the classical construct validation approach, which involves examining a network of correlations between test scores and other variables [57,60]. According to this approach, construct validity is supported when the empirical correlations of the test scores match theoretical expectations. For convergent validity, we tested expected associations of the three VA-MeCo sub-scales (i.e., advancing content, providing structure, and building relationship) with similar measures and criterion-related external variables; for discriminant validity, we tested whether correlations with expectedly unrelated or distal measures were indeed low (see below). In these analyses, we focussed on the VA-MeCo’s sub-test scores rather than the overall MCC-score to gain a more differentiated evaluation of convergent and discriminant validity.
Investigated measures and hypotheses for construct validation.
For investigating construct validity, we selected relevant validation variables by considering, on the one hand, substantive research on MCC [3,6,49,61], and general research on the SJT method (e.g., [29,62]) on the other hand. Table 1 presents the selected variables categorised in (i) domain-specific and generic cognitive variables (i.e., prior knowledge and experience; intelligence), (ii) variables relevant to patient-interaction (i.e., empathy; patient orientation), and (iii) general personality traits (i.e., social competence and Big Five personality traits). To measure these constructs, we employed well-established existing instruments that have a sound psychometric basis (see Measures). The assumed correlations between the VA-MeCo’s sub-scales are also summarised in Table 1 and elaborated below.
Table 1. Summary of external variables, instruments, and hypotheses for evaluating construct validity in relation to VA-MeCo.
Variable | Sub-scales | Hypotheses on correlations with the VA-MeCo |
---|---|---|
Cognitive Variables | ||
Prior knowledge and experience [63] | MCC-courses taken, study progress | Positive correlations (r > 0) |
Intelligence (DESIGMA A+ [64]) |
— | At maximum medium correlations (r ≤ .3) |
Patient-interaction variables | ||
Empathy (IRI [65]) |
Perspective taking, emotional concern | Positive correlations with building relationship (r > 0); zero or negative correlations with advancing content and providing structure (r ≤ 0) |
Patient orientation (PPOS-D12 [66]) | Sharing, caring | Positive correlations (r > 0) |
General personality traits | ||
Social competence (ISK-K [67]) | Social orientation, offensiveness, self-organisation, reflexibility | At maximum low positive correlations (r < .1) |
Personality (BFI-K [68]) | Extraversion, agreeableness, conscientiousness, neuroticism, openness | At maximum low positive correlations (r < .1) |
DESIGMA A+ = Design a Matrix – Advanced + ; IRI = Interpersonal Reactivity Index; PPOS-D12 = Patient Provider Orientation Scale – German version; ISK-K = Inventory of Social Competences – short version; BFI-K = Big Five Inventory short version. Statements about the expected sizes of the correlations refer to Cohen’s [69] standards (r ~ .1 = low; ~ .3 = medium; ~ .5 = large). Note that we made predictions about the sizes of the correlations only when doing so could be based on sufficient evidence from prior research.
-
a)
Cognitive variables: Working on the VA-MeCo’s tasks requires knowledge-based judgement of the answer options in relation to the specified communication goal. Therefore, we assumed positive convergent correlations between the participants’ test scores and their prior knowledge and experience in medical communication as well as with study progress (i.e., semester). Moreover, we assumed that there are, at maximum, medium-sized correlations with intelligence. While intelligence is a distal construct to MCC, meta-analysis show that performance on SJTs in general has a low-to-moderate correlation with intelligence, with an estimated mean population correlation of ρ = .32 [26]. Hence, correlations of the VA-MeCo’s sub-tests with intelligence r ≤ .3 provide evidence for discriminant validity.
-
b)
Patient-interaction variables: A good working relationship with the patient as a central dimension of MCC is a crucial aspect of patient-oriented communication. The concept of patient centredness/patient orientation implies an egalitarian doctor–patient relationship, which means shared power and responsibility between both parties involved [70]. Another important aspect of patient-centred communication is an empathic response to the emotional needs of patients [71,72]. Hence, we assumed positive convergent correlations with empathy and the building relationship dimension, but no correlations with the advancing content dimension and providing structure dimension. For the association between test performance and patient orientation, we expected positive correlations.
-
c)
General personality traits: Both communication skills in general and MCC in particular, are content- and domain-specific [73]. This implies that MCC is a more specific construct than general social competence, even though the two are certainly related. Social competence is defined as a person’s knowledge, skills and abilities to promote the quality of their own social behaviour [74]. Hence, social competence helps individuals achieve their goals in specific situations while maintaining social acceptance [74]. Given their different scopes, we assumed that low positive correlations of the VA-MeCo’s sub-scales with social competence would provide discriminant evidence that MCC is discernible from general social competence.
Communicative behaviour is also influenced by general personality traits, such as agreeableness, openness, and conscientiousness [75–77]. Moreover, there is evidence that the Big Five personality traits have a general impact on participants’ performance in SJTs, regardless of the specific trait being assessed. The meta-analysis by McDaniel et al. reported average correlations with conscientiousness (ρ = .27), emotional stability (ρ = .22) and agreeableness (ρ = .25), and smaller correlations with extraversion (ρ = .14) and openness to experience (ρ = .14) [26]. Hence, we assumed that, at maximum, low positive correlations of the VA-MeCo’s sub-tests with the Big Five personality traits would provide evidence for discriminant validity.
Methods
Participants
This study was implemented as an online survey with medical students, including all phases of medical education and was conducted between June 25, 2020, and December 1, 2020. The study was approved by the medical ethics commission of the Technical University of Munich [14/20S]. The students’ participation in the study was voluntary. They were informed in advance and explicitly provided informed written consent. Participation took approximately 1.5 hours on average, with an average of 45 minutes for completing the VA-MeCo. Only participants who provided informed consent, satisfied a minimum time on task-criterium, and completed at least five tasks of the VA-MeCo were included in the final analysis. No other data exclusions were made. Note that due to the decentralised recruitment strategy, the amount and reasons for non-participation could not be determined.
The final sample consisted of N = 395 medical students, of whom 70.33% were female, 29.04% male, and 0.63% diverse. This gender distribution aligns with official statistics for German medical students [78]. The sample covered all phases of German medical education, albeit with a focus on the pre-clinical and clinical phases where medical communication courses predominantly take place (age: M = 23.47, SD = 3.86; semester of study: M = 5.77, SD = 3.55; stage of study: 41.12% pre-clinic, 51.60% clinic, 6.57% practical year, 0.71% other). The resulting sample size enabled testing correlations as low as r = .15 with more than 90% statistical power in the analyses on construct validity and is also sufficiently large for the CFAs [58].
Measures
All participants first completed the VA-MeCo, followed by the questionnaires and achievement tests related to the validation variables. The study was conducted individually online: participants received written instructions at the beginning and then were guided through the procedure by the survey software, without personal assistance. Instruments for the validation variables were chosen to align conceptually with our research objectives, focusing on well-established measures that have been psychometrically evaluated and supported by prior research. A short description of the selected instruments, along with their reliabilities, is provided in Table S1. For more detailed information, readers are referred to the original publications and test documentations (see references in S1).
Analysis
General statistical analyses were done using IBM SPSS 28 and CFA-modelling in the Mplus 8.3 software [79] using robust Full Information Maximum Likelihood estimation. To reduce the CFA’s model complexity due to the abovementioned nested test structure (i.e., ratings of answer options nested in test tasks), we aggregated items per task for each of the three dimensions. Thus, the CFAs represent the test structure at the task level. We proceeded by first estimating separate unidimensional CFA models per respective MCC dimension (M1–M3) and then combining them in a three-dimensional model (M4). Because each task was rated on all three dimensions of MCC this final model contained expected residual correlations of the same tasks across the three dimensions (for a path diagram, see S2 Fig). Reliability was assessed using McDonald’s ω, which is preferable to Cronbach’s α [59] and can be interpreted in the same manner. Construct validity was analysed using Pearson correlations.
Results
Factorial validity
The three separate unidimensional CFAs for the VA-MeCo sub-scales (M1–M3) resulted in good model fit, as evidenced by the χ2-tests and standard fit indices (Table 2). For the dimensions advancing content (M1) and building relationship (M3), the χ2-tests were non-significant despite the large sample size. All other fit indices were good. For the providing structure dimension (M2), the χ2-test was statistically significant, pointing at some degree of misfit; however, all other fit indices hinted at good model fit.
Table 2. Model fit of the CFA models.
Model | χ2-test | RMSEA [90% CI] |
CFI | SRMR |
---|---|---|---|---|
Unidimensional models for the VA-MeCo sub-scales | ||||
M1 Content | χ2(44) = 56.802, p = .093 | .027 [.000,.046] | .971 | .045 |
M2 Structure | χ2(44) = 64.690, p = .023 | .035 [.013,.052] | .967 | .050 |
M3 Relationship | χ2(44) = 59.609, p = .058 | .030 [.000,.048] | .977 | .050 |
Complete models of MCC including all VA-MeCo sub-scales | ||||
M4 Three-dimensional model | χ2 (459) = 733.697, p < .001 |
.039 [.034,.044] | .921 | .065 |
M5 One-dimensional model | χ2 (462) = 867.470, p < .001 |
.047 [.042,.052] | .883 | .066 |
RMSEA = root mean square error of approximation; CFI = comparative fit index; SRMR = standardised root mean square residual; CI = confidence interval; MCC = medical communication competence.
Based on these results, we proceeded with the estimation of the combined three-dimensional model of MCC (M4). Again, this model showed an overall acceptable model fit despite the somewhat diminished value of the comparative fit index. The standardised factor loadings were of a substantial size (i.e., ~ .4 or larger; [59]), with only Task 5 having a relatively low loading on the content dimension (see S2 Table).
The results from M4 revealed substantial positive correlations among the three dimensions (r ≥ .8; S2 Table). Given these high correlations, one might argue that a one-dimensional model (i.e., a single general factor of MCC) would provide a more parsimonious representation of the data. To test this, we compared M4 to a one-dimensional model (M5), in which all tasks loaded onto a single MCC factor. For completeness, we also evaluated three two-dimensional models (M6–M8), each combining two sub-scales into a single factor (e.g., one factor for relationship and another one combining content and structure). The results of the χ2-difference tests indicated that M4 provided a significantly better fit than all alternative models (S3 Table). This was corroborated by information criteria that consistently showed M4 as the best-fitting model among the candidates. Therefore, we selected M4 as the final model.
In the final step, we analysed the reliabilities of the test’s sub-scores and total score, which proved to be high throughout (total score: ω = .94; advancing content: ω = .82; providing structure: ω = .88; building relationship: ω = .88).
Construct validity
The Pearson correlations of the investigated measures with the three dimensions of MCC are listed in a heatmap in Fig 1. Regarding the cognitive variables, there were significant positive correlations of all VA-MeCo sub-scales with prior knowledge and experience in the field of medical communication, as well as with study progress, which is in line with our hypothesis. Moreover, we found low positive correlations with intelligence, which supports our expectations.
Fig 1. Heatmap for Pearson correlations of the measures with the three dimensions of MCC.
The correlations in boldface are in line with the stated hypotheses (see Table 1). * = p ≤ .05 (two-tailed). ** = p ≤ .01 (two-tailed).
Regarding the patient-interaction variables, as expected, both sub-scales contained in the empathy measure had higher correlations for the VA-MeCo’s relationship sub-scale than with its content and structure sub-scales. In addition, corroborating our hypothesis, there were significant positive correlations of all VA-MeCo sub-scales with the sharing and caring scales contained in the patient orientation instrument. However, these correlations were somewhat below the expected value of.30 for the content dimension.
Concerning the general personality traits, the correlations for the three sub-scales of the ISK-K as a measure of social competence ranged from low negative ones for offensiveness and reflexibility to low positive correlations for social orientation and self-organisation. These findings are partially in line with our hypothesis. For social orientation, the correlations were, as expected, in the range of positive to small-sized associations, acknowledging that the correlation with building relationship was marginally higher than that with the other VA-MeCo sub-scales. For reflexibility, the correlations were even lower and not significantly different from zero. The correlations with self-organisation were somewhat higher than expected, particularly for building relationship. Finally, unlike expected the correlations with offensiveness were small and negative, although not significantly different from zero.
For the five sub-scales of the Big Five personality traits, we found correlations close to zero for extraversion and neuroticism. For agreeableness, conscientiousness, and openness, there were low positive correlations for each of the three dimensions, especially for the building relationship dimension. As with social competence, these findings are consistent with our hypotheses in terms of the sign of the associations, although partly not in terms of size.
Discussion
This study aimed to extend the existing validity evidence for the VA-MeCo, a video-based SJT of MCC [30]. To this end, we first focused on factorial validity and investigated whether there is evidence that the empirical structure of the test scores reflects the theoretically assumed one and whether the resulting scales are sufficiently reliable. Second, we investigated construct validity by testing a net of assumed associations among the VA-MeCo sub-scales and a set of convergent and discriminant measures following the classical construct validation approach [56,57].
Regarding factorial validity, the CFA results corroborated that the test adequately reflects the underlying theoretical model of MCC and aligns with previous findings on its dimensional structure [49,50]. Our results also replicated preliminary evidence of the VA-MeCo’s strong reliability [30]. Although the high inter-factor correlations might be seen as a limitation, they are consistent with the hierarchical structure of MCC as described in the Munich Model of Professional Conversation Competence, where the three dimensions form a higher order construct of general MCC [50]. We deliberately chose not to test a hierarchical CFA model with a second-order MCC factor because, in general, a second-order factor model with three first-order factors is statistically equivalent to a model with three correlated factors (cf. M4) [58]. Since such models cannot be distinguished statistically, estimating an additional hierarchical model would not have provided meaningful insights. Moreover, the comparisons with the alternative models (M5–M8) provide further evidence that the three dimensions are not only theoretically but also empirically discernible aspects of MCC. Thus, test result interpretation can focus both on the overall MCC score as a comprehensive measure and the respective sub-scale scores. The latter may be preferable, for instance, for providing detailed feedback to learners [80]. Although the high inter-factor correlations suggest that the dimensions are closely related, they do not imply redundancy. Rather, each sub-scale captures a distinct yet interconnected facet of MCC, allowing for nuanced feedback that can highlight specific strengths and areas for improvement [38].
Regarding construct validity, we tested hypothesised correlations with measures of (a) cognitive variables as well as (b) patient-interaction and (c) more general personality traits. For cognitive variables, significant positive correlations with prior knowledge and study progress provided convergent evidence that the VA-MeCo differentiates between groups with different levels of MCC. Overall, correlations with prior knowledge were notably smaller than those with study progress. Although we based our questions on these variables on prior research, we acknowledge that they are most likely imperfect proxies for students’ actual prior knowledge. Regarding discriminant validity, the identified low positive correlations with intelligence suggest that test performance does not depend excessively on intelligence. As discussed above, meta-analysis indicates that SJTs exhibit, on average, medium-sized correlations with intelligence [26]. The observed correlations between the VA-MeCo’s sub-scales and intelligence were lower than expected based on this research. Therefore, we conclude that, although test performance in the VA-MeCo depends on intelligence to some degree—as is likely the case with any standardised achievement test—the measurement of MCC is clearly distinguishable from intelligence.
In terms of patient-interaction variables, significant correlations with empathy and patient orientation—especially regarding the building relationship dimension—were found, as expected. This outcome provides evidence of convergent validity. These correlations align with earlier research emphasising that a crucial aspect of patient-centred communication is displaying empathy in response to the emotional needs of patients [70,81]. Furthermore, these positive correlations reflect that building a good working relationship with the patient is a vital aspect of patient-centredness [70,82]. They are also consistent with findings on the importance of an egalitarian doctor–patient relationship, which focuses on the patient, for facilitating effective conversations between patients and physicians [70].
With respect to general personality traits, some of the observed correlations with social competence differed to some degree from our expectations. We had assumed at maximum small positive correlations (i.e., r ≤ .1). Results showed that the correlation between social orientation and the building relationship sub-scale was slightly higher than expected but could still be considered essentially small. This finding might be explained by the fact that people with a high social orientation generally possess a more positive attitude towards others and, thus, are better at empathising and socially connecting with others [67]. Additionally, unlike expected, small negative correlations were observed for offensiveness and all three VA-MeCo sub-scales. However, these correlations did not differ significantly from zero and thus pose no threat to discriminant validity. One potential reason for the descriptively negative correlations may be that people with high offensiveness effectively advocate for their own interests, which is not in line with a patient-centred orientation [67]. Finally, the correlations for self-organisation and all three dimensions of MCC were slightly higher than expected. However, the differences were negligible for the content and structure sub-scales and still small for the building relationship sub-scale (i.e. Δr = .164 between the expected and the observed correlation). These somewhat higher correlations might be explained by the fact that people with high social-organisation tend to maintain emotional balance, act calmly and with control, and adapt flexibly to changing conditions. All these factors are important when conducting a physician-patient conversation. In summary, we believe that the discussed deviations from our expectations are of marginal size and, thus, pose no substantial concern about discriminant validity against social competence.
Finally, the pattern of findings largely corroborated our expectations for the correlations with the Big Five personality traits. As a slight deviation, some of the correlations of conscientiousness and agreeableness with the VA-MeCo were marginally higher than r = .1; however, they can still be judged as essentially low correlations in terms of Cohen’s [69] standards. As mentioned previously, the Big Five often show medium correlations with SJT performance [26]. The correlations found in this study were smaller throughout, indicating that personality plays a lesser role for performance on the VA-MeCo than is typically expected for SJTs.
In summary, despite the slight deviations discussed, the analyses provided a pattern of findings that was largely in line with our expectations (cf. Table 1) and delivers further evidence of the VA-MeCo’s validity in measuring MCC [56,60]. The results support the conjecture that MCC is a hierarchical, multidimensional construct [49,50] that is related to but discernible from other factors that are frequently considered important in medical education (such as empathy and patient orientation) and, thus, provides added value.
Limitations and future research
We acknowledge several main limitations of the present study. First, regarding factorial validity, some tasks (particularly Item 5) showed relatively low factor loadings. We retained these items because they represent important steps in medical communication and excluding them would have compromised the instrument’s content validity. Nonetheless, as the overall factor solution is well-defined, theoretically coherent, and yields reliable scales, we consider it defensible, while noting that these tasks may warrant refinement in future research.
Second, we used a classic construct validity assessment approach [56,57,60] which has at least two limitations: (a) the selection of relevant validation variables and (b) the interpretation of correlation patterns, which is ultimately judgmental and somewhat subjective [83]. Regarding the first problem, we aimed to select the most relevant validation measures by drawing on substantive theory and research on communication competence, particularly in the medical field, as well as general research on the SJT method. Although we were able to include a fairly wide range of cognitive and non-cognitive variables, this selection is necessarily incomplete. However, the inclusion of further validation measures—particularly further achievement tests—would have meant an even greater workload for the participants, potentially compromising the quality of the data. For the same reason and given the sample size requirements for the analyses of the present study, the collection of additional behavioural measures of MCC (e.g., in an OSCE) was impractical and beyond the scope of the study.
Regarding the interpretation of correlations, we attempted to reduce subjectivity by stating a priori hypotheses about the expected directions and, where possible, the sizes of the correlations (Table 1), and by following Cohen’s [69] standards for interpreting effect sizes. However, these recommendations should not be taken as universally valid and unambiguous cut-off values. For example, in Fig 1, we categorised the correlation coefficients for social competence and the Big Five personality variables as deviating from the hypotheses when their absolute values exceeded r = .1 (i.e., a small effect). However, as noted above, many of the correlations found should still be interpreted as small (e.g., r = .16 between agreeableness and relationship). Moreover, the present data do not allow us to explore the causes of deviations in correlation sizes from the expected values. Future research should investigate whether such patterns replicate across different validation measures.
Furthermore, although our study provides an in-depth evaluation of the factorial and construct validity of the VA-MeCo, further validation is needed, particularly studies assessing its predictive validity for communicative quality in actual patient encounters. Further research should examine whether students who score well on the VA-MeCo perform better in communication in clinical settings. Similarly, studies should investigate the relationship between the VA-MeCo and performance in simulated interactions (e.g. OSCEs), which was not feasible in the present study. Such research should account for the fact that SJTs reflect knowledge-based judgment rather than actual behaviour. Therefore, we expect the VA-MeCo to be most predictive of situation-specific judgment and decision making, such as recognising meaningful patterns, identifying critical moments in conversations, evaluating courses of action, and analysing and evaluating communicative situations [34]. Furthermore, as the VA-MeCo primarily addresses cognitive aspects, further research is needed to examine how these interact with procedural and attitudinal dimensions of MCC (such as verbalising and regulating emotions to convey empathy) and develop combined approaches that capture these facets more directly.
Additionally, the stability of MCC over time and its sensitivity to instructional interventions should be investigated. Longitudinal research could assess medical students with the VA-MeCo at multiple points during their studies to track changes in MCC and evaluate the instrument’s sensitivity to communication training. Moreover, examining external and predictive validity by testing associations with performance in both simulated and real patient interactions would further strengthen the instrument.
Finally, a common criticism of SJTs is their potential susceptibility to socially desirable responses [84]. Such response tendencies can bias correlations with other variables [57] potentially affecting our analyses of convergent and discriminant validity. However, we believe that the VA-MeCo’s test format largely mitigates social desirability bias. In each task, rather than selecting a single best response, participants must rate the effectiveness of every presented answer option on three dimensions for achieving a given communication goal. This format prevents participants from simply choosing the most socially desirable option. Furthermore, even if participants’ effectiveness ratings are influenced by social desirability considerations, they can only earn test points if their responses match the experts’ effectiveness ratings. Overall, these design features enhance the robustness of the VA-MeCo against social desirability bias and strengthen the validity of the results. Nonetheless, future research should also examine other potential threats to validity, such as group-related biases (e.g., differential item functioning by gender or ethnic background). Investigating such aspects was beyond the scope of the present study due to the sample size requirements.
Practical implications
The present study adds to the existing evidence for the validity of the VA-MeCo in measuring medical students’ MCC in patient encounters. Due to its digital implementation and ease of use, the test can be administered effectively to large groups in medical education. The VA-MeCo is versatile and can be used for both assessment and teaching MCC. In the classroom, faculty can use the test to monitor student progress, provide feedback, and raise awareness of important medical communication issues. In many teaching scenarios, the VA-MeCo can be effectively combined with more resource-intensive methods, such as OSCEs and standardised patients, at different stages of the curriculum. Test results can provide valuable feedback to faculty, supporting curricular quality development. Finally, given its close alignment with standards for medical communication training [47,48], the VA-MeCo’s materials may serve as inspiration for creating new learning opportunities or refining existing ones.
Supporting information
(TIF)
(TIF)
(DOCX)
(DOCX)
(DOCX)
Acknowledgments
We thank all the medical students that participated in this study as well as our student assistants for their help in gathering the data.
Data Availability
All data files are available from the PsychArchives data repository (DOI: https://doi.org/10.23668/psycharchives.5337). All test materials are available from the OpenTestArchive repository (DOI: https://doi.org/10.23668/psycharchives.14472).
Funding Statement
Initials of authors who received the award: SR, LS, LJ, SIDP, ED, KS, POB, MG, JB This work was supported by the German Federal Ministry of Education and Research (Bundesministerium für Bildung und Forschung, BMBF) under Grant [number 16DHB2134]. URL to sponsors‘ website: https://www.bmbf.de/EN/Home/home_node.html The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
References
- 1.Silverman J, Kurtz S, Draper J. Skills for communicating with patients. 3rd ed. Boca Raton: CRC Press. 2013. [Google Scholar]
- 2.Jünger J. Medical communication: practice book for the master plan for medical studies 2020. Stuttgart: Schattauer. 2018. [Google Scholar]
- 3.Ha JF, Longnecker N. Doctor-patient communication: a review. Ochsner J. 2010;10(1):38–43. [PMC free article] [PubMed] [Google Scholar]
- 4.Cuffy C, Hagiwara N, Vrana S, McInnes BT. Measuring the quality of patient-physician communication. J Biomed Inform. 2020;112:103589. doi: 10.1016/j.jbi.2020.103589 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Dowson J. Transferring knowledge into practice? Exploring the feasibility of action learning for improving knowledge, skills and confidence in clinical communication skills. BMC Med Educ. 2019;19(1):37. doi: 10.1186/s12909-019-1467-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Maguire P, Pitceathly C. Key communication skills and how to acquire them. BMJ. 2002;325(7366):697–700. doi: 10.1136/bmj.325.7366.697 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Boissy A, Windover AK, Bokar D, Karafa M, Neuendorf K, Frankel RM, et al. Communication Skills Training for Physicians Improves Patient Satisfaction. J Gen Intern Med. 2016;31(7):755–61. doi: 10.1007/s11606-016-3597-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Kelley JM, Kraft-Todd G, Schapira L, Kossowsky J, Riess H. The influence of the patient-clinician relationship on healthcare outcomes: a systematic review and meta-analysis of randomized controlled trials. PLoS One. 2014;9(4):e94207. doi: 10.1371/journal.pone.0094207 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Matusitz J, Spear J. Effective doctor-patient communication: an updated examination. Soc Work Public Health. 2014;29(3):252–66. doi: 10.1080/19371918.2013.776416 [DOI] [PubMed] [Google Scholar]
- 10.Medizinischer Fakultätentag. German national competence-based catalogue of learning objectives in medicine. Berlin: Medizinischer Fakultätentag. 2021. https://www.nklm.de [Google Scholar]
- 11.Swanson DB, van der Vleuten CPM. Assessment of clinical skills with standardized patients: state of the art revisited. Teach Learn Med. 2013;25 Suppl 1:S17-25. doi: 10.1080/10401334.2013.842916 [DOI] [PubMed] [Google Scholar]
- 12.Fischer F, Helmer S, Rogge A, Arraras JI, Buchholz A, Hannawa A, et al. Outcomes and outcome measures used in evaluation of communication training in oncology - a systematic literature review, an expert workshop, and recommendations for future research. BMC Cancer. 2019;19(1):808. doi: 10.1186/s12885-019-6022-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Brown J. How clinical communication has become a core part of medical education in the UK. Med Educ. 2008;42(3):271–8. doi: 10.1111/j.1365-2923.2007.02955.x [DOI] [PubMed] [Google Scholar]
- 14.Zill JM, Christalle E, Müller E, Härter M, Dirmaier J, Scholl I. Measurement of physician-patient communication--a systematic review. PLoS One. 2014;9(12):e112637. doi: 10.1371/journal.pone.0112637 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Radziej K, Loechner J, Engerer C, Niglio de Figueiredo M, Freund J, Sattel H, et al. How to assess communication skills? Development of the rating scale ComOn Check. Med Educ Online. 2017;22(1):1392823. doi: 10.1080/10872981.2017.1392823 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Burdick WP, Boulet JR, LeBlanc KE. Can We Increase the Value and Decrease the Cost of Clinical Skills Assessment?. Acad Med. 2018;93(5):690–2. doi: 10.1097/ACM.0000000000001867 [DOI] [PubMed] [Google Scholar]
- 17.Bauer J, Gartmeier M, Wiesbeck AB, Moeller GE, Karsten G, Fischer MR, et al. Differential learning gains in professional conversation training: A latent profile analysis of competence acquisition in teacher-parent and physician-patient communication. Learning and Individual Differences. 2018;61:1–10. doi: 10.1016/j.lindif.2017.11.002 [DOI] [Google Scholar]
- 18.Sanson-Fisher R, Hobden B, Waller A, Dodd N, Boyd L. Methodological quality of teaching communication skills to undergraduate medical students: a mapping review. BMC Med Educ. 2018;18(1):151. doi: 10.1186/s12909-018-1265-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Cömert M, Zill JM, Christalle E, Dirmaier J, Härter M, Scholl I. Assessing Communication Skills of Medical Students in Objective Structured Clinical Examinations (OSCE)--A Systematic Review of Rating Scales. PLoS One. 2016;11(3):e0152717. doi: 10.1371/journal.pone.0152717 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Lane C, Rollnick S. The use of simulated patients and role-play in communication skills training: a review of the literature to August 2005. Patient Educ Couns. 2007;67(1–2):13–20. doi: 10.1016/j.pec.2007.02.011 [DOI] [PubMed] [Google Scholar]
- 21.Kepes S, Keener SK, Lievens F, McDaniel MA. An Integrative, Systematic Review of the Situational Judgment Test Literature. Journal of Management. 2025;51(6):2278–319. doi: 10.1177/01492063241288545 [DOI] [Google Scholar]
- 22.Graupe T, Fischer MR, Strijbos J-W, Kiessling C. Development and piloting of a Situational Judgement Test for emotion-handling skills using the Verona Coding Definitions of Emotional Sequences (VR-CoDES). Patient Educ Couns. 2020;103(9):1839–45. doi: 10.1016/j.pec.2020.04.001 [DOI] [PubMed] [Google Scholar]
- 23.Ludwig S, Behling L, Schmidt U, Fischbeck S. Development and testing of a summative video-based e-examination in relation to an OSCE for measuring communication-related factual and procedural knowledge of medical students. GMS J Med Educ. 2021;38(3):Doc70. doi: 10.3205/zma001466 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Kiessling C, Bauer J, Gartmeier M, Iblher P, Karsten G, Kiesewetter J, et al. Development and validation of a computer-based situational judgement test to assess medical students’ communication skills in the field of shared decision making. Patient Educ Couns. 2016;99(11):1858–64. doi: 10.1016/j.pec.2016.06.006 [DOI] [PubMed] [Google Scholar]
- 25.Heier L, Gambashidze N, Hammerschmidt J, Riouchi D, Geiser F, Ernstmann N. Development and testing of the situational judgement test to measure safety performance of healthcare professionals: An explorative cross-sectional study. Nurs Open. 2022;9(1):684–91. doi: 10.1002/nop2.1119 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.McDaniel MA, Hartman NS, Whetzel DL, Grubb WL III. Situational judgment tests, response instructions, and validity: a meta‐analysis. Personnel Psychology. 2007;60(1):63–91. doi: 10.1111/j.1744-6570.2007.00065.x [DOI] [Google Scholar]
- 27.Patterson F, Zibarras L, Ashworth V. Situational judgement tests in medical education and training: Research, theory and practice: AMEE Guide No. 100. Med Teach. 2016;38(1):3–17. doi: 10.3109/0142159X.2015.1072619 [DOI] [PubMed] [Google Scholar]
- 28.Whetzel D, Sullivan T, McCloy R. Situational Judgment Tests: An Overview of Development Practices and Psychometric Characteristics. PAD. 2020;6(1). doi: 10.25035/pad.2020.01.001 [DOI] [Google Scholar]
- 29.McDaniel MA, Nguyen NT. Situational Judgment Tests: A Review of Practice and Constructs Assessed. Int J Selection Assessment. 2001;9(1–2):103–13. doi: 10.1111/1468-2389.00167 [DOI] [Google Scholar]
- 30.Reiser S, Schacht L, Thomm E, Figalist C, Janssen L, Schick K, et al. A video-based situational judgement test of medical students’ communication competence in patient encounters: Development and first evaluation. Patient Educ Couns. 2022;105(5):1283–9. doi: 10.1016/j.pec.2021.08.020 [DOI] [PubMed] [Google Scholar]
- 31.Campion MC, Ployhart RE, MacKenzie WI Jr. The State of Research on Situational Judgment Tests: A Content Analysis and Directions for Future Research. Human Performance. 2014;27(4):283–310. doi: 10.1080/08959285.2014.929693 [DOI] [Google Scholar]
- 32.Bergman ME, Drasgow F, Donovan MA, Henning JB, Juraska SE. Scoring Situational Judgment Tests: Once You Get the Data, Your Troubles Begin. Int J Selection Assessment. 2006;14(3):223–35. doi: 10.1111/j.1468-2389.2006.00345.x [DOI] [Google Scholar]
- 33.Lievens F, Patterson F, Corstjens J, Martin S, Nicholson S. Widening access in selection using situational judgement tests: evidence from the UKCAT. Med Educ. 2016;50(6):624–36. doi: 10.1111/medu.13060 [DOI] [PubMed] [Google Scholar]
- 34.Blömeke S, Gustafsson J-E, Shavelson RJ. Beyond Dichotomies. Zeitschrift für Psychologie. 2015;223(1):3–13. doi: 10.1027/2151-2604/a000194 [DOI] [Google Scholar]
- 35.Golubovich J, Seybert J, Martin-Raugh M, Naemi B, Vega RP, Roberts RD. Assessing Perceptions of Interpersonal Behavior with a Video-Based Situational Judgment Test. International Journal of Testing. 2017;17(3):191–209. doi: 10.1080/15305058.2016.1194275 [DOI] [Google Scholar]
- 36.Richman-Hirsch WL, Olson-Buchanan JB, Drasgow F. Examining the impact of administration medium on examine perceptions and attitudes. J Appl Psychol. 2000;85(6):880–7. doi: 10.1037/0021-9010.85.6.880 [DOI] [PubMed] [Google Scholar]
- 37.Bardach L, Rushby JV, Kim LE, Klassen RM. Using video- and text-based situational judgement tests for teacher selection: a quasi-experiment exploring the relations between test format, subgroup differences, and applicant reactions. European Journal of Work and Organizational Psychology. 2021;30(2):251–64. doi: 10.1080/1359432x.2020.1736619 [DOI] [Google Scholar]
- 38.Goss BD, Ryan AT, Waring J, Judd T, Chiavaroli NG, O’Brien RC, et al. Beyond Selection: The Use of Situational Judgement Tests in the Teaching and Assessment of Professionalism. Acad Med. 2017;92(6):780–4. doi: 10.1097/ACM.0000000000001591 [DOI] [PubMed] [Google Scholar]
- 39.Cullen MJ, Zhang C, Marcus-Blank B, Braman JP, Tiryaki E, Konia M, et al. Improving Our Ability to Predict Resident Applicant Performance: Validity Evidence for a Situational Judgment Test. Teach Learn Med. 2020;32(5):508–21. doi: 10.1080/10401334.2020.1760104 [DOI] [PubMed] [Google Scholar]
- 40.Mielke I, Breil SM, Amelung D, Espe L, Knorr M. Assessing distinguishable social skills in medical admission: does construct-driven development solve validity issues of situational judgment tests?. BMC Med Educ. 2022;22(1):293. doi: 10.1186/s12909-022-03305-x [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Heininger SK, Baumgartner M, Zehner F, Burgkart R, Söllner N, Berberat PO, et al. Measuring hygiene competence: the picture-based situational judgement test HygiKo. BMC Med Educ. 2021;21(1):410. doi: 10.1186/s12909-021-02829-y [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Black EW, Schrock B, Prewett MS, Blue AV. Design of a Situational Judgment Test for Preclinical Interprofessional Collaboration. Acad Med. 2021;96(7):992–6. doi: 10.1097/ACM.0000000000004117 [DOI] [PubMed] [Google Scholar]
- 43.Fröhlich M, Kahmann J, Kadmon M. Development and psychometric examination of a German video‐based situational judgment test for social competencies in medical school applicants. Int J Selection Assessment. 2017;25(1):94–110. doi: 10.1111/ijsa.12163 [DOI] [Google Scholar]
- 44.Husbands A, Rodgerson MJ, Dowell J, Patterson F. Evaluating the validity of an integrity-based situational judgement test for medical school admissions. BMC Med Educ. 2015;15:144. doi: 10.1186/s12909-015-0424-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Schwibbe A, Lackamp J, Knorr M, Hissbach J, Kadmon M, Hampe W. Selection of medical students : Measurement of cognitive abilities and psychosocial competencies. Bundesgesundheitsblatt Gesundheitsforschung Gesundheitsschutz. 2018;61(2):178–86. doi: 10.1007/s00103-017-2670-2 [DOI] [PubMed] [Google Scholar]
- 46.Koczwara A, Patterson F, Zibarras L, Kerrin M, Irish B, Wilkinson M. Evaluating cognitive ability, knowledge tests and situational judgement tests for postgraduate selection. Med Educ. 2012;46(4):399–408. doi: 10.1111/j.1365-2923.2011.04195.x [DOI] [PubMed] [Google Scholar]
- 47.Frank JR, Danoff D. The CanMEDS initiative: implementing an outcomes-based framework of physician competencies. Med Teach. 2007;29(7):642–7. doi: 10.1080/01421590701746983 [DOI] [PubMed] [Google Scholar]
- 48.Kurtz S, Silverman J, Benson J, Draper J. Marrying content and process in clinical method teaching: enhancing the Calgary-Cambridge guides. Acad Med. 2003;78(8):802–9. doi: 10.1097/00001888-200308000-00011 [DOI] [PubMed] [Google Scholar]
- 49.Gartmeier M, Bauer J, Fischer MR, Karsten G, Prenzel M. Modelling and assessment of teachers’ professional communication skills in teacher-parent interviews. In: Zlatkin-Troitschanskaia O. Stationen empirischer Bildungsforschung. 1st ed. Wiesbaden: VS Verl für Sozialwiss. 2011. 412–24. [Google Scholar]
- 50.Gartmeier M, Bauer J, Fischer MR, Hoppe-Seyler T, Karsten G, Kiessling C, et al. Fostering professional communication skills of future physicians and teachers: effects of e-learning with video cases and role-play. Instr Sci. 2015;43(4):443–62. doi: 10.1007/s11251-014-9341-6 [DOI] [Google Scholar]
- 51.Wiesbeck AB, Bauer J, Gartmeier M, Kiessling C, Möller GE, Karsten G, et al. Simulated conversations for assessing professional conversation competence in teacher-parent and physician-patient conversations. J Educ Res online. 2017;3:82–101. doi: 10.25656/01:15302 [DOI] [Google Scholar]
- 52.Frank JR, Snell L, Sherbino J. CanMEDS 2015. Physician competency framework. Ottawa: Royal College of Physicians and Surgeons of Canada. 2015. [Google Scholar]
- 53.de Haes H, Bensing J. Endpoints in medical communication research, proposing a framework of functions and outcomes. Patient Educ Couns. 2009;74(3):287–94. doi: 10.1016/j.pec.2008.12.006 [DOI] [PubMed] [Google Scholar]
- 54.Weng Q (Derek), Yang H, Lievens F, McDaniel MA. Optimizing the validity of situational judgment tests: The importance of scoring methods. Journal of Vocational Behavior. 2018;104:199–209. doi: 10.1016/j.jvb.2017.11.005 [DOI] [Google Scholar]
- 55.Reiser S, Schacht L, Thomm E, Janssen L, Schick K, Berberat PO, et al. VA-MeCo. Video-Based Assessment of Medical Communication Competence. Trier: ZPID (Leibniz Institute for Psychology) – Open Test Archive. 2022. doi: 10.23668/psycharchives.14472 [DOI] [Google Scholar]
- 56.American Educational Research Association, American Psychological Association, National Council on Measurement in Education. Standards for educational and psychological testing. Washington: American Educational Research Association. 2014. [Google Scholar]
- 57.Furr RM. Psychometrics: an introduction. 4th ed. California: SAGE. 2022. [Google Scholar]
- 58.Brown TA. Confirmatory factor analysis for applied research. 2nd ed. New York: The Guilford Press. 2015. [Google Scholar]
- 59.Raykov T, Marcoulides GA. Introduction to psychometric theory. New York: Routledge. 2011. [Google Scholar]
- 60.Cronbach LJ, Meehl PE. Construct validity in psychological tests. Psychol Bull. 1955;52(4):281–302. doi: 10.1037/h0040957 [DOI] [PubMed] [Google Scholar]
- 61.Kiessling C, Dieterich A, Fabry G, Hölzer H, Langewitz W, Mühlinghaus I, et al. Communication and social competencies in medical education in German-speaking countries: the Basel consensus statement. Results of a Delphi survey. Patient Educ Couns. 2010;81(2):259–66. doi: 10.1016/j.pec.2010.01.017 [DOI] [PubMed] [Google Scholar]
- 62.Christian MS, Edwards BD, Bradley JC. Situational judgment tests: constructs assessed and a meta-analysis of their criterion-related validities. Personnel Psychology. 2010;63(1):83–117. doi: 10.1111/j.1744-6570.2009.01163.x [DOI] [Google Scholar]
- 63.Bauer J, Gartmeier M, Gröschl B. ProfKom - Professionalisation of future doctors and teachers in the field of communication competence: scale documentation. Munich (Germany): Technical University of Munich. 2013. [Google Scholar]
- 64.Becker N, Spinath FM. Design a Matrix – Advanced. A distractor-free matrix test for assessing general intelligence: manual. Göttingen (Germany): Hogrefe. 2014. [Google Scholar]
- 65.Davis MH. A multidimensional approach to individual differences in empathy. JSAS Catal Sel Doc Psychol. 1980;10:85. [Google Scholar]
- 66.Kiessling C, Fabry G, Rudolf Fischer M, Steiner C, Langewitz WA. German translation and construct validation of the Patient-Provider-Orientation Scale (PPOS-D12). Psychother Psychosom Med Psychol. 2014;64(3–4):122–7. doi: 10.1055/s-0033-1341455 [DOI] [PubMed] [Google Scholar]
- 67.Kanning UP. Inventory for measuring social competences in self-perception and external perception (ISK-360°). Göttingen: Hogrefe. 2009. [Google Scholar]
- 68.Rammstedt B, John OP. Kurzversion des Big Five Inventory (BFI-K):. Diagnostica. 2005;51(4):195–206. doi: 10.1026/0012-1924.51.4.195 [DOI] [Google Scholar]
- 69.Cohen J. Statistical power analysis for the behavioral sciences. New York: Routledge. 1988. [Google Scholar]
- 70.Mead N, Bower P. Patient-centredness: a conceptual framework and review of the empirical literature. Soc Sci Med. 2000;51(7):1087–110. doi: 10.1016/s0277-9536(00)00098-8 [DOI] [PubMed] [Google Scholar]
- 71.Norfolk T, Birdi K, Walsh D. The role of empathy in establishing rapport in the consultation: a new model. Med Educ. 2007;41(7):690–7. doi: 10.1111/j.1365-2923.2007.02789.x [DOI] [PubMed] [Google Scholar]
- 72.Santana MJ, Manalili K, Jolley RJ, Zelinsky S, Quan H, Lu M. How to practice person-centred care: A conceptual framework. Health Expect. 2018;21(2):429–40. doi: 10.1111/hex.12640 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Baig LA, Violato C, Crutcher RA. Assessing clinical communication skills in physicians: are the skills context specific or generalizable. BMC Med Educ. 2009;9:22. doi: 10.1186/1472-6920-9-22 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Kanning UP. Soziale Kompetenz - Definition, Strukturen und Prozesse. Zeitschrift für Psychologie / Journal of Psychology. 2002;210(4):154–63. doi: 10.1026//0044-3409.210.4.154 [DOI] [Google Scholar]
- 75.Zohoorian Z, Zeraatpishe M, Khorrami N. Willingness to communicate, Big Five personality traits, and empathy: how are they related? CJESS Can J Educ Soc Sci. 2022;2(5):17–27. doi: 10.53103/cjess.v2i5.69 [DOI] [Google Scholar]
- 76.Sims CM. Do the Big-Five Personality Traits Predict Empathic Listening and Assertive Communication?. International Journal of Listening. 2017;31(3):163–88. doi: 10.1080/10904018.2016.1202770 [DOI] [Google Scholar]
- 77.Manuel RS, Borges NJ, Gerzina HA. Personality and clinical skills: any correlation?. Acad Med. 2005;80(10 Suppl):S30-3. doi: 10.1097/00001888-200510001-00011 [DOI] [PubMed] [Google Scholar]
- 78.Hachmeister CD. What do women study? What do men study? – Students and first-year students by gender. CHE DataCHECK. 2025. https://hochschuldaten.che.de/was-studieren-frauen-was-studieren-maenner/ [Google Scholar]
- 79.Muthén LK, Muthén BO. Mplus User’s Guide. 8th ed. Los Angeles: Muthén & Muthén. 1998-2017. [Google Scholar]
- 80.Meijer RR, Boevé AJ, Tendeiro JN, Bosker RJ, Albers CJ. The Use of Subscores in Higher Education: When Is This Useful?. Front Psychol. 2017;8:305. doi: 10.3389/fpsyg.2017.00305 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Constand MK, MacDermid JC, Dal Bello-Haas V, Law M. Scoping review of patient-centered care approaches in healthcare. BMC Health Serv Res. 2014;14:271. doi: 10.1186/1472-6963-14-271 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Scholl I, Zill JM, Härter M, Dirmaier J. An integrative model of patient-centeredness - a systematic review and concept analysis. PLoS One. 2014;9(9):e107828. doi: 10.1371/journal.pone.0107828 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Furr RM, Bacharach VR. Psychometrics: an introduction. 2nd ed. California: SAGE. 2013. [Google Scholar]
- 84.Kaminski K, Felfe J, Schäpers P, Krumm S. A closer look at response options: Is judgment in situational judgment tests a function of the desirability of response options?. Int J Selection Assessment. 2019;27(1):72–82. doi: 10.1111/ijsa.12233 [DOI] [Google Scholar]