Skip to main content
PLOS One logoLink to PLOS One
. 2025 Sep 23;20(9):e0332957. doi: 10.1371/journal.pone.0332957

Online assessment of medical students’ communication competence in patient encounters: Validation of the VA-MeCo situational judgement test

Sabine Reiser 1,2, Laura Schacht 1, Eva Thomm 1,2, Laura Janssen 3, Sylvia Irene Donata Pittroff 3, Eva Dörfler 4, Kristina Schick 3,5, Pascal O Berberat 3, Martin Gartmeier 3, Johannes Bauer 1,2,*
Editor: Ayesha Fahim6
PMCID: PMC12456786  PMID: 40986534

Abstract

Background

Medical communication is a core task of physicians, and its quality affects both patients and physicians. Hence, the teaching and measurement of medical communication competence (MCC) are crucial during medical studies. This study aims to explore the factorial and construct validity of the Video-Based Assessment of Medical Communication Competence (VA-MeCo), an online-situational judgement test (SJT) designed to measure three aspects of medical students’ communication competence in patient encounters: advancing content, providing structure, and building relationship.

Methods

We conducted an online survey with N = 395 medical students. Factorial validity was tested by confirmatory factor analysis (CFA). To investigate convergent and discriminant aspects of construct validity, we tested correlations of participants’ VA-MeCo sub-test scores with scores in relevant cognitive variables, patient-interaction, and general personality traits.

Results

The CFA confirmed the expected three-dimensional factorial structure showing good model fit and highly correlated dimensions. A McDonald’s Ω of.94 for the complete test and >.81 for the sub-scales indicated high reliability. Regarding construct validity, the directions of the correlations were in line with the theoretically assumed associations; correlation sizes partly deviated from our expectations.

Conclusions

The results support that MCC can be validly measured using the VA-MeCo. The CFA results align with theory and previous studies that proposed three distinguishable dimensions of MCC. The findings on convergent and discriminant validity demonstrate that the test measures MCC as a specific construct that is positively related to patient-interaction; moreover, it can be distinguished from related but more generic constructs (e.g., social competence) and is not overly confounded with construct-irrelevant traits (e.g., intelligence). Notwithstanding the need for further validation, the results indicate that the VA-MeCo is a useful test for assessing MCC in medical education.

Background

Patient encounters are a central and frequent task of physicians [1,2]. Effective communication between physicians and patients serves as the foundation for successful treatment and the provision of optimal care [36]. Research shows that high-quality medical communication is beneficial for both parties [7], fosters patients’ well-being, and supports recovery [3,8,9]. Therefore, training medical communication competence (MCC) has become an important—even obligatory—part of medical education in several countries and training as well as assessment methods are receiving increasing research attention [1013].

Teaching MCC involves the challenge of measuring it as a learning outcome. Reliable and valid assessment tools are required [1416] not only to gauge learners’ progress and provide them with feedback [16] but also to evaluate training interventions [17] and teaching methods [18]. Standardised patient encounters in objective structured clinical examinations (OSCEs) [19] or multiple mini interviews [20] are frequently used and often considered the gold standard in teaching and assessment of MCC [19,20]. However, they are particularly resource-intensive to design and implement. In addition, available MCC assessment instruments differ in focus and in the aspects of communication competence they capture, which limits comparability and standardisation. To address such challenges, there is growing interest in using situational judgement tests (SJTs) as assessment tools for a wide range of competences [21]. In the medical context, SJTs have successfully been developed to measure relevant aspects of communicative and social competences (e.g., empathy [22], communication-related factual and procedural knowledge [23,24]) as well as other characteristics (e.g., safety performance [25]). Previous studies have shown that carefully designed SJTs can yield good reliability and validity [2628]. Compared with the above mentioned methods, it has been argued that SJTs optimally balance efficient test implementation with validity [28,29].

Despite this growing interest, the number of SJTs developed specifically to assess MCC in medical education remains very limited, and most currently provide only preliminary evidence for their validity. The present study contributes to the further validation of the Video-Based Assessment of Medical Communication Competence (VA-MeCo) [30], a computer-administrated SJT designed to measure basic aspects of MCC in medical students. While earlier research provided evidence of the VA-MeCo’s curricular and content validity [30], other important aspects, such as its factorial structure and (convergent and discriminant) construct validity have not been evaluated, yet. The present study aimed to contribute to closing this gap.

Below, first, we elaborate on the SJT method and its use for assessment of traits relevant to medical education. Afterwards, we describe the VA-MeCo’s test design and the available evidence for its reliability and validity.

The SJT method and SJTs in medical education

SJTs were originally developed for personnel selection and have continuously gained acceptance as effective measures in other fields [28,29,31]. SJTs present examinees with multiple problem scenarios, which represent typical or critical work-related situations. For each problem scenario, examinees need to evaluate a set of specific response options related to the task at hand [28]. Participants’ answers can then be compared with a scoring key, which is often based on expert judgements [32]. Hence, SJTs assess individual abilities to provide knowledge-based and situation-specific judgements in authentic scenarios. That is, they can evaluate an individual’s ability to apply professional knowledge to analyse tasks in authentic scenarios, such as clinical situations, and to make judgements about available courses of action [27,33]. Research on competence assessment highlights that such situation-specific skills provide a crucial bridge between individuals’ basic dispositions for achievement (e.g., foundational knowledge) and their actual performance in real-world tasks [34]. For modelling the situational context, video-based problem scenarios have proven particularly useful [26,27,35]. In addition to conceptual benefits, such as enhancing authenticity, videos can foster face validity as well as participants’ test acceptance and motivation [36,37].

Because of their advantages, SJTs are increasingly used in medical education to measure a multitude of different constructs, such as professionalism [38,39], social skills [40], hygiene competence [41], and preclinical interprofessional collaboration [42]. The use of SJTs in medical education has been associated with favourable psychometric properties [24,4345]. Moreover, the ease of administration, enhanced authenticity, and high acceptance among participants [22,27,46] make video-based SJTs a promising tool for measuring MCC in an online test setting, especially for large groups [35,43]. SJTs can therefore help bridge the gap between knowledge-based measures of MCC and behavioural assessments, such as OSCEs. By requiring judgements grounded in authentic scenarios, they indirectly capture procedural knowledge and attitudinal dispositions (e.g., empathy, respect, professionalism), thereby contributing to a more comprehensive evaluation of MCC [34].

The VA-MeCo test of medical communication competence

Conceptual basis and test structure.

The VA-MeCo is a video-based online-SJT for assessing MCC in medical education with a theoretical basis in (medical) communication theory and established curricular standards of patient-physician communication [4750]. The Calgary-Cambridge Guide [48] guided the selection and design of the tasks and the setting of quality standards for the test items. Moreover, the test structure draws upon the Munich Model of Professional Conversation Competence, an empirically validated model that has been used before to guide curriculum and assessment development in medical education [49,50]. The Munich Model of Professional Conversation Competence conceptualises professional communication competence as a hierarchical, multidimensional construct where the top-level factor of general medical communication competence subsumes three correlated sub-dimensions: (i) facilitating joint-problem solving, (ii) organising conversations in a proactive and transparent way, and (iii) fostering a strong working relationship. This dimensional structure has been empirically corroborated in several studies [50,51].

The VA-MeCo was designed to represent this theoretical structure, to allow a differentiated assessment of MCC. The test measures MCC by three scales: (i) advancing the content of the conversation, (ii) structuring the conversation, and (iii) building a relationship with the patient. The advancing content-scale is based on the joint-problem solving-dimension of the Munich Model of Professional Conversation Competence [49,50]. Both emphasise the importance of a shared understanding between patient and physician. However, the focus of the advancing content-scale is less on problem solving and more on the progression of the conversation, as described below.

The advancing content-scale assesses participants’ ability to advance the content level of the conversation effectively, for example by gathering information, explaining and planning [48,52,53]. The providing structure-scale addresses examinees’ ability to structure a conversation actively by guiding the patient through the conversation in a comprehensible and systematic manner. This involves the systematic use of meta-communication, such as providing summaries and managing transitions [48,52,53]. Finally, the building relationship-scale refers to the relational dimension by addressing participants’ ability to establish and maintain good working relationships with patients throughout the conversation. This includes aspects such as empathy and consideration of the patient’s needs [48,52,53]. Hence, on the basic level, the test delivers sub-scale scores on the three basic dimensions of MCC. These sub-scales can be aggregated to a total score representing the top-level construct of MCC. Therefore, depending on the intended purpose of the assessment, the VA-MeCo allows a global assessment of participants’ total achievement as well as detailed reports concerning their differential achievement on the three sub-dimensions (e.g., for providing feedback).

Test tasks and procedure.

The VA-MeCo’s test tasks have a standardised structure and focus on critical parts of physician–patient communication. S1 Fig in the online supplementary material shows an example. Each task begins with a written description of the scenario that provides a background of the patient and the general situation. All scenarios focus on the medical history taking during an initial encounter between a physician and a patient. This initial description is followed by a short video sequence of the encounter (~ 60 s) that stops at a critical point in the conversation. The respondents are then provided with a medical communication goal that the physician wants to achieve as well as 3–5 answer options featuring statements to continue the conversation.

The participants’ task is to assess the effectiveness of each answer option in achieving the stated communication goal on a 6-point rating scale (1 = very ineffective, 6 = very effective), with the answer options differing in their effectiveness. These ratings cover all three basic dimensions of MCC, so that participants judge each answer option in terms of its effectiveness in advancing the content, providing structure, and building or maintaining a positive relationship. Hence, the test has a nested structure comprising 11 patient encounter scenarios (“tasks”), each featuring 3–5 embedded answer options, and three ratings per option. These ratings constitute the test items (overall 117 items).

Test scores are obtained by comparing participants’ effectiveness ratings with an expert- based key using so-called raw consensus scoring, a common method of scoring SJTs [54]. The scoring key was developed in a previous study with a sample of experts in medical communication training who demonstrated strong rater agreement [30]. Performance scores are calculated for each item by determining the squared difference between the participant’s answer and the respective expert rating [55]. The scores are then reversed so that higher values express higher levels of competence, and they are summed up per sub-dimension.

Existing evidence on psychometric quality criteria.

Previous studies provided preliminary evidence on the VA-MeCo’s usability, reliability and content validity based on two expert studies, cognitive pretesting, and a preliminary test evaluation based on a study with medical students (N = 117) [30]. Results suggested good reliability for the total test score (Cronbach’s α > .93) and sub-test scores regarding the three dimensions of MCC (α > .83). The two expert studies supported curricular and content validity by confirming the medical correctness of the content and relevance for measuring medical students’ MCC. Finally, both expert and student data supported the high acceptance and perceived usability of the VA-MeCo as an assessment tool. Although these data provided initial supportive evidence, further psychometric evaluation is required.

Validation strategy of the present study

General approach.

This study aimed to gain additional evidence on the VA-MeCo’s psychometric properties with a focus on (a) factorial validity and reliability and (b) convergent and discriminant construct validity. According to the standards for educational and psychological testing [56], collecting evidence on different validity aspects is essential to argue for the interpretability of test scores. Evaluating the mentioned validity aspects enhances the available evidence that the VA-MeCo measures empirically discernible and theoretically relevant dimensions of MCC.

  • a)

    Factorial validity: We sought to test whether the empirical structure of the test scores reflects the theoretically assumed three-dimensional structure of MCC. Given these a-priori assumptions, confirmatory factor analysis (CFA) was the appropriate method evaluating factorial validity [57,58]. Because issues related to dimensionality are closely related to reliability [59], we also examined whether the good test reliability found in a previous study [30] was replicated in the present study.

  • b)

    Construct validity: To assess convergent and discriminant validity, we followed the classical construct validation approach, which involves examining a network of correlations between test scores and other variables [57,60]. According to this approach, construct validity is supported when the empirical correlations of the test scores match theoretical expectations. For convergent validity, we tested expected associations of the three VA-MeCo sub-scales (i.e., advancing content, providing structure, and building relationship) with similar measures and criterion-related external variables; for discriminant validity, we tested whether correlations with expectedly unrelated or distal measures were indeed low (see below). In these analyses, we focussed on the VA-MeCo’s sub-test scores rather than the overall MCC-score to gain a more differentiated evaluation of convergent and discriminant validity.

Investigated measures and hypotheses for construct validation.

For investigating construct validity, we selected relevant validation variables by considering, on the one hand, substantive research on MCC [3,6,49,61], and general research on the SJT method (e.g., [29,62]) on the other hand. Table 1 presents the selected variables categorised in (i) domain-specific and generic cognitive variables (i.e., prior knowledge and experience; intelligence), (ii) variables relevant to patient-interaction (i.e., empathy; patient orientation), and (iii) general personality traits (i.e., social competence and Big Five personality traits). To measure these constructs, we employed well-established existing instruments that have a sound psychometric basis (see Measures). The assumed correlations between the VA-MeCo’s sub-scales are also summarised in Table 1 and elaborated below.

Table 1. Summary of external variables, instruments, and hypotheses for evaluating construct validity in relation to VA-MeCo.
Variable Sub-scales Hypotheses on correlations with the VA-MeCo
Cognitive Variables
 Prior knowledge and experience [63] MCC-courses taken, study progress Positive correlations (r > 0)
 Intelligence
 (DESIGMA A+ [64])
At maximum medium correlations (r ≤ .3)
Patient-interaction variables
 Empathy
 (IRI [65])
Perspective taking, emotional concern Positive correlations with building relationship (r > 0); zero or negative correlations with advancing content and providing structure (r ≤ 0)
 Patient orientation (PPOS-D12 [66]) Sharing, caring Positive correlations (r > 0)
General personality traits
 Social competence (ISK-K [67]) Social orientation, offensiveness, self-organisation, reflexibility At maximum low positive correlations (r < .1)
 Personality (BFI-K [68]) Extraversion, agreeableness, conscientiousness, neuroticism, openness At maximum low positive correlations (r < .1)

DESIGMA A+ = Design a Matrix – Advanced + ; IRI = Interpersonal Reactivity Index; PPOS-D12 = Patient Provider Orientation Scale – German version; ISK-K = Inventory of Social Competences – short version; BFI-K = Big Five Inventory short version. Statements about the expected sizes of the correlations refer to Cohen’s [69] standards (r  ~ .1 = low; ~ .3 = medium; ~ .5 = large). Note that we made predictions about the sizes of the correlations only when doing so could be based on sufficient evidence from prior research.

  • a)

    Cognitive variables: Working on the VA-MeCo’s tasks requires knowledge-based judgement of the answer options in relation to the specified communication goal. Therefore, we assumed positive convergent correlations between the participants’ test scores and their prior knowledge and experience in medical communication as well as with study progress (i.e., semester). Moreover, we assumed that there are, at maximum, medium-sized correlations with intelligence. While intelligence is a distal construct to MCC, meta-analysis show that performance on SJTs in general has a low-to-moderate correlation with intelligence, with an estimated mean population correlation of ρ = .32 [26]. Hence, correlations of the VA-MeCo’s sub-tests with intelligence r ≤ .3 provide evidence for discriminant validity.

  • b)

    Patient-interaction variables: A good working relationship with the patient as a central dimension of MCC is a crucial aspect of patient-oriented communication. The concept of patient centredness/patient orientation implies an egalitarian doctor–patient relationship, which means shared power and responsibility between both parties involved [70]. Another important aspect of patient-centred communication is an empathic response to the emotional needs of patients [71,72]. Hence, we assumed positive convergent correlations with empathy and the building relationship dimension, but no correlations with the advancing content dimension and providing structure dimension. For the association between test performance and patient orientation, we expected positive correlations.

  • c)

    General personality traits: Both communication skills in general and MCC in particular, are content- and domain-specific [73]. This implies that MCC is a more specific construct than general social competence, even though the two are certainly related. Social competence is defined as a person’s knowledge, skills and abilities to promote the quality of their own social behaviour [74]. Hence, social competence helps individuals achieve their goals in specific situations while maintaining social acceptance [74]. Given their different scopes, we assumed that low positive correlations of the VA-MeCo’s sub-scales with social competence would provide discriminant evidence that MCC is discernible from general social competence.

Communicative behaviour is also influenced by general personality traits, such as agreeableness, openness, and conscientiousness [7577]. Moreover, there is evidence that the Big Five personality traits have a general impact on participants’ performance in SJTs, regardless of the specific trait being assessed. The meta-analysis by McDaniel et al. reported average correlations with conscientiousness (ρ = .27), emotional stability (ρ = .22) and agreeableness (ρ = .25), and smaller correlations with extraversion (ρ = .14) and openness to experience (ρ = .14) [26]. Hence, we assumed that, at maximum, low positive correlations of the VA-MeCo’s sub-tests with the Big Five personality traits would provide evidence for discriminant validity.

Methods

Participants

This study was implemented as an online survey with medical students, including all phases of medical education and was conducted between June 25, 2020, and December 1, 2020. The study was approved by the medical ethics commission of the Technical University of Munich [14/20S]. The students’ participation in the study was voluntary. They were informed in advance and explicitly provided informed written consent. Participation took approximately 1.5 hours on average, with an average of 45 minutes for completing the VA-MeCo. Only participants who provided informed consent, satisfied a minimum time on task-criterium, and completed at least five tasks of the VA-MeCo were included in the final analysis. No other data exclusions were made. Note that due to the decentralised recruitment strategy, the amount and reasons for non-participation could not be determined.

The final sample consisted of N = 395 medical students, of whom 70.33% were female, 29.04% male, and 0.63% diverse. This gender distribution aligns with official statistics for German medical students [78]. The sample covered all phases of German medical education, albeit with a focus on the pre-clinical and clinical phases where medical communication courses predominantly take place (age: M = 23.47, SD = 3.86; semester of study: M = 5.77, SD = 3.55; stage of study: 41.12% pre-clinic, 51.60% clinic, 6.57% practical year, 0.71% other). The resulting sample size enabled testing correlations as low as r = .15 with more than 90% statistical power in the analyses on construct validity and is also sufficiently large for the CFAs [58].

Measures

All participants first completed the VA-MeCo, followed by the questionnaires and achievement tests related to the validation variables. The study was conducted individually online: participants received written instructions at the beginning and then were guided through the procedure by the survey software, without personal assistance. Instruments for the validation variables were chosen to align conceptually with our research objectives, focusing on well-established measures that have been psychometrically evaluated and supported by prior research. A short description of the selected instruments, along with their reliabilities, is provided in Table S1. For more detailed information, readers are referred to the original publications and test documentations (see references in S1).

Analysis

General statistical analyses were done using IBM SPSS 28 and CFA-modelling in the Mplus 8.3 software [79] using robust Full Information Maximum Likelihood estimation. To reduce the CFA’s model complexity due to the abovementioned nested test structure (i.e., ratings of answer options nested in test tasks), we aggregated items per task for each of the three dimensions. Thus, the CFAs represent the test structure at the task level. We proceeded by first estimating separate unidimensional CFA models per respective MCC dimension (M1–M3) and then combining them in a three-dimensional model (M4). Because each task was rated on all three dimensions of MCC this final model contained expected residual correlations of the same tasks across the three dimensions (for a path diagram, see S2 Fig). Reliability was assessed using McDonald’s ω, which is preferable to Cronbach’s α [59] and can be interpreted in the same manner. Construct validity was analysed using Pearson correlations.

Results

Factorial validity

The three separate unidimensional CFAs for the VA-MeCo sub-scales (M1–M3) resulted in good model fit, as evidenced by the χ2-tests and standard fit indices (Table 2). For the dimensions advancing content (M1) and building relationship (M3), the χ2-tests were non-significant despite the large sample size. All other fit indices were good. For the providing structure dimension (M2), the χ2-test was statistically significant, pointing at some degree of misfit; however, all other fit indices hinted at good model fit.

Table 2. Model fit of the CFA models.

Model χ2-test RMSEA
[90% CI]
CFI SRMR
Unidimensional models for the VA-MeCo sub-scales
 M1 Content χ2(44) = 56.802, p = .093 .027 [.000,.046] .971 .045
 M2 Structure χ2(44) = 64.690, p = .023 .035 [.013,.052] .967 .050
 M3 Relationship χ2(44) = 59.609, p = .058 .030 [.000,.048] .977 .050
Complete models of MCC including all VA-MeCo sub-scales
 M4 Three-dimensional model χ2 (459) = 733.697,
p < .001
.039 [.034,.044] .921 .065
 M5 One-dimensional model χ2 (462) = 867.470,
p < .001
.047 [.042,.052] .883 .066

RMSEA = root mean square error of approximation; CFI = comparative fit index; SRMR = standardised root mean square residual; CI = confidence interval; MCC = medical communication competence.

Based on these results, we proceeded with the estimation of the combined three-dimensional model of MCC (M4). Again, this model showed an overall acceptable model fit despite the somewhat diminished value of the comparative fit index. The standardised factor loadings were of a substantial size (i.e., ~ .4 or larger; [59]), with only Task 5 having a relatively low loading on the content dimension (see S2 Table).

The results from M4 revealed substantial positive correlations among the three dimensions (r ≥ .8; S2 Table). Given these high correlations, one might argue that a one-dimensional model (i.e., a single general factor of MCC) would provide a more parsimonious representation of the data. To test this, we compared M4 to a one-dimensional model (M5), in which all tasks loaded onto a single MCC factor. For completeness, we also evaluated three two-dimensional models (M6–M8), each combining two sub-scales into a single factor (e.g., one factor for relationship and another one combining content and structure). The results of the χ2-difference tests indicated that M4 provided a significantly better fit than all alternative models (S3 Table). This was corroborated by information criteria that consistently showed M4 as the best-fitting model among the candidates. Therefore, we selected M4 as the final model.

In the final step, we analysed the reliabilities of the test’s sub-scores and total score, which proved to be high throughout (total score: ω = .94; advancing content: ω = .82; providing structure: ω = .88; building relationship: ω = .88).

Construct validity

The Pearson correlations of the investigated measures with the three dimensions of MCC are listed in a heatmap in Fig 1. Regarding the cognitive variables, there were significant positive correlations of all VA-MeCo sub-scales with prior knowledge and experience in the field of medical communication, as well as with study progress, which is in line with our hypothesis. Moreover, we found low positive correlations with intelligence, which supports our expectations.

Fig 1. Heatmap for Pearson correlations of the measures with the three dimensions of MCC.

Fig 1

The correlations in boldface are in line with the stated hypotheses (see Table 1). * = p ≤ .05 (two-tailed). ** = p ≤ .01 (two-tailed).

Regarding the patient-interaction variables, as expected, both sub-scales contained in the empathy measure had higher correlations for the VA-MeCo’s relationship sub-scale than with its content and structure sub-scales. In addition, corroborating our hypothesis, there were significant positive correlations of all VA-MeCo sub-scales with the sharing and caring scales contained in the patient orientation instrument. However, these correlations were somewhat below the expected value of.30 for the content dimension.

Concerning the general personality traits, the correlations for the three sub-scales of the ISK-K as a measure of social competence ranged from low negative ones for offensiveness and reflexibility to low positive correlations for social orientation and self-organisation. These findings are partially in line with our hypothesis. For social orientation, the correlations were, as expected, in the range of positive to small-sized associations, acknowledging that the correlation with building relationship was marginally higher than that with the other VA-MeCo sub-scales. For reflexibility, the correlations were even lower and not significantly different from zero. The correlations with self-organisation were somewhat higher than expected, particularly for building relationship. Finally, unlike expected the correlations with offensiveness were small and negative, although not significantly different from zero.

For the five sub-scales of the Big Five personality traits, we found correlations close to zero for extraversion and neuroticism. For agreeableness, conscientiousness, and openness, there were low positive correlations for each of the three dimensions, especially for the building relationship dimension. As with social competence, these findings are consistent with our hypotheses in terms of the sign of the associations, although partly not in terms of size.

Discussion

This study aimed to extend the existing validity evidence for the VA-MeCo, a video-based SJT of MCC [30]. To this end, we first focused on factorial validity and investigated whether there is evidence that the empirical structure of the test scores reflects the theoretically assumed one and whether the resulting scales are sufficiently reliable. Second, we investigated construct validity by testing a net of assumed associations among the VA-MeCo sub-scales and a set of convergent and discriminant measures following the classical construct validation approach [56,57].

Regarding factorial validity, the CFA results corroborated that the test adequately reflects the underlying theoretical model of MCC and aligns with previous findings on its dimensional structure [49,50]. Our results also replicated preliminary evidence of the VA-MeCo’s strong reliability [30]. Although the high inter-factor correlations might be seen as a limitation, they are consistent with the hierarchical structure of MCC as described in the Munich Model of Professional Conversation Competence, where the three dimensions form a higher order construct of general MCC [50]. We deliberately chose not to test a hierarchical CFA model with a second-order MCC factor because, in general, a second-order factor model with three first-order factors is statistically equivalent to a model with three correlated factors (cf. M4) [58]. Since such models cannot be distinguished statistically, estimating an additional hierarchical model would not have provided meaningful insights. Moreover, the comparisons with the alternative models (M5–M8) provide further evidence that the three dimensions are not only theoretically but also empirically discernible aspects of MCC. Thus, test result interpretation can focus both on the overall MCC score as a comprehensive measure and the respective sub-scale scores. The latter may be preferable, for instance, for providing detailed feedback to learners [80]. Although the high inter-factor correlations suggest that the dimensions are closely related, they do not imply redundancy. Rather, each sub-scale captures a distinct yet interconnected facet of MCC, allowing for nuanced feedback that can highlight specific strengths and areas for improvement [38].

Regarding construct validity, we tested hypothesised correlations with measures of (a) cognitive variables as well as (b) patient-interaction and (c) more general personality traits. For cognitive variables, significant positive correlations with prior knowledge and study progress provided convergent evidence that the VA-MeCo differentiates between groups with different levels of MCC. Overall, correlations with prior knowledge were notably smaller than those with study progress. Although we based our questions on these variables on prior research, we acknowledge that they are most likely imperfect proxies for students’ actual prior knowledge. Regarding discriminant validity, the identified low positive correlations with intelligence suggest that test performance does not depend excessively on intelligence. As discussed above, meta-analysis indicates that SJTs exhibit, on average, medium-sized correlations with intelligence [26]. The observed correlations between the VA-MeCo’s sub-scales and intelligence were lower than expected based on this research. Therefore, we conclude that, although test performance in the VA-MeCo depends on intelligence to some degree—as is likely the case with any standardised achievement test—the measurement of MCC is clearly distinguishable from intelligence.

In terms of patient-interaction variables, significant correlations with empathy and patient orientation—especially regarding the building relationship dimension—were found, as expected. This outcome provides evidence of convergent validity. These correlations align with earlier research emphasising that a crucial aspect of patient-centred communication is displaying empathy in response to the emotional needs of patients [70,81]. Furthermore, these positive correlations reflect that building a good working relationship with the patient is a vital aspect of patient-centredness [70,82]. They are also consistent with findings on the importance of an egalitarian doctor–patient relationship, which focuses on the patient, for facilitating effective conversations between patients and physicians [70].

With respect to general personality traits, some of the observed correlations with social competence differed to some degree from our expectations. We had assumed at maximum small positive correlations (i.e., r ≤ .1). Results showed that the correlation between social orientation and the building relationship sub-scale was slightly higher than expected but could still be considered essentially small. This finding might be explained by the fact that people with a high social orientation generally possess a more positive attitude towards others and, thus, are better at empathising and socially connecting with others [67]. Additionally, unlike expected, small negative correlations were observed for offensiveness and all three VA-MeCo sub-scales. However, these correlations did not differ significantly from zero and thus pose no threat to discriminant validity. One potential reason for the descriptively negative correlations may be that people with high offensiveness effectively advocate for their own interests, which is not in line with a patient-centred orientation [67]. Finally, the correlations for self-organisation and all three dimensions of MCC were slightly higher than expected. However, the differences were negligible for the content and structure sub-scales and still small for the building relationship sub-scale (i.e. Δr = .164 between the expected and the observed correlation). These somewhat higher correlations might be explained by the fact that people with high social-organisation tend to maintain emotional balance, act calmly and with control, and adapt flexibly to changing conditions. All these factors are important when conducting a physician-patient conversation. In summary, we believe that the discussed deviations from our expectations are of marginal size and, thus, pose no substantial concern about discriminant validity against social competence.

Finally, the pattern of findings largely corroborated our expectations for the correlations with the Big Five personality traits. As a slight deviation, some of the correlations of conscientiousness and agreeableness with the VA-MeCo were marginally higher than r = .1; however, they can still be judged as essentially low correlations in terms of Cohen’s [69] standards. As mentioned previously, the Big Five often show medium correlations with SJT performance [26]. The correlations found in this study were smaller throughout, indicating that personality plays a lesser role for performance on the VA-MeCo than is typically expected for SJTs.

In summary, despite the slight deviations discussed, the analyses provided a pattern of findings that was largely in line with our expectations (cf. Table 1) and delivers further evidence of the VA-MeCo’s validity in measuring MCC [56,60]. The results support the conjecture that MCC is a hierarchical, multidimensional construct [49,50] that is related to but discernible from other factors that are frequently considered important in medical education (such as empathy and patient orientation) and, thus, provides added value.

Limitations and future research

We acknowledge several main limitations of the present study. First, regarding factorial validity, some tasks (particularly Item 5) showed relatively low factor loadings. We retained these items because they represent important steps in medical communication and excluding them would have compromised the instrument’s content validity. Nonetheless, as the overall factor solution is well-defined, theoretically coherent, and yields reliable scales, we consider it defensible, while noting that these tasks may warrant refinement in future research.

Second, we used a classic construct validity assessment approach [56,57,60] which has at least two limitations: (a) the selection of relevant validation variables and (b) the interpretation of correlation patterns, which is ultimately judgmental and somewhat subjective [83]. Regarding the first problem, we aimed to select the most relevant validation measures by drawing on substantive theory and research on communication competence, particularly in the medical field, as well as general research on the SJT method. Although we were able to include a fairly wide range of cognitive and non-cognitive variables, this selection is necessarily incomplete. However, the inclusion of further validation measures—particularly further achievement tests—would have meant an even greater workload for the participants, potentially compromising the quality of the data. For the same reason and given the sample size requirements for the analyses of the present study, the collection of additional behavioural measures of MCC (e.g., in an OSCE) was impractical and beyond the scope of the study.

Regarding the interpretation of correlations, we attempted to reduce subjectivity by stating a priori hypotheses about the expected directions and, where possible, the sizes of the correlations (Table 1), and by following Cohen’s [69] standards for interpreting effect sizes. However, these recommendations should not be taken as universally valid and unambiguous cut-off values. For example, in Fig 1, we categorised the correlation coefficients for social competence and the Big Five personality variables as deviating from the hypotheses when their absolute values exceeded r = .1 (i.e., a small effect). However, as noted above, many of the correlations found should still be interpreted as small (e.g., r = .16 between agreeableness and relationship). Moreover, the present data do not allow us to explore the causes of deviations in correlation sizes from the expected values. Future research should investigate whether such patterns replicate across different validation measures.

Furthermore, although our study provides an in-depth evaluation of the factorial and construct validity of the VA-MeCo, further validation is needed, particularly studies assessing its predictive validity for communicative quality in actual patient encounters. Further research should examine whether students who score well on the VA-MeCo perform better in communication in clinical settings. Similarly, studies should investigate the relationship between the VA-MeCo and performance in simulated interactions (e.g. OSCEs), which was not feasible in the present study. Such research should account for the fact that SJTs reflect knowledge-based judgment rather than actual behaviour. Therefore, we expect the VA-MeCo to be most predictive of situation-specific judgment and decision making, such as recognising meaningful patterns, identifying critical moments in conversations, evaluating courses of action, and analysing and evaluating communicative situations [34]. Furthermore, as the VA-MeCo primarily addresses cognitive aspects, further research is needed to examine how these interact with procedural and attitudinal dimensions of MCC (such as verbalising and regulating emotions to convey empathy) and develop combined approaches that capture these facets more directly.

Additionally, the stability of MCC over time and its sensitivity to instructional interventions should be investigated. Longitudinal research could assess medical students with the VA-MeCo at multiple points during their studies to track changes in MCC and evaluate the instrument’s sensitivity to communication training. Moreover, examining external and predictive validity by testing associations with performance in both simulated and real patient interactions would further strengthen the instrument.

Finally, a common criticism of SJTs is their potential susceptibility to socially desirable responses [84]. Such response tendencies can bias correlations with other variables [57] potentially affecting our analyses of convergent and discriminant validity. However, we believe that the VA-MeCo’s test format largely mitigates social desirability bias. In each task, rather than selecting a single best response, participants must rate the effectiveness of every presented answer option on three dimensions for achieving a given communication goal. This format prevents participants from simply choosing the most socially desirable option. Furthermore, even if participants’ effectiveness ratings are influenced by social desirability considerations, they can only earn test points if their responses match the experts’ effectiveness ratings. Overall, these design features enhance the robustness of the VA-MeCo against social desirability bias and strengthen the validity of the results. Nonetheless, future research should also examine other potential threats to validity, such as group-related biases (e.g., differential item functioning by gender or ethnic background). Investigating such aspects was beyond the scope of the present study due to the sample size requirements.

Practical implications

The present study adds to the existing evidence for the validity of the VA-MeCo in measuring medical students’ MCC in patient encounters. Due to its digital implementation and ease of use, the test can be administered effectively to large groups in medical education. The VA-MeCo is versatile and can be used for both assessment and teaching MCC. In the classroom, faculty can use the test to monitor student progress, provide feedback, and raise awareness of important medical communication issues. In many teaching scenarios, the VA-MeCo can be effectively combined with more resource-intensive methods, such as OSCEs and standardised patients, at different stages of the curriculum. Test results can provide valuable feedback to faculty, supporting curricular quality development. Finally, given its close alignment with standards for medical communication training [47,48], the VA-MeCo’s materials may serve as inspiration for creating new learning opportunities or refining existing ones.

Supporting information

S1 Fig. Example of a test task.

(TIF)

pone.0332957.s001.tif (426KB, tif)
S2 Fig. Path diagram with standardized results for M4.

(TIF)

pone.0332957.s002.tif (123.7KB, tif)
S1 Table. Details of the instruments used for measuring validation variables.

(DOCX)

pone.0332957.s003.docx (26.2KB, docx)
S2 Table. Standardised factor loadings and factor correlations of the final three-dimensional CFA-model (M4).

(DOCX)

pone.0332957.s004.docx (23.2KB, docx)
S3 Table. Model fit of alternative one- and two-dimensional CFA models compared to the three-dimensional target model of MCC (M4).

(DOCX)

pone.0332957.s005.docx (26.6KB, docx)

Acknowledgments

We thank all the medical students that participated in this study as well as our student assistants for their help in gathering the data.

Data Availability

All data files are available from the PsychArchives data repository (DOI: https://doi.org/10.23668/psycharchives.5337). All test materials are available from the OpenTestArchive repository (DOI: https://doi.org/10.23668/psycharchives.14472).

Funding Statement

Initials of authors who received the award: SR, LS, LJ, SIDP, ED, KS, POB, MG, JB This work was supported by the German Federal Ministry of Education and Research (Bundesministerium für Bildung und Forschung, BMBF) under Grant [number 16DHB2134]. URL to sponsors‘ website: https://www.bmbf.de/EN/Home/home_node.html The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Silverman J, Kurtz S, Draper J. Skills for communicating with patients. 3rd ed. Boca Raton: CRC Press. 2013. [Google Scholar]
  • 2.Jünger J. Medical communication: practice book for the master plan for medical studies 2020. Stuttgart: Schattauer. 2018. [Google Scholar]
  • 3.Ha JF, Longnecker N. Doctor-patient communication: a review. Ochsner J. 2010;10(1):38–43. [PMC free article] [PubMed] [Google Scholar]
  • 4.Cuffy C, Hagiwara N, Vrana S, McInnes BT. Measuring the quality of patient-physician communication. J Biomed Inform. 2020;112:103589. doi: 10.1016/j.jbi.2020.103589 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Dowson J. Transferring knowledge into practice? Exploring the feasibility of action learning for improving knowledge, skills and confidence in clinical communication skills. BMC Med Educ. 2019;19(1):37. doi: 10.1186/s12909-019-1467-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Maguire P, Pitceathly C. Key communication skills and how to acquire them. BMJ. 2002;325(7366):697–700. doi: 10.1136/bmj.325.7366.697 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Boissy A, Windover AK, Bokar D, Karafa M, Neuendorf K, Frankel RM, et al. Communication Skills Training for Physicians Improves Patient Satisfaction. J Gen Intern Med. 2016;31(7):755–61. doi: 10.1007/s11606-016-3597-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Kelley JM, Kraft-Todd G, Schapira L, Kossowsky J, Riess H. The influence of the patient-clinician relationship on healthcare outcomes: a systematic review and meta-analysis of randomized controlled trials. PLoS One. 2014;9(4):e94207. doi: 10.1371/journal.pone.0094207 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Matusitz J, Spear J. Effective doctor-patient communication: an updated examination. Soc Work Public Health. 2014;29(3):252–66. doi: 10.1080/19371918.2013.776416 [DOI] [PubMed] [Google Scholar]
  • 10.Medizinischer Fakultätentag. German national competence-based catalogue of learning objectives in medicine. Berlin: Medizinischer Fakultätentag. 2021. https://www.nklm.de [Google Scholar]
  • 11.Swanson DB, van der Vleuten CPM. Assessment of clinical skills with standardized patients: state of the art revisited. Teach Learn Med. 2013;25 Suppl 1:S17-25. doi: 10.1080/10401334.2013.842916 [DOI] [PubMed] [Google Scholar]
  • 12.Fischer F, Helmer S, Rogge A, Arraras JI, Buchholz A, Hannawa A, et al. Outcomes and outcome measures used in evaluation of communication training in oncology - a systematic literature review, an expert workshop, and recommendations for future research. BMC Cancer. 2019;19(1):808. doi: 10.1186/s12885-019-6022-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Brown J. How clinical communication has become a core part of medical education in the UK. Med Educ. 2008;42(3):271–8. doi: 10.1111/j.1365-2923.2007.02955.x [DOI] [PubMed] [Google Scholar]
  • 14.Zill JM, Christalle E, Müller E, Härter M, Dirmaier J, Scholl I. Measurement of physician-patient communication--a systematic review. PLoS One. 2014;9(12):e112637. doi: 10.1371/journal.pone.0112637 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Radziej K, Loechner J, Engerer C, Niglio de Figueiredo M, Freund J, Sattel H, et al. How to assess communication skills? Development of the rating scale ComOn Check. Med Educ Online. 2017;22(1):1392823. doi: 10.1080/10872981.2017.1392823 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Burdick WP, Boulet JR, LeBlanc KE. Can We Increase the Value and Decrease the Cost of Clinical Skills Assessment?. Acad Med. 2018;93(5):690–2. doi: 10.1097/ACM.0000000000001867 [DOI] [PubMed] [Google Scholar]
  • 17.Bauer J, Gartmeier M, Wiesbeck AB, Moeller GE, Karsten G, Fischer MR, et al. Differential learning gains in professional conversation training: A latent profile analysis of competence acquisition in teacher-parent and physician-patient communication. Learning and Individual Differences. 2018;61:1–10. doi: 10.1016/j.lindif.2017.11.002 [DOI] [Google Scholar]
  • 18.Sanson-Fisher R, Hobden B, Waller A, Dodd N, Boyd L. Methodological quality of teaching communication skills to undergraduate medical students: a mapping review. BMC Med Educ. 2018;18(1):151. doi: 10.1186/s12909-018-1265-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Cömert M, Zill JM, Christalle E, Dirmaier J, Härter M, Scholl I. Assessing Communication Skills of Medical Students in Objective Structured Clinical Examinations (OSCE)--A Systematic Review of Rating Scales. PLoS One. 2016;11(3):e0152717. doi: 10.1371/journal.pone.0152717 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Lane C, Rollnick S. The use of simulated patients and role-play in communication skills training: a review of the literature to August 2005. Patient Educ Couns. 2007;67(1–2):13–20. doi: 10.1016/j.pec.2007.02.011 [DOI] [PubMed] [Google Scholar]
  • 21.Kepes S, Keener SK, Lievens F, McDaniel MA. An Integrative, Systematic Review of the Situational Judgment Test Literature. Journal of Management. 2025;51(6):2278–319. doi: 10.1177/01492063241288545 [DOI] [Google Scholar]
  • 22.Graupe T, Fischer MR, Strijbos J-W, Kiessling C. Development and piloting of a Situational Judgement Test for emotion-handling skills using the Verona Coding Definitions of Emotional Sequences (VR-CoDES). Patient Educ Couns. 2020;103(9):1839–45. doi: 10.1016/j.pec.2020.04.001 [DOI] [PubMed] [Google Scholar]
  • 23.Ludwig S, Behling L, Schmidt U, Fischbeck S. Development and testing of a summative video-based e-examination in relation to an OSCE for measuring communication-related factual and procedural knowledge of medical students. GMS J Med Educ. 2021;38(3):Doc70. doi: 10.3205/zma001466 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Kiessling C, Bauer J, Gartmeier M, Iblher P, Karsten G, Kiesewetter J, et al. Development and validation of a computer-based situational judgement test to assess medical students’ communication skills in the field of shared decision making. Patient Educ Couns. 2016;99(11):1858–64. doi: 10.1016/j.pec.2016.06.006 [DOI] [PubMed] [Google Scholar]
  • 25.Heier L, Gambashidze N, Hammerschmidt J, Riouchi D, Geiser F, Ernstmann N. Development and testing of the situational judgement test to measure safety performance of healthcare professionals: An explorative cross-sectional study. Nurs Open. 2022;9(1):684–91. doi: 10.1002/nop2.1119 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.McDaniel MA, Hartman NS, Whetzel DL, Grubb WL III. Situational judgment tests, response instructions, and validity: a meta‐analysis. Personnel Psychology. 2007;60(1):63–91. doi: 10.1111/j.1744-6570.2007.00065.x [DOI] [Google Scholar]
  • 27.Patterson F, Zibarras L, Ashworth V. Situational judgement tests in medical education and training: Research, theory and practice: AMEE Guide No. 100. Med Teach. 2016;38(1):3–17. doi: 10.3109/0142159X.2015.1072619 [DOI] [PubMed] [Google Scholar]
  • 28.Whetzel D, Sullivan T, McCloy R. Situational Judgment Tests: An Overview of Development Practices and Psychometric Characteristics. PAD. 2020;6(1). doi: 10.25035/pad.2020.01.001 [DOI] [Google Scholar]
  • 29.McDaniel MA, Nguyen NT. Situational Judgment Tests: A Review of Practice and Constructs Assessed. Int J Selection Assessment. 2001;9(1–2):103–13. doi: 10.1111/1468-2389.00167 [DOI] [Google Scholar]
  • 30.Reiser S, Schacht L, Thomm E, Figalist C, Janssen L, Schick K, et al. A video-based situational judgement test of medical students’ communication competence in patient encounters: Development and first evaluation. Patient Educ Couns. 2022;105(5):1283–9. doi: 10.1016/j.pec.2021.08.020 [DOI] [PubMed] [Google Scholar]
  • 31.Campion MC, Ployhart RE, MacKenzie WI Jr. The State of Research on Situational Judgment Tests: A Content Analysis and Directions for Future Research. Human Performance. 2014;27(4):283–310. doi: 10.1080/08959285.2014.929693 [DOI] [Google Scholar]
  • 32.Bergman ME, Drasgow F, Donovan MA, Henning JB, Juraska SE. Scoring Situational Judgment Tests: Once You Get the Data, Your Troubles Begin. Int J Selection Assessment. 2006;14(3):223–35. doi: 10.1111/j.1468-2389.2006.00345.x [DOI] [Google Scholar]
  • 33.Lievens F, Patterson F, Corstjens J, Martin S, Nicholson S. Widening access in selection using situational judgement tests: evidence from the UKCAT. Med Educ. 2016;50(6):624–36. doi: 10.1111/medu.13060 [DOI] [PubMed] [Google Scholar]
  • 34.Blömeke S, Gustafsson J-E, Shavelson RJ. Beyond Dichotomies. Zeitschrift für Psychologie. 2015;223(1):3–13. doi: 10.1027/2151-2604/a000194 [DOI] [Google Scholar]
  • 35.Golubovich J, Seybert J, Martin-Raugh M, Naemi B, Vega RP, Roberts RD. Assessing Perceptions of Interpersonal Behavior with a Video-Based Situational Judgment Test. International Journal of Testing. 2017;17(3):191–209. doi: 10.1080/15305058.2016.1194275 [DOI] [Google Scholar]
  • 36.Richman-Hirsch WL, Olson-Buchanan JB, Drasgow F. Examining the impact of administration medium on examine perceptions and attitudes. J Appl Psychol. 2000;85(6):880–7. doi: 10.1037/0021-9010.85.6.880 [DOI] [PubMed] [Google Scholar]
  • 37.Bardach L, Rushby JV, Kim LE, Klassen RM. Using video- and text-based situational judgement tests for teacher selection: a quasi-experiment exploring the relations between test format, subgroup differences, and applicant reactions. European Journal of Work and Organizational Psychology. 2021;30(2):251–64. doi: 10.1080/1359432x.2020.1736619 [DOI] [Google Scholar]
  • 38.Goss BD, Ryan AT, Waring J, Judd T, Chiavaroli NG, O’Brien RC, et al. Beyond Selection: The Use of Situational Judgement Tests in the Teaching and Assessment of Professionalism. Acad Med. 2017;92(6):780–4. doi: 10.1097/ACM.0000000000001591 [DOI] [PubMed] [Google Scholar]
  • 39.Cullen MJ, Zhang C, Marcus-Blank B, Braman JP, Tiryaki E, Konia M, et al. Improving Our Ability to Predict Resident Applicant Performance: Validity Evidence for a Situational Judgment Test. Teach Learn Med. 2020;32(5):508–21. doi: 10.1080/10401334.2020.1760104 [DOI] [PubMed] [Google Scholar]
  • 40.Mielke I, Breil SM, Amelung D, Espe L, Knorr M. Assessing distinguishable social skills in medical admission: does construct-driven development solve validity issues of situational judgment tests?. BMC Med Educ. 2022;22(1):293. doi: 10.1186/s12909-022-03305-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Heininger SK, Baumgartner M, Zehner F, Burgkart R, Söllner N, Berberat PO, et al. Measuring hygiene competence: the picture-based situational judgement test HygiKo. BMC Med Educ. 2021;21(1):410. doi: 10.1186/s12909-021-02829-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Black EW, Schrock B, Prewett MS, Blue AV. Design of a Situational Judgment Test for Preclinical Interprofessional Collaboration. Acad Med. 2021;96(7):992–6. doi: 10.1097/ACM.0000000000004117 [DOI] [PubMed] [Google Scholar]
  • 43.Fröhlich M, Kahmann J, Kadmon M. Development and psychometric examination of a German video‐based situational judgment test for social competencies in medical school applicants. Int J Selection Assessment. 2017;25(1):94–110. doi: 10.1111/ijsa.12163 [DOI] [Google Scholar]
  • 44.Husbands A, Rodgerson MJ, Dowell J, Patterson F. Evaluating the validity of an integrity-based situational judgement test for medical school admissions. BMC Med Educ. 2015;15:144. doi: 10.1186/s12909-015-0424-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Schwibbe A, Lackamp J, Knorr M, Hissbach J, Kadmon M, Hampe W. Selection of medical students : Measurement of cognitive abilities and psychosocial competencies. Bundesgesundheitsblatt Gesundheitsforschung Gesundheitsschutz. 2018;61(2):178–86. doi: 10.1007/s00103-017-2670-2 [DOI] [PubMed] [Google Scholar]
  • 46.Koczwara A, Patterson F, Zibarras L, Kerrin M, Irish B, Wilkinson M. Evaluating cognitive ability, knowledge tests and situational judgement tests for postgraduate selection. Med Educ. 2012;46(4):399–408. doi: 10.1111/j.1365-2923.2011.04195.x [DOI] [PubMed] [Google Scholar]
  • 47.Frank JR, Danoff D. The CanMEDS initiative: implementing an outcomes-based framework of physician competencies. Med Teach. 2007;29(7):642–7. doi: 10.1080/01421590701746983 [DOI] [PubMed] [Google Scholar]
  • 48.Kurtz S, Silverman J, Benson J, Draper J. Marrying content and process in clinical method teaching: enhancing the Calgary-Cambridge guides. Acad Med. 2003;78(8):802–9. doi: 10.1097/00001888-200308000-00011 [DOI] [PubMed] [Google Scholar]
  • 49.Gartmeier M, Bauer J, Fischer MR, Karsten G, Prenzel M. Modelling and assessment of teachers’ professional communication skills in teacher-parent interviews. In: Zlatkin-Troitschanskaia O. Stationen empirischer Bildungsforschung. 1st ed. Wiesbaden: VS Verl für Sozialwiss. 2011. 412–24. [Google Scholar]
  • 50.Gartmeier M, Bauer J, Fischer MR, Hoppe-Seyler T, Karsten G, Kiessling C, et al. Fostering professional communication skills of future physicians and teachers: effects of e-learning with video cases and role-play. Instr Sci. 2015;43(4):443–62. doi: 10.1007/s11251-014-9341-6 [DOI] [Google Scholar]
  • 51.Wiesbeck AB, Bauer J, Gartmeier M, Kiessling C, Möller GE, Karsten G, et al. Simulated conversations for assessing professional conversation competence in teacher-parent and physician-patient conversations. J Educ Res online. 2017;3:82–101. doi: 10.25656/01:15302 [DOI] [Google Scholar]
  • 52.Frank JR, Snell L, Sherbino J. CanMEDS 2015. Physician competency framework. Ottawa: Royal College of Physicians and Surgeons of Canada. 2015. [Google Scholar]
  • 53.de Haes H, Bensing J. Endpoints in medical communication research, proposing a framework of functions and outcomes. Patient Educ Couns. 2009;74(3):287–94. doi: 10.1016/j.pec.2008.12.006 [DOI] [PubMed] [Google Scholar]
  • 54.Weng Q (Derek), Yang H, Lievens F, McDaniel MA. Optimizing the validity of situational judgment tests: The importance of scoring methods. Journal of Vocational Behavior. 2018;104:199–209. doi: 10.1016/j.jvb.2017.11.005 [DOI] [Google Scholar]
  • 55.Reiser S, Schacht L, Thomm E, Janssen L, Schick K, Berberat PO, et al. VA-MeCo. Video-Based Assessment of Medical Communication Competence. Trier: ZPID (Leibniz Institute for Psychology) – Open Test Archive. 2022. doi: 10.23668/psycharchives.14472 [DOI] [Google Scholar]
  • 56.American Educational Research Association, American Psychological Association, National Council on Measurement in Education. Standards for educational and psychological testing. Washington: American Educational Research Association. 2014. [Google Scholar]
  • 57.Furr RM. Psychometrics: an introduction. 4th ed. California: SAGE. 2022. [Google Scholar]
  • 58.Brown TA. Confirmatory factor analysis for applied research. 2nd ed. New York: The Guilford Press. 2015. [Google Scholar]
  • 59.Raykov T, Marcoulides GA. Introduction to psychometric theory. New York: Routledge. 2011. [Google Scholar]
  • 60.Cronbach LJ, Meehl PE. Construct validity in psychological tests. Psychol Bull. 1955;52(4):281–302. doi: 10.1037/h0040957 [DOI] [PubMed] [Google Scholar]
  • 61.Kiessling C, Dieterich A, Fabry G, Hölzer H, Langewitz W, Mühlinghaus I, et al. Communication and social competencies in medical education in German-speaking countries: the Basel consensus statement. Results of a Delphi survey. Patient Educ Couns. 2010;81(2):259–66. doi: 10.1016/j.pec.2010.01.017 [DOI] [PubMed] [Google Scholar]
  • 62.Christian MS, Edwards BD, Bradley JC. Situational judgment tests: constructs assessed and a meta-analysis of their criterion-related validities. Personnel Psychology. 2010;63(1):83–117. doi: 10.1111/j.1744-6570.2009.01163.x [DOI] [Google Scholar]
  • 63.Bauer J, Gartmeier M, Gröschl B. ProfKom - Professionalisation of future doctors and teachers in the field of communication competence: scale documentation. Munich (Germany): Technical University of Munich. 2013. [Google Scholar]
  • 64.Becker N, Spinath FM. Design a Matrix – Advanced. A distractor-free matrix test for assessing general intelligence: manual. Göttingen (Germany): Hogrefe. 2014. [Google Scholar]
  • 65.Davis MH. A multidimensional approach to individual differences in empathy. JSAS Catal Sel Doc Psychol. 1980;10:85. [Google Scholar]
  • 66.Kiessling C, Fabry G, Rudolf Fischer M, Steiner C, Langewitz WA. German translation and construct validation of the Patient-Provider-Orientation Scale (PPOS-D12). Psychother Psychosom Med Psychol. 2014;64(3–4):122–7. doi: 10.1055/s-0033-1341455 [DOI] [PubMed] [Google Scholar]
  • 67.Kanning UP. Inventory for measuring social competences in self-perception and external perception (ISK-360°). Göttingen: Hogrefe. 2009. [Google Scholar]
  • 68.Rammstedt B, John OP. Kurzversion des Big Five Inventory (BFI-K):. Diagnostica. 2005;51(4):195–206. doi: 10.1026/0012-1924.51.4.195 [DOI] [Google Scholar]
  • 69.Cohen J. Statistical power analysis for the behavioral sciences. New York: Routledge. 1988. [Google Scholar]
  • 70.Mead N, Bower P. Patient-centredness: a conceptual framework and review of the empirical literature. Soc Sci Med. 2000;51(7):1087–110. doi: 10.1016/s0277-9536(00)00098-8 [DOI] [PubMed] [Google Scholar]
  • 71.Norfolk T, Birdi K, Walsh D. The role of empathy in establishing rapport in the consultation: a new model. Med Educ. 2007;41(7):690–7. doi: 10.1111/j.1365-2923.2007.02789.x [DOI] [PubMed] [Google Scholar]
  • 72.Santana MJ, Manalili K, Jolley RJ, Zelinsky S, Quan H, Lu M. How to practice person-centred care: A conceptual framework. Health Expect. 2018;21(2):429–40. doi: 10.1111/hex.12640 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Baig LA, Violato C, Crutcher RA. Assessing clinical communication skills in physicians: are the skills context specific or generalizable. BMC Med Educ. 2009;9:22. doi: 10.1186/1472-6920-9-22 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Kanning UP. Soziale Kompetenz - Definition, Strukturen und Prozesse. Zeitschrift für Psychologie / Journal of Psychology. 2002;210(4):154–63. doi: 10.1026//0044-3409.210.4.154 [DOI] [Google Scholar]
  • 75.Zohoorian Z, Zeraatpishe M, Khorrami N. Willingness to communicate, Big Five personality traits, and empathy: how are they related? CJESS Can J Educ Soc Sci. 2022;2(5):17–27. doi: 10.53103/cjess.v2i5.69 [DOI] [Google Scholar]
  • 76.Sims CM. Do the Big-Five Personality Traits Predict Empathic Listening and Assertive Communication?. International Journal of Listening. 2017;31(3):163–88. doi: 10.1080/10904018.2016.1202770 [DOI] [Google Scholar]
  • 77.Manuel RS, Borges NJ, Gerzina HA. Personality and clinical skills: any correlation?. Acad Med. 2005;80(10 Suppl):S30-3. doi: 10.1097/00001888-200510001-00011 [DOI] [PubMed] [Google Scholar]
  • 78.Hachmeister CD. What do women study? What do men study? – Students and first-year students by gender. CHE DataCHECK. 2025. https://hochschuldaten.che.de/was-studieren-frauen-was-studieren-maenner/ [Google Scholar]
  • 79.Muthén LK, Muthén BO. Mplus User’s Guide. 8th ed. Los Angeles: Muthén & Muthén. 1998-2017. [Google Scholar]
  • 80.Meijer RR, Boevé AJ, Tendeiro JN, Bosker RJ, Albers CJ. The Use of Subscores in Higher Education: When Is This Useful?. Front Psychol. 2017;8:305. doi: 10.3389/fpsyg.2017.00305 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Constand MK, MacDermid JC, Dal Bello-Haas V, Law M. Scoping review of patient-centered care approaches in healthcare. BMC Health Serv Res. 2014;14:271. doi: 10.1186/1472-6963-14-271 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Scholl I, Zill JM, Härter M, Dirmaier J. An integrative model of patient-centeredness - a systematic review and concept analysis. PLoS One. 2014;9(9):e107828. doi: 10.1371/journal.pone.0107828 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Furr RM, Bacharach VR. Psychometrics: an introduction. 2nd ed. California: SAGE. 2013. [Google Scholar]
  • 84.Kaminski K, Felfe J, Schäpers P, Krumm S. A closer look at response options: Is judgment in situational judgment tests a function of the desirability of response options?. Int J Selection Assessment. 2019;27(1):72–82. doi: 10.1111/ijsa.12233 [DOI] [Google Scholar]

Decision Letter 0

Ayesha Fahim

22 May 2025

Dear Dr. Bauer,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Jul 06 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org . When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols . Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols .

We look forward to receiving your revised manuscript.

Kind regards,

Ayesha Fahim

Academic Editor

PLOS ONE

Journal requirements: 

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf .

2. Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

Reviewer #1: Yes

Reviewer #2: Partly

Reviewer #3: Yes

Reviewer #4: Yes

Reviewer #5: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously? -->?>

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

Reviewer #4: Yes

Reviewer #5: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available??>

The PLOS Data policy

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

Reviewer #4: Yes

Reviewer #5: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English??>

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

Reviewer #4: Yes

Reviewer #5: Yes

**********

Reviewer #1: As the authors identify, professionalism and in particular communication skills are some of the most important and yet notoriously difficult skills to teach to medical students.

Perhaps the first step to teaching or learning anything is to define what it is that we are attempting to teach or learn. This study goes a long way towards clarifying what it is we mean when we say "Medical Communication Competence".

Any step which bringings us closer to the development of a means to accurately measure these skills without bias provides feedback that can be used to improve students' communication skills and teachers' ability to teach these skills. By validating the constructs that underlie MCC this study supports the future development of more scaffolded approaches to teaching and learning MCC.

I would note that the article was information dense and that it took me several reads to unpack the study itself and the conclusions that the authors' had drawn. I would suggest the incorporation of simple visual aids to assist readers to understand the core concepts more easily... Perhaps Radar Diagrams or the like showing the factors and their correlation coefficients might be useful.

Reviewer #2: In the background include a paragraph that addresses what is currently happening with the measurement of communication competence in health sciences students, that is, whether current teaching methods and/or didactis are effective or why they are a challenge, where they are failing, or why current methods are not working.

Also, justify why a scale or test allows for the measurement of communication competence, which is an aspect that requires procedural and attitudinal skills, not just cognitive skills or decision-making or judgment.

Improve the wording of lines 84 and 92, as it currently appears contradictory. First, they state that there is a growing interest in the use of situational judgment tests (SJTs), and then in line 92, they mention "still only a few SJTs."

Methodology

It is suggested included the number of students who did not participate and indicated the reasons.

Please clarify if the 45 minutes were spent administering only the Va-MeCo or all the tests. Please also indicate how many questions were included in the total for all the tests, and what the total number of questions was. Although the authors mention that the information is reviewed according to references, please also mention if everything was self-administered or if there was a tutor who guided them through all the tests, or only the Va-MeCo.

All this information is necessary to allow other appropriately trained researchers to fully replicate your study.

Results

This section provides part of the discussion, as it not only describes the results but also briefly analyzes them. This is permitted by the journal; however, it is suggested that you create a combined Results/Discussion section (commonly labeled "Results and Discussion") or leave it as results only without providing additional information or expressing your opinion.

Discussion

Include references between lines 385 and 407 that support or contrast your results. For example, it is not clear why it allows for better feedback, or how it relates to feedback. Evidence that supports this or mentions its importance in the process is missing.

Limitations/Conclusion

Include in one of these sections, that communication competency is a procedural and attitudinal competency where students must verbalize and manage their emotions to empathize. Therefore, the current proposal is a cognitive approach, since the interaction of these facets requires further research and other types of instruments/methodologies for its measurement.

Reviewer #3: This manuscript “Assessment of medical communication competence“ presents a carefully conducted empirical study. The strengths of the manuscript lie in its conceptual clarity, the systematic approach to empirical analysis. It is well formulated in terms of language.

The authors describe an empirical analysis of the factorial and construct validity of an online situational judgement test (VA-MeCo), measuring medical students´ communication competence. They provide a detailed and well-structured background on the test and its conceptual foundations. The research question is clearly defined, and the hypotheses are quantified based on empirical findings. The methodology of confirmatory factor analysis seems appropriately selected and transparently described. Construct validity is operationalised and tested using convergent and discriminant variables. The discussion section effectively integrates the results with relevant literature and outlines implications in a differentiated manner.

However, the conclusion section stands out from the rest of the manuscript in terms of focus and content. Rather than providing a concise summary of the study’s aims, methods, and main findings, it introduces new aspects that have not been sufficiently addressed in earlier sections. While these points appear clearly relevant and valuable, they would benefit from being integrated into the introduction and discussion to ensure a coherent overall argument and to support the internal consistency of the manuscript.

The study addresses a topic of clear relevance within the field. It enriches the ongoing academic discourse and opens avenues for further research. Overall, the manuscript represents a valuable scholarly contribution and demonstrates a high standard of academic rigor.

Reviewer #4: The authors report a rigorous extension of the characterization of an instrument to measure an individual's skills in communication with a patient. The manuscript significantly expands the characterization beyond previous reports. The article would have more impact if it reported changes in learners' communication skills in response to an intervention. In terms of the quality of the validation, two tasks (tasks 1 and 5) fail to show standardized factor loadings above 0.5 (a commonly used cut-off) for any of the three subscales.

Reviewer #5: Reviewer comment and suggestions

This study on the Video-Based Assessment of Medical Communication Competence (VA-MeCo) presents valuable insights into measuring medical communication competence (MCC) among medical students. However few issues need improvement.

Strengths:

• Relevance of the Topic: The study addresses a critical area in medical education, emphasizing the importance of effective communication in patient encounters, which is well-documented in literature.

• Methodological Rigor: The use of confirmatory factor analysis (CFA) to assess factorial validity adds a strong methodological foundation, providing statistical support for the three-dimensional structure of MCC.

• Reliability: The high McDonald’s Ω values indicate that the VA-MeCo test is reliable, reinforcing its potential utility in educational settings.

• Construct Validity Exploration: Evaluating both convergent and discriminant validity enhances the understanding of how MCC relates to other cognitive and personality traits, suggesting that the VA-MeCo measures a specific and relevant construct.

• Implications for Medical Education: The findings support the integration of structured assessments like VA-MeCo into medical curricula, highlighting their role in improving the competence of future physicians.

Areas for Improvement:

• Sample Diversity: Although the sample size of 395 is substantial, the study should discuss the demographic diversity of participants. If the sample lacks variety in terms of age, gender, ethnicity, or educational background, this may limit the generalizability of the findings.

• Exploration of Deviation in Correlations: While the study mentions that correlation sizes partly deviated from expectations, it does not thoroughly explore the implications of this finding. A deeper analysis could provide insights into why certain relationships were weaker than anticipated and how this might inform future research.

• Longitudinal Validation: The study recommends further validation, but it would strengthen its claims to propose specific longitudinal or cross-validation studies. Demonstrating stability of MCC over time and its predictive validity concerning actual patient interactions would be beneficial.

• Practical Application: Though the study indicates that VA-MeCo is useful for assessing MCC, it could provide more detailed recommendations on how the findings can be implemented in medical training programs. For example, how should educators modify curricula based on VA-MeCo results?

• Potential Confounders: While the study suggests that VA-MeCo isn't confounded by traits like intelligence, it could be worthwhile to explore other potential confounding variables that may influence MCC and how these were controlled in the study.

NB:

Generally, this study is a strong contribution to the field of medical education, offering a rigorously validated instrument for assessing communication skills. Addressing the areas for improvement could enhance its robustness and applicability further, making it a more powerful tool for educators seeking to improve medical communication competence among students.

**********

what does this mean? ). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy

Reviewer #1: Yes:  Francesco Bolstad

Reviewer #2: Yes:  Claudia Bugueno Araya

Reviewer #3: No

Reviewer #4: No

Reviewer #5: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/ . PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org

PLoS One. 2025 Sep 23;20(9):e0332957. doi: 10.1371/journal.pone.0332957.r002

Author response to Decision Letter 1


30 Aug 2025

A detailed point-by-point response to the editor’s and reviewers’ comments can be found in the document Response to Reviewers.

Attachment

Submitted filename: Reiser_etal_Response to Reviewers.docx

pone.0332957.s007.docx (133.8KB, docx)

Decision Letter 1

Ayesha Fahim

7 Sep 2025

Online assessment of medical students' communication competence in patient encounters: Validation of the VA-MeCo situational judgement test.

PONE-D-25-19255R1

Dear Dr. Bauer,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice will be generated when your article is formally accepted. Please note, if your institution has a publishing partnership with PLOS and your article meets the relevant criteria, all or part of your publication costs will be covered. Please make sure your user information is up-to-date by logging into Editorial Manager at Editorial Manager®  and clicking the ‘Update My Information' link at the top of the page. For questions related to billing, please contact billing support .

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Ayesha Fahim

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Fig. Example of a test task.

    (TIF)

    pone.0332957.s001.tif (426KB, tif)
    S2 Fig. Path diagram with standardized results for M4.

    (TIF)

    pone.0332957.s002.tif (123.7KB, tif)
    S1 Table. Details of the instruments used for measuring validation variables.

    (DOCX)

    pone.0332957.s003.docx (26.2KB, docx)
    S2 Table. Standardised factor loadings and factor correlations of the final three-dimensional CFA-model (M4).

    (DOCX)

    pone.0332957.s004.docx (23.2KB, docx)
    S3 Table. Model fit of alternative one- and two-dimensional CFA models compared to the three-dimensional target model of MCC (M4).

    (DOCX)

    pone.0332957.s005.docx (26.6KB, docx)
    Attachment

    Submitted filename: Reiser_etal_Response to Reviewers.docx

    pone.0332957.s007.docx (133.8KB, docx)

    Data Availability Statement

    All data files are available from the PsychArchives data repository (DOI: https://doi.org/10.23668/psycharchives.5337). All test materials are available from the OpenTestArchive repository (DOI: https://doi.org/10.23668/psycharchives.14472).


    Articles from PLOS One are provided here courtesy of PLOS

    RESOURCES