Skip to main content
PLOS One logoLink to PLOS One
. 2023 Jan 26;18(1):e0280493. doi: 10.1371/journal.pone.0280493

Validity of constructed-response situational judgment tests in training programs for the health professions: A systematic review and meta-analysis protocol

Sara Mortaz Hejri 1, Jordan L Ho 2, Xuan Pan 1, Yoon Soo Park 3, Amir H Sam 4, Haykaz Mangardich 1, Alexander MacIntosh 1,*
Editor: Somayeh Delavari5
PMCID: PMC9879421  PMID: 36701397

Abstract

Background

Situational judgments tests have been increasingly used to help training programs for the health professions incorporate professionalism attributes into their admissions process. While such tests have strong psychometric properties for testing professional attributes and are feasible to implement in high-volume, high-stakes selection, little is known about constructed-response situational judgment tests and their validity.

Methods

We will conduct a systematic review of primary published or unpublished studies reporting on the association between scores on constructed-response situational judgment tests and scores on other tests that measure personal, interpersonal, or professional attributes in training programs for the health professions. In addition to searching electronic databases, we will contact academics and researchers and undertake backward and forward searching. Two reviewers will independently screen the papers and decide on their inclusion, first based on the titles and abstracts of all citations, and then according to the full texts. Data extraction will be done independently by two reviewers using a data extraction form to chart study details and key findings. Studies will be assessed for the risk of bias and quality by two reviewers using the “Quality In Prognosis Studies” tool. To synthesize evidence, we will test the statistical heterogeneity and conduct a psychometric meta-analysis using a random-effects model. If adequate data are available, we will explore whether the meta-analytic correlation varies across different subgroups (e.g., race, gender).

Discussion

The findings of this study will inform best practices for admission and selection of applicants for training programs for the health professions and encourage further research on constructed-response situational judgment tests, in particular their validity.

Trial registration

The protocol for this systematic review has been registered in PROSPERO [CRD42022314561]. https://www.crd.york.ac.uk/prospero/display_record.php?ID=CRD42022314561.

Introduction

Training programs for the health professions (e.g., medicine, nursing, physician assistants, physical and occupational therapy, etc.) have traditionally prioritized applicants’ academic knowledge and skills, such as their grade point average (GPA) and Medical College Admission Test (MCAT) scores as indicators of their future success [1]. However, this emphasis on academic-based metrics has demonstrated a major limitation: curtailing the selection of applicants who are not just ‘book-smart,’ but also possess the professionalism and interpersonal-based attributes required for success as healthcare practitioners [2,3]. The newfound importance of evaluating personal and professional characteristics in admissions (e.g., empathy, integrity) has sparked a rise in holistic review by programs seeking valid and legally defensible assessment methods [4].

Situational judgments tests (SJTs) have been increasingly used to address the gap in programs’ ability to incorporate attributes outside of academic metrics into their admissions process. SJTs are assessments that present applicants with a series of hypothetical scenarios and assess their responses to those situations, with the goal of evaluating how someone is likely to react or behave in a given setting [5]. SJTs are favorably received by applicants [6], are cost-effective and feasible to implement in high-volume, high-stakes assessments, and thus, can be easily used in the earlier stages of selection [7]. One meta-analysis showed that SJTs have an internal consistency coefficient that ranged from 0.43 to 0.94 [8]. However, it is often argued that lower internal consistency of SJTs is found in cases where the test has a heterogeneous structure representing many different factors [9,10]. Also, several studies have explored SJTs’ validity—defined as the test’s ability to differentiate among people on other variables that measure the same or conceptually relevant constructs, behaviours, and performances, and operationalized as the correlation between scores on SJTs and scores on relevant outcomes [1115]. While some studies have found evidence to support the interpretation that SJT scores can explain unique variance in relevant outcomes not explained by other measures [1113], more recent studies have reported varying levels of validity depending on the type of SJT and outcome. A recent systematic review has found a moderate level of validity for SJTs (pooled estimate of 0.32, p value < 0.0001) [14], and another systematic review found that the SJT subtest of The University Clinical Aptitude Test (UCAT) was a weak predictor of professional behaviour during medical school [15].

To help elucidate why the above-mentioned findings conflict, it is possible that there are moderating characteristics that influence the magnitude of SJT validity values. Past meta-analyses have shown that moderators such as test construction (e.g., consultation of subject matter experts, level of detail in scenarios), study design (e.g., concurrent vs. predictive), and response instructions (e.g., knowledge vs. behavioral tendency) can affect the validity of SJTs [9]. However, little attention has been directed toward understanding one potential moderator in particular: the role of SJTs’ response format. It is unclear, for instance, how the validity of constructed-response SJTs (i.e., where applicants give free-text answers) compares to that of selected-response formats (i.e., multiple-choice). This distinction between the two SJT formats is crucial to make. The results of previous studies have demonstrated that, while selected-response questions can give a false impression of students’ competence, constructed-response exams have advantages in terms of reliability, validity, and distinction between high and low performers [16,17]. Specifically regarding SJTs, constructed-response formats are less susceptible to faking by applicants [7], and have increased ability to discriminate between applicants compared with selected-response questions [18]. These differences suggest that constructed-response SJTs may have greater validity than selected-response formats. With the widespread and growing use of constructed-response SJTs by training programs for the health professions [1921], investigating the validity of this format has strong practical implications.

Another important aspect of the admission process is the impact of selection tools on the shape of diversity. When using SJTs, it is important to know the extent to which comparable inferences can be made across different demographic subgroups (e.g., gender, race, ethnicity, and socioeconomic status) [22,23]. Research has demonstrated, for instance, that subgroup differences were significantly smaller or reversed for constructed-response SJTs, not only in comparison to other selection measures such as GPA and the MCAT [21], but also when compared with selected-response SJTs [23]. Nonetheless, the magnitude of subgroup differences is not equivalent to comparable validity inferences across groups. The present research will therefore seek to contribute new knowledge by directly examining the validity of constructed-response SJTs across demographic groups and identity characteristics.

To address the above-mentioned literature gaps, the present systematic review and meta-analysis will address the following primary question:

  • What is the magnitude of the relationship between constructed-response SJT scores and other measures of personal, interpersonal, and professional attributes assessed either concurrently or in the future in training programs for the health professions?

Depending on the data availability, the present systematic review will also investigate the secondary questions:

  • To what extent are the relations between constructed-response SJT scores and other measures of personal, interpersonal, and professional attributes assessed either concurrently or in the future moderated by demographic characteristics (e.g., gender, race, socioeconomic status, language) and methodological variables (e.g., test construction, study design, response instruction)?

To be aligned with modern perspectives on validity, we will frame our study using Messick’s framework that emphasizes a uniform conceptualization of validity evidence including content, response process, relations to other variables, internal structure, and consequences, as operationalized in the Standards for Educational and Psychological Testing [24]. In this study, we will synthesize the collected evidence regarding ‘relations to other variables’. In this sense, we will use a definition of the validity that includes the relations to both concurrent and subsequent measures.

Methods

The protocol for this systematic review has been registered in PROSPERO [CRD42022314561].

Eligibility criteria

Primary studies reporting on the association between scores on constructed-response SJTs with other measures of personal, interpersonal, or professional attributes assessed within the context of training programs for the health professions will be eligible for inclusion. Studies will be selected according to the following criteria which have also been summarized in Table 1.

Table 1. The summary of inclusion and exclusion criteria.

Inclusion Exclusion
Participants Any persons participating in undergraduate or
postgraduate training programs for the health professions
Participants in training programs other than health professions
Participants of continuing health professions development or faculty development programs
Instruments Constructed-response SJT
Assessment tools that, partially or completely, aim to assess personal, interpersonal, and professional characteristics, either upon admission or in the future
Selected-response SJTs
Assessments that mainly measure cognitive domain (GPA, MCAT)
Outcome Validity of constructed-response SJTs
Validity of constructed-response SJTs across demographic characteristics
Study type Published or unpublished primary studies Non-empirical literature (e.g., letters, commentaries, editorials) and secondary studies (e.g., meta-analysis, reviews)

Participants

We will include studies conducted on students applying to, or training in, a program for the health professions. This could include undergraduate or postgraduate programs in medicine, nursing, dentistry, pharmacology, physical therapy, occupational therapy, chiropractic medicine, veterinary medicine, optometry, nutrition, radiology technology, blood-banking, and laboratory medicine, etc.

Instruments

We will include studies reporting on the association of a constructed-response SJT with at least one other measure that, partially or completely, aims to assess personal, interpersonal, or professional attributes. To ensure that the SJT and the outcome measure at least partially assess the same or similar underlying construct, the intended aspects of the evaluation will be recorded. For instance, descriptions from both the SJT and the outcome measure will need to reference at least one of the same aspects of professionalism or situational judgment. These aspects may be components of professionalism definitions or descriptions from health professions governing bodies (e.g., CanMEDS, ACGME) and can include but are not limited to aspects such as interpersonal and communication skills, leadership, reliability, and dependability. If there are no overlapping aspects between the SJT and the outcome measure, the study will be excluded. The measures may be administered at any point throughout the trainee’s education to the completion of an SJT. Thus, this will include but not be limited to the Multiple Mini Interview (MMI), Objective Structured Clinical Examination (OSCE), workplace-based evaluations, and personality tests. The purpose of using the SJT may be for selecting applicants to be admitted to a program or for the evaluation of students’ performance after admission. We note that we do not require instruments to be the same to look at effects across studies.

Outcomes

We will include studies that have reported the magnitude of relation between constructed-response SJT scores and other measures of personal, interpersonal, and professional attributes. These are expected to include univariate and multivariate techniques such as linear and logistic regression analyses, classification techniques, relative risk estimates (e.g., odds ratios) and risk predictions, similar to methods observed in evaluating clinical prediction rules [25]. We also include the relations across demographic characteristics and methodological considerations where available (race, gender, socioeconomic status, language proficiency, level of training, profession).

Study type

We will have no restrictions on study design, publication date, or language. Non-empirical literature, letters, commentaries, editorials, meta-analyses, studies without any original data, and reviews will be excluded. We will include unpublished studies as well as published papers to identify as much relevant evidence as possible. While grey literature may cause concerns regarding their methodological quality due to the absence of peer review, inclusion of unpublished data could reduce risk of publication bias, the “file-drawer problem,” enrich the power of findings, and reduce research waste [26]. To avoid any potential problem, we will follow the direction of the Cochrane handbook that recommends having at least a similar level of expertise in the review team as a peer reviewer for a journal to appraise unpublished studies [27].

Information sources

To ensure comprehensiveness of our search, we will use different approaches to identify the relevant studies. To find eligible published studies, we will explore the following electronic databases: MEDLINE (via OVID, 1948 onwards), EMBASE (via OVID, 1980 onwards), CINAHL, ERIC, SCOPUS, and Web of Sciences.

In order to find the grey literature, we will search within OpenGrey, ProQuest Dissertations & Theses, and Electronic Theses Online Service (EThOS). We will also contact academics and researchers and invite them to share raw or summary association information, whitepaper reports, technical quality assurance reports, or conference papers to be included in the study. We will also undertake backward and forward searching by reviewing the reference lists and citations of the included articles to add relevant studies. This will help us identify the works cited in an article and let us find out whether a work has been cited after its publication.

Search strategy

We will first develop the MEDLINE search strategy. We will focus on two concepts integral to our research question: population (i.e., applicants to, and students in, health professions programs) and instruments of interest (i.e., constructed-response SJTs). We will not include outcomes of interest in our search strategy assuming they would have an impact on the search sensitivity and decrease the number of retrieved articles. The search terms will be identified through reviewing the known relevant papers. These include, for example, BEME guide No. 50 [28] and BEME guide no. 52 [29] for relevant populations of interest, as well as Patterson et al. [6] and Webster et al. [14] for relevant assessments, tests, and/or instruments. Additional terms will be identified using a MeSH analysis grid [30]. The MEDLINE search strategy will be reviewed by a librarian with expertise in systematic review searching and will be revised based on their comments and suggestions. The final MEDLINE search strategy will be modified to adjust the syntax and subject headings of other databases.

A draft search strategy has been included in S1 Table in S1 File.

Study records

Data management

We will use Covidence [31], an online software that supports screening, full-text review, extraction and export of data, to coordinate the study selection process. All retrieved citations and their metadata (i.e., abstract, author names, journal) will be imported into Covidence where first the duplicates will be identified and removed.

Selection process

To select studies for inclusion, two reviewers will independently screen the papers in two rounds. The initial screening process will be performed based on the titles and abstracts of all citations via Covidence. For the second round, the full texts of the records will be assessed against the inclusion and exclusion criteria. Studies will be included if both reviewers agree on the relevance. If both reviewers agreed to exclude the paper the article will be rejected. In case of disagreement, the reviewers will resolve the issue by discussion. The potential remaining conflicts will be addressed by a third reviewer. The whole team will meet to discuss and finalize the inclusion of papers.

Data collection process

To extract data from the included studies, we will design a data extraction form which will be outlined in the next section. The form will be revised after pilot testing on two studies. All the extractions will be done independently and in duplicate by two reviewers based on the final data extraction form. If there is any disagreement, first, the coders will discuss the issue and then, if the issue remains unresolved, a third reviewer will independently extract data and then will have a discussion with two coders to reach consensus.

Data items

We will extract details of the citation, study aim, design, setting, and methodology as well as the assessment tools used and their characteristics. We will also code the key findings and summary notes. If information is not available, it will be indicated as ‘‘not reported”.

A draft of the data extraction form has been included in S2 Table in S1 File.

Outcomes and prioritization

Primary outcomes

  • Validity of constructed-response SJTs, operationalized as the correlation between SJT scores and measures of personal, interpersonal, and professional attributes

Secondary outcomes

  • Validity of constructed-response SJTs across demographic characteristics

  • Validity of constructed-response SJTs across methodological moderators

Risk of bias in individual studies

Studies will be assessed for the risk of bias and quality, independently by two reviewers, using the “Quality In Prognosis Studies” tool [32]. This tool is composed of six domains including 31 items which are rated on a four-grade scale. The overall risk is expressed on a three-grade scale (high, moderate, or low) and free-text comments will be used to justify scores. The review team will meet to discuss and resolve any discrepancies until consensus is achieved. The information will be used in the data synthesis and will let us provide a deeper interpretation of the findings. No study will be excluded based on the quality and risk of bias assessment, but all bias and quality outcomes will be reported.

Data synthesis

We will first provide a description on the characteristics, setting, and context of the included studies. This descriptive synthesis will be used as the basis of synthesis evidence in order to address the review questions. In attempting to answer the review questions, we will synthesize the findings to discuss the validity of constructed-response SJT scores in training programs for the health professions. To improve reporting transparency and mitigate the risk of instrument heterogeneity, a coding scheme will be implemented to organize measures based on their degree of alignment to the underlying construct assessed by the constructed-response SJT. Results from instruments with high overlap, where greater than half the measures score is derived from aspects assessed by the SJT, will be grouped together. Conversely, results from instruments with low overlap will be grouped together. That is, where greater than half the measures score is derived from aspects not assessed in the SJT (e.g., academic, procedural knowledge or clinical skills). We will test the statistical heterogeneity by calculating the I2 value and examining the width of 80% credibility intervals. If statistical heterogeneity is observed, we will conduct psychometric meta-analyses using a random-effects model. The Pearson correlation coefficients between SJT scores and measures of personal, interpersonal, and professional attributes will serve as the effect size index and will be weighted based on sampling precision (e.g., sample size). All correlation coefficients will be transformed into Fisher’s z scale before analyses [33]. We will also report the study findings narratively and will undertake a rich and exploratory descriptive synthesis of evidence to explain the findings.

It is possible that one study may report on several assessment tools or include more than one cohort year of a program. Where unique data for each outcome assessment and participant cohort (year) are available, these will be considered as separate studies.

If adequate data are available, subgroup analyses will be conducted to explore whether different subgroups demonstrate different results. First, all data included in the meta-analysis will be split into subgroups, based on the applicants’ characteristics (e.g., gender, race, socioeconomic status, language, profession, level of training) and study characteristics (e.g., study design, publication status). Then, a meta-analysis will be conducted on each of the subgroups. Sensitivity analysis will also be performed in order to explore the impact of different sample size and risk of bias by removing studies that are judged to have small sample size or be at high risk of bias.

Discussion

The findings of this study will inform best practices for admission and selection of applicants in health professions programs. The association between admission metrics and students’ future performance is an important question for medical educators, selection committees, and stakeholders. Determining the validity of constructed-response SJTs and their ability to produce similar inferences across demographic groups can facilitate defensible and scalable holistic admissions practices by programs for the health professions. The findings of this review will also encourage further research of constructed-response SJTs, in particular their validity. We expect this review will identify gaps of knowledge in this field and suggest areas for future research.

Supporting information

S1 Checklist. PRISMA-P (Preferred Reporting Items for Systematic review and Meta-Analysis Protocols) 2015 checklist: Recommended items to address in a systematic review protocol.

(DOCX)

S1 File

(DOCX)

Acknowledgments

We thank Jacqueline Kreller-Vanderkooy (Learning & Curriculum Support Librarian at the University of Guelph) for her assistance in reviewing and refining our search strategy.

Data Availability

No datasets were generated or analysed during the current study. All relevant data from this study will be made available upon study completion.

Funding Statement

The authors received no specific funding for this work.

References

  • 1.Ferguson E. Factors associated with success in medical school: Systematic review of the literature. BMJ. 2002;324(7343):952–7. doi: 10.1136/bmj.324.7343.952 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Papadakis MA, Teherani A, Banach MA, Knettler TR, Rattner SL, Stern DT, et al. Disciplinary action by medical boards and prior behavior in Medical School. New England Journal of Medicine. 2005;353(25):2673–82. doi: 10.1056/NEJMsa052596 [DOI] [PubMed] [Google Scholar]
  • 3.Siu E, Reiter HI. Overview: What’s worked and what hasn’t as a guide towards Predictive Admissions Tool Development. Advances in Health Sciences Education. 2009;14(5):759–75. doi: 10.1007/s10459-009-9160-8 [DOI] [PubMed] [Google Scholar]
  • 4.Roadmap to excellence: Key concepts for evaluating the impact of medical school holistic admissions. Association of American Medical Colleges (2013). [cited 2022Mar17]. Available from: https://store.aamc.org/downloadable/download/sample/sample_id/198/.
  • 5.Patterson F, Knight A, Dowell J, Nicholson S, Cousans F, Cleland J. How effective are selection methods in medical education? A systematic review. Medical Education. 2015;50(1):36–60. [DOI] [PubMed] [Google Scholar]
  • 6.Patterson F, Ashworth V, Zibarras L, Coan P, Kerrin M, O’Neill P. Evaluations of situational judgement tests to assess non-academic attributes in selection. Medical Education. 2012;46(9):850–68. doi: 10.1111/j.1365-2923.2012.04336.x [DOI] [PubMed] [Google Scholar]
  • 7.Lievens F, Peeters H, Schollaert E. Situational judgment tests: A review of recent research. Personnel Review. 2008;37(4):426–41. [Google Scholar]
  • 8.McDaniel MA, Morgeson FP, Finnegan EB, Campion MA, Braverman EP. Use of situational judgment tests to predict job performance: a clarification of the literature. J Appl Psychol. 2001;86(4):730–740. doi: 10.1037/0021-9010.86.4.730 [DOI] [PubMed] [Google Scholar]
  • 9.Catano VM, Brochu A, Lamerson CD. Assessing the reliability of situational judgment tests used in high-stakes situations. International Journal of Selection and Assessment 2012; 20: 334–346. [Google Scholar]
  • 10.Tiffin PA, Paton LW, O’Mara D, MacCann C, Lang JWB, Lievens F. Situational judgement tests for selection: Traditional vs construct-driven approaches. Med Educ. 2020;54(2):105–115. doi: 10.1111/medu.14011 [DOI] [PubMed] [Google Scholar]
  • 11.Clevenger J, Pereira GM, Wiechmann D, Schmitt N, Harvey VS. Incremental validity of situational judgment tests. Journal of Applied Psychology. 2001;86(3):410–7. [DOI] [PubMed] [Google Scholar]
  • 12.Lievens F, Buyse T, Sackett PR. The operational validity of a video-based situational judgment test for Medical College Admissions: Illustrating the importance of matching predictor and criterion construct domains. Journal of Applied Psychology. 2005;90(3):442–52. doi: 10.1037/0021-9010.90.3.442 [DOI] [PubMed] [Google Scholar]
  • 13.Patterson F, Baron H, Carr V, Plint S, Lane P. Evaluation of three short-listing methodologies for selection into postgraduate training in general practice. Medical Education. 2009;43(1):50–7. doi: 10.1111/j.1365-2923.2008.03238.x [DOI] [PubMed] [Google Scholar]
  • 14.Webster ES, Paton LW, Crampton PE, Tiffin PA. Situational Judgement Test validity for selection: A systematic review and meta-analysis. Medical Education. 2020;54(10):888–902. doi: 10.1111/medu.14201 [DOI] [PubMed] [Google Scholar]
  • 15.Bala L, Pedder S, Sam AH, Brown C. Assessing the predictive validity of the UCAT—a systematic review and narrative synthesis. Medical Teacher. 2021;:1–9. doi: 10.1080/0142159X.2021.1998401 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Sam AH, Field SM, Collares CF, van der Vleuten CP, Wass VJ, Melville C, et al. Very-short-answer questions: Reliability, discrimination and acceptability. Medical Education. 2018;52(4):447–55. doi: 10.1111/medu.13504 [DOI] [PubMed] [Google Scholar]
  • 17.Sam AH, Westacott R, Gurnell M, Wilson R, Meeran K, Brown C. Comparing single-best-answer and very-short-answer questions for the assessment of Applied Medical Knowledge in 20 UK medical schools: Cross-sectional study. BMJ Open. 2019;9(9). doi: 10.1136/bmjopen-2019-032550 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Funke U, Schuler H. Validity of stimulus and response components in a video test of Social Competence. International Journal of Selection and Assessment. 1998;6(2):115–23. [Google Scholar]
  • 19.Dore KL, Reiter HI, Kreuger S, Norman GR. Casper, an online pre-interview screen for personal/professional characteristics: Prediction of national licensure scores. Advances in Health Sciences Education. 2016;22(2):327–36. doi: 10.1007/s10459-016-9739-9 [DOI] [PubMed] [Google Scholar]
  • 20.Shipper ES, Mazer LM, Merrell SB, Lin DT, Lau JN, Melcher ML. Pilot evaluation of the computer-based assessment for Sampling Personal Characteristics Test. Journal of Surgical Research. 2017;215:211–8. doi: 10.1016/j.jss.2017.03.054 [DOI] [PubMed] [Google Scholar]
  • 21.Juster FR, Baum RC, Zou C, Risucci D, Ly A, Reiter H, et al. Addressing the diversity–validity dilemma using situational judgment tests. Academic Medicine. 2019;94(8):1197–203. doi: 10.1097/ACM.0000000000002769 [DOI] [PubMed] [Google Scholar]
  • 22.Downing SM. Validity: On the meaningful interpretation of assessment data. Medical Education. 2003;37(9):830–7. [DOI] [PubMed] [Google Scholar]
  • 23.Lievens F, Sackett PR, Dahlke JA, Oostrom JK, De Soete B. Constructed response formats and their effects on minority–majority differences and validity. Journal of Applied Psychology. 2019;104(5):715–26. doi: 10.1037/apl0000367 [DOI] [PubMed] [Google Scholar]
  • 24.Standards for educational and psychological testing. Washington, DC: American Educational Research Association; 2014.
  • 25.Cowley L.E., Farewell D.M., Maguire S. et al. Methodological standards for the development and evaluation of clinical prediction rules: a review of the literature. Diagn Progn Res 3, 16 (2019). doi: 10.1186/s41512-019-0060-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Korevaar DA, Salameh JP, Vali Y, Cohen JF, McInnes MD, Spijker R, et al. Searching practices and inclusion of unpublished studies in systematic reviews of diagnostic accuracy. Research Synthesis Methods. 2020;11(3):343–53. doi: 10.1002/jrsm.1389 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Higgins JPT, Thomas J, Chandler J, Cumpston M, Li T, Page MJ, et al. (editors). Cochrane Handbook for Systematic Reviews of Interventions version 6.3 (updated February 2022). Cochrane, 2022. Available from www.training.cochrane.org/handbook. [Google Scholar]
  • 28.Simone K, Ahmed RA, Konkin J, Campbell S, Hartling L, Oswald AE. What are the features of targeted or system-wide initiatives that affect diversity in health professions trainees? A Beme systematic review: Beme Guide no. 50. Medical Teacher. 2018;40(8):762–80. doi: 10.1080/0142159X.2018.1473562 [DOI] [PubMed] [Google Scholar]
  • 29.Maudsley G, Taylor D, Allam O, Garner J, Calinici T, Linkman K. A best evidence medical education (BEME) systematic review of: What works best for health professions students using mobile (hand-held) devices for educational support on clinical placements? Beme Guide no. 52. Medical Teacher. 2018;41(2):125–40. [DOI] [PubMed] [Google Scholar]
  • 30.Hocking R. Yale Mesh Analyzer. Journal of the Canadian Health Libraries Association / Journal de l’Association des bibliothèques de la santé du Canada. 2017;38(3). [Google Scholar]
  • 31.Covidence systematic review software, Veritas Health Innovation, Melbourne, Australia. Available at www.covidence.org.
  • 32.Hayden JA, van der Windt DA, Cartwright JL, Côté P, Bombardier C. Assessing bias in studies of prognostic factors. Annals of Internal Medicine. 2013;158(4):280. doi: 10.7326/0003-4819-158-4-201302190-00009 [DOI] [PubMed] [Google Scholar]
  • 33.Borenstein M, Hedges LV, T. Higgins JP, Rothstein H. Introduction to meta-analysis. Hoboken, NJ: John Wiley & Sons, Inc.; 2021. [Google Scholar]

Decision Letter 0

Hanna Landenmark

21 Jul 2022

PONE-D-22-08111Predictive validity of constructed-response situational judgment tests in health professions education programs: A systematic review and meta-analysis protocolPLOS ONE

Dear Dr. Mortaz Hejri,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Two reviewers have assessed the manuscript. They have raised some overlapping concerns about some of the difficulties you may encounter in this analysis, and have provided suggestions for revisions that can be made to the study.

Please submit your revised manuscript by Sep 03 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Hanna Landenmark

Staff Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at 

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and 

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Thank you for stating the following in the Competing Interests section: 

YSP and AHS have no disclosures to declare. SMH, JLH, XP, and AM disclose they are salaried employees of Altus Assessments which administers a situational judgment test called Casper. The authors receive no reimbursements, fees, or funding related to this study or its outcomes.

Please confirm that this does not alter your adherence to all PLOS ONE policies on sharing data and materials, by including the following statement: "This does not alter our adherence to  PLOS ONE policies on sharing data and materials.” (as detailed online in our guide for authors http://journals.plos.org/plosone/s/competing-interests).  If there are restrictions on sharing of data and/or materials, please state these. Please note that we cannot proceed with consideration of your article until this information has been declared. 

Please include your updated Competing Interests statement in your cover letter; we will change the online submission form on your behalf.

3. We note that you have referenced (ie. Bewick et al. [5]) which has currently not yet been accepted for publication. Please remove this from your References and amend this to state in the body of your manuscript: (ie “Bewick et al. [Unpublished]”) as detailed online in our guide for authors

http://journals.plos.org/plosone/s/submission-guidelines#loc-reference-style 

4. Please include captions for your Supporting Information files at the end of your manuscript, and update any in-text citations to match accordingly. Please see our Supporting Information guidelines for more information: http://journals.plos.org/plosone/s/supporting-information.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Does the manuscript provide a valid rationale for the proposed study, with clearly identified and justified research questions?

The research question outlined is expected to address a valid academic problem or topic and contribute to the base of knowledge in the field.

Reviewer #1: Yes

Reviewer #2: Partly

**********

2. Is the protocol technically sound and planned in a manner that will lead to a meaningful outcome and allow testing the stated hypotheses?

The manuscript should describe the methods in sufficient detail to prevent undisclosed flexibility in the experimental procedure or analysis pipeline, including sufficient outcome-neutral conditions (e.g. necessary controls, absence of floor or ceiling effects) to test the proposed hypotheses and a statistical power analysis where applicable. As there may be aspects of the methodology and analysis which can only be refined once the work is undertaken, authors should outline potential assumptions and explicitly describe what aspects of the proposed analyses, if any, are exploratory.

Reviewer #1: Partly

Reviewer #2: Partly

**********

3. Is the methodology feasible and described in sufficient detail to allow the work to be replicable?

Descriptions of methods and materials in the protocol should be reported in sufficient detail for another researcher to reproduce all experiments and analyses. The protocol should describe the appropriate controls, sample size calculations, and replication needed to ensure that the data are robust and reproducible.

Reviewer #1: No

Reviewer #2: Yes

**********

4. Have the authors described where all data underlying the findings will be made available when the study is complete?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception, at the time of publication. The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above and, if applicable, provide comments about issues authors must address before this protocol can be accepted for publication. You may also include additional comments for the author, including concerns about research or publication ethics.

You may also provide optional suggestions and comments to authors that they might find helpful in planning their study.

(Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: I appreciated the clarity and detail in the protocol provided. My primary questions are with regard to feasibility of a meta-analysis on this topic.

1. There are few published empirical studies or dissertations on constructed response SJTs (my quick search identified less than 25) across any setting and even fewer that are in Health Professionals Education programs. Within that, finding multiple studies that look at the same instrument or that look at the same outcome in order to look at cumulative effects seems highly unlikely to me.

2. A second feasibility issue is with regard to the strategy for assessing moderated validity (i.e., differences in predictive validity by subgroup). Such comparisons would require having correlations between predictors and outcomes reported separately by subgroup -- the strategy outlined in the protocol suggests such a split would be possible, but it is rare to see this type of information reported in studies (published or unpublished) because of small Ns. Thus, it does not seem addressing that key question is feasible.

3. It seems that the key predictive validity questions would be focused on outcomes such as performance in classes, graduation rates, etc... Thus, it was unclear to me why looking at a variety of other types of assessments (e.g, personality measures) would be considered an examination of predictive validity rather than convergent validity. Further, because such assessments may be focused on specific constructs that are not those assessed by a given SJT it is unclear what such data will indicate. Perhaps an analysis that focuses on SJTs designed to measure certain attributes to measures designed to measure those same attributes would be a better focus.

4. It was surprising to see the note that SJTs have acceptable internal consistency as a common critique of SJTs are that they are multidimensional at the item level and often do not have high internal consistency.

5. One question regarding constructed responses to assessments are the concerns over how scored -- for example, an oral SJT with constructed responses is akin to a situational interview. It seems looking at dimensions of scoring and who the assessors/raters are would be important aspects to code.

Reviewer #2: I think that meta-analyzing constructed-response SJTs is a worthwhile goal and I would enjoy reading that meta-analysis.

However, I think that to maximize the utility of findings for practitioners, they would need to know more than simply the average predictive validity of constructed-response (C-R) SJTs for selection into Health Professions Education. Based on the broader research on SJTs, we know that this is a method of measurement whose validity can vary greatly as a function of various methodological choices. So I'd expect a practitioner to want to know what methodological choices they should make when designing their C-R SJT in order to maximize its validity. In the case that the average validity is actually not very good, it would be especially important to be able to explain what makes some C-R SJTs more effective than others, rather than the take-home message ending up that these tools are not very useful.

As such, I would recommend that the authors pull all the available studies on C-R SJTs (within Health Professions or even more broadly than that) and determine potential moderators of validity to code and examine in their meta-analysis. The authors can look to the broader SJT research (outside of the Health Professions) for the types of methodological moderators other meta-analyses/reviews have considered (e.g., Campion, Ployhart, & MacKenzie, 2014; Christian, Edwards, & Bradley, 2010; McDaniel, Morgeson, Finnegan, Campion, & Braverman, 2001). While some moderators considered in those studies may not translate to C-R SJTs, others certainly do.

Related to the above, one potential moderator of validity I'd like to highlight is the validity study design (concurrent vs predictive). The authors indicated that they only intend to examine predictive validity designs, but that seems unnecessarily restrictive to me. I don't know about the Health Professions Education field in particular, but from the broader research on SJTs, do know that C-R SJTs constitute a small minority of SJTs in general, and predictive validity studies--as a much less common validity study design than concurrent--further restrict the possible sample of studies to include in a meta-analysis. As a side note, from the authors' description of their plans, it was not entirely clear to me that they definitely plan to only examine predictive validity studies. For example, I did not understand this sentence (lines 137-138): "The measures may be administered at any point throughout the trainee's education to the completion of an SJT." And at the end of that same paragraph, this also doesn't sound like a predictive validity design: "The purpose of using the SJT may be for ... evaluation of students’ performance after admission."

Also, I would expect raters who score responses to C-R SJTs (e.g., their training) to also have validity-moderating potential. I noticed that the example of potential moderators I bring up are actually already reflected in the protocol for what the researchers planned to code about the articles they find (Appendix 2), but the data extraction form may need further refinement to capture anything additional about the research methodology that might be relevant. Again, I believe that the more validity moderators the authors can point to, the more practical utility their research will have.

As a final note, it was unclear to me how the authors plan to examine the effect of possible publication bias (lower quality studies remaining in a file drawer) on their findings (lines 237-238). They may find these sources helpful:

Duval SJ, Tweedie RL. (2000). A non-parametric “trim and fill” method of accounting for publication bias in meta-analysis. Journal of the American Statistical Association, 95, 89–98.

Duval SJ, Tweedie RL. (2000). Trim and fill: A simple funnel plot-based method of testing and adjusting for publication bias in meta-analysis. Biometrics, 56, 276–284.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

Decision Letter 1

Somayeh Delavari

19 Oct 2022

PONE-D-22-08111R1Predictive validity of constructed-response situational judgment tests in health professions education programs: A systematic review and meta-analysis protocolPLOS ONE

Dear Dr. Hejri,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Dec 03 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Somayeh Delavari, Ph.D.,

Academic Editor

PLOS ONE

Journal Requirements:

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Does the manuscript provide a valid rationale for the proposed study, with clearly identified and justified research questions?

The research question outlined is expected to address a valid academic problem or topic and contribute to the base of knowledge in the field.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

2. Is the protocol technically sound and planned in a manner that will lead to a meaningful outcome and allow testing the stated hypotheses?

The manuscript should describe the methods in sufficient detail to prevent undisclosed flexibility in the experimental procedure or analysis pipeline, including sufficient outcome-neutral conditions (e.g. necessary controls, absence of floor or ceiling effects) to test the proposed hypotheses and a statistical power analysis where applicable. As there may be aspects of the methodology and analysis which can only be refined once the work is undertaken, authors should outline potential assumptions and explicitly describe what aspects of the proposed analyses, if any, are exploratory.

Reviewer #1: No

Reviewer #2: Partly

Reviewer #3: Partly

**********

3. Is the methodology feasible and described in sufficient detail to allow the work to be replicable?

Descriptions of methods and materials in the protocol should be reported in sufficient detail for another researcher to reproduce all experiments and analyses. The protocol should describe the appropriate controls, sample size calculations, and replication needed to ensure that the data are robust and reproducible.

Reviewer #1: No

Reviewer #2: Yes

Reviewer #3: Yes

**********

4. Have the authors described where all data underlying the findings will be made available when the study is complete?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception, at the time of publication. The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above and, if applicable, provide comments about issues authors must address before this protocol can be accepted for publication. You may also include additional comments for the author, including concerns about research or publication ethics.

You may also provide optional suggestions and comments to authors that they might find helpful in planning their study.

(Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: While the author(s) did address comments, there was insufficient compelling evidence that there is sufficient literature available for a review

Reviewer #2: I think the revised proposal addresses some of the issues with the original, but there are still a few points that could use clarification.

1. In using Messick's validity framework, I think the authors should also adjust their terminology and refer to the test's "validity" or "construct validity," as opposed to its "predictive validity," which has a narrower definition and can be confusing to readers as referring to study designs where the SJT predicts an outcome measure separated in time from the SJT (predictor).

2. I don't think the other reviewer's concern has been adequately addressed regarding the authors being able to find multiple studies that look at the same instrument or that look at the same outcome in order to be able to consider cumulative effects. The authors say this is not a requirement for them to be able to do their analyses, but I did not see a rationale in the response to reviewers or the revised proposal. It's unclear to me what the analyses would indicate without attending to the constructs assessed by the instruments involved.

3. While the authors acknowledge in their responses that it is important to code potential moderators of validity in their review of the literature, there is no research question posed in the proposal intro to look at methodological moderators of validity as part of their analyses.

Reviewer #3: The authors are planning to assess the predictive validity of constructed response situational judgement tests for professional performance in the healthcare setting in an upcoming systematic review, the protocol for which is presented here. The issue of the validity of university enrollment tests, especially those that assess humanistic tendencies is a concerning one and the question has been formulated well. The protocol has also been written very well with adequate methodological details. There are only a few remarks and suggestions that the authors may wish to consider before publication of the protocol and conduction of the systematic review.

1. The term "health professions education programs" was a bit confusing to me at the beginning and made me think of programs that aim to train educators rather than health professionals. Thus, I suggest that the authors replace it with another more clear term like health sciences/professional training programs.

2. The outcomes and the acceptable assessment tools that assess “other measures of personal, interpersonal, and professional attributes” warrant further discussions and clarifications because these are the tests against which your main test is being tested and needs to be validated. Do the authors have pre-specified inclusion criteria for these tests? Is any comparison considered acceptable? I assume this issue can impact the validity of the final results because these measures will vary in their discriminative and diagnostic features. I think in this case, your study can also be formulated into a diagnostic question where these measures may represent the “reference standard”. The same case goes for clinical prediction rules which are similar to this study.

3. While the currently suggested risk of bias tool seems appropriate and meets the general standards, I suggest that the authors also consider the PROBAST (Prediction model Risk Of Bias Assessment Tool) which may be more helpful for their question which is a mixture of diagnosis and prognosis.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2023 Jan 26;18(1):e0280493. doi: 10.1371/journal.pone.0280493.r004

Author response to Decision Letter 1


5 Dec 2022

Please see the attached 'Response to Reviewers' document for a point-by-point response with the associated clean and tracked changes versions of the manuscript.

Attachment

Submitted filename: PONE-D-22-08111R1 - Response to Reviewers - 20221202.docx

Decision Letter 2

Somayeh Delavari

2 Jan 2023

Validity of constructed-response situational judgment tests in training programs for the health professions: A systematic review and meta-analysis protocol

PONE-D-22-08111R2

Dear Dr. MacIntosh,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Somayeh Delavari, Ph.D.,

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Does the manuscript provide a valid rationale for the proposed study, with clearly identified and justified research questions?

The research question outlined is expected to address a valid academic problem or topic and contribute to the base of knowledge in the field.

Reviewer #2: Yes

Reviewer #3: Yes

Reviewer #4: Yes

**********

2. Is the protocol technically sound and planned in a manner that will lead to a meaningful outcome and allow testing the stated hypotheses?

The manuscript should describe the methods in sufficient detail to prevent undisclosed flexibility in the experimental procedure or analysis pipeline, including sufficient outcome-neutral conditions (e.g. necessary controls, absence of floor or ceiling effects) to test the proposed hypotheses and a statistical power analysis where applicable. As there may be aspects of the methodology and analysis which can only be refined once the work is undertaken, authors should outline potential assumptions and explicitly describe what aspects of the proposed analyses, if any, are exploratory.

Reviewer #2: Yes

Reviewer #3: Yes

Reviewer #4: Yes

**********

3. Is the methodology feasible and described in sufficient detail to allow the work to be replicable?

Descriptions of methods and materials in the protocol should be reported in sufficient detail for another researcher to reproduce all experiments and analyses. The protocol should describe the appropriate controls, sample size calculations, and replication needed to ensure that the data are robust and reproducible.

Reviewer #2: Yes

Reviewer #3: Yes

Reviewer #4: Yes

**********

4. Have the authors described where all data underlying the findings will be made available when the study is complete?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception, at the time of publication. The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #2: Yes

Reviewer #3: Yes

Reviewer #4: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #2: Yes

Reviewer #3: Yes

Reviewer #4: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above and, if applicable, provide comments about issues authors must address before this protocol can be accepted for publication. You may also include additional comments for the author, including concerns about research or publication ethics.

You may also provide optional suggestions and comments to authors that they might find helpful in planning their study.

(Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #2: I think the authors have addressed the reviewers' comments. I don't have further feedback at this time.

Reviewer #3: Good luck to the authors on conducting their perfectly designed review. Thank you for responding to my concerns with details.

Reviewer #4: thank you for choosing such an interesting topic. It is needed to add the search syntax at least for one database with its NNR.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #2: No

Reviewer #3: No

Reviewer #4: Yes: Manoosh Mehrabi

**********

Acceptance letter

Somayeh Delavari

18 Jan 2023

PONE-D-22-08111R2

Validity of constructed-response situational judgment tests in training programs for the health professions: A systematic review and meta-analysis protocol

Dear Dr. MacIntosh:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Somayeh Delavari

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Checklist. PRISMA-P (Preferred Reporting Items for Systematic review and Meta-Analysis Protocols) 2015 checklist: Recommended items to address in a systematic review protocol.

    (DOCX)

    S1 File

    (DOCX)

    Attachment

    Submitted filename: Response to Reviewers - Final.docx

    Attachment

    Submitted filename: PONE-D-22-08111R1 - Response to Reviewers - 20221202.docx

    Data Availability Statement

    No datasets were generated or analysed during the current study. All relevant data from this study will be made available upon study completion.


    Articles from PLOS ONE are provided here courtesy of PLOS

    RESOURCES