. 2020 Apr 24;9:91. doi: 10.1186/s13643-020-01311-y

Table 6.

Details of studies which have used and validated six other tools identified as lower quality by this review for use in evaluating EBM teaching in medical education

Source instrument name and date	Instrument development, number of participants, level of expertise	EBM learning domains	Instrument description	EBM steps	Psychometric properties with results of validity and reliability assessment
Educational Prescription-David Feldstein [19]	20 residents	Knowledge and skills	Educat academic GPs or clinical ional prescription (EP)—web-based tool that guides learners through the four As of EBM. Learners use the EP to define a clinical question, document a search strategy, appraise the evidence, report the results and apply evidence to the particular patient	Asking, acquiring, appraising, applying	Predictive validity Interrater reliability Interrater reliability on the 20 EPs showed fair agreement for question formation (k= 0.22); moderate agreement for overall competence (k = 0.57) and evaluation of evidence (k= 0.44). and substantial agreement for searching (k = 0.70) and application of evidence (k = 0.72)
BACES-Barlow [23]	Yes postgraduate medical trainees/residents—150 residents	Knowledge, skills	BACES-Biostatistics and Clinical Epidemiology Skills (BACES) assessment for medical residents-30 multiple-choice questions were written to focus on interpreting clinical epidemiological and statistical methods	Appraisal—interpreting clinical epidemiology and statistical methods	Content validity was assessed through a four person expert review Item Response Theory (IRT) makes it flexible to use subsets of questions for other cohorts of residents (novice, intermediate and advanced). 26 items fit into a two parameter logistic IRT model and correlated well with their comparable CTT (classical test theory) values
David Feldstein-EBM test [18]	48 internal medicine residents	Knowledge and skills	EBM test—25 mcqs-covering seven EBM focus areas: (a) asking clinical questions, (b) searching, (c) EBM resources, (d) critical appraisal of therapeutic and diagnostic evidence, (e) calculating ARR, NNT and RRR, (f) interpreting diagnostic test results and (g) interpreting confidence intervals	Asking, acquiring and appraising Asking clinical questions, searching, EBM resources, critical appraisal, calculations of ARR, NNT, RRR, interpreting diagnostic test results and interpreting confidence intervals.	Construct validity Responsive validity EBM experts scored significantly higher EBM test scores compared to PGY-1 residents (p < 0.001), who in turn scored higher than 1st year students (p < 0.004). Responsiveness of the test was also demonstrated with 16 practising clinicians—mean difference in fellows’ pre-test to post-test EBM scores was 5.8 points (95% CI 4.2, 7.4)
Frohna-OSCE [22]	Medical students (n-26) who tried the paper-based test during the pilot phase. A web-based station was then developed for full implementation (n = 140).	Skills	A web-based 20-min OSCE-specific case scenario where students asked a structural clinical question, generated effective MEDLINE search terms and elected the most appropriate of 3 abstracts	Ask, acquire, appraise and apply	Face validity Interrater reliability Literature review and expert consensus Between three scorers, there was good interrater reliability with 84, 94 and 96% agreement (k = 0.64, 0.82 and 0.91)
Tudiver-OSCE [21]	Residents—first year and second year	Skills	OSCE stations	Ask, acquire, appraise and apply	Content validity Construct validity p= 0.43 Criterion validity p < 0.001 Interrater reliability ICC 0.96 Internal reliability Cronbach’s alpha 0.58
Mendiola-mcq [20]	Fifth year medical students	Knowledge	MCQ (100 questions)	Appraise	Reliability of the mcq = Cronbach’s alpha 0.72 in M5 and 0.83 in M6 group Effect size in Cohen’s d for the knowledge score main outcome comparison of M5 EBM vs M5 non-EBM was 3.54

mcq multiple choice question, OSCE objective structured clinical examination, ICC intraclass correlation, NNT number needed to treat, ARR attributable risk ratio, RRR relative risk ratio

Assessment aims: formative