Performance Validity Test Failure in the Clinical Population: A Systematic Review and Meta-Analysis of Prevalence Rates

Jeroen J Roor; Maarten J V Peters; Brechje Dandachi-FitzGerald; Rudolf W H M Ponds

doi:10.1007/s11065-023-09582-7

. 2023 Mar 6;34(1):299–319. doi: 10.1007/s11065-023-09582-7

Performance Validity Test Failure in the Clinical Population: A Systematic Review and Meta-Analysis of Prevalence Rates

Jeroen J Roor ^1,^2,^✉, Maarten J V Peters ³, Brechje Dandachi-FitzGerald ^3,⁵, Rudolf W H M Ponds ^2,⁴

PMCID: PMC10920461 PMID: 36872398

Abstract

Performance validity tests (PVTs) are used to measure the validity of the obtained neuropsychological test data. However, when an individual fails a PVT, the likelihood that failure truly reflects invalid performance (i.e., the positive predictive value) depends on the base rate in the context in which the assessment takes place. Therefore, accurate base rate information is needed to guide interpretation of PVT performance. This systematic review and meta-analysis examined the base rate of PVT failure in the clinical population (PROSPERO number: CRD42020164128). PubMed/MEDLINE, Web of Science, and PsychINFO were searched to identify articles published up to November 5, 2021. Main eligibility criteria were a clinical evaluation context and utilization of stand-alone and well-validated PVTs. Of the 457 articles scrutinized for eligibility, 47 were selected for systematic review and meta-analyses. Pooled base rate of PVT failure for all included studies was 16%, 95% CI [14, 19]. High heterogeneity existed among these studies (Cochran's Q = 697.97, p < .001; I² = 91%; τ² = 0.08). Subgroup analysis indicated that pooled PVT failure rates varied across clinical context, presence of external incentives, clinical diagnosis, and utilized PVT. Our findings can be used for calculating clinically applied statistics (i.e., positive and negative predictive values, and likelihood ratios) to increase the diagnostic accuracy of performance validity determination in clinical evaluation. Future research is necessary with more detailed recruitment procedures and sample descriptions to further improve the accuracy of the base rate of PVT failure in clinical practice.

Supplementary Information

The online version contains supplementary material available at 10.1007/s11065-023-09582-7.

Keywords: Prevalence, Base rate, Performance validity test, Invalid performance, Meta-analysis, Clinical assessments

Introduction

Neuropsychological assessment guides diagnostics and treatment in a wide range of clinical conditions (e.g., traumatic brain injury, epilepsy, functional neurological disorder, attention deficit hyperactivity disorder, multiple sclerosis, or mild cognitive impairment). Therefore, it is important that neuropsychological test results accurately represent a patients’ actual cognitive abilities. However, personal factors such as a lack of task engagement or malingering can invalidate a patient’s test performance (Schroeder & Martin, 2022). When invalid performance is not properly identified, clinicians risk attributing abnormally low scores to cognitive impairment, potentially leading to misdiagnosis and ineffective or even harmful treatments (e.g., Roor et al., 2016; van der Heide et al., 2020). Consequently, performance invalidity is not only relevant to diagnostics, but also extends to treatment efficacy (Roor et al., 2022).

Various tests are available for determining invalid performance on cognitive tests (for an overview, see Soble et al., 2022). Performance validity tests (PVTs) can be specifically designed to measure performance validity (i.e., stand-alone PVTs), or empirically derived from standard cognitive tests (i.e., embedded indicators). Overall, the psychometric properties of stand-alone PVTs have been found to be superior in comparison to embedded PVTs (Miele et al., 2012; Soble et al., 2022). Using well-researched stand-alone PVTs, the meta-analyses of Sollman and Berry (2011) found their aggregated mean specificity to be 0.90, with a mean sensitivity of 0.69. This finding is typical for stand-alone PVTs, for which empirical cutoff scores are chosen at a specificity of ≥ 0.90 to minimize the misclassification of a valid cognitive test performance as non-valid (i.e., a maximum 10% false positive rate).

Importantly, sensitivity and specificity should never be interpreted in isolation from other clinical metrics like base rates (Lange & Lippa, 2017). To determine the positive and negative predictive value of a PVT score, the base rate of the condition (here: performance invalidity) needs to be considered (Richards et al., 2015). Using Bayes' rule, the likelihood that PVT failure is indeed indicative of performance invalidity can be calculated based upon: 1) the base rate of invalid performance in the specific population of that individual; 2) the score of a PVT; 3) the sensitivity; and 4) the specificity of the utilized PVT (Dandachi-FitzGerald & Martin, 2022; Tiemens et al., 2020). Ignoring Bayes' rule potentially leads to overdiagnosis of invalid performance when the base rate of invalidity is low and to underdiagnosis when the base rate is high. Therefore, it is essential that base rate information is available for each PVT in a specific clinical context and, ideally, for specific clinical patient groups (Schroeder et al., 2021a).

Early surveys amongst non-forensic clinical neuropsychologists reported an expectation that only 8% of general clinical referrals would produce invalid test results (Mittenberg et al., 2002). Over the last two decades, research on validity issues in clinical practice increased significantly, and neuropsychologists have become more aware of the need to identify invalid test performance (Merten & Dandachi-FitzGerald, 2022; Sweet et al., 2021). These factors probably contributed to the findings of a nearly double median reported base rate of 15% across clinical contexts and settings in a more recent survey (Martin & Schroeder, 2020). However, there has been a delay in research examining empirically derived bases rates of invalidity in clinical settings.

To address this issue, McWhirter et al. (2020) undertook a systematic review to examine PVT failure in clinical populations. Their main finding was that PVT failure rates were common, exceeding 25% for some PVTs and clinical groups. However, their study has been criticized on several aspects. First, Kemp and Kapur (2020) did not distinguish between stand-alone and (psychometrically inferior) embedded PVTs. Second, McWhirter et al. (2020) included studies that examined PVT failure in patients with dementia and intellectual disabilities, two groups in which PVTs are strongly discouraged due to unacceptable high false-positive rates when using the standard cutoffs (Larrabee et al., 2020; Lippa, 2018; Merten et al., 2007). Third, studies with ≥ 50% of the patient sample was involved in litigation or seeking welfare benefits were excluded, other types and lower rates of external gain incentives were not characterized. Therefore, external incentives that increase PVT failure rates in patients engaged in standard clinical evaluations (Schroeder et al., 2021b) may have contributed to their reported PVT failure rates. Importantly, McWhirter et al. (2020) summarized data on PVT failure based upon the literature search and data extraction performed by one author, without considering the quality of included studies or calculating a weighted average to get a more precise estimate.

The current meta-analysis is designed to address these gaps to improve the quality of reported PVT failure findings in clinical patient groups. The main aim of the present study is to provide comprehensive information regarding the base rate of PVT failure to facilitate its interpretation in clinical practice. We calculated pooled estimates of the base rate of PVT failure across the type of clinical context, distinct clinical patient groups, the potential for external incentives, and per PVT.

Methods

Search Strategy

This meta-analysis was conducted in accordance with updated Preferred Reporting Items for Systematic Review and Meta-analyses guidelines (PRISMA; Page et al., 2021). A review protocol was registered at inception on PROSPERO (ID: CRD42020164128). The protocol was slightly modified to further improve the quality of included studies. Specifically, only stand-alone PVTs were included that met the restrictive selection criteria per Sollman and Berry (2011), and one additional database was searched. Electronic databases (PubMed/MEDLINE, Web of Science, and PsychINFO) were comprehensively searched using multiple terms for performance validity and neuropsychological assessment (see Online Resource 1 for detailed search strategies). Finally, we chose to focus solely on the base rate of PVT failure without also addressing its impact on treatment outcome. The final search was conducted on November 5, 2021.

Study Selection

All studies in this systematic review and meta-analysis were performed in a clinical evaluation context of adult patients (18 + years of age), using standard/per manual administration procedure and cutoffs for the five stand-alone performance validity tests (PVTs) from Sollman and Berry (2011). These five PVTs are: the Word Memory Test (WMT; Green, 2003), the Medical Symptom Validity Test (MSVT; Green, 2004), the Test of Memory Malingering (TOMM; Tombaugh, 1996), the Victoria Symptom Validity Test (VSVT; Slick et al., 1997), and the Letter Memory Test (LMT; Inman et al., 1998). Based upon Grote et al. (2000), a higher cutoff was used for the hard items of the VSVT in patients with medically intractable epilepsy. All studies were original, peer-reviewed, and published in English. Studies were excluded if they examined PVT failure rate in a non-clinical context (i.e., forensic/medico-legal context, data generated for research purposes). Studies that only addressed PVT failure in a sample already selected upon initially passing/failing a PVT were equally excluded (typically known-groups design). Studies performed on patients diagnosed with intellectual disability or dementia were excluded, as well as studies with a small (sub)sample size (N < 20). Finally, we chose to exclude studies of Veterans/military personnel since the distinction between clinical and forensic evaluations are difficult to make within the context of the Veterans Affairs (VA) system (Armistead-Jehle & Buican, 2012).

Unique patient samples were ensured by carefully screening for similar samples used in different studies. In case multiple studies examined the same patient sample, data with the largest sample size was included, or, when equal, the most recent paper.

Data Collection and Extraction

References resulting from the searches in PubMed/MEDLINE, Web of Science, and PsychINFO were imported into a reference manager (EndNote X8). After automatic duplicate removal, one of the investigators (JR) manually removed the remaining duplicate references. First, a single rater (JR) screened all titles and abstracts for broad suitability and eligibility. Doubtful references were addressed with a second rater (MP). If doubts remained, references were included for full-text scrutinization. Second, two independent raters (MP and JR) reviewed the remaining full-texts based on the mentioned inclusion and exclusion criteria, for which the online systematic review tool Rayyan (Ouzzani et al., 2016) was used. The interrater reliability was substantial (Cohen’s k = 0.63), and agreement 89.83%. A sizable number of studies failed to clearly state information used for inclusion in the current study, which contributed to the suboptimal agreement between the two independent raters. Therefore, corresponding authors were contacted when additional information was required (e.g., regarding clinical context, utilized PVT cutoff, number of subjects that were provided and failed a PVT, or language/version of the utilized PVT). Non-responders were reminded twice, and if no author response was elicited, studies were excluded. Discrepancies were resolved by discussion with a third and fourth reviewer (BD and RP). Finally, one investigator (JR) extracted relevant information from the included full-text articles, such as setting, sample size, mean age, and utilized PVT(s) according to a standardized data collection form (see Online Resource 2).

Statistical Analyses

Statistical analysis was performed using MetaXL version 5.3 (www.epigear.com), a freely available add-in for meta-analysis in Microsoft Excel. Independence of effect sizes, a critical assumption in random-effects meta-analyses, was examined by checking if and how many studies used multiple, potentially inter-correlated PVTs from the same patient sample (Cheung, 2019). The frequency of PVT failure from the individual studies were pooled into the meta-analysis using a double-arcsine transformation. Back transformation was performed to report the pooled prevalence rates. We chose to use this transformation method to stabilize variance in the analysis. The double arcsine transformation has been shown to be preferential to logit transformation or no transformation usage in the calculation of pooled prevalence rates (Barendregt et al., 2013). All analyses were performed using the random-effects model since it allows between-study variation of PVT failure. Forest plots were used to visualize the pooled prevalence of PVT failure, with 95% confidence intervals [CIs]. Where possible, subgroup analyses were performed to examine whether the base rate of PVT failure was related to specific clinical contexts, distinct patient groups, utilized PVT, and the consideration of the presence of potential external gain. To further establish the generalizability of our study findings, the consistency across the included studies was assessed using the Cochran’s Q-test (Higgins et al., 2003). For the Q-test, a p-value < 0.10 was considered to indicate statistically significant heterogeneity between studies. Because the number of included studies impacts the Q-test, we additionally evaluated the inconsistency index I² (Higgins & Thompson, 2002). An I² value over 75% would tentatively be classified as a “high” degree of between-study variance (Higgins et al., 2003). Since I² is a relative measure of heterogeneity and its value depends on the precision of included studies, we also calculated Tau squared (τ²). This measure quantifies the variance of the true effect sizes underlying our data, with larger values suggesting greater between-study variance (Borenstein et al., 2017).

Study Quality

An adapted version of the Prevalence Critical Appraisal Tool of the Joanna Briggs Institute (Munn et al., 2015) was used to rate the quality of all included studies. Amongst the currently available tools, it addresses the most important items related to the methodological quality when determining prevalence (Migliavaca et al., 2020). Three study quality domains were assessed: selection bias (items 1, 2, and 4), sample size/statistics (items 3 and 5), and attrition bias (item 6; see Online Resource 3 for a detailed description).

Doi plot and LFK index are relatively new graphical and quantitative methods that were used for detecting publication bias (Furuya-Kanamori et al., 2018). These analyses were also implemented using MetaXL. Contrary to the scatter plot of precision used in a more standard funnel plot to examine publication bias, the Doi plot uses a quantile plot providing a better visual representation of normality (Wilk & Gnanadesikan, 1968). A symmetric inverted funnel is created with a Z-score closest to zero at its tip if the trials are not affected by publication bias. The LKF index then quantifies the two areas under the Doi plot. The interpretation is based on the a-priori concern about positive or negative publication bias. Since we were concerned about possible positive publication bias, the LFK > 1 was used consistent with positive publication bias. Even in the case of limited included studies, the LKF index has a better sensitivity over the more standard Egger's test (Furuya-Kanamori et al., 2018).

Results

Literature Search

Figure 1 gives an overview of the search and selection process. Of the 13,587 identified abstracts, 457 (3.4%) were included for full-text scrutiny. We contacted the first author of 37 studies for additional information, and 30 authors responded. This resulted in 47 observational studies of PVT failure in the clinical context, with a total sample size of n = 6,484.

Fig. 1 — PRISMA flow chart of study selection

Characterization of Included Studies

Table 1 reports study characteristics, including clinical context, clinical patient group, and sample size. Most studies were performed in a medical hospital (k = 25), with others in an epilepsy clinic (k = 7), psychiatric institute (k = 6), rehabilitation clinic (k = 4), and private practice (k = 2). Three studies (6.2%) did not specify clinical context. In 15/47 (31.9%) of the studies, prevalence of PVT failure was reported for heterogeneous patient samples. The majority of the studies (32/47; 68.1%) reported PVT failure rates for one or multiple diagnostic subgroups. The diagnostic (sub)groups constituted of patients with traumatic brain injury (TBI) in most studies (k = 10), followed by patients with epilepsy (k = 9), patients with psychogenic non-epileptic seizures (PNES; k = 5), patients that were seen for attention deficit hyperactivity disorder (ADHD) assessment (k = 4), patients with mild cognitive impairment (MCI: k = 4), patients with multiple sclerosis (MS; k = 2), and patients with Parkinson’s disease (k = 2). Severity of TBI was not always specified or was poorly defined. The remaining diagnostic (sub)groups (i.e., sickle cell disease, Huntington’s disease, patients substance-use related disorders (SUD), inpatients with depression, memory complaints) were examined in single studies. In more than half of the included studies (25/47; 53.2%), the language (-proficiency) of the included patient sample was not reported. Potential external gain was not mentioned in 12/47 (25.5%) studies, and the remaining studies varied greatly in how they addressed its presence. Of the remaining 35 studies, only seven (i.e., 20%) specified how external gain was examined (Domen et al., 2020; Eichstaedt et al., 2014; Galioto et al., 2020; Grote et al., 2000; Rhoads et al., 2021a; Williamson et al., 2012; Wodushek & Domen, 2020). In most studies (28/35; 80%), the way the authors examined this variable (e.g., by checking the medical record of patient, querying patients about potential incentives being present during the assessment procedure) was not specified. Moreover, in only 4/35 (11.4%) studies, subjects were excluded when external gain incentives (e.g., workers compensation claim) were present (Dandachi-FitzGerald et al., 2020; Davis & Millis, 2014; Merten et al., 2007; Wodushek & Domen, 2020).

Table 1.

Summary details for individual studies that reported the prevalence of PVT failure in clinical patients

Study	Type clinical context	Sample type	Sample (n)	Mean age (SD)	Mean education (SD)	Country	Language	External incentive	PVT	Administration	PVT failure N [%] ^*
Cragar et al. (2006)	Epilepsy clinic	Epilepsy	41	36.0 (9.6)	11.9 (3.2)	United States	Not mentioned	'48% on or seeking disability';	TOMM T2 or Retention	Standard	1 (2.4)
		Epilepsy	41	36.0 (9.6)	11.9 (3.2)			'48% on or seeking disability';	LMT	Computerized	7 (17.1)
		PNES	21	40.8 (10.3)	13.7 (6.1)			'35% on or seeking disability'. Table 1, p. 559	TOMM T2 or Retention	Standard	3 (14.3)
		PNES	21	40.8 (10.3)	13.7 (6.1)			'35% on or seeking disability'. Table 1, p. 559	LMT	Computerized	5 (23.8)
Czornik et al. (2021)	Medical hospital	MCI	28	66.8 (9.9)^a	12.5 (4.2)^a	Austria	Not mentioned	'Information about possible secondary gain was not available.' p. 272	WMT IR, DR, or CNS	Computerized German version	2 (7.1)
Dandachi-FitzGerald et al. (2020)	Medical hospital	MCI	41	78.0 (7.2)	[70.7% medium education. Table 1, p. 317]	Netherlands	Dutch	'Involvement in juridical procedures (e.g., litigation)' as exclusion criterion, p. 315	TOMM T2	Dutch version	3/38 (7.9)^g
Dandachi-FitzGerald et al. (2020)	Medical hospital	Parkinson's disease	41	63.7 (8.1)	[56.1% medium education. Table 1, p. 317]	Netherlands	Dutch		TOMM T2	Dutch version	1/40 (2.5)^g
Davis and Millis (2014)	Medical hospital	Heterogeneous neurological	87	42.9 (12.8)^a	14.0 (2.3)^a	United States	English, with '19% of the sample reported history of English as a second language', p. 202	'Subjects with potential external incentives were excluded in subgroup analysis.' p. 204	WMT IR, DR, or CNS	Standard	12/58 (20.7)^h
Deloria et al. (2021)	Private practice	Heterogeneous	181	58.0 (15.7)	13.7 (2.5)	United States	Not mentioned	'7.2% of the sample had indication of involvement in disability or litigation claims.' p. 3	TOMM T2 or Retention	Standard	7/38 (18.4)ⁱ
Dodrill (2008)	Epilepsy clinic	Epilepsy	65	35.2 (12.9)	12.0 (3.0)	United States	English	Not mentioned	WMT IR, DR, or CNS	Oral version	16 (25.0)
Dodrill (2008)	Epilepsy clinic	PNES	32	42.2 (11.6)	12.7 (2.4)	United States	English	Not mentioned	WMT IR, DR, or CNS	Oral version	9 (28.0)
Domen et al. (2020)	Medical hospital	MS	84^b	46.5 (12.9)^b	14.9 (2.6)^b	United States	English	'Of note, 16.67% of the analyzed sample endorsed currently applying for disability, and this information was unknown for another 29.76%.' p. 512	MSVT IR, DR, or CNS	Standard	13/108 (12.0)
Donders and Strong (2011)	Rehabilitation clinic	TBI	100	37.5 (13.8)^a	13.28 (2.3)^a	United states	English	'(n = 28) were involved in disputed financial compensation seeking at the time of the neuropsychological assessment'. p. 176	WMT IR, DR, or CNS	Standard	24 (24.0)
Dorociak et al. (2018)	Medical hospital	Sickle cell disease	54	40.61 (12.3)	13.13 (2.3)	United States	Not mentioned	'None of the subjects was applying for disability or had other known financial incentives related to cognitive status'. p. 85	TOMM T2	Standard	1 (1.9)
Drane et al. (2006)	Epilepsy clinic	Epilepsy	41^a	36.9 (14.4)	12.6 (2.3)	United States	English	Not mentioned	WMT IR, DR, or CNS	Oral version	3/37 (8.1)
Drane et al. (2006)	Epilepsy clinic	PNES	43^a	40.6 (10.2)	12.4 (2.6)	United States	English	Not mentioned	WMT IR, DR, or CNS	Oral version	22 (51.2)
Eichstaedt et al. (2014)	Medical hospital	Epilepsy	26	37.8 (11.6)	12.7 (2.6)	United states	English	'Five participants with LTLE reported receiving disability benefits at the time of evaluation, and none failed the WMT.' p. 947	WMT IR, DR, or CNS	Standard	6 (23.1)
Erdodi et al. (2018)	Medical hospital	TBI	104	38.8 (16.7)	13.7 (2.6)	Not mentioned	Not mentioned	'No data were available on litigation status'. p 848	WMT IR, DR, or CNS	Standard	40 (38.5)
Galioto et al. (2020)	Medical hospital	MS	102	47.2 (11.4)	14.4 (2.6)	United States	English	For MS patients only: 27.9% no seeking disability, 38.5% seeking disability, 33.7% already receiving disability, Table 1, p. 1031	VSVT hard items^f	Standard	15 (14.4)
		Epilepsy	102	47.2 (11.8)	14.3 (2.5)						6 (5.8)
		mTBI	50	42.7 (13.5)	14.3 (2.1)						10 (20.4)
Gorissen et al. (2005)	Mental healthcare institute	Schizophrenia spectrum	64	[Between 18–65. p. 201]	[Less than 6 years of education as exclusion criterion. p. 201]	Netherlands & Spain	Dutch & Spanish	Not mentioned	WMT IR, DR, or CNS	Dutch and Spanish oral versions	46 (72.0)
		Psychiatric (heterogeneous)	63								16 (25.0)
		Neurological (heterogeneous)	20								2 (10.0)
Grote et al. (2000)	Epilepsy clinic	Epilepsy	30	33.4 (10.6)	14.0 (2.6)	Not mentioned	Not mentioned	'They were not seeking compensation at the time of their neuropsychological evaluation. 8 (26.7%) were receiving disability at the time of evaluation because of their seizure disorders.'. p. 711	VSVT hard items	Standard	0
Haber and Fichtenberg (2006)	Rehabilitation clinic	TBI	22	36.4 (13.9)	12.2 (1.4)	United States	Not mentioned	'Subjects were not involved in litigation or workers’ compensation cases.' p. 526	TOMM T2	Standard	0
Haggerty et al. (2007)	Medical hospital	Heterogeneous	300	44.7 (13.0)	13.8 (2.5)	United States	Not mentioned	'Approximately 16% of the sample was involved in litigation and/or seeking compensation for an illness or injury (e.g., workers’ compensation, disability) at the time of their evaluations.' p. 921	VSVT hard items	Standard	24 (8.0)
Harrison and Armstrong (2020)	Mental healthcare institute	ADHD	245	20.4 (1.8)	[All participants were students. '57.1% in their first or second year.' p. 316]	Canada	Not mentioned	Not mentioned	MSVT IR, DR, or CNS	Standard	49 (20.0)
Harrison et al. (2021)	Mental healthcare institute	ADHD	2463	21.8 (5.9)	['All students were high school graduates or equivalent, with their college or university program in progress.' p. 2]	Canada	Not mentioned	'All were seeking a diagnosis to allow access to disability supports and services'. p. 3	MSVT IR, DR, or CNS	Standard	57/648 (8.8)
Harrison et al. (2021)	Mental healthcare institute	ADHD	2463	21.8 (5.9)		Canada	Not mentioned		WMT IR, DR, or CNS	Standard	206/1810 (11.4)
Hoskins et al. (2010)	Epilepsy clinic	Epilepsy	31	38.5 (12.0)	12.7 (2.7)	United States	English	Not mentioned	WMT IR, DR, or CNS	Oral version (n = 30); computerized version (n = 31)	7 (22.6)^j
Hoskins et al. (2010)	Epilepsy clinic	PNES	30	38.5 (12.0)	12.7 (2.7)	United States	English	Not mentioned	WMT IR, DR, or CNS	Oral version (n = 30); computerized version (n = 31)	11 (36.7)^j
Jennette et al. (2021)	Medical hospital	Heterogeneous	128	45.7 (16.4)	14.0 (2.6)	United States	English,14% bilingual English. Table 1, p. 4	14% compensation seeking. Table 1, p. 4	MSVT IR, DR, or CNS	Standard	32 (25.0)
Keary et al. (2013)	Epilepsy clinic	Epilepsy	404	38.5 (12.1)	13.3 (2.1)	United States	Not mentioned	'None of the patients in this clinically referred sample were known to be involved in litigation regarding their medical status or seeking financial compensation at the time of their neuropsychological evaluations.' p. 315	VSVT hard items^f	Standard	22 (5.4)
Krishnan and Donders (2011)	Rehabilitation clinic	TBI	115	40.71 (13.3)	12.86 (2.2)	United States	Not mentioned	32% seeking financial compensation. Table 1, p. 179	TOMM T2 or Retention	Standard	3/39 (8)
Krishnan and Donders (2011)	Rehabilitation clinic	TBI	115	40.71 (13.3)	12.86 (2.2)	United States	Not mentioned	32% seeking financial compensation. Table 1, p. 179	WMT IR, DR, or CNS	Standard	25/81 (31)
Leppma et al. (2018)	Mental health care institute	ADHD	350	22.6	[22.9% Graduate Students. p. 213]	United States	Not mentioned	Not mentioned	NV-MSVT IR, DR, or CNS	Standard	68 (21.1)
Locke et al. (2008)	Medical hospital	Acquired brain injury	87	36.3 (12.2)	13.4 (2.5)	United States	Not mentioned	'76% of the sample was on disability at the time of the evaluation.' p. 275	TOMM T2	Standard	19 (21.8)
Loring et al. (2007)	Medical hospital	Neurological (heterogeneous)	27	47.4 (13.2)	13.9 (2.4)	United States	Not mentioned	'No known external financial incentive'. p. 524	VSVT hard items	Standard	2 (7.0)
		Memory complaints	163	51.8 (13.0)	13.8 (2.6)						16 (10.0)
		TBI	49	36.7 (10.8)	12.7 (1.9)						6 (12.0)
Loring et al. (2005)	Medical hospital	Epilepsy	120	34.5 (11.1)	12.6 (2.2)	United States	Not mentioned	'Not actively screened for compensation status.' (p. 611)	VSVT hard items^f	Standard	14 (11.7)
Marshall et al. (2016)	Mental health care institute	ADHD	428	26.4 (7.8)	14.4 (1.9)	United States	Not mentioned	Not mentioned	WMT IR, DR, or CNS	Standard	53/174 (30.4)^k
Martins and Martins (2010)	Not specified	MCI	21	71.2 (2.0)	[71.4% had less then 6 years of education. Table 1, p. 178]	Portugal	Portuguese	'None of these patients had any identifiable secondary gain. All patients were retired and without ongoing legal processes.' p. 178	WMT IR, DR, or CNS	Portuguese computerized version	14 (67.0)
Merten et al. (2007)	Medical hospital	Heterogeneous	48	56.4 (13.1)^a	[Minimum of 8 years of formal schooling as inclusion criterion. p. 309]^a	Germany	German	'Involvement in litigation' as exclusion criterion.’ p. 309	TOMM T2 or Retention	German version	1/24 (4.2)^c
Merten et al. (2007)	Medical hospital	Heterogeneous	48	56.4 (13.1)^a		Germany	German	'Involvement in litigation' as exclusion criterion.’ p. 309	WMT IR, DR, or CNS	Oral, German version	2/24 (8.3)
Meyers et al. (2014)	Not specified	Heterogeneous	255	34.5 (12.1)	12.5 (1.8)	United states	Not mentioned	'76 [subjects] were in litigation' p. 225	WMT, IR, DR, or CNS	Standard	98 (38.4)
Moore and Donders (2004)	Rehabilitation clinic	TBI	132	35.8 (14.2)	12.3 (2.6)	United States	Not mentioned	'Those seeking financial compensation (n = 26) were not excluded.' p. 977	TOMM T2	Standard	11 (8.3)
Neale et al. (2022)	Medical hospital	Heterogeneous	147	46.4 (14.5)	13.2 (2.2)	United States	Not mentioned	20/147 (13.6%) were seeking compensation. Table 2, p. 5	MSVT IR, DR, or CNS	Standard	30/145 (20.7)^l
Rees et al. (2001)	Medical hospital	Inpatients with depression	26	40.4 (11.2)	4.9 (2.8)	Canada	English	Not mentioned	TOMM T2 or Retention	Standard	0
Rees et al. (2001)	Medical hospital	TBI	24	40.4 (14)	13.6 (2.7)	Canada	English	Not mentioned	TOMM T2 or Retention	Standard	0
Resch et al. (2021)	Not specified	Heterogeneous	88	31.7 (10.2)	15.4 (2.3)	United States	English	Not mentioned	TOMM T2	Standard	10/33 (30.0)
Rhoads et al. (2021a)	Medical hospital	Heterogeneous	132	45.1 (16.3)	14.0 (2.6)	United States	English	'Finally, 15% of patients (n = 20) reported being concurrently compensation-seeking (e.g., disability) at the time of their clinical evaluation' p. 135–136	MSVT IR, DR, or CNS	Standard	28 (21.2)^m
Rhoads et al. (2021b)	Medical hospital	Heterogeneous	112	60.6 (15.9)	8.1 (4.5)	United States	Spanish	n = 20 (17.9%) compensation seeking. Table 3, p. 272	TOMM T2	Spanish version	20/86 (23.3)
Sabelli et al. (2021)	Private practice	mTBI	326	39.5 (11.8)	12.1 (2.6)	Canada	Not mentioned	Not mentioned	WMT IR, DR, or CNS	Standard	104 (31.9)
Schroeder et al. (2019)	Medical hospital	Heterogeneous	162	46.4 (13.2)	13.6 (2.3)	United States	Not mentioned	'Roughly 65% of the sample had known or suspected secondary gain associated with the evaluation. The secondary gain was most commonly related to a pursuit of: disability, civil litigation, or workers compensation.', p. 468	TOMM T2 or Retention	Standard	25 (15.0)
Sharland et al. (2018)	Medical hospital	Heterogeneous	615	43.4 (12.8)	12.6 (2.6)	United States	English	'Additionally, while the sample was clinical in nature, it is possible a proportion of participants were also involved in litigation, applying for disability, or on workers compensation'. p.105	TOMM T2 or Retention	Standard	49 (7.9)ⁿ
Sieck et al. (2013)	Medical hospital	Huntington disease	36	46.1 (12.6)	13.6 (2.2)	United States	Not mentioned	Not mentioned	TOMM T2	Standard	3 (8.2)
Silverberg et al. (2017)	Medical hospital	mTBI	80	40.8 (12.0)	[n = 37 (46.3%) with postsecondary degree. Table 1, p. 2143]	United States	English	n = 71 (86.4%) receiving or seeking injury compensation. Table 1, p. 2143	MSVT IR, DR, or CNS	Standard	20 (25.0)
Teichner and Wagner (2004)	Medical hospital	Cognitive impairment, not demented	36	70.6 (8.1)	14.2 (3.2)	United States	Not mentioned	Not mentioned	TOMM T2 or Retention	Standard	3 (8.3)
Teichner and Wagner (2004)	Medical hospital	Cognitively intact	21	65.6 (8.6)	14.2 (3.6)	United States	Not mentioned	Not mentioned	TOMM T2 or Retention	Standard	0
Vilar-López et al. (2021)	Mental healthcare institute	Substance-use related disorders (SUD)	77	43.2 (8.2)^d	[patients with primary schooling (n = 35) constituted the largest category. Table 1, p. 257]	Spain	Spanish	' The second group, made up of SUD patients with compensation seeking (n = 36), completed a neuropsychological evaluation in order to apply for economic compensation due to their disability (according to their disability level, participants could obtain a monthly payment for life or a payment reviewable at 4 years).' p. 256–257	TOMM T2 and Retention	Spanish version	1 (1.3)
Walter et al. (2014)	Medical hospital	MCI	31	66.0 (8.0)	14.7 (2.1)	United states	Not mentioned	'No participant was involved in litigation at the time of the evaluation or had a substantial external incentive to perform poorly.' p .1200	TOMM T2	Standard	3 (9.7)
Wodushek and Domen, (2020)	Medical hospital	Parkinson's disease	55	65.2 (8.9)	14.9 (2.8)	United States	English	'Cases were excluded from analysis if the patient reported being involved in litigation, or if there was an obvious external or secondary gain issue, such as a disability application (n = 1).' p. 11	MSVT IR, DR, or CNS	Standard	5/51 (9.8)
Williamson et al. (2012)	Epilepsy clinic	PNES	90	39.0 (8.3)^e	13.2 (2.0)^e	United States	English	'The presence of financial incentives was determined on the basis of patient report. Patients were classified as having financial incentives if they were currently receiving or applying for disability benefits or other forms of financial compensation (e.g., worker’s compensation)'. p. 591	WMT IR, DR, or CNS	Oral version	32 (35.5)

Open in a new tab

ADHD attention deficit hyperactivity disorder; CNS consistency score; DR delayed recognition; IR immediate recognition; LMT letter memory test; MCI mild cognitive impairment; MS multiple sclerosis; (m)TBI (mild) traumatic brain injury; (nv)MSVT (non-verbal) medical symptom validity test; PNES psychogenic non-epileptic seizures; T2 trial 2; TOMM test of memory malingering; VSVT Victoria symptom validity test; WMT word memory test

^* In case not all subjects in the patient sample received a given PVT, the proportion is shown in this column (…/…)

^a The authors provided demographics for the total patient sample, not for the (sub)samples for which PVT failure rates were reported

^b The authors only provided demographics for the final sample, after excluding cases with missing data. The initial total sample that was provided with a PVT was larger than the sample size detailed in the article

^c Results of the "no clinically obvious cognitive impairment" subgroup (n = 24)

^d Demographics for the total sample (n = 77) were not mentioned. Therefore, we choose to display the demographics of the non-compensation subgroup (n = 41)

^e Demographics for the total sample (n = 90) were not mentioned. Therefore, we choose to display the demographics of the WMT fail subgroup (n = 32)

^f VSVT hard items cutoff per Grote et al., 2000 in epilepsy (sub)sample

^g B. Dandachi-Fitzgerald (personal communication, February 4, 2022)

^h J. Davis (personal communication, February 26, 2021)

ⁱ A. Kivisto (personal communication, February 9, 2022)

^j D. Drane (personal communication, June 30, 2021)

^k P. Marshall (personal communication, February 26, 2021)

^l A. Neale (personal communication, February 5, 2022)

^m T. Rhoads (personal communication, December 4, 2021)

ⁿ M. Sharland (personal communication, February 4, 2022)

The TOMM was the most frequently administered PVT (k = 18), followed by the WMT (k = 17), the MSVT (k = 9), the VSVT (k = 6), or the LMT (k = 1). Only 4/47 (8.5%) studies employed two PVTs (none used > 2 PVTs that fulfilled the inclusion/and exclusion criteria). The other 43/47 (91.5%) studies used one PVT. In two of the four studies reporting two PVTs, the same PVTs were not administered to all participants. Harrison et al. (2021) administered the MSVT to 648 patients and the WMT to 1810 patients, and Krishnan and Donders (2011) administered the TOMM to 39 patients and the WMT to 81 patients. Inclusion in these studies was – amongst others – based upon failing one PVT. Furthermore, these studies did not report the number of subjects that were provided with both PVTs. Therefore, it is unclear to what extend the reported PVT failure rates in these studies are influenced by potential dependence. In the two other studies reporting two PVTs (i.e., Cragar et al., 2006; Merten et al., 2007), all patients were administered both PVTs. The total number of subjects in these two studies that reported two likely dependent effect-sizes was n = 76. This is 1.2% of the total of n = 6487 patients from all 47 studies. We therefore argue that the reported effect sizes from the 47 included studies are (largely) independent.

Methodological Quality Assessment

A summary of the methodological quality of the included studies for determining prevalence is provided in Online Resource 4. No study was rated as having high quality; all had limitations in at least one of the three prespecified domains (selection bias, attrition bias, and sample size/statistical analyses). Most studies had a study sample that addressed the target population (k = 41, 87.2%), whereas only a minority described relevant assessment and patient characteristics (n = 15, 31.9%). The majority of included studies failed to clearly state how patients were recruited (n = 27, 57.4%). Eleven studies (23.4%) had an inadequate response rate. The majority of the studies used appropriate statistical analyses (n = 41, 87.2%), but also had inappropriate sample sizes (n = 39, 83.0%).

The shape of the Doi plot showed slight asymmetry (see Online Resource 5), and the results of the LFK index (1.09) revealed minor asymmetry indicative of potential positive publication bias.

Base Rate of PVT Failure in Clinical Patients

The pooled prevalence of PVT failure of all (n = 47) included studies was 16%, 95% CI [14, 19]. Significant between-study heterogeneity and high between-study variability existed (Cochran's Q = 697.97, p < 0.001; I² = 91%; τ² = 0.08) as revealed by the large 95% CIs (see Fig. 2). The high I² statistic indicates that the variation in reported PVT failure is likely a result of true heterogeneity rather than chance.

Fig. 2 — Forest plot of the 47 included studies estimating the pooled prevalence of PVT failure in the clinical setting CI = confidence interval. Note: Weights are from random effects analysis

Subgroup Analyses based upon Clinically Relevant Characteristics

To facilitate the interpretation of PVT failure in clinical practice, subgroup analyses were performed for clinically relevant characteristics associated with performance validity (Table 2). It is important to emphasize that some of these findings are based upon relatively small numbers of studies (i.e., k = 2 or 4), potentially impacting the stability if the reported estimates.

Table 2.

Pooled prevalence of PVT failure in clinical patients, stratified by false-positive scrutinization, clinical context, external gain incentives, clinical diagnosis, and PVT

Clinical characteristics	n/k	Pooled PVT failure rate (%)	95% CI	I²(%)	τ²
Overall	6,484/47	16	14–19	91	0.08
False-positive scrutinization
Probable risk of false-positive PVT failure classification	85/2	70	60–80	0	0.00
Probable no risk of false-positive PVT failure classification	6,399/46	15	13–18	89	0.07
Clinical setting*
Private practice	364/2	27	15–40	66	0.03
Epilepsy clinic	824/7	19	10–29	91	0.17
Mental healthcare institute	1,577/6	15	10–24	92	0.04
Medical hospital	3,057/25	12	10–15	81	0.05
Rehabilitation clinic	293/4	13	4–25	88	0.10
Subjects with potential external gain incentives excluded?*
Yes	211/4	10	5–15	45	0.02
No	6,188/42	16	13–19	90	0.07
Clinical diagnosis*
PNES	216/5	33	24–43	53	0.03
ADHD	1,417/4	17	11–23	94	0.03
(m)TBI	926/10	17	10–25	89	0.09
MS	210/2	13	9–18	0	0.00
Epilepsy	856/9	11	6–16	79	0.05
MCI	97/3	9	4–16	0	0.00
Parkinson’s disease	91/2	6	1–15	45	0.02
PVT*
WMT	1,482/13	25	19–32	93	0.10
(nv)MSVT	1,891/9	18	13–23	85	0.03
TOMM	1,759/18	9	6–12	80	0.05
VSVT	1,347/6	9	7–2	64	0.01

Open in a new tab

ADHD attention deficit hyperactivity disorder; CI confidence interval; MCI mild cognitive impairment; MS multiple sclerosis; (m)TBI (mild) traumatic brain injury; (nv)MSVT (non-verbal) medical symptom validity test; PNES psychogenic non-epileptic seizures; PVT performance validity test; TOMM test of memory malingering; VSVT Victoria symptom validity test; WMT word memory test. * after exclusion of the subsamples of subjects with a probable risk of false-positive PVT failure classification (i.e., k = 46)

False-Positive Scrutinization

Although we excluded studies that examined PVT failure rates in patients with dementia or intellectual disability a priori, the included studies might still comprise patient samples with other conditions or combinations of characteristics that make them highly susceptible to false-positive PVT failure classification. Therefore, and in line with clinical guidelines (Sweet et al., 2021), we first examined included studies for the risk of unacceptably low specificity rates when applying standard PVTs cutoffs, and two studies were identified. First, PVT performance in the subsample of severely ill schizophrenia spectrum and mostly inpatients from Gorissen et al. (2005) was significantly correlated with negative symptoms and general psychopathology. Second, the MCI subjects from Martins and Martins (2010) were of advanced age, Spanish speaking, and had the lowest formal schooling of all included studies (i.e., 71.4% had less than 6 years of formal education). These cultural/language factors in combination with low formal schooling are associated with unacceptably low specificity rates when applying standard PVT cutoffs (Robles et al., 2015; Ruiz et al., 2020). Exclusion of the subsample of patients with schizophrenia in the Gorissen et al. (2005 study) and of the Martins and Martins (2010) study led to a pooled prevalence of PVT failure of 15% (95% CI [13, 18]; Cochran's Q = 573.73, p < 0.01; I² = 89%; τ² = 0.07). However, after exclusion of these patient samples, between-study heterogeneity and between-study variability were still high as indicated by a significant Cochran's Q statistic and high and I² statistic. Further subgroup analyses were performed in the remaining studies (k = 46; see Table 1).

Clinical Context

The pooled prevalence of PVT failure was the highest in the context of a private practice (27%, 95% CI [15, 40]; Cochran's Q = 2.98, p = 0.08, I² = 66%; τ² = 0.03), followed by the epilepsy clinic (19%, 95% CI [10, 29]; Cochran's Q = 128.07, p < 0.001, I² = 91%; τ² = 0.17), the psychiatric institute (15%, 95% CI [10, 21]; Cochran's Q = 92.30, p < 0.001, I² = 92%; τ² = 0.04), the medical hospital (12%, 95% CI [10, 15]; Cochran's Q = 160.51, p < 0.001, I² = 81%; τ² = 0.05) and the rehabilitation clinic (13%, 95% CI [4, 25]; Cochran's Q = 31.07, p < 0.001, I² = 88%; τ² = 0.10). As can be seen, heterogeneity of pooled PVT failure rates was significant and between-study variability was moderately-high to high for all types of clinical context.

Clinical Diagnoses

The pooled prevalence of PVT failure was the highest for patients with PNES (33%, 95% CI [ 24, 43]; Cochran's Q = 10.65, p = 0.06; I² = 53%; τ² = 0.03), followed by subjects seen for ADHD assessment (17%, 95% CI [11, 23]; Cochran's Q = 68.80, p < 0.01; I² = 94%; τ² = 0.03), (m)TBI (17%, 95% CI [10, 25]; Cochran's Q = 89.57, p < 0.01; I² = 89%; τ² = 0.09), MS (13%, 95% CI [9, 18]; Cochran's Q = 0.32, p = 0.57; I² = 0%; τ² = 0.00), epilepsy (11%, 95% CI [6, 16]; Cochran's Q = 42.21, p < 0.001; I² = 79%; τ² = 0.05), MCI (9%, 95% CI [4, 16]; Cochran's Q = 0.11, p = 0.95; I² = 0%; τ² = 0.00), and Parkinson's disease (6%, 95% CI [1, 15]; Cochran's Q = 1.81, p = 0.18; I² = 45%; τ² = 0.02). Based upon Cochran's Q, heterogeneity of pooled PVT failure rates was significant in patients with PNES, (m)TBI, epilepsy, Parkinson's disease, and subjects seen for ADHD assessment. Non-significant heterogeneity in pooled PVT failure rates was found in patients with Parkinson's disease, MCI, and MS. Based upon the I² statistic, variability of base rate estimates of PVT failure was low in patients with MCI and MS, and (moderately) high for the other diagnostic patient groups. This suggests that for studies in patients with MCI and MS, the pooled PVT failure rates are more homogeneous. However, since these calculations are based upon small numbers of studies, these findings should be interpreted with caution (von Hippel, 2015).

External Gain Incentives

In the four studies where patients with potential external gain incentives were excluded from analysis, the pooled prevalence of PVT failure was as low as 10% (95% CI [5, 15]; Cochran's Q = 9.17, p = 0.10; I² = 45%; τ² = 0.02). For the 42 remaining studies that did not report to have actively excluded clinical patients with potential external gain incentives before reporting PVT failure, however, the pooled prevalence of reported PVT failure was 16% (95% CI [13, 19]; Cochran's Q = 560.93, p < 0.001; I² = 90%; τ² = 0.07). Although Cochran's Q statistic indicated that heterogeneity of pooled PVT failure rates in both groups was high, inconsistency was lower in the studies where patients with external gain were excluded from analysis.

PVT

The pooled prevalence of PVT failure was the highest for patients examined with the WMT (25%, 95% CI [19, 32]; Cochran's Q = 253.52, p < 0.001; I² = 93%; τ² = 0.10), followed by the (nv)MSVT (18%, 95% CI [13, 23]; Cochran's Q = 55.04, p < 0.001, I² = 85; τ² = 0.03), the TOMM (9%, 95% CI [6, 12]; Cochran's Q = 103.35, p < 0.001, I² = 80; τ² = 0.05), and the hard items of the VSVT (9%, 95% CI [7, 12]; Cochran's Q = 24.97, p < 0.001, I² = 64; τ² = 0.01). Heterogeneity of pooled PVT failure rates was significant across studies examining the same PVT, whereas the between-study variability was moderately-high for studies using the VSVT and high in studies using other PVTs.

Discussion

This systematic review and meta-analysis examined the prevalence of PVT failure in the context of routine clinical care. Based on extracted data from all 47 studies involving 6,484 patients seen for clinical assessment, the pooled prevalence of PVT failure was 16%, 95% CI [14, 19]. Excluding two studies that likely represented patients where standard PVT cutoff application would probably lead to false positive classification, resulted in a pooled PVT failure of 15%, 95% CI [13, 18]. This number corresponds with the median estimated base rate of invalid performance in clinical settings reported in a recent survey amongst 178 adult-focused neuropsychologists (Martin & Schroeder, 2020). Our empirical findings confirm PVT failure in a sizeable minority of patients seen for clinical neuropsychological assessment.

Another key finding is that reported PVT failure rates vary significantly amongst the included studies (i.e., 0–52.2%). This variability is likely due to (1) sample characteristics, such as clinical setting, clinical diagnosis, and potential external incentives, and (2) the sensitivity and specificity of the PVT used. Pooled PVT failure was found to be highest (i.e., 27%, 95% CI [15, 40]) in patients seen in private practice. The pooled PVT failure rates for the other settings (i.e., epilepsy clinic, psychiatric institute, medical hospital, and rehabilitation clinic) varied between 13–19%. The Sabelli et al. (2021) study had the largest private practice sample (N = 326), consisting of relatively young mTBI patients referred for neuropsychological evaluation. Since only 2/47 of the included studies were conducted in the private practice setting, the Sabelli et al. (2021) study with a PVT failure rate of 31.9%, was a major contributor to the higher pooled PVT failure rate in a private practice setting. Of interest, potential external incentives were not mentioned in that study. Therefore, potential external gain incentives may have been present and impacted the relatively high level of PVT failure rather than assessment context per se. Unsurprisingly, but now clearly objectified, studies that excluded patients with potential external gain incentives had a significantly lower pooled PVT failure rate compared to studies where these subjects (potentially) remained in the analysis (i.e., 10%, 95% CI [ 5, 15] versus 16%, 95% CI [13, 19] respectively). However, although it is known that the presence of external incentive links directly to PVT failure in clinical evaluations (e.g., Schroeder et al., 2021b), little over a quarter of the included studies failed to mention the presence of external gain incentives. Moreover, even when external gain incentives were known to be present, only a minority of studies excluded these subjects from further analyses. Pooled PVT failure rates were highest for patients diagnosed with PNES (i.e., 33%, 95% CI [24, 43]), patients seen for ADHD assessment (i.e., 17%, 95% CI [11, 23]), and (m)TBI (i.e., 17%, 95% CI [10, 25) with pooled PVT failure rates ranging between 6–13% for the other diagnostic groups (i.e., MS, epilepsy, MCI, and Parkinson's disease). These findings contrast with McWhirter et al. (2020), who reported PVT failure in subjects with functional neurological disorders (such as PNES) are no higher compared to MCI or epilepsy. Likely, our strict inclusion and exclusion criteria, inclusion of only well-validated stand-alone PVTs, and meta-analysis application lead to a more precise estimate of PVT failure across diagnostic groups. Our findings also indicate that pooled PVT failure rates for MCI, MS, and Parkinson's disease diagnostic groups are more homogeneous than those of PNES, (m)TBI, and patients seen for ADHD assessment. The higher levels of heterogeneity in these latter groups could indicate that other factors that likely impact PVT failure were present, such as external gain incentives, variation in diagnostic criteria, and bias in patient selection. Finally, pooled failure rates varied across the utilized PVTs in line with their respective sensitivity/specificity ratios in correctly identifying invalid performance. The WMT is known for its relatively high sensitivity (Sollman & Berry, 2011), which likely resulted in the highest pooled failure rate amongst the examined stand-alone PVTs. The lowest pooled failure rate for the TOMM is probably related to its high specificity (Martin et al., 2020).

Our findings indicate that in addition to PVT psychometric properties (i.e., sensitivity and specificity), the clinical setting, the presence of external gain incentives, and the clinical diagnosis impact pooled PVT failure rates. The clinician should therefore consider these factors when interpreting PVT results. Consider, for example, a well-researched stand-alone PVT with a sensitivity of 0.69 and specificity of 0.90, administered to two different clinical patients. The first patient is diagnosed with epilepsy and wants to get approved to return to work (i.e., no external gain incentives for invalid performance). If the mentioned PVT were failed in the context of this patient without external gain incentives (base rate PVT failure of 10%, see Table 2), the likelihood that PVT failure was indeed a true positive (i.e., positive predictive value, PPV) would be 43%. The second patient is also diagnosed with epilepsy but has a pending disability application because the patient does not believe he/she is able to return to work (i.e., potential external gain incentive for invalid performance). If the same PVT were failed in the context of this patient with potential external gain incentive (base rate PVT failure of 16%, see Table 2), PPV would be 57%.

Of importance, although a PPV increase of 0.43 to 0.57 is substantial, the latter is still not sufficient to determine performance validity. Therefore, in line with general consensus multiple, independent, validity tests should be employed (Sherman et al., 2020; Sweet et al., 2021). By chaining the positive likelihood-ratios (LRs) of multiple failed PVTs, the diagnostic probability of invalid performance (or PPV) is increased and the diagnostic error is decreased (for an explanation of how to chain likelihood ratios see Larrabee, 2014; Larrabee, 2022). Note that while considerable weight should be placed on the psychometric evaluation of performance validity, the clinician should also include other test and extra-test information (e.g., degree of PVT failure, (in)consistency of the clinical presentation) to draw conclusions about the validity of an individual patient's neuropsychological assessment (Dandachi-FitzGerald & Martin, 2022; Larrabee, 2022; Sherman et al., 2020).

Strengths of the present study are its strict inclusion/ exclusion criteria ensuring accurate PVT results. Unfortunately, none of the studies fulfilled all components of the three pre-defined quality criteria selection bias, attrition bias, and adequate sample size/statistics for determining prevalence. Although we excluded studies with < 20 subjects, most of the remaining studies still relatively small sample sizes, increasing the likelihood of sampling bias and heterogeneity. Moreover, only 20/47 studies reported appropriate recruitment method (e.g., consecutive referrals of a good census) necessary for determining the base rate of PVT failure. Additionally, diagnostic criteria varied across studies, limiting the generalizability of their calculated PVT failure base rates. Also, the way potential external gain incentives were examined and defined varied significantly. Surprisingly, in just over one quarter of included studies, potential external gain incentives were not mentioned at all, and potential external gain incentives may have been present. Finally, although language (-proficiency) and cultural factors relate to PVT failure (Robles et al., 2015; Ruiz et al., 2020), these factors were not mentioned in more than half of the included studies in our meta-analysis.

Additional empirical research is necessary to advance knowledge of performance validity test failure in clinical populations. An important first step in future research should be to provide comprehensive details regarding study design, such as recruitment procedure, clinical setting, and demographic/descriptive information (e.g., cultural factors, age, language and language proficiency, level of education). A second improvement would be to form comparable and homogeneous patient samples by specifying diagnostic criteria and providing a detailed specification of how external gain incentives were examined (e.g., querying the patient for potential external gain incentives, such as pending litigation or disability procedures; Schroeder et al., 2021a). Since administration of multiple PVTs is recommended (Sweet et al., 2021), future studies and specifically meta-analyses should consider using advanced statistical techniques (e.g., three-level meta-analyses) in handling non-independent effect sizes (Cheung, 2019).

In conclusion, the current meta-analysis demonstrates that PVT failure occurs in a substantial minority of patients seen for routine clinical care. Type of clinical context, patient characteristics, presence of external gain incentives, and psychometric properties of the utilized PVT are found to impact the rate of PVT failure. Our findings can be used for calculating clinically applied statistics (i.e., PPV/NPV, and LRs) in everyday practice to increase the diagnostic accuracy of performance validity determination. Future studies using detailed recruitment procedures and sample characteristics, such as external gain incentives and language (proficiency), are needed to further improve and refine knowledge about the base rates of PVT failure in clinical assessments.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 13 KB)^{(13.3KB, docx)}

Supplementary file2 (DOCX 16 KB)^{(16.4KB, docx)}

Supplementary file3 (DOCX 16 KB)^{(16.3KB, docx)}

Supplementary file4 (DOCX 20 KB)^{(19.8KB, docx)}

Supplementary file5 (DOCX 19 KB)^{(18.8KB, docx)}

Author Contributions

All authors contributed to the study conception and design. Material preparation, data collection, and analysis were performed by Jeroen Roor and Maarten Peters. The first draft of the manuscript was written by Jeroen Roor and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Data Availability

Data are available upon reasonable request from the first author.

Declarations

Ethical Approval

Not Applicable.

Consent to Participate

Not Applicable.

Consent for Publication

Not Applicable.

Conflicts of Interest

The authors declare no potential conflicts of interest.

Footnotes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

Armistead-Jehle P, Buican B. Evaluation context and symptom validity test performances in a U.S. military sample. Archives of Clinical Neuropsychology. 2012;27(8):828–839. doi: 10.1093/arclin/acs086. [DOI] [PubMed] [Google Scholar]
Barendregt JJ, Doi SA, Lee YY, Norman RE, Vos T. Meta-analysis of prevalence. Journal of Epidemiology and Community Health. 2013;67(11):974–978. doi: 10.1136/jech-2013-203104. [DOI] [PubMed] [Google Scholar]
Borenstein M, Higgins JP, Hedges LV, Rothstein HR. Basics of meta-analysis: I2 is not an absolute measure of heterogeneity. Research Synthesis Methods. 2017;8(1):5–18. doi: 10.1002/jrsm.1230. [DOI] [PubMed] [Google Scholar]
Cheung MW. A guide to conducting a meta-analysis with non-independent effect sizes. Neuropsychology Review. 2019;29(4):387–396. doi: 10.1007/s11065-019-09415-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
Cragar DE, Berry DT, Fakhoury TA, Cibula JE, Schmitt FA. Performance of patients with epilepsy or psychogenic non-epileptic seizures on four measures of effort. The Clinical Neuropsychologist. 2006;20(3):552–566. doi: 10.1080/13854040590947380. [DOI] [PubMed] [Google Scholar]
Czornik M, Merten T, Lehrner J. Symptom and performance validation in patients with subjective cognitive decline and mild cognitive impairment. Applied Neuropsychology: Adult. 2021;28(3):269–281. doi: 10.1080/23279095.2019.1628761. [DOI] [PubMed] [Google Scholar]
Dandachi-FitzGerald B, Martin PhK. Clinical judgement and clinically applied statistics: Description, benefits, and potential dangers when relying on either one individually in clinical practice. In: Schroeder RW, Martin PK, editors. Validity assessment in clinical neuropsychological practice; evaluating and managing noncredible performance. The Guilford Press; 2022. pp. 107–125. [Google Scholar]
Dandachi-FitzGerald B, Duits AA, Leentjens A, Verhey F, Ponds R. Performance and symptom validity assessment in patients with apathy and cognitive impairment. Journal of the International Neuropsychological Society. 2020;26(3):314–321. doi: 10.1017/S1355617719001139. [DOI] [PubMed] [Google Scholar]
Davis JJ, Millis SR. Examination of performance validity test failure in relation to number of tests administered. The Clinical Neuropsychologist. 2014;28(2):199–214. doi: 10.1080/13854046.2014.884633. [DOI] [PubMed] [Google Scholar]
Deloria, R., Kivisto, A. J., Swier-Vosnos, A., & Elwood, L. (2021). Optimal per test cutoff scores and combinations of failure on multiple embedded performance validity tests in detecting performance invalidity in a mixed clinical sample. Applied Neuropsychology: Adult, 1–11. Advance online publication. 10.1080/23279095.2021.1973005 [DOI] [PubMed]
Dodrill CB. Do patients with psychogenic nonepileptic seizures produce trustworthy findings on neuropsychological tests? Epilepsia. 2008;49(4):691–695. doi: 10.1111/j.1528-1167.2007.01457.x. [DOI] [PubMed] [Google Scholar]
Domen CH, Greher MR, Hosokawa PW, Barnes SL, Hoyt BD, Wodushek TR. Are established embedded performance validity test cut-offs generalizable to patients with multiple sclerosis? Archives of Clinical Neuropsychology. 2020;35(5):511–516. doi: 10.1093/arclin/acaa016. [DOI] [PubMed] [Google Scholar]
Donders J, Strong CA. Embedded effort indicators on the California Verbal Learning Test - Second Edition (CVLT-II): an attempted cross-validation. The Clinical Neuropsychologist. 2011;25(1):173–184. doi: 10.1080/13854046.2010.536781. [DOI] [PubMed] [Google Scholar]
Dorociak KE, Schulze ET, Piper LE, Molokie RE, Janecek JK. Performance validity testing in a clinical sample of adults with sickle cell disease. The Clinical Neuropsychologist. 2018;32(1):81–97. doi: 10.1080/13854046.2017.1339830. [DOI] [PubMed] [Google Scholar]
Drane DL, Williamson DJ, Stroup ES, Holmes MD, Jung M, Koerner E, Chaytor N, Wilensky AJ, Miller JW. Cognitive impairment is not equal in patients with epileptic and psychogenic nonepileptic seizures. Epilepsia. 2006;47(11):1879–1886. doi: 10.1111/j.1528-1167.2006.00611.x. [DOI] [PubMed] [Google Scholar]
Eichstaedt KE, Clifton WE, Vale FL, Benbadis SR, Bozorg AM, Rodgers-Neame NT, Schoenberg MR. Sensitivity of Green's Word Memory Test genuine memory impairment profile to temporal pathology: A study in patients with temporal lobe epilepsy. The Clinical Neuropsychologist. 2014;28(6):941–953. doi: 10.1080/13854046.2014.942374. [DOI] [PubMed] [Google Scholar]
Erdodi LA, Abeare CA, Medoff B, Seke KR, Sagar S, Kirsch NL. A single error is one too many: The forced choice recognition trial of the CVLT-II as a measure of performance validity in adults with TBI. Archives of Clinical Neuropsychology. 2018;33(7):845–860. doi: 10.1093/acn/acx110. [DOI] [PubMed] [Google Scholar]
Furuya-Kanamori L, Barendregt JJ, Doi S. A new improved graphical and quantitative method for detecting bias in meta-analysis. International Journal of Evidence-Based Healthcare. 2018;16(4):195–203. doi: 10.1097/XEB.0000000000000141. [DOI] [PubMed] [Google Scholar]
Galioto R, Dhima K, Berenholz O, Busch R. Performance validity testing in multiple sclerosis. Journal of the International Neuropsychological Society. 2020;26(10):1028–1035. doi: 10.1017/S1355617720000466. [DOI] [PubMed] [Google Scholar]
Gorissen M, Sanz JC, Schmand B. Effort and cognition in schizophrenia patients. Schizophrenia Research. 2005;78(2–3):199–208. doi: 10.1016/j.schres.2005.02.016. [DOI] [PubMed] [Google Scholar]
Green P. Manual for the word memory test. Green’s Publishing; 2003. [Google Scholar]
Green P. Manual for the medical symptom validity test. Green’s Publishing; 2004. [Google Scholar]
Grote CL, Kooker EK, Garron DC, Nyenhuis DL, Smith CA, Mattingly ML. Performance of compensation seeking and non-compensation seeking samples on the Victoria symptom validity test: Cross-validation and extension of a standardization study. Journal of Clinical and Experimental Neuropsychology. 2000;22(6):709–719. doi: 10.1076/jcen.22.6.709.958. [DOI] [PubMed] [Google Scholar]
Haber AH, Fichtenberg NL. Replication of the Test of Memory Malingering (TOMM) in a traumatic brain injury and head trauma sample. The Clinical Neuropsychologist. 2006;20(3):524–532. doi: 10.1080/13854040590967595. [DOI] [PubMed] [Google Scholar]
Haggerty KA, Frazier TW, Busch RM, Naugle RI. Relationships among Victoria symptom validity test indices and personality assessment inventory validity scales in a large clinical sample. The Clinical Neuropsychologist. 2007;21(6):917–928. doi: 10.1080/13854040600899724. [DOI] [PubMed] [Google Scholar]
Harrison AG, Armstrong IT. Differences in performance on the test of variables of attention between credible vs. noncredible individuals being screened for attention deficit hyperactivity disorder. Applied Neuropsychology: Child. 2020;9(4):314–322. doi: 10.1080/21622965.2020.1750115. [DOI] [PubMed] [Google Scholar]
Harrison, A. G., Beal, A. L., & Armstrong, I. T. (2021). Predictive value of performance validity testing and symptom validity testing in psychoeducational assessment. Applied Neuropsychology: Adult, 1–15. Advance online publication. 10.1080/23279095.2021.1943396 [DOI] [PubMed]
Higgins JP, Thompson SG. Quantifying heterogeneity in a meta-analysis. Statistics in Medicine. 2002;21(11):1539–1558. doi: 10.1002/sim.1186. [DOI] [PubMed] [Google Scholar]
Higgins JP, Thompson SG, Deeks JJ, Altman DG. Measuring inconsistency in meta-analyses. BMJ (Clinical Research Ed.) 2003;327(7414):557–560. doi: 10.1136/bmj.327.7414.557. [DOI] [PMC free article] [PubMed] [Google Scholar]
Hoskins LL, Binder LM, Chaytor NS, Williamson DJ, Drane DL. Comparison of oral and computerized versions of the word memory test. Archives of Clinical Neuropsychology. 2010;25(7):591–600. doi: 10.1093/arclin/acq060. [DOI] [PubMed] [Google Scholar]
Inman TH, Vickery CD, Berry DT, Lamb DG, Edwards CL, Smith GT. Development and initial validation of a new procedure for evaluating adequacy of effort given during neuropsychological testing: The letter memory test. Psychological Assessment. 1998;10(2):128. doi: 10.1037/1040-3590.10.2.128. [DOI] [Google Scholar]
Jennette, K. J., Williams, C. P., Resch, Z. J., Ovsiew, G. P., Durkin, N. M., O'Rourke, J., Marceaux, J. C., Critchfield, E. A., & Soble, J. R. (2021). Assessment of differential neurocognitive performance based on the number of performance validity tests failures: A cross-validation study across multiple mixed clinical samples. The Clinical Neuropsychologist, 1–19. Advance online publication. 10.1080/13854046.2021.1900398 [DOI] [PubMed]
Keary TA, Frazier TW, Belzile CJ, Chapin JS, Naugle RI, Najm IM, Busch RM. Working memory and intelligence are associated with victoria symptom validity test hard item performance in patients with intractable epilepsy. Journal of the International Neuropsychological Society. 2013;19(3):314–323. doi: 10.1017/S1355617712001397. [DOI] [PubMed] [Google Scholar]
Kemp, S. & Kapur, N. (2020). Response to McWhirter et al. https://jnnp.bmj.com/content/91/9/945.responses#response-to-mcwhirter-et-al
Krishnan M, Donders J. Embedded assessment of validity using the continuous visual memory test in patients with traumatic brain injury. Archives of Clinical Neuropsychology. 2011;26(3):176–183. doi: 10.1093/arclin/acr010. [DOI] [PubMed] [Google Scholar]
Lange RT, Lippa SM. Sensitivity and specificity should never be interpreted in isolation without consideration of other clinical utility metrics. The Clinical Neuropsychologist. 2017;31(6–7):1015–1028. doi: 10.1080/13854046.2017.1335438. [DOI] [PubMed] [Google Scholar]
Larrabee GJ. Aggregating across multiple indicators improves the detection of malingering: Relationship to likelihood-ratios. The Clinical Neuropsychologist. 2014;22(4):666–679. doi: 10.1080/13854040701494987. [DOI] [PubMed] [Google Scholar]
Larrabee GJ. Synthesizing data to reach clinical conclusion regarding validity status. In: Schroeder RW, Martin PK, editors. Validity assessment in clinical neuropsychological practice; Evaluating and managing noncredible performance. The Guilford Press; 2022. pp. 193–210. [Google Scholar]
Larrabee, G. J., Boone, K. B., Bianchini, K. J., Rohling, M. L., & Sherman, E. M. (2020). Response to McWhirter et al (2020). https://jnnp.bmj.com/content/91/9/945.responses#response-to-mcwhirter-et-al
Leppma M, Long D, Smith M, Lassiter C. Detecting symptom exaggeration in college students seeking ADHD treatment: Performance validity assessment using the NV-MSVT and IVA-plus. Applied Neuropsychology: Adult. 2018;25(3):210–218. doi: 10.1080/23279095.2016.1277723. [DOI] [PubMed] [Google Scholar]
Lippa SM. Performance validity testing in neuropsychology: a clinical guide, critical review, and update on a rapidly evolving literature. The Clinical Neuropsychologist. 2018;32(3):391–421. doi: 10.1080/13854046.2017.1406146. [DOI] [PubMed] [Google Scholar]
Locke DE, Smigielski JS, Powell MR, Stevens SR. Effort issues in post-acute outpatient acquired brain injury rehabilitation seekers. NeuroRehabilitation. 2008;23(3):273–281. doi: 10.3233/NRE-2008-23310. [DOI] [PubMed] [Google Scholar]
Loring DW, Lee GP, Meador KJ. Victoria symptom validity test performance in non-litigating epilepsy surgery candidates. Journal of Clinical and Experimental Neuropsychology. 2005;27(5):610–617. doi: 10.1080/13803390490918471. [DOI] [PubMed] [Google Scholar]
Loring DW, Larrabee GJ, Lee GP, Meador KJ. Victoria symptom validity test performance in a heterogenous clinical sample. The Clinical Neuropsychologist. 2007;21(3):522–531. doi: 10.1080/13854040600611384. [DOI] [PubMed] [Google Scholar]
Marshall PS, Hoelzle JB, Heyerdahl D, Nelson NW. The impact of failing to identify suspect effort in patients undergoing adult attention-deficit/hyperactivity disorder (ADHD) assessment. Psychological Assessment. 2016;28(10):1290–1302. doi: 10.1037/pas0000247. [DOI] [PubMed] [Google Scholar]
Martin PK, Schroeder RW. Base rates of invalid test performance across clinical non-forensic contexts and settings. Archives of Clinical Neuropsychology. 2020;35(6):717–725. doi: 10.1093/arclin/acaa017. [DOI] [PubMed] [Google Scholar]
Martin PK, Schroeder RW, Olsen DH, Maloy H, Boettcher A, Ernst N, Okut H. A systematic review and meta-analysis of the Test of Memory Malingering in adults: Two decades of deception detection. The Clinical Neuropsychologist. 2020;34(1):88–119. doi: 10.1080/13854046.2019.1637027. [DOI] [PubMed] [Google Scholar]
Martins M, Martins IP. Memory malingering: Evaluating WMT criteria. Applied Neuropsychology. 2010;17(3):177–182. doi: 10.1080/09084281003715709. [DOI] [PubMed] [Google Scholar]
McWhirter L, Ritchie CW, Stone J, Carson A. Performance validity test failure in clinical populations-a systematic review. Journal of Neurology, Neurosurgery, and Psychiatry. 2020;91(9):945–952. doi: 10.1136/jnnp-2020-323776. [DOI] [PubMed] [Google Scholar]
Merten T, Dandachi-FitzGerald B. Symptom and performance validity assessment: European trends in research and practice. Psychological Injury and Law. 2022;15:113–115. doi: 10.1007/s12207-022-09454-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
Merten T, Bossink L, Schmand B. On the limits of effort testing: symptom validity tests and severity of neurocognitive symptoms in nonlitigant patients. Journal of Clinical and Experimental Neuropsychology. 2007;29(3):308–318. doi: 10.1080/13803390600693607. [DOI] [PubMed] [Google Scholar]
Meyers JE, Miller RM, Thompson LM, Scalese AM, Allred BC, Rupp ZW, Dupaix ZP, Junghyun Lee A. Using likelihood ratios to detect invalid performance with performance validity measures. Archives of Clinical Neuropsychology. 2014;29(3):224–235. doi: 10.1093/arclin/acu001. [DOI] [PubMed] [Google Scholar]
Miele AS, Gunner JH, Lynch JK, McCaffrey RJ. Are embedded validity indices equivalent to free-standing symptom validity tests? Archives of Clinical Neuropsychology. 2012;27(1):10–22. doi: 10.1093/arclin/acr084. [DOI] [PubMed] [Google Scholar]
Migliavaca CB, Stein C, Colpani V, Munn Z, Falavigna M, Prevalence Estimates Reviews – Systematic Review Methodology Group (PERSyst) Quality assessment of prevalence studies: a systematic review. Journal of Clinical Epidemiology. 2020;127:59–68. doi: 10.1016/j.jclinepi.2020.06.039. [DOI] [PubMed] [Google Scholar]
Mittenberg W, Patton C, Canyock EM, Condit DC. Base rates of malingering and symptom exaggeration. Journal of Clinical and Experimental Neuropsychology. 2002;24(8):1094–1102. doi: 10.1076/jcen.24.8.1094.8379. [DOI] [PubMed] [Google Scholar]
Moore BA, Donders J. Predictors of invalid neuropsychological test performance after traumatic brain injury. Brain Injury. 2004;18(10):975–984. doi: 10.1080/02699050410001672350. [DOI] [PubMed] [Google Scholar]
Munn Z, Moola S, Lisy K, Riitano D, Tufanaru C. Methodological guidance for systematic reviews of observational epidemiological studies reporting prevalence and cumulative incidence data. International Journal of Evidence-Based Healthcare. 2015;13(3):147–153. doi: 10.1097/XEB.0000000000000054. [DOI] [PubMed] [Google Scholar]
Neale AC, Ovsiew GP, Resch ZJ, Soble JR. Feigning or forgetfulness: The effect of memory impairment severity on word choice test performance. The Clinical Neuropsychologist. 2022;36(3):584–599. doi: 10.1080/13854046.2020.1799076. [DOI] [PubMed] [Google Scholar]
Ouzzani, M., Hammady, H., Fedorowicz, Z., & Elmagarmid, A. (2016). Rayyan—a web and mobile app for systematic reviews. Systematic Reviews, 5(210). 10.1186/s13643-016-0384-4 [DOI] [PMC free article] [PubMed]
Page, M. J., McKenzie, J. E., Bossuyt, P. M., Boutron, I., Hoffmann, T. C., Mulrow, C. D., et al. (2021). The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. The BMJ, 372(71). 10.1136/bmj.n7 [DOI] [PMC free article] [PubMed]
Rees LM, Tombaugh TN, Boulay L. Depression and the Test of Memory Malingering. Archives of Clinical Neuropsychology. 2001;16(5):501–506. doi: 10.1093/arclin/16.5.501. [DOI] [PubMed] [Google Scholar]
Resch ZJ, Soble JR, Ovsiew GP, Castillo LR, Saladino KF, DeDios-Stern S, Schulze ET, Song W, Pliskin NH. Working memory, processing speed, and memory functioning are minimally predictive of victoria symptom validity test performance. Assessment. 2021;28(6):1614–1623. doi: 10.1177/1073191120911102. [DOI] [PubMed] [Google Scholar]
Rhoads T, Resch ZJ, Ovsiew GP, White DJ, Abramson DA, Soble JR. Every second counts: A comparison of four dot counting test scoring procedures for detecting invalid neuropsychological test performance. Psychological Assessment. 2021;33(2):133–141. doi: 10.1037/pas0000970. [DOI] [PubMed] [Google Scholar]
Rhoads T, Leib SI, Resch ZJ, Basurto KS, Castillo LR, Jennette KJ, Soble JR. Relative rates of invalidity for the test of memory malingering and the dot counting test among Spanish-speaking patients residing in the USA. Psychological Injury and Law. 2021;14(4):269–80. doi: 10.1007/s12207-021-09423-z. [DOI] [Google Scholar]
Richards PM, Geiger JA, Tussey CM. The dirty dozen: 12 Sources of bias in forensic neuropsychology with ways to mitigate. Psychological Injury and Law. 2015;8:265–280. doi: 10.1007/s12207-015-9235-1. [DOI] [Google Scholar]
Robles L, López E, Salazar X, Boone KB, Glaser DF. Specificity data for the b test, dot counting test, Rey-15 item plus recognition, and Rey word recognition test in monolingual Spanish-speakers. Journal of Clinical and Experimental Neuropsychology. 2015;37(6):614–621. doi: 10.1080/13803395.2015.1039961. [DOI] [PubMed] [Google Scholar]
Roor JJ, Dandachi-FitzGerald B, Ponds RW. A case of misdiagnosis of mild cognitive impairment: The utility of symptom validity testing in an outpatient memory clinic. Applied Neuropsychology: Adult. 2016;23(3):172–178. doi: 10.1080/23279095.2015.1030018. [DOI] [PubMed] [Google Scholar]
Roor JJ, Dandachi-FitzGerald B, Peters M, Knoop H, Ponds R. Performance validity and outcome of cognitive behavior therapy in patients with chronic fatigue syndrome. Journal of the International Neuropsychological Society. 2022;28(5):473–482. doi: 10.1017/S1355617721000643. [DOI] [PubMed] [Google Scholar]
Ruiz I, Raugh IM, Bartolomeo LA, Strauss GP. A Meta-Analysis of neuropsychological effort test performance in psychotic disorders. Neuropsychology Review. 2020;30(3):407–424. doi: 10.1007/s11065-020-09448-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
Sabelli AG, Messa I, Giromini L, Lichtenstein JD, May N, Erdodi LA. Symptom versus performance validity in patients with mild TBI: Independent sources of non-credible responding. Psychological Injury and Law. 2021;14(1):17–36. doi: 10.1007/s12207-021-09400-6. [DOI] [Google Scholar]
Schroeder RW, Martin PK. Explanations of performance validity test failure in clinical settings. In: Schroeder RW, Martin PK, editors. Validity assessment in clinical neuropsychological practice; evaluating and managing noncredible performance. The Guilford Press; 2022. pp. 11–30. [Google Scholar]
Schroeder RW, Martin PK, Heinrichs RJ, Baade LE. Research methods in performance validity testing studies: Criterion grouping approach impacts study outcomes. The Clinical Neuropsychologist. 2019;33(3):466–477. doi: 10.1080/13854046.2018.1484517. [DOI] [PubMed] [Google Scholar]
Schroeder RW, Boone KB, Larrabee GJ. Design methods in neuropsychological performance validity, symptom validity, and malingering research. In: Boone KB, editor. Assessment of feigned cognitive impairment. 2. The Guilford Press; 2021. pp. 11–33. [Google Scholar]
Schroeder, R. W., Clark, H. A., & Martin, P. K. (2021b). Base rates of invalidity when patients undergoing routine clinical evaluations have social security disability as an external incentive. The Clinical Neuropsychologist, 1–13. Advance online publication. https://doi-org.mu.idm.oclc.org/10.1080/13854046.2021.1895322 [DOI] [PubMed]
Sharland MJ, Waring SC, Johnson BP, Taran AM, Rusin TA, Pattock AM, Palcher JA. Further examination of embedded performance validity indicators for the conners' continuous performance test and brief test of attention in a large outpatient clinical sample. The Clinical Neuropsychologist. 2018;32(1):98–108. doi: 10.1080/13854046.2017.1332240. [DOI] [PubMed] [Google Scholar]
Sherman E, Slick DJ, Iverson GL. Multidimensional malingering criteria for neuropsychological assessment: A 20-year update of the malingered neuropsychological dysfunction criteria. Archives of Clinical Neuropsychology. 2020;35(6):735–764. doi: 10.1093/arclin/acaa019. [DOI] [PMC free article] [PubMed] [Google Scholar]
Sieck BC, Smith MM, Duff K, Paulsen JS, Beglinger LJ. Symptom validity test performance in the huntington disease clinic. Archives of Clinical Neuropsychology. 2013;28(2):135–143. doi: 10.1093/arclin/acs109. [DOI] [PMC free article] [PubMed] [Google Scholar]
Silverberg ND, Iverson GL, Panenka W. Cogniphobia in mild traumatic brain injury. Journal of Neurotrauma. 2017;34(13):2141–2146. doi: 10.1089/neu.2016.4719. [DOI] [PubMed] [Google Scholar]
Slick DJ, Hopp G, Strauss E, Thompson GB. Victoria symptom validity test: Professional manual. Psychological Assessment Resources; 1997. [Google Scholar]
Soble JR, Webber TA, Bailey KC. An overview of common performance validity tests for practicing clinicians. In: Schroeder RW, Martin PK, editors. Validity assessment in clinical neuropsychological practice; evaluating and managing noncredible performance. The Guilford Press; 2022. pp. 126–149. [Google Scholar]
Sollman MJ, Berry DT. Detection of inadequate effort on neuropsychological testing: A meta-analytic update and extension. Archives of Clinical Neuropsychology. 2011;26(8):774–789. doi: 10.1093/arclin/acr066. [DOI] [PubMed] [Google Scholar]
Sweet JJ, Heilbronner RL, Morgan JE, Larrabee GJ, Rohling ML, Boone KB, Kirkwood MW, Schroeder RW, Suhr JA, Participants Conference. American Academy of Clinical Neuropsychology (AACN) 2021 consensus statement on validity assessment: Update of the 2009 AACN consensus conference statement on neuropsychological assessment of effort, response bias, and malingering. The Clinical Neuropsychologist. 2021;35(6):1053–1106. doi: 10.1080/13854046.2021.1896036. [DOI] [PubMed] [Google Scholar]
Teichner G, Wagner MT. The test of memory malingering (TOMM): Normative data from cognitively intact, cognitively impaired, and elderly patients with dementia. Archives of Clinical Neuropsychology. 2004;19(3):455–464. doi: 10.1016/S0887-6177(03)00078-7. [DOI] [PubMed] [Google Scholar]
Tiemens B, Wagenvoorde R, Witteman C. Why every clinician should know Bayes’ rule. Health Professions Education. 2020;6(3):320–324. doi: 10.1016/j.hpe.2020.05.002. [DOI] [Google Scholar]
Tombaugh TN. Test of memory malingering. MultiHealth Systems; 1996. [Google Scholar]
van der Heide D, Boskovic I, van Harten P, Merckelbach H. Overlooking feigning behavior may result in potential harmful treatment interventions: Two case reports of undetected malingering. Journal of Forensic Science. 2020;65(4):1371–1375. doi: 10.1111/1556-4029.14320. [DOI] [PubMed] [Google Scholar]
Vilar-López R, Daugherty JC, Pérez-García M, Piñón-Blanco A. A pilot study on the adequacy of the TOMM in detecting invalid performance in patients with substance use disorders. Journal of Clinical and Experimental Neuropsychology. 2021;43(3):255–263. doi: 10.1080/13803395.2021.1912298. [DOI] [PubMed] [Google Scholar]
von Hippel PT. The heterogeneity statistic I(2) can be biased in small meta-analyses. BMC Medical Research Methodology. 2015;15:35. doi: 10.1186/s12874-015-0024-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
Walter J, Morris J, Swier-Vosnos A, Pliskin N. Effects of severity of dementia on a symptom validity measure. The Clinical Neuropsychologist. 2014;28(7):1197–1208. doi: 10.1080/13854046.2014.960454. [DOI] [PubMed] [Google Scholar]
Wilk MB, Gnanadesikan R. Probability plotting methods for the analysis of data. Biometrika. 1968;55(1):1–17. [PubMed] [Google Scholar]
Williamson DJ, Holsman M, Chaytor N, Miller JW, Drane DL. Abuse, not financial incentive, predicts non-credible cognitive performance in patients with psychogenic non-epileptic seizures. The Clinical Neuropsychologist. 2012;26(4):588–598. doi: 10.1080/13854046.2012.670266. [DOI] [PubMed] [Google Scholar]
Wodushek TR, Domen CH. Comparing two models of performance validity assessment in patients with Parkinson's disease who are candidates for deep brain stimulation surgery. Applied Neuropsychology: Adult. 2020;27(1):9–21. doi: 10.1080/23279095.2018.1473251. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary file1 (DOCX 13 KB)^{(13.3KB, docx)}

Supplementary file2 (DOCX 16 KB)^{(16.4KB, docx)}

Supplementary file3 (DOCX 16 KB)^{(16.3KB, docx)}

Supplementary file4 (DOCX 20 KB)^{(19.8KB, docx)}

Supplementary file5 (DOCX 19 KB)^{(18.8KB, docx)}

Data Availability Statement

Data are available upon reasonable request from the first author.

[CR1] Armistead-Jehle P, Buican B. Evaluation context and symptom validity test performances in a U.S. military sample. Archives of Clinical Neuropsychology. 2012;27(8):828–839. doi: 10.1093/arclin/acs086. [DOI] [PubMed] [Google Scholar]

[CR2] Barendregt JJ, Doi SA, Lee YY, Norman RE, Vos T. Meta-analysis of prevalence. Journal of Epidemiology and Community Health. 2013;67(11):974–978. doi: 10.1136/jech-2013-203104. [DOI] [PubMed] [Google Scholar]

[CR3] Borenstein M, Higgins JP, Hedges LV, Rothstein HR. Basics of meta-analysis: I2 is not an absolute measure of heterogeneity. Research Synthesis Methods. 2017;8(1):5–18. doi: 10.1002/jrsm.1230. [DOI] [PubMed] [Google Scholar]

[CR4] Cheung MW. A guide to conducting a meta-analysis with non-independent effect sizes. Neuropsychology Review. 2019;29(4):387–396. doi: 10.1007/s11065-019-09415-6. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR5] Cragar DE, Berry DT, Fakhoury TA, Cibula JE, Schmitt FA. Performance of patients with epilepsy or psychogenic non-epileptic seizures on four measures of effort. The Clinical Neuropsychologist. 2006;20(3):552–566. doi: 10.1080/13854040590947380. [DOI] [PubMed] [Google Scholar]

[CR6] Czornik M, Merten T, Lehrner J. Symptom and performance validation in patients with subjective cognitive decline and mild cognitive impairment. Applied Neuropsychology: Adult. 2021;28(3):269–281. doi: 10.1080/23279095.2019.1628761. [DOI] [PubMed] [Google Scholar]

[CR7] Dandachi-FitzGerald B, Martin PhK. Clinical judgement and clinically applied statistics: Description, benefits, and potential dangers when relying on either one individually in clinical practice. In: Schroeder RW, Martin PK, editors. Validity assessment in clinical neuropsychological practice; evaluating and managing noncredible performance. The Guilford Press; 2022. pp. 107–125. [Google Scholar]

[CR8] Dandachi-FitzGerald B, Duits AA, Leentjens A, Verhey F, Ponds R. Performance and symptom validity assessment in patients with apathy and cognitive impairment. Journal of the International Neuropsychological Society. 2020;26(3):314–321. doi: 10.1017/S1355617719001139. [DOI] [PubMed] [Google Scholar]

[CR9] Davis JJ, Millis SR. Examination of performance validity test failure in relation to number of tests administered. The Clinical Neuropsychologist. 2014;28(2):199–214. doi: 10.1080/13854046.2014.884633. [DOI] [PubMed] [Google Scholar]

[CR10] Deloria, R., Kivisto, A. J., Swier-Vosnos, A., & Elwood, L. (2021). Optimal per test cutoff scores and combinations of failure on multiple embedded performance validity tests in detecting performance invalidity in a mixed clinical sample. Applied Neuropsychology: Adult, 1–11. Advance online publication. 10.1080/23279095.2021.1973005 [DOI] [PubMed]

[CR11] Dodrill CB. Do patients with psychogenic nonepileptic seizures produce trustworthy findings on neuropsychological tests? Epilepsia. 2008;49(4):691–695. doi: 10.1111/j.1528-1167.2007.01457.x. [DOI] [PubMed] [Google Scholar]

[CR12] Domen CH, Greher MR, Hosokawa PW, Barnes SL, Hoyt BD, Wodushek TR. Are established embedded performance validity test cut-offs generalizable to patients with multiple sclerosis? Archives of Clinical Neuropsychology. 2020;35(5):511–516. doi: 10.1093/arclin/acaa016. [DOI] [PubMed] [Google Scholar]

[CR13] Donders J, Strong CA. Embedded effort indicators on the California Verbal Learning Test - Second Edition (CVLT-II): an attempted cross-validation. The Clinical Neuropsychologist. 2011;25(1):173–184. doi: 10.1080/13854046.2010.536781. [DOI] [PubMed] [Google Scholar]

[CR14] Dorociak KE, Schulze ET, Piper LE, Molokie RE, Janecek JK. Performance validity testing in a clinical sample of adults with sickle cell disease. The Clinical Neuropsychologist. 2018;32(1):81–97. doi: 10.1080/13854046.2017.1339830. [DOI] [PubMed] [Google Scholar]

[CR15] Drane DL, Williamson DJ, Stroup ES, Holmes MD, Jung M, Koerner E, Chaytor N, Wilensky AJ, Miller JW. Cognitive impairment is not equal in patients with epileptic and psychogenic nonepileptic seizures. Epilepsia. 2006;47(11):1879–1886. doi: 10.1111/j.1528-1167.2006.00611.x. [DOI] [PubMed] [Google Scholar]

[CR16] Eichstaedt KE, Clifton WE, Vale FL, Benbadis SR, Bozorg AM, Rodgers-Neame NT, Schoenberg MR. Sensitivity of Green's Word Memory Test genuine memory impairment profile to temporal pathology: A study in patients with temporal lobe epilepsy. The Clinical Neuropsychologist. 2014;28(6):941–953. doi: 10.1080/13854046.2014.942374. [DOI] [PubMed] [Google Scholar]

[CR17] Erdodi LA, Abeare CA, Medoff B, Seke KR, Sagar S, Kirsch NL. A single error is one too many: The forced choice recognition trial of the CVLT-II as a measure of performance validity in adults with TBI. Archives of Clinical Neuropsychology. 2018;33(7):845–860. doi: 10.1093/acn/acx110. [DOI] [PubMed] [Google Scholar]

[CR18] Furuya-Kanamori L, Barendregt JJ, Doi S. A new improved graphical and quantitative method for detecting bias in meta-analysis. International Journal of Evidence-Based Healthcare. 2018;16(4):195–203. doi: 10.1097/XEB.0000000000000141. [DOI] [PubMed] [Google Scholar]

[CR19] Galioto R, Dhima K, Berenholz O, Busch R. Performance validity testing in multiple sclerosis. Journal of the International Neuropsychological Society. 2020;26(10):1028–1035. doi: 10.1017/S1355617720000466. [DOI] [PubMed] [Google Scholar]

[CR20] Gorissen M, Sanz JC, Schmand B. Effort and cognition in schizophrenia patients. Schizophrenia Research. 2005;78(2–3):199–208. doi: 10.1016/j.schres.2005.02.016. [DOI] [PubMed] [Google Scholar]

[CR21] Green P. Manual for the word memory test. Green’s Publishing; 2003. [Google Scholar]

[CR22] Green P. Manual for the medical symptom validity test. Green’s Publishing; 2004. [Google Scholar]

[CR23] Grote CL, Kooker EK, Garron DC, Nyenhuis DL, Smith CA, Mattingly ML. Performance of compensation seeking and non-compensation seeking samples on the Victoria symptom validity test: Cross-validation and extension of a standardization study. Journal of Clinical and Experimental Neuropsychology. 2000;22(6):709–719. doi: 10.1076/jcen.22.6.709.958. [DOI] [PubMed] [Google Scholar]

[CR24] Haber AH, Fichtenberg NL. Replication of the Test of Memory Malingering (TOMM) in a traumatic brain injury and head trauma sample. The Clinical Neuropsychologist. 2006;20(3):524–532. doi: 10.1080/13854040590967595. [DOI] [PubMed] [Google Scholar]

[CR25] Haggerty KA, Frazier TW, Busch RM, Naugle RI. Relationships among Victoria symptom validity test indices and personality assessment inventory validity scales in a large clinical sample. The Clinical Neuropsychologist. 2007;21(6):917–928. doi: 10.1080/13854040600899724. [DOI] [PubMed] [Google Scholar]

[CR26] Harrison AG, Armstrong IT. Differences in performance on the test of variables of attention between credible vs. noncredible individuals being screened for attention deficit hyperactivity disorder. Applied Neuropsychology: Child. 2020;9(4):314–322. doi: 10.1080/21622965.2020.1750115. [DOI] [PubMed] [Google Scholar]

[CR27] Harrison, A. G., Beal, A. L., & Armstrong, I. T. (2021). Predictive value of performance validity testing and symptom validity testing in psychoeducational assessment. Applied Neuropsychology: Adult, 1–15. Advance online publication. 10.1080/23279095.2021.1943396 [DOI] [PubMed]

[CR28] Higgins JP, Thompson SG. Quantifying heterogeneity in a meta-analysis. Statistics in Medicine. 2002;21(11):1539–1558. doi: 10.1002/sim.1186. [DOI] [PubMed] [Google Scholar]

[CR29] Higgins JP, Thompson SG, Deeks JJ, Altman DG. Measuring inconsistency in meta-analyses. BMJ (Clinical Research Ed.) 2003;327(7414):557–560. doi: 10.1136/bmj.327.7414.557. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR30] Hoskins LL, Binder LM, Chaytor NS, Williamson DJ, Drane DL. Comparison of oral and computerized versions of the word memory test. Archives of Clinical Neuropsychology. 2010;25(7):591–600. doi: 10.1093/arclin/acq060. [DOI] [PubMed] [Google Scholar]

[CR31] Inman TH, Vickery CD, Berry DT, Lamb DG, Edwards CL, Smith GT. Development and initial validation of a new procedure for evaluating adequacy of effort given during neuropsychological testing: The letter memory test. Psychological Assessment. 1998;10(2):128. doi: 10.1037/1040-3590.10.2.128. [DOI] [Google Scholar]

[CR32] Jennette, K. J., Williams, C. P., Resch, Z. J., Ovsiew, G. P., Durkin, N. M., O'Rourke, J., Marceaux, J. C., Critchfield, E. A., & Soble, J. R. (2021). Assessment of differential neurocognitive performance based on the number of performance validity tests failures: A cross-validation study across multiple mixed clinical samples. The Clinical Neuropsychologist, 1–19. Advance online publication. 10.1080/13854046.2021.1900398 [DOI] [PubMed]

[CR33] Keary TA, Frazier TW, Belzile CJ, Chapin JS, Naugle RI, Najm IM, Busch RM. Working memory and intelligence are associated with victoria symptom validity test hard item performance in patients with intractable epilepsy. Journal of the International Neuropsychological Society. 2013;19(3):314–323. doi: 10.1017/S1355617712001397. [DOI] [PubMed] [Google Scholar]

[CR34] Kemp, S. & Kapur, N. (2020). Response to McWhirter et al. https://jnnp.bmj.com/content/91/9/945.responses#response-to-mcwhirter-et-al

[CR35] Krishnan M, Donders J. Embedded assessment of validity using the continuous visual memory test in patients with traumatic brain injury. Archives of Clinical Neuropsychology. 2011;26(3):176–183. doi: 10.1093/arclin/acr010. [DOI] [PubMed] [Google Scholar]

[CR36] Lange RT, Lippa SM. Sensitivity and specificity should never be interpreted in isolation without consideration of other clinical utility metrics. The Clinical Neuropsychologist. 2017;31(6–7):1015–1028. doi: 10.1080/13854046.2017.1335438. [DOI] [PubMed] [Google Scholar]

[CR37] Larrabee GJ. Aggregating across multiple indicators improves the detection of malingering: Relationship to likelihood-ratios. The Clinical Neuropsychologist. 2014;22(4):666–679. doi: 10.1080/13854040701494987. [DOI] [PubMed] [Google Scholar]

[CR38] Larrabee GJ. Synthesizing data to reach clinical conclusion regarding validity status. In: Schroeder RW, Martin PK, editors. Validity assessment in clinical neuropsychological practice; Evaluating and managing noncredible performance. The Guilford Press; 2022. pp. 193–210. [Google Scholar]

[CR39] Larrabee, G. J., Boone, K. B., Bianchini, K. J., Rohling, M. L., & Sherman, E. M. (2020). Response to McWhirter et al (2020). https://jnnp.bmj.com/content/91/9/945.responses#response-to-mcwhirter-et-al

[CR40] Leppma M, Long D, Smith M, Lassiter C. Detecting symptom exaggeration in college students seeking ADHD treatment: Performance validity assessment using the NV-MSVT and IVA-plus. Applied Neuropsychology: Adult. 2018;25(3):210–218. doi: 10.1080/23279095.2016.1277723. [DOI] [PubMed] [Google Scholar]

[CR41] Lippa SM. Performance validity testing in neuropsychology: a clinical guide, critical review, and update on a rapidly evolving literature. The Clinical Neuropsychologist. 2018;32(3):391–421. doi: 10.1080/13854046.2017.1406146. [DOI] [PubMed] [Google Scholar]

[CR42] Locke DE, Smigielski JS, Powell MR, Stevens SR. Effort issues in post-acute outpatient acquired brain injury rehabilitation seekers. NeuroRehabilitation. 2008;23(3):273–281. doi: 10.3233/NRE-2008-23310. [DOI] [PubMed] [Google Scholar]

[CR43] Loring DW, Lee GP, Meador KJ. Victoria symptom validity test performance in non-litigating epilepsy surgery candidates. Journal of Clinical and Experimental Neuropsychology. 2005;27(5):610–617. doi: 10.1080/13803390490918471. [DOI] [PubMed] [Google Scholar]

[CR44] Loring DW, Larrabee GJ, Lee GP, Meador KJ. Victoria symptom validity test performance in a heterogenous clinical sample. The Clinical Neuropsychologist. 2007;21(3):522–531. doi: 10.1080/13854040600611384. [DOI] [PubMed] [Google Scholar]

[CR45] Marshall PS, Hoelzle JB, Heyerdahl D, Nelson NW. The impact of failing to identify suspect effort in patients undergoing adult attention-deficit/hyperactivity disorder (ADHD) assessment. Psychological Assessment. 2016;28(10):1290–1302. doi: 10.1037/pas0000247. [DOI] [PubMed] [Google Scholar]

[CR46] Martin PK, Schroeder RW. Base rates of invalid test performance across clinical non-forensic contexts and settings. Archives of Clinical Neuropsychology. 2020;35(6):717–725. doi: 10.1093/arclin/acaa017. [DOI] [PubMed] [Google Scholar]

[CR47] Martin PK, Schroeder RW, Olsen DH, Maloy H, Boettcher A, Ernst N, Okut H. A systematic review and meta-analysis of the Test of Memory Malingering in adults: Two decades of deception detection. The Clinical Neuropsychologist. 2020;34(1):88–119. doi: 10.1080/13854046.2019.1637027. [DOI] [PubMed] [Google Scholar]

[CR48] Martins M, Martins IP. Memory malingering: Evaluating WMT criteria. Applied Neuropsychology. 2010;17(3):177–182. doi: 10.1080/09084281003715709. [DOI] [PubMed] [Google Scholar]

[CR49] McWhirter L, Ritchie CW, Stone J, Carson A. Performance validity test failure in clinical populations-a systematic review. Journal of Neurology, Neurosurgery, and Psychiatry. 2020;91(9):945–952. doi: 10.1136/jnnp-2020-323776. [DOI] [PubMed] [Google Scholar]

[CR50] Merten T, Dandachi-FitzGerald B. Symptom and performance validity assessment: European trends in research and practice. Psychological Injury and Law. 2022;15:113–115. doi: 10.1007/s12207-022-09454-0. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR51] Merten T, Bossink L, Schmand B. On the limits of effort testing: symptom validity tests and severity of neurocognitive symptoms in nonlitigant patients. Journal of Clinical and Experimental Neuropsychology. 2007;29(3):308–318. doi: 10.1080/13803390600693607. [DOI] [PubMed] [Google Scholar]

[CR52] Meyers JE, Miller RM, Thompson LM, Scalese AM, Allred BC, Rupp ZW, Dupaix ZP, Junghyun Lee A. Using likelihood ratios to detect invalid performance with performance validity measures. Archives of Clinical Neuropsychology. 2014;29(3):224–235. doi: 10.1093/arclin/acu001. [DOI] [PubMed] [Google Scholar]

[CR53] Miele AS, Gunner JH, Lynch JK, McCaffrey RJ. Are embedded validity indices equivalent to free-standing symptom validity tests? Archives of Clinical Neuropsychology. 2012;27(1):10–22. doi: 10.1093/arclin/acr084. [DOI] [PubMed] [Google Scholar]

[CR54] Migliavaca CB, Stein C, Colpani V, Munn Z, Falavigna M, Prevalence Estimates Reviews – Systematic Review Methodology Group (PERSyst) Quality assessment of prevalence studies: a systematic review. Journal of Clinical Epidemiology. 2020;127:59–68. doi: 10.1016/j.jclinepi.2020.06.039. [DOI] [PubMed] [Google Scholar]

[CR55] Mittenberg W, Patton C, Canyock EM, Condit DC. Base rates of malingering and symptom exaggeration. Journal of Clinical and Experimental Neuropsychology. 2002;24(8):1094–1102. doi: 10.1076/jcen.24.8.1094.8379. [DOI] [PubMed] [Google Scholar]

[CR56] Moore BA, Donders J. Predictors of invalid neuropsychological test performance after traumatic brain injury. Brain Injury. 2004;18(10):975–984. doi: 10.1080/02699050410001672350. [DOI] [PubMed] [Google Scholar]

[CR57] Munn Z, Moola S, Lisy K, Riitano D, Tufanaru C. Methodological guidance for systematic reviews of observational epidemiological studies reporting prevalence and cumulative incidence data. International Journal of Evidence-Based Healthcare. 2015;13(3):147–153. doi: 10.1097/XEB.0000000000000054. [DOI] [PubMed] [Google Scholar]

[CR58] Neale AC, Ovsiew GP, Resch ZJ, Soble JR. Feigning or forgetfulness: The effect of memory impairment severity on word choice test performance. The Clinical Neuropsychologist. 2022;36(3):584–599. doi: 10.1080/13854046.2020.1799076. [DOI] [PubMed] [Google Scholar]

[CR59] Ouzzani, M., Hammady, H., Fedorowicz, Z., & Elmagarmid, A. (2016). Rayyan—a web and mobile app for systematic reviews. Systematic Reviews, 5(210). 10.1186/s13643-016-0384-4 [DOI] [PMC free article] [PubMed]

[CR60] Page, M. J., McKenzie, J. E., Bossuyt, P. M., Boutron, I., Hoffmann, T. C., Mulrow, C. D., et al. (2021). The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. The BMJ, 372(71). 10.1136/bmj.n7 [DOI] [PMC free article] [PubMed]

[CR61] Rees LM, Tombaugh TN, Boulay L. Depression and the Test of Memory Malingering. Archives of Clinical Neuropsychology. 2001;16(5):501–506. doi: 10.1093/arclin/16.5.501. [DOI] [PubMed] [Google Scholar]

[CR62] Resch ZJ, Soble JR, Ovsiew GP, Castillo LR, Saladino KF, DeDios-Stern S, Schulze ET, Song W, Pliskin NH. Working memory, processing speed, and memory functioning are minimally predictive of victoria symptom validity test performance. Assessment. 2021;28(6):1614–1623. doi: 10.1177/1073191120911102. [DOI] [PubMed] [Google Scholar]

[CR63] Rhoads T, Resch ZJ, Ovsiew GP, White DJ, Abramson DA, Soble JR. Every second counts: A comparison of four dot counting test scoring procedures for detecting invalid neuropsychological test performance. Psychological Assessment. 2021;33(2):133–141. doi: 10.1037/pas0000970. [DOI] [PubMed] [Google Scholar]

[CR64] Rhoads T, Leib SI, Resch ZJ, Basurto KS, Castillo LR, Jennette KJ, Soble JR. Relative rates of invalidity for the test of memory malingering and the dot counting test among Spanish-speaking patients residing in the USA. Psychological Injury and Law. 2021;14(4):269–80. doi: 10.1007/s12207-021-09423-z. [DOI] [Google Scholar]

[CR65] Richards PM, Geiger JA, Tussey CM. The dirty dozen: 12 Sources of bias in forensic neuropsychology with ways to mitigate. Psychological Injury and Law. 2015;8:265–280. doi: 10.1007/s12207-015-9235-1. [DOI] [Google Scholar]

[CR66] Robles L, López E, Salazar X, Boone KB, Glaser DF. Specificity data for the b test, dot counting test, Rey-15 item plus recognition, and Rey word recognition test in monolingual Spanish-speakers. Journal of Clinical and Experimental Neuropsychology. 2015;37(6):614–621. doi: 10.1080/13803395.2015.1039961. [DOI] [PubMed] [Google Scholar]

[CR67] Roor JJ, Dandachi-FitzGerald B, Ponds RW. A case of misdiagnosis of mild cognitive impairment: The utility of symptom validity testing in an outpatient memory clinic. Applied Neuropsychology: Adult. 2016;23(3):172–178. doi: 10.1080/23279095.2015.1030018. [DOI] [PubMed] [Google Scholar]

[CR68] Roor JJ, Dandachi-FitzGerald B, Peters M, Knoop H, Ponds R. Performance validity and outcome of cognitive behavior therapy in patients with chronic fatigue syndrome. Journal of the International Neuropsychological Society. 2022;28(5):473–482. doi: 10.1017/S1355617721000643. [DOI] [PubMed] [Google Scholar]

[CR69] Ruiz I, Raugh IM, Bartolomeo LA, Strauss GP. A Meta-Analysis of neuropsychological effort test performance in psychotic disorders. Neuropsychology Review. 2020;30(3):407–424. doi: 10.1007/s11065-020-09448-2. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR70] Sabelli AG, Messa I, Giromini L, Lichtenstein JD, May N, Erdodi LA. Symptom versus performance validity in patients with mild TBI: Independent sources of non-credible responding. Psychological Injury and Law. 2021;14(1):17–36. doi: 10.1007/s12207-021-09400-6. [DOI] [Google Scholar]

[CR71] Schroeder RW, Martin PK. Explanations of performance validity test failure in clinical settings. In: Schroeder RW, Martin PK, editors. Validity assessment in clinical neuropsychological practice; evaluating and managing noncredible performance. The Guilford Press; 2022. pp. 11–30. [Google Scholar]

[CR72] Schroeder RW, Martin PK, Heinrichs RJ, Baade LE. Research methods in performance validity testing studies: Criterion grouping approach impacts study outcomes. The Clinical Neuropsychologist. 2019;33(3):466–477. doi: 10.1080/13854046.2018.1484517. [DOI] [PubMed] [Google Scholar]

[CR73] Schroeder RW, Boone KB, Larrabee GJ. Design methods in neuropsychological performance validity, symptom validity, and malingering research. In: Boone KB, editor. Assessment of feigned cognitive impairment. 2. The Guilford Press; 2021. pp. 11–33. [Google Scholar]

[CR74] Schroeder, R. W., Clark, H. A., & Martin, P. K. (2021b). Base rates of invalidity when patients undergoing routine clinical evaluations have social security disability as an external incentive. The Clinical Neuropsychologist, 1–13. Advance online publication. https://doi-org.mu.idm.oclc.org/10.1080/13854046.2021.1895322 [DOI] [PubMed]

[CR75] Sharland MJ, Waring SC, Johnson BP, Taran AM, Rusin TA, Pattock AM, Palcher JA. Further examination of embedded performance validity indicators for the conners' continuous performance test and brief test of attention in a large outpatient clinical sample. The Clinical Neuropsychologist. 2018;32(1):98–108. doi: 10.1080/13854046.2017.1332240. [DOI] [PubMed] [Google Scholar]

[CR76] Sherman E, Slick DJ, Iverson GL. Multidimensional malingering criteria for neuropsychological assessment: A 20-year update of the malingered neuropsychological dysfunction criteria. Archives of Clinical Neuropsychology. 2020;35(6):735–764. doi: 10.1093/arclin/acaa019. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR77] Sieck BC, Smith MM, Duff K, Paulsen JS, Beglinger LJ. Symptom validity test performance in the huntington disease clinic. Archives of Clinical Neuropsychology. 2013;28(2):135–143. doi: 10.1093/arclin/acs109. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR78] Silverberg ND, Iverson GL, Panenka W. Cogniphobia in mild traumatic brain injury. Journal of Neurotrauma. 2017;34(13):2141–2146. doi: 10.1089/neu.2016.4719. [DOI] [PubMed] [Google Scholar]

[CR79] Slick DJ, Hopp G, Strauss E, Thompson GB. Victoria symptom validity test: Professional manual. Psychological Assessment Resources; 1997. [Google Scholar]

[CR80] Soble JR, Webber TA, Bailey KC. An overview of common performance validity tests for practicing clinicians. In: Schroeder RW, Martin PK, editors. Validity assessment in clinical neuropsychological practice; evaluating and managing noncredible performance. The Guilford Press; 2022. pp. 126–149. [Google Scholar]

[CR81] Sollman MJ, Berry DT. Detection of inadequate effort on neuropsychological testing: A meta-analytic update and extension. Archives of Clinical Neuropsychology. 2011;26(8):774–789. doi: 10.1093/arclin/acr066. [DOI] [PubMed] [Google Scholar]

[CR82] Sweet JJ, Heilbronner RL, Morgan JE, Larrabee GJ, Rohling ML, Boone KB, Kirkwood MW, Schroeder RW, Suhr JA, Participants Conference. American Academy of Clinical Neuropsychology (AACN) 2021 consensus statement on validity assessment: Update of the 2009 AACN consensus conference statement on neuropsychological assessment of effort, response bias, and malingering. The Clinical Neuropsychologist. 2021;35(6):1053–1106. doi: 10.1080/13854046.2021.1896036. [DOI] [PubMed] [Google Scholar]

[CR83] Teichner G, Wagner MT. The test of memory malingering (TOMM): Normative data from cognitively intact, cognitively impaired, and elderly patients with dementia. Archives of Clinical Neuropsychology. 2004;19(3):455–464. doi: 10.1016/S0887-6177(03)00078-7. [DOI] [PubMed] [Google Scholar]

[CR84] Tiemens B, Wagenvoorde R, Witteman C. Why every clinician should know Bayes’ rule. Health Professions Education. 2020;6(3):320–324. doi: 10.1016/j.hpe.2020.05.002. [DOI] [Google Scholar]

[CR85] Tombaugh TN. Test of memory malingering. MultiHealth Systems; 1996. [Google Scholar]

[CR86] van der Heide D, Boskovic I, van Harten P, Merckelbach H. Overlooking feigning behavior may result in potential harmful treatment interventions: Two case reports of undetected malingering. Journal of Forensic Science. 2020;65(4):1371–1375. doi: 10.1111/1556-4029.14320. [DOI] [PubMed] [Google Scholar]

[CR87] Vilar-López R, Daugherty JC, Pérez-García M, Piñón-Blanco A. A pilot study on the adequacy of the TOMM in detecting invalid performance in patients with substance use disorders. Journal of Clinical and Experimental Neuropsychology. 2021;43(3):255–263. doi: 10.1080/13803395.2021.1912298. [DOI] [PubMed] [Google Scholar]

[CR88] von Hippel PT. The heterogeneity statistic I(2) can be biased in small meta-analyses. BMC Medical Research Methodology. 2015;15:35. doi: 10.1186/s12874-015-0024-z. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR89] Walter J, Morris J, Swier-Vosnos A, Pliskin N. Effects of severity of dementia on a symptom validity measure. The Clinical Neuropsychologist. 2014;28(7):1197–1208. doi: 10.1080/13854046.2014.960454. [DOI] [PubMed] [Google Scholar]

[CR90] Wilk MB, Gnanadesikan R. Probability plotting methods for the analysis of data. Biometrika. 1968;55(1):1–17. [PubMed] [Google Scholar]

[CR91] Williamson DJ, Holsman M, Chaytor N, Miller JW, Drane DL. Abuse, not financial incentive, predicts non-credible cognitive performance in patients with psychogenic non-epileptic seizures. The Clinical Neuropsychologist. 2012;26(4):588–598. doi: 10.1080/13854046.2012.670266. [DOI] [PubMed] [Google Scholar]

[CR92] Wodushek TR, Domen CH. Comparing two models of performance validity assessment in patients with Parkinson's disease who are candidates for deep brain stimulation surgery. Applied Neuropsychology: Adult. 2020;27(1):9–21. doi: 10.1080/23279095.2018.1473251. [DOI] [PubMed] [Google Scholar]

PERMALINK

Performance Validity Test Failure in the Clinical Population: A Systematic Review and Meta-Analysis of Prevalence Rates

Jeroen J Roor

Maarten J V Peters

Brechje Dandachi-FitzGerald

Rudolf W H M Ponds

Abstract

Supplementary Information

Introduction

Methods

Search Strategy

Study Selection

Data Collection and Extraction

Statistical Analyses

Study Quality

Results

Literature Search

Fig. 1.

Characterization of Included Studies

Table 1.

Methodological Quality Assessment

Base Rate of PVT Failure in Clinical Patients

Fig. 2.

Subgroup Analyses based upon Clinically Relevant Characteristics

Table 2.

False-Positive Scrutinization

Clinical Context

Clinical Diagnoses

External Gain Incentives

PVT

Discussion

Supplementary Information

Author Contributions

Data Availability

Declarations

Ethical Approval

Consent to Participate

Consent for Publication

Conflicts of Interest

Footnotes

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases