Skip to main content
PLOS ONE logoLink to PLOS ONE
. 2019 Dec 30;14(12):e0227113. doi: 10.1371/journal.pone.0227113

Indicators to distinguish symptom accentuators from symptom producers in individuals with a diagnosed adjustment disorder: A pilot study on inconsistency subtypes using SIMS and MMPI-2-RF

Cristina Mazza 1,#, Graziella Orrù 2,, Franco Burla 1,, Merylin Monaro 3,, Stefano Ferracuti 1,, Marco Colasanti 1,#, Paolo Roma 1,*,#
Editor: Stephan Doering4
PMCID: PMC6936836  PMID: 31887214

Abstract

In the context of legal damage evaluations, evaluees may exaggerate or simulate symptoms in an attempt to obtain greater economic compensation. To date, practitioners and researchers have focused on detecting malingering behavior as an exclusively unitary construct. However, we argue that there are two types of inconsistent behavior that speak to possible malingering—accentuating (i.e., exaggerating symptoms that are actually experienced) and simulating (i.e., fabricating symptoms entirely)—each with its own unique attributes; thus, it is necessary to distinguish between them. The aim of the present study was to identify objective indicators to differentiate symptom accentuators from symptom producers and consistent participants. We analyzed the Structured Inventory of Malingered Symptomatology scales and the Minnesota Multiphasic Personality Inventory-2 Restructured Form validity scales of 132 individuals with a diagnosed adjustment disorder with mixed anxiety and depressed mood who had undergone assessment for psychiatric/psychological damage. The results indicated that the SIMS Total Score, Neurologic Impairment and Low Intelligence scales and the MMPI-2-RF Infrequent Responses (F-r) and Response Bias (RBS) scales successfully discriminated among symptom accentuators, symptom producers, and consistent participants. Machine learning analysis was used to identify the most efficient parameter for classifying these three groups, recognizing the SIMS Total Score as the best indicator.

Introduction

Psychic damage (or psychological/ psychiatric damage) can be defined as an alteration of psychic integrity (i.e., a qualitative and quantitative change in psychic elements, including primary mental abilities, affectivity, defense mechanisms, and mood) [1]. Although it is considered biological damage, it is not limited to medically assessable pathology; rather, it involves both objective and subjective elements, linked to an individual’s unique personal history [2]. In the context of legal damage evaluations, individuals obtain economic compensation based on an estimation of damage: the higher the estimate, the higher the indemnity received. A psychopathological condition that is often presented in this context is adjustment disorder. The 5th edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5) lists the following diagnostic criteria for this disorder: presentation of emotional or behavioral symptoms within 3 months of a specific stressor; experience of more stress than would normally be expected in response to the stressful life event, and/or stress that causes significant problems in one’s relationships, either at work or at school; and symptoms that are not the result of another mental health disorder or associated with healthy grieving [3]. In the medico-legal context, disorders associated with depression and anxiety, such as chronic adjustment disorder with mixed anxiety and depressed mood, are the most frequently simulated [4], at a rate of over 50% [5].

Forensic practitioners are trained to evaluate whether participants might simulate or accentuate distress or a psychic disorder in order to unjustly obtain greater compensation [6]. The DSM-5 defines such behavior as malingering and describes it as the “intentional production of false or grossly exaggerated physical or psychological symptoms, motivated by external incentives” (p. 726) [3]. Overall, malingering is a serious social problem that increases costs for society [7,8]. The precise incidence of the phenomenon is largely unknown; Young [9,10] pointed out that many studies relied on overinclusive criteria leading to an increase in prevalence, but in forensic and disability samples its prevalence has been reported to be between 15±15%.

Lipman [11] indicated four types of malingering: invention, or completely fabricating symptoms; perseveration, or fraudulently presenting formerly present symptoms that have since ceased; exaggeration, or making existing symptoms appear worse than they really are; and transference or attributing genuine symptoms to an unrelated cause or accident. Similarly, Resnick [12] described three potential subtypes of malingering: pure malingering, which involves completely fabricating symptoms; partial malingering, which involves exaggerating existing symptoms; and false imputation, which involves attributing symptoms to a cause that has little or no relationship to their development. Despite these various classifications of malingering, however, the behavior has mostly been considered a unitary construct, both theoretically and empirically, assimilating the aspects of invention and exaggeration. The literature has focused on the differences between simulators and honest respondents, while research focusing on exaggerators and their unique attributes has not been conducted, even though exaggerating behavior is thought to be much more frequent than that of invention and false imputation [13].

Researchers have developed instruments specifically designed to detect malingering and included, in personality inventories, scales with the same purpose. For example, the Structured Interview of Reported Symptoms (SIRS) [14,15] and its second edition (SIRS-2) [16], the latter being a 172-item test designed for the assessment of feigning, including a scale to assess defensiveness. SIRS has been considered the “gold standard” in assessing psychiatric malingering [17] and received extensive validations [18]. A study [19] on SIRS-2 indicated that, when compared to SIRS, it reaches a higher specificity (94.3% vs. 92.0%) but a lower sensitivity (36.8% vs. 47.4% among forensic patients). Another example is the Test of Memory Malingering (TOMM) [20], a 50-item visual recognition test designed to help distinguish between malingerers and true memory impairments. Its usefulness in discriminating between bona-fide individuals and malingerers has been evidenced by a series of study, conducted with different types of participants (college students, patients with traumatic brain injury and hospital outpatients), different research designs (simulation and known-group) and different procedures for stimuli presentation (paper-and-pencil and computer) [21]. A further example is the Inventory of Problems-29 (IOP-29) [22], a 29-item easy-to-use measure of non-credible mental and cognitive symptoms. Studies using this instrument yielded encouraging results in the detection of malingering [23,24] and indicated that it can be used in a multimethod symptom validity assessment along with TOMM [25]. Another example is the Structured Inventory of Malingered Symptomatology (SIMS) [26] is a 75-item multi-axial self-report questionnaire validated with clinical-forensic, psychiatric, and non-clinical populations. In the literature, many authors have used the SIMS to discriminate between honest respondents and simulators, confirming its usefulness for this task [2733]. Regarding SIMS, scales’ sensitivity partly depends on the feigned condition. A study by Edens, Otto and Dwyer [27] evaluating SIMS scales’ sensitivity for different conditions (e.g., depression, psychosis, cognitive impairment) indicated that, while most scales (except Psychosis scale) were sensitive to malingering regardless of the specific symptomatology, Total Score was overall the most sensitive indicator, correctly identifying 96.4% of all malingered protocols. A last example is the Minnesota Multiphasic Personality Inventory-2 Restructured Form (MMPI-2-RF) [34], which is a 338-item personality questionnaire that includes subscales designed to detect overreporting and response bias (Infrequent Responses, F-r; Infrequent Psychopathology Responses, Fp-r; Symptom Validity, FBS-r; Infrequent Somatic Responses, Fs; Response Bias, RBS). A recent meta-analysis [35] indicated that the most sensitive scale for feigned mental disorders is RBS (.93, cut-off ≥80), followed by FBS-r (.84, cut-off ≥80) and F-r (.71, cut-off ≥100); for feigned cognitive impairment the most sensitive scale is FBS-r (.88, cut-off ≥80), followed by RBS (.84, cut-off ≥80) and Fs (.66, cut-off≥80); lastly, for feigned medical complaints, the most sensitive scale is RBS (.94, cut-off≥80), followed by FBS-r (.69, cutoff≥80) and F-r (.59, cut-off≥100).

In an attempt to develop a strategy to distinguish between symptom accentuators and producers, we employed a promising tool in the field of lie-detection: machine learning (ML). ML can be defined as “the study and construction of algorithms that can learn information from a set of data (called a training set) and make predictions for a new set of data (called a test set). In other words, it consists of training one or more algorithms to predict outcomes without being explicitly programmed and only uses the information learned from the training set.” In the literature, ML models have been recently used to discriminate between honest respondents and fakers in a variety of settings [3638], with extremely promising accuracy, indicating that ML can, in fact, outperform traditional statistical methods.

The main purpose of this study was to identify helpful criteria to distinguish accentuators from symptom producers and consistent individuals. We hypothesized that symptom accentuators would have on SIMS and MMPI-2-RF selected validity scales higher scores compared to consistent participants, but lower scores than symptom producers.

Materials and method

Participants and procedure

Participants were 150 Italian individuals who had to undergo a mental health examination based on a judge’s order, between January and December 2018, in the context of a lawsuit involving psychological injury. All were referred to the Laboratory of Clinical Psychology at Human Neuroscience Department, Faculty of Medicine and Dentistry, Sapienza University of Rome, which is an academic reference center in psychiatric evaluation and psychological assessment. Specifically, the inclusion criteria were: (a) having been born and raised in Italy, and (b) having a clinical diagnosis of chronic adjustment disorder with mixed anxiety and depressed mood. On the other hand, the exclusion criteria were: (a) a psychiatric history prior to the accident, and (b) comorbidity with another psychiatric disorder. Forty percent of these diagnoses followed a road accident, 30% were a consequence of work-related accidents, 20% followed equally workplace harassment and stalking episodes, and 10% originated from domestic violence.

The distribution of participants (N = 150) to groups and the evaluation of their psycho-diagnostic profiles were conducted in three phases.

In the first phase, participants underwent a psychiatric interview and a psychological-clinical interview, blind. At the end of these interviews, the psychiatrist and clinical psychologist, also blind, completed an information sheet establishing the following: First (a), congruence between the documentation submitted considered suitable (e.g., psychopharmacological prescriptions, psychotherapeutic treatment certifications, illness certificates for work) and the diagnosis provided by a mental health professional with the required training to diagnose chronic adjustment disorder with mixed anxiety and depressed mood. Second (b), congruence between the manifestation of clinical and emotional symptoms during the interview (e.g. lowering of mood, crying or hopelessness, nervousness, agitation) and the diagnosis of chronic adjustment disorder with mixed anxiety and depressed mood. And third (c), congruence between the referenced symptomatology and the referenced impairment in day-to-day functioning in the participants’ social, working, and other important areas (e.g., absenteeism from work, changes and difficulties in interpersonal relationships, complications regarding illness and treatments such as extension of hospital stay and decreased compliance with the recommended treatment regimen). The determination for each of these criteria was “congruent” or “incongruent”. The information sheets were then delivered directly to the research coordinator. Until the end of this evaluation step, the two mental health professionals did not have any knowledge regarding the assessment made by the other colleague on the same participant (i.e., blind procedure). Whenever there was disagreement on one or more conclusions, the experts were required to justify their choice and reach an agreement. At first, experts disagreed on the evaluation of the second criterion (b) 11 times, while 23 cases involved mental professionals reaching different conclusions on the third (c) parameter. Participants for whom agreement was impossible (N = 12 out of 34 divergences) were excluded from the study. Further (N = 6) participants were excluded because they did not consent to the research. In the second phase, examinees were assigned to one of three groups on the basis of the experts’ conclusions: (a) Consistent Participants (CP), which included individuals judged congruent on all three criteria; (b) Symptom Accentuators (SA), which included examinees judged congruent on criterion 1 but incongruent on either criterion 2 or 3; and (c) Symptom Producers (SP), which included members judged incongruent on at least two criteria. In the third phase, participants completed a test battery with the help of specialized technical staff, blind. Test scoring was performed via computer software.

The final sample was comprised of 132 participants (Table 1). The three groups differed in age [F (2, 129) = 8.373, p < .001] and educational level [F (2, 129) = 4.240, p = .016], but not gender composition [F (2, 129) = 1.775, p = .191].

Table 1. Demographic composition of the three research groups.

Consistent Participants
(n = 49)
Symptom
Accentuators
(n = 44)
Symptom
Producers
(n = 39)
Gender M (n) 42 31 29
F(n) 7 13 10
Age M (SD) 48.82 (6.84) 44.55 (13.30) 39.59 (10.77)
Education
(years) M (SD)
13.08 (3.05) 11.50 (2.28) 12.00 (2.61)

The study was carried out with written informed consent by all participants, in accordance with the Declaration of Helsinki. It was approved by the local ethics committee (Board of the Department of Human Neuroscience, Faculty of Medicine and Dentistry, Sapienza University of Rome).

Materials

Structured Inventory of Malingered Symptomatology (SIMS) [26,39]. The SIMS is comprised of 75 items that describe implausible, rare, atypical, or extreme symptoms that bona fide patients tend not to present. The response options are on a dichotomous scale (“True” vs. “False”), and the measure aims at detecting feigned psychopathology [40]. The item responses are grouped into five main scales, addressing the validity of symptoms related to Psychosis (P; evaluates the degree to which participants report unusual and bizarre psychotic symptoms that are not typically encountered in psychiatric populations), Low Intelligence (LI; assesses the degree to which participants simulate or exaggerate intellectual deficits through low performance on simple items), Neurological Impairment (NI; evaluates illogical or atypical neurological symptoms), Affective Disorders (AF; evaluates the degree to which participants present atypical symptoms of depression and anxiety), and Amnestic Disorders (AM; evaluates the degree to which participants report memory deficits that are inconsistent with patterns of impairment seen in brain dysfunction or injury). The total number of implausible symptoms endorsed by the subject represents the Total Score (TS), which is the main symptom validity scale of the SIMS. Indeed, the five SIMS subscales were not designed to detect the overreporting of mental health problems, but to determine which types of psychopathology respondents tend to overreport when the SIMS Total Score is above the cut-off value [39]. Different cut-off values have been used in the literature (i.e., ≥ 14 [25]; ≥ 17 [39]; ≥ 19 [32], and even ≥ 24 [41]). A recent meta-analytic study encompassing 4,180 protocols supported the claim that the specificity of the SIMS may be unsatisfactory when the traditional cut-offs (i.e., ≥ 15 and ≥ 17) are adopted [40]. The Italian version of the SIMS was translated by La Marca, Rigoni, Sartori, and Lo Priore [42].

Minnesota Multiphasic Personality Inventory-2 Restructured Form (MMPI-2-RF) [34]. The MMPI-2-RF is a 51-scale measure of personality and psychopathology with 338 items, selected from the 567 items on the complete MMPI-2. The response options are on a dichotomous scale (“True” vs. “False”). The MMPI-2-RF is comprised of: nine validity scales, most of which are revised versions of the MMPI-2 validity scales; nine Restructured Clinical (RC) scales, developed by Tellegen et al. and released in 2003 [43]; three Higher Order (HO) scales, derived from factor analyses to identify the basic domains of affect, thought, and behavior; 23 Specific Problem (SP) scales, intended to highlight important characteristics associated with particular RC scales; and revised versions of the Personality Psychopathology Five (PSY-5) scales, which link the MMPI-2-RF to a five-factor model of personality pathology [34]. The present study considered the following MMPI-2-RF validity scales: F-r, Fp-r, Fs, FBS-r, RBS, and K-r. The Italian version was translated by Sirigatti and Faravelli [44].

Statistical analysis and machine learning models

A first multivariate analysis of variance with covariates (MANCOVA) was run using the three research groups (Consistent Participants, Symptom Accentuators, Symptom Producers) as the independent variable and the SIMS Total Score and subscale scores as dependent measures. A second MANCOVA was run using the three research groups as the independent variable and MMPI-2-RF validity scale T-scores as dependent measures. Both analyses controlled for age and educational levels. The Bonferroni correction was applied to adjust confidence intervals; SPSS-25 software (SPSS Inc., Chicago, IL) automatically corrected the p-value for the number of comparisons. Scheffe’s [45] method was used to assess post hoc pair differences (p < 0.05). The effect sizes of the score differences between groups was also measured; values of .02, .13, and .26 were considered indicative of small, medium, and large effects, respectively [46]. The SPSS-25 statistical package was used for all analyses. ML analyses were run using WEKA 3.8 [47].

Results

SIMS

A 3 x 6 MANCOVA (groups x SIMS scales) showed a significant effect of group on the selected SIMS scales, V = .550, F (12, 246) = 7.777, p < .001, parη2 = .275. In more detail, the results for SIMS showed that the three research groups (Consistent Participants, Symptom Accentuators, Symptom Producers) obtained significantly different scores on the NI, LI, and TS scales. Furthermore, on the AF scale, there was a significant difference in scores between Consistent Participants and the other two groups. Lastly, there was a significant difference between Symptom Producers and the other participants on the P and AM scales. Table 2 shows the descriptive values of SIMS scores for each group.

Table 2. Comparison between consistent participants, accentuators, and symptom producers on SIMS mean scores.

SIMS Consistent Participants n = 49
Scores
M(SD)
Symptom Accentuators n = 44
Scores
M(SD)
Symptom Producers n = 39
Scores
M(SD)
F p parη2
Neurologic Impairment (NI) 1.31 (.94)
A
2.61 (2.26)
B
3.92 (2.21)
C
16.83 < .001 .210
Affective Disorder (AF) 5.73 (2.01)
A
8.14 (3.25)
B
8.46 (3.32)
B
11.24 < .001 .150
Psychosis (P) .90 (.90)
A
1.55 (1.39)
A
2.67 (2.18)
B
11.20 < .001 .150
Low Intelligence (LI) 1.06 (1.20)
A
2.11 (1.73)
B
4.31 (2.23)
C
32.92 < .001 .341
Amnestic Disorder (AM) 1.35 (1.17)
A
2.16 (1.84)
A
3.82 (2.78)
B
14.19 < .001 .183
Total Score (TS) 10.35 (4.05)
A
16.50 (5.77)
B
23.15 (6.29)
C
50.44 < .001 .443

Note. For each line, different letters indicate a significant difference between columns.

MMPI-2-RF

A 3 x 6 MANCOVA (groups x MMPI-2-RF selected scales) showed a significant effect of group on the MMPI-2-RF selected validity scales, V = .377, F (12, 246) = 4.758, p < .001, parη2 = .188. In more detail, results for the MMPI-2-RF showed that the three groups (Consistent Participants, Symptom Accentuators, Symptom Producers) obtained significantly different scores on the F-r and RBS scales. Furthermore, on the Fp-r and Fs scales, a significant difference was found between the Symptom Producers and the other two groups. On the FBS-r scale, scores of the consistent group significantly differed from those of the accentuating and producing groups. On the K-r scale, there was a significant difference between the consistent group and the producing group, but not between the accentuating group and either of the other groups. Table 3 shows the descriptive values of the MMPI-2-RF scales for each group.

Table 3. Comparison between consistent participants, accentuators, and symptom producers on MMPI-2-RF selected validity scale mean scores.

MMPI-2-RF Consistent Participants
n = 49
Scores
M (SD)
Symptom Accentuators
n = 44
Scores
M (SD)
Symptom Producers
n = 39
Scores
M (SD)
F p parη2
F-r 63.16 (8.18)
A
70.73 (12.64)
B
84.41 (13.98)
C
29.10 < .001 .314
Fp-r 58.84 (8.94)
A
62.86 (9.99)
A
72.64 (16.21)
B
11.56 < .001 .154
Fs 61.35 (12.06)
A
68.84 (17.57)
A
82.51 (19.26)
B
15.38 < .001 .195
FBS-r 56.33 (13.09)
A
65.41 (16.27)
B
72.74 (14.27)
B
11.71 < .001 .156
RBS 61.59 (8.79)
A
70.89 (15.13)
B
82.41 (16.65)
C
20.61 < .001 .245
K-r 44.16 (7.92)
A
40.80 (6.59)
A, B
38.62 (9.40)
B
3.08 .049 .046

Note. For each line, different letters indicate a significant difference between columns.

Feature selection and machine learning models

The recent focus on the lack of replicability in behavioral experiments has suggested that the discipline is facing a “replicability crisis.” One potential source of this problem is the frequent use of inferential statistics with misunderstood p values and underpowered experiments [48]. Recent methodological discussions relate to procedures that guarantee replicable results [49]. In summarizing their assessment of replicability, Szucs and Ioannidis [50] concluded that: “Assuming a realistic range of prior probabilities for null hypotheses, false report probability is likely to exceed 50% for the whole literature. In light of our findings, the recently reported low replication success in psychology is realistic, and worse performance may be expected for cognitive neuroscience” (p.1). The replication of experimental results may be distinguished according to exact versus broad replication [51]. Exact replication refers to replication that follows the exact same procedure of the original experiment, incorporating cross-validation. Cross-validation is generally a very good procedure for measuring the replicability of a given result. While it does not prevent model overfit, it still estimates true performance.

To avoid overfitting, cross-validation is regarded a compulsory step in ML analysis; nonetheless, its use is very limited in the analysis of psychological experiments. There are a number of cross-validation procedures, but one that consistently guarantees a good result is the so-called k-fold method (or k-fold cross-validation). The k-fold cross-validation is a technique used to evaluate predictive models by repeatedly partitioning the original sample into a training set to train the model, and a validation set to evaluate it. Specifically, in this paper, we adopted a 10-fold cross-validation procedure, in which the original sample was randomly partitioned into 10 equal-size subsamples, the folds. Of the 10 subsamples, a single subsample is retained as validation data for testing the model, and the remaining 10–1 = 9 subsamples are used as training data. Such process is repeated 10 times, with each of the 10 folds are then used exactly once as validation data. The results from the 10 folds were then averaged to produce a single estimation of prediction accuracy. Most psychometric investigations do not address the problem of generalization outside the sample used to develop the model. Clearly, the avoidance of cross-validation yields results that are overoptimistic and that may not replicate when the model is applied to out-of-sample data. This result was recently confirmed by Bokhari and Hubert [52] when they re-analysed the results of the MacArthur Violence Risk Assessment Study using ML tree models and cross-validation. Also, Pace et al. (2019) [53], in discussing the results of the b test [54] (a test for detecting malingered cognitive symptoms), similarly observed that a decision rule developed on the whole dataset yielded a classification accuracy of 88% on the whole dataset; however, after 10-fold cross-validation, the accuracy dropped to 66%. For the reasons reported above, in the present study, all ML analyses were conducted using 10-fold cross-validation methods that previous research had shown to be robust in replication studies.

The identification of the most informative attributes (or features, or predictors), called “feature selection,” is a widely used procedure in ML [55]. The feature selection is a very powerful means to build a classification model that can detect accentuators and symptom producers as accurately as possible. In fact, it permits to remove redundant and irrelevant features, increasing the model generalization and reducing overfitting and noise in the data. In order to identify the most discriminating features for classification, we ran a trial and error procedure using random forest as model. This model consists of many decision trees, each built from a random extraction of observations from the dataset and a random extraction of features. The random extraction is repeated many times, finally selecting the set of features that maximizes the model accuracy. The selected features at the top of trees are generally more important than those selected at end nodes, because the top splits typically produce larger information gains. Following this, we list the most important features for classification accuracy. Based on the analysis, the predictors used to develop the ML models were age, neurological impairment, affective disorders, psychosis, SIMS (TS), low intelligence (LI), amnestic disorders (AM), F-r, Fp-r, Fs, and RBS.

These 11 features were entered into different ML classifiers, which were trained (using a 10-fold cross-validation procedure) to classify every subject as belonging to one of the three groups of interest (Consistent Participants, Symptom Accentuators, Symptom Producers). To ensure that results are stable across different classifiers, not depending on the specific model assumptions, we selected the following classifiers as representative of different categories (from regression to classification trees, to Bayesian statistics): naïve bayes, logistic regression, simple logistic, support vector machine, and random forest (WEKA Manual for Version 3.7.8) [56]. The results and accuracies among different classifiers, as measured by the percentage of participants correctly classified, AUC, and F1 score, are reported in Table 4.

Table 4. Accuracies of the five ML classifiers as measured by percentage of participants correctly classified, AUC, and F1.

Classifier Accuracy (%) AUC F1
Naïve Bayes 71.79% 0.85 0.71
Logistic Regression 70.94% 0.84 0.71
Simple Logistics 66.67% 0.83 0.66
Support Vector Machine 69.23% 0.81 0.69
Random Forest 71.79% 0.86 0.72

Note. Perfect classification would be equivalent to AUC = 1 and the F1 score = 1.

AUC stands for area under the curve in ROC analysis and the F1 score is defined as the weighted harmonic mean of the precision and recall of the test (note that precision) is the number of correct positive results divided by the number of all positive results returned by the classifier, and recall (r) is the number of correct positive results divided by the number of all relevant samples, or all samples that should have been identified as positive).

All classifiers were based on different assumptions and representative of different classes of classifiers. However, they all yielded similarly accurate results (in the range of 66.67–71.79%). ML models, such as those reported above, are difficult to interpret: the operations computed by the algorithm to identify the single participant as Consistent, Symptoms Accentuator, or Symptoms Producer are unclear. To better understand the logic on which the classifications results are based, a simpler model, called OneR, was run [57]. This classifier is clearer in terms of transparency of the operations computed by the algorithm and it permits to highlight easily the classification logic. The accuracy of this model was 66.67%, and it followed the following rules:

  • if the SIMS score is < 13.5, then the subject is a consistent participant;

  • if the SIMS score is < 18.5 or ≥ 34.5, then the subject is an accentuator; and

  • if the SIMS score is < 34.5, then the subject is a symptom producer.

According to the classification process followed by the OneR algorithm, amongst the parameters considered, the SIMS score emerged the feature on which the algorithm based its classification efficiency. According to the aforementioned classification rules, indeed, OneR identified cut-off for the SIMS score (i.e., 13.5, 18.5, and 34.5) to distinguish symptom producers from symptom accentuators and consistent participants.

Discussion

In forensic damage evaluations, it is not always easy to determine whether a given symptom presentation is bona fide or non-credible. For this reason, researchers have designed tests to detect feigning and included, in personality and psychopathological inventories, validity scales to identify what Paulhus defined as “responding bias” [58]—the systematic tendency to answer self-report items in a way that interferes with accurate self-presentation. However, such measures are unable to distinguish between persons who exaggerate existing symptoms and persons who completely fabricate symptoms. In the context of damage evaluations, the ability to differentiate between consistent participants, accentuators, and symptom producers can assist courts in determining the appropriate rates of damage and proportional indemnity. Thus, the present study sought to identify criteria for the identification of accentuators. We analyzed differences in the SIMS and MMPI-2-RF validity scale scores among participants previously classified as consistent participants, accentuators and symptom producers. The SIMS is a widely used tool for identifying feigning, while the MMPI-2-RF validity scales are used to investigate responding bias.

The results for the SIMS indicated that the TS, NI, and LI scales were able to distinguish among consistent participants, accentuators and symptom producers. In contrast, the AF scale could discriminate between consistent participants and symptom producers but was unable to identify whether feigners were exaggerating or fabricating their symptoms. The P and AM scales were able to distinguish between symptom producers and consistent participants/ accentuators. Finally, the TS provided an overall estimate of the likelihood that a respondent was fabricating or exaggerating symptoms of psychiatric or cognitive dysfunction. These results not only confirm the findings reported in the literature [26,27,31,33] that TS is one of the best overall indicators of feigning, but they also suggest that TS is capable of distinguishing between accentuators and symptom producers.

In this research, we used the traditional cut-off value of ≥ 14, which has been proven to show remarkably high sensitivity [26,27] and good specificity. We observed that accentuators obtained scores just over the cut-off value (M = 16.50), whereas symptom producers obtained significantly higher scores (M = 23.15). We argue that the presentation of different optimal cut-off values in the literature (≥ 14 [26]; ≥ 19 [33], and even ≥ 24 [41]) might reflect a difference in these two subtypes of inconsistent behavior that speak to possible malingering. Considering that a lower cut-off value (e.g., ≥ 14 or ≥16) might incorrectly identify an accentuator as a symptom producers, whereas a higher cut-off value (e.g., ≥ 19 or ≥ 24) might incorrectly identify an accentuator as an consistent respondent, it would be better to use two different cut-offs to identify accentuators and symptoms , respectively.

The NI scale reflects the degree to which a respondent endorses illogical or highly atypical neurological symptoms. Our results showed that the NI scale was not only useful in discriminating between feigners and consistent respondents, but also in differentiating between the two subtypes of inconsistent behavior that speak to possible malingering: accentuators’ scores (M = 2.61) were quite close to the cut-off value recommended by Smith and Burger (≥ 2), whereas symptom producers obtained significantly higher scores (M = 3.92). The LI scale reflects the degree to which a respondent endorses cognitive incapacity or intellectual deficits inconsistent with the capacities and knowledge typically present in individuals with cognitive or intellectual deficits. In the present study, the LI scale distinguished between consistent respondents (M = 1.06), accentuators (M = 2.11), and symptom producers (M = 4.31) when the recommended cut-off value (≥ 2) was used. Again, the LI scale was not only able to identify feigners, but it was also able to distinguish between accentuators (whose scores were distributed around the cut-off value) and symptom producers (who obtained significantly higher scores). In summary, it would be useful to set two cut-off values for the aforementioned SIMS scales for use in identifying accentuators.

The P scale reflects the degree to which a respondent endorses unusual psychotic symptoms that are not typically present in actual psychiatric patients. In this study, using the recommended cut-off value (≥ 1), we found that the P scale could not distinguish between consistent respondents (M = 0.90) and accentuators (M = 1.55); however, it could distinguish both of these two groups from symptom producers (M = 2.67). It is worth mentioning that, even though the P scale could identify up to 91.5% of participants who malingered psychotic symptoms [27], individuals with an alleged adjustment disorder associated with depression and anxiety obtained scores past the cut-off value, indicating endorsement of psychotic symptoms. This result is in line with the literature [40], which indicates that malingerers tend to overgeneralize their symptoms.

The AM scale reflects the degree to which a respondent endorses symptoms of memory impairment that are inconsistent with patterns of impairment seen in brain dysfunction or injury. The scale demonstrates high sensitivity for its target psychopathology [33,59,60]. In the present study, we used the recommended cut-off value (≥ 2) and found that the AM scale behaved like the P scale: it could not distinguish between consistent respondents (M = 1.35) and accentuators (M = 2.16); however, it could separate both of these groups from symptom producers (M = 3.82).

The AF scale reflects the degree to which a respondent endorses atypical feelings and symptoms of depression and anxiety. This scale also shows high sensitivity for its target psychopathology [33,59,60], correctly identifying up to 100% of individuals who malinger depression when the recommended cut-off value (≥ 5) is used [27]. In our study, using a cut-off value of ≥ 5, AF successfully distinguished between consistent participants (M = 5.73) and feigners, but it was unable to discriminate between accentuators (M = 8.14) and symptom producers (M = 8.46). Despite its high sensitivity, the scale has been criticized because it overlaps with genuine depressive symptoms [61], thus increasing the risk of false positives [27], as observed in our study.

Our results on the MMPI-2-RF validity scales indicated that the F-r and RBS scales successfully discriminated between consistent individuals, accentuators, and symptom producers. The F-r scale is comprised of 32 items designed to detect unusual or infrequent responses in the normative population. High scores indicate the overreporting of a large range of psychological, cognitive, and somatic symptoms. According to Sellbom and Bagby [62], F-r was designed to more broadly identify infrequent responses across populations; thus, it is likely to work better when less severe psychopathology is overreported, as was the case in the present study, in which participants tended to amplify anxious and depressive symptoms. The RBS scale is comprised of 28 items that measure overreporting as an unusual mix of responses associated with non-credible memory complaints [34]. For both scales, we observed lower scores in consistent participants relative to symptom producers, with the largest effect size in discriminating between accentuators and symptom producers. These results are consistent with the findings of Wygant et al. [63,64], which underlined that F-r and RBS perform best in predicting malingering criteria.

The Fs and Fp-r scales were able to distinguish between symptom producers and consistent respondents/accentuators. The Fp-r scale’s 21 items analyze infrequent responses within psychiatric inpatient samples. An elevated score indicates an individual’s self-unfavorable reporting and exaggerated psychopathology. In particular, the scale focuses on identifying symptoms that are rarely reported among bona-fide patients with mental illness [65]. The Fs scale is comprised of 16 items describing somatic symptoms that are infrequently observed in medical patient populations. A high score suggests feigning. Both the Fp-r and the Fs scale differentiated symptom producers from the other two groups and showed that, contrary to symptom producers, accentuators did not feign psychotic psychopathology; also, contrary to consistent participants, accentuators did not inflate somatic symptoms more frequently.

The FBS-r scale could distinguish between consistent respondents and feigners, regardless of whether they exaggerated or fabricated their symptoms. The scale was designed for application in a forensic, rather than a clinical, context, and it is comprised of 31 items that define somatic and cognitive symptoms that are rarely reported by personal injury claimants. A high score is associated with overreporting. Specifically, Fs and FBS-r focus on detecting non-credible somatic and/or neurocognitive complaints [66]. Our classification results were consistent with the interpretative guidelines recommended by Ben-Porath and Tellegen [67,68] with regard to suspected symptom exaggeration at T-scores of 90 for F-r, 70 for Fp-r, and 80 for Fs and FBS-r. Similar conclusions were also found by Wygant et al. [69] and Gervais et al. [70] for RBS cut-off values of 80.

Finally, the K-r scale could discriminate between consistent participants and symptom producers but could not differentiate either of these groups from accentuators.

Conclusion

This preliminary study yielded encouraging results, highlighting that some scales of the SIMS (TS, NI, and LI) and MMPI-2-RF (F-r and RBS) were able to discriminate between consistent participants, accentuators, and symptom producers. The idea behind this research was to identify objective indicators not only to discriminate between consistent and inconsistent test-takers, but also to distinguish between different degrees of inconsistency (exaggeration vs. fabrication), with the aim to offer practitioners and clinicians an empirical-based tool to perform their assessment.

One of the main limitations of the study is the sample size: a small sample increases the likelihood of a type II error skewing the results. Another limitation is the fact that mental health professionals’ evaluations used in this study are characterized by a certain degree of subjectivity; therefore, they could partially be a product of biases and beliefs of the professionals who took part in the research. A further limit of this study concerns the SIMS and the rationale behind its creation: this tool was developed for forensic screening, and it thereby covers a number of feigned dysfunctions that are commonly encountered in criminal proceedings (e.g., intellectual disability, psychotic disorder, amnestic syndromes), wherein defendants might aim at obtaining a diminished capacity plea. Accordingly, the SIMS concerns extreme dysfunctions with lower base rates outside of criminal settings (e.g., a damage evaluation setting), wherein milder and more moderate impairments that are not specifically addressed by the SIMS are more common [40].

Future research should use a computerized version of these tests, enabling researchers to also record behavioral indicators (e.g., reaction time, mouse trajectories); such data has been demonstrated to be useful in faking-good research [38,7179]. Moreover, future research could implement experimental designs using other tests and questionnaires, exploring empirical differences in scenarios concerning not only feigned mental disorders but also feigned cognitive impairment.

Data Availability

A repository for the data has been created in Zenodo. It can be accessed via this link: https://doi.org/10.5281/zenodo.3548270

Funding Statement

The authors received no specific funding for this work.

References

  • 1.Capri P, Giannini A, Torbidone E, Del Vecchio S, Iecher F, Cesari G, et al. Linee guida per l'accertamento e la valutazione psicologico-giuridica del danno alla persona. Ordine degli Psicologi del Lazio. 2012 [Google Scholar]
  • 2.Cimino L, Vasapollo D. Considerazioni in merito all’uso dei test mentali nella quantificazione del danno biologico di natura psichica. Rivista di Criminologia, Vittimologia e Sicurezza. 2009;3: 49–59. [Google Scholar]
  • 3.American Psychiatric Association. Diagnostic and statistical manual of mental disorder, 5th edition. 2013.
  • 4.Mittenberg W, Patton C, Canyock EM, Condit DC. Base rates of malingering and symptom exaggeration. J Clin Exp Neuropsychol. 2002;24: 1094–1102. 10.1076/jcen.24.8.1094.8379 [DOI] [PubMed] [Google Scholar]
  • 5.Santamaría P, Ramírez PC, Ordi HG. Prevalencia de simulación en incapacidad temporal: Percepción de los profesionales de la salud. Clin Salud. 2013;24: 139–151. [Google Scholar]
  • 6.Sartori G, Zangrossi A, Orrù G, Monaro M. Detection of malingering in psychic damage ascertainment In: Ferrara S, editor. P5 medicine and justice. Springer; 10.1007/978-3-319-67092-8_21 (2017) [DOI] [Google Scholar]
  • 7.Chafetz M, Underhill J. Estimated costs of malingered disability. Arch Clin Neuropsychol. 2013;28: 633–639. 10.1093/arclin/act038 [DOI] [PubMed] [Google Scholar]
  • 8.Knoll J, Resnick PJ. The detection of malingered post-traumatic stress disorder. Psychiatr. Clin. North Am. 2006;29: 629–647. 10.1016/j.psc.2006.04.001 [DOI] [PubMed] [Google Scholar]
  • 9.Young G. Resource material for ethical psychological assessment of symptom and performance validity, including malingering. Psychological Injury and Law. 2014;7(3): 206–235. [Google Scholar]
  • 10.Rogers R, Bender S D. Clinical assessment of malingering and deception. 4th ed. Guilford Publications; 2018 [Google Scholar]
  • 11.Lipman FD. Malingering in personal injury cases. Temple L Rev. 1962;35: 141–162. [Google Scholar]
  • 12.Resnick PJ. The malingering of posttraumatic disorders In: Rogers R, editor. Clinical assessment of malingering and deception, second edition. New York, NY: Guildford Press; 1997. pp. 84–103. [Google Scholar]
  • 13.Halligan PW, Bass C, Oakley DA (Eds.). Malingering and illness deception. New York, NY: Oxford University Press; 2003. [Google Scholar]
  • 14.Rogers R. (1992). Structured interview of reported symptoms. Odessa, FL: Psychological Assessment Resources. [Google Scholar]
  • 15.Rogers R., Bagby R. M., & Dickens S. E. (1992). Structured interview of reported symptoms (SIRS) and professional manual. Odessa, FL: Psychological Assessment Resources. [Google Scholar]
  • 16.Rogers R, Sewell KW, Gillard ND. Structured Interview of Reported Symptoms (2nd ed.). Odessa, FL: Psychological Assessment Resources; 2010. [Google Scholar]
  • 17.Green D, Rosenfeld B. Evaluating the gold standard: a review and meta- analysis of the structured interview of reported symptoms. Psychological Assessment. 2011; 23: 95–107. 10.1037/a0021149 [DOI] [PubMed] [Google Scholar]
  • 18.Rogers R, Gillis J R, & Bagby R M. The SIRS as a measure of malingering: A validation study with a correctional sample. Behavioral Sciences and the Law. 1990; 8: 85–92. [Google Scholar]
  • 19.Green D, Rosenfeld B, Belfi B. New and improved? A comparison of the original and revised versions of the Structured Interview of Reported Symptoms. Assessment. 2013; 20: 210–218. 10.1177/1073191112464389 [DOI] [PubMed] [Google Scholar]
  • 20.Tombaugh TN. Test of Memory Malingering (TOMM). New York, NY: Multi Health Systems; 1996. [Google Scholar]
  • 21.Rees LM, Tombaugh TN, Gansler DA, Moczynski NP. Five validation experiments of the Test of Memory Malingering (TOMM). Psychol. Assess. 1998,10:10–20 [Google Scholar]
  • 22.Viglione DJ, Giromini L, Landis P. The development of the Inventory of Problems–29: A brief self-administered measure for discriminating bona fide from feigned psychiatric and cognitive complaints. J Pers Assess. 2017;99: 534–544. 10.1080/00223891.2016.1233882 [DOI] [PubMed] [Google Scholar]
  • 23.Giromini L, Viglione DJ, Pignolo C, Zennaro A. A Clinical Comparison, Simulation Study Testing the Validity of SIMS and IOP-29 with an Italian Sample. Psychological Injury and Law. 2018, 11: 340–350. [Google Scholar]
  • 24.Roma P, Giromini L, Burla F, Ferracuti S, Viglione DJ, & Mazza C. Ecological Validity of the Inventory of Problems-29 (IOP-29): an Italian Study of Court-Ordered, Psychological Injury Evaluations Using the Structured Inventory of Malingered Symptomatology (SIMS) as Criterion Variable. Psychol. Inj. and Law. 2019; 1–9. 10.1007/s12207-019-09368-4 [DOI] [Google Scholar]
  • 25.Giromini L, Barbosa F, Coga G, Azeredo A, Viglione DJ, Zennaro A. Using the inventory of problems– 29 (IOP-29) with the Test of Memory Malingering (TOMM) in symptom validity assessment: A study with a Portuguese sample of experimental feigners. Applied Neuropsychology: Adult. 2019, 1–13. 10.1080/23279095.2019.1570929 [DOI] [PubMed] [Google Scholar]
  • 26.Smith GP, Burger GK. Detection of malingering: Validation of the Structured Inventory of Malingered Symptoms (SIMS). J Am Acad Psychiatry Law. 1997;25: 183–189. [PubMed] [Google Scholar]
  • 27.Edens JF, Otto RK, Dwyer T. Utility of the Structured lnventory of Malingered Symptomatology in identifying persons motivated to malinger psychopathology. J Am Acad Psychiatry Law. 1999;27: 387–396. [PubMed] [Google Scholar]
  • 28.Rogers R, Jackson RL, Kaminski PL. Factitious psychological disorders: The overlooked response style in forensic evaluations. J Forensic Psychol Pract. 2005;5: 21–41. [Google Scholar]
  • 29.Jelicic M, Hessels A, Merckelebach H. Detection of feigned psychosis with the Structured Inventory of Malingered Symptomatology (SIMS): A study of coached and uncoached simulators. J Psychopathol Behav Assess. 2006;8: 19–22. [Google Scholar]
  • 30.Edens JF, Campbell JS, Weir JM. Youth psychopathy and criminal recidivism: A meta-analysis of the psychopathy checklist measures. Law Hum Behav. 2007;31: 53–75. 10.1007/s10979-006-9019-y [DOI] [PubMed] [Google Scholar]
  • 31.Poythress NG, Edens JF, Watkins MM. The relationship between psychopathic personality features and malingering symptoms of major mental illness. Law Hum Behav. 2001;25: 567–582. 10.1023/a:1012702223004 [DOI] [PubMed] [Google Scholar]
  • 32.Heinze MC, Purisch AD. Beneath the mask: Use of psychological tests to detect and subtype malingering in criminal defendants. J Forensic Psychol Pract. 2001;1: 23–52. [Google Scholar]
  • 33.Clegg C, Fremouw W, Mogge N. Utility of the Structured Inventory of Malingered Symptomatology (SIMS) and the Assessment of Depression Inventory (ADI) in screening for malingering among outpatients seeking to claim disability. J Forens Psychiatry Psychol. 2009;20: 239–254. [Google Scholar]
  • 34.Ben-Porath YS, Tellegen A. The Minnesota Multiphasic Personality Inventory–2 Restructured Form: Manual for administration, scoring, and interpretation. Minneapolis, MN: University of Minnesota Press; 2008. [Google Scholar]
  • 35.Sharf AJ, Rogers R, Williams MM, Henry SA. The Effectiveness of the MMPI-2-RF in Detecting Feigned Mental Disorders and Cognitive Deficits: a MetaAnalysis. Journal of Psychopathology and Behavioral Assessment. 2017; 39: 441–455. [Google Scholar]
  • 36.Monaro M, Gamberini L, Zecchinato F, Sartori G. False identity detection using complex sentences. Front Psychol. 2018;9: 283 10.3389/fpsyg.2018.00283 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Monaro M, Toncini A, Ferracuti S, Tessari G, Vaccaro MG, De Fazio P, et al. The detection of malingering: A new tool to identify made-up depression. Front Psychiatry. 2018;9: 249 10.3389/fpsyt.2018.00249 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Mazza C, Monaro M, Orrù G, Burla F, Colasanti M, Ferracuti S and Roma P. Introducing machine learning to detect personality faking-good in a male sample: A new model based on Minnesota Multiphasic Personality Inventory-2 restructured form scales and reaction times. Front. Psychiatry. 2019;10: 389 10.3389/fpsyt.2019.00389 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Widows MR, Smith GP. Structured Inventory of Malingered Symptomatology. Odessa, FL: Psychological Assessment Resources; 2005. [Google Scholar]
  • 40.van Impelen A, Merckelbach H, Jelicic M, Merten T. The Structured Inventory of Malingered Symptomatology (SIMS): A systematic review and meta-analysis. Clin Neuropsychol. 2014;28: 1336–1365. 10.1080/13854046.2014.984763 [DOI] [PubMed] [Google Scholar]
  • 41.Wisdom NM, Callahan JL, Shaw TG. Diagnostic utility of the Structured Inventory of Malingered Symptomatology to detect malingering in a forensic sample. Arch Clin Neuropsychol. 2010;25: 118–125. 10.1093/arclin/acp110 [DOI] [PubMed] [Google Scholar]
  • 42.La Marca S, Rigoni D, Sartori G, Lo Priore C. Structured Inventory of Malingered Symptomatology (SIMS): manuale Adattamento italiano. Firenze: Giunti O.S.; 2011. [Google Scholar]
  • 43.Tellegen A, Ben-Porath YS, McNulty JL, Arbisi PA, Graham JR, Kaemmer B. The MMPI–2 Restructured Clinical (RC) scales: Development, validation and interpretation. Minneapolis: University of Minnesota Press; 2003 [Google Scholar]
  • 44.Sirigatti S, Faravelli C. MMPI-2 RF: Adattamento italiano Taratura, proprietà psicometriche e correlati empirici. Firenze: Giunti O.S.; 2012. [Google Scholar]
  • 45.Scheffé H. Analysis of variance. London: John Wiley & Sons; 1959. [Google Scholar]
  • 46.Pierce CA, Block RA, Aguinis H. Cautionary note on reporting eta-squared values from multifactor ANOVA designs. Educ Psychol Meas. 2004;64: 916–924. [Google Scholar]
  • 47.Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten I. The WEKA data mining software: An update. SIGKDD Explor Newsl. 2009;11: 10–18. [Google Scholar]
  • 48.Baker M. Reproducibility: Seek out stronger science. Nature. 2016;537: 703–704. [Google Scholar]
  • 49.Browne MW. Cross-validation methods. J Math Psychol. 2000;44: 108–132. 10.1006/jmps.1999.1279 [DOI] [PubMed] [Google Scholar]
  • 50.Szucs D, Ioannidis JPA. Empirical assessment of published effect sizes and power in the recent cognitive neuroscience and psychology literature. PLoS Biol. 2017;15(3). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Cumming G. Replication and p intervals: p values predict the future only vaguely, but confidence intervals do much better. Perspect Psychol Sci. 2008;3: 286–300. 10.1111/j.1745-6924.2008.00079.x [DOI] [PubMed] [Google Scholar]
  • 52.Bokhari E, Hubert L. The lack of cross-validation can lead to inflated results and spurious conclusions: A re-analysis of the MacArthur violence risk assessment study. J Classif. 2018;35: 147–171. [Google Scholar]
  • 53.Pace G, Orrù G, Monaro M, Gnoato F, Vitaliani R, Boone KB, Gemignani A, Sartori G. Malingering detection of cognitive impairment with the b test is boosted using machine learning. Front. Psychol. 2019;10: 1650 10.3389/fpsyg.2019.01650 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Boone KB, Lu P, Herzberg D. The b-test manual. Los Angeles: Western Psychological Service; 2002. [Google Scholar]
  • 55.Hall MA. Correlation-based Feature Selection for Machine Learning. The University of Waikato, Hamilton, 1999. [Google Scholar]
  • 56.Bouckaert RR, Frank E, Hall M, Kirkby R, Reutemann P, Seewald A, et al. WEKA manual for version 3-7-8; 2013. [Google Scholar]
  • 57.Holte RC. Very simple classification rules perform well on most commonly used datasets. Mach Learn. 1993;11: 63–90. [Google Scholar]
  • 58.Paulhus DL. Socially desirable responding: The evolution of a construct In: Braun HI, Jackson DN, Wiley DE, editors. The role of constructs in psychological and educational measurement. Mahwah, NJ: Lawrence Erlbaum; 2002. pp. 49–69. [Google Scholar]
  • 59.Benge JF, Wisdom NM, Collins RL, Franks R, Lemaire A, Chen DK. Diagnostic utility of the Structured Inventory of Malingered Symptomatology for identifying psychogenic non-epileptic events. Epilepsy Behav. 2012;24: 439–444. 10.1016/j.yebeh.2012.05.007 [DOI] [PubMed] [Google Scholar]
  • 60.Giger P, Merten T, Merckelbach H, Oswald M. Detection of feigned crime-related amnesia: A multi–method approach. J Forensic Psychol Pract. 2010;10: 440–463. [Google Scholar]
  • 61.Widder B. Beurteilung der beschwerdenvalidität In: Widder B, Gaidzik PW, editors. Begutachtung in der neurologie (2nd ed.). Stuttgart: Thieme; 2011. pp. 64–92. [Google Scholar]
  • 62.Sellbom M, Bagby MR. Detection of overreported psychopathology with the MMPI-2-RF form validity scales. Psychol Assess. 2010;22: 757–767. 10.1037/a0020825 [DOI] [PubMed] [Google Scholar]
  • 63.Wygant DB, Ben-Porath YS, Arbisi PA. Development and initial validation of a scale to detect infrequent somatic complaints. Poster session presented at the 39th Annual Symposium on Recent Developments of the MMPI–2/MMPI–A, Minneapolis, MN. 2004 (May).
  • 64.Wygant DB, Anderson JL, Sellbom M, Rapier JL, Allgeier LM, Granacher RP. Association of the MMPI–2 restructured form (MMPI–2–RF) validity scales with structured malingering criteria. Psychol Inj Law. 2011;4: 13–23. [Google Scholar]
  • 65.Rogers R. Clinical assessment of malingering and deception (3rd ed.). New York, NY: Guilford Press; 2008. [Google Scholar]
  • 66.Wygant DB, Ben-Porath YS, Arbisi PA, Berry DTR, Freeman DB, Heilbronner RL. Examination of the MMPI–2 Restructured Form (MMPI–2-RF) validity scales in civil forensic settings: Findings from simulation and known-group samples. Arch Clin Neuropsychol. 2009;24: 671–680. 10.1093/arclin/acp073 [DOI] [PubMed] [Google Scholar]
  • 67.Ben-Porath YS. Interpreting the MMPI-2-RF. University of Minnesota Press; 2012, 245–250. [Google Scholar]
  • 68.Ben-Porath YS, Tellegen A. MMPI-2-RF Manuale di istruzioni Adattamento italiano a cura di Sirigatti, S., & Casale, S. (2012), Giunti OS, Firenze; 2011, 49–56. [Google Scholar]
  • 69.Wygant DB, Sellbom M, Gervais RO, Ben-Porath YS, Stafford KP, Freeman DB, et al. Further validation of the MMPI-2 and MMPI-2-RF response bias scale: Findings from disability and criminal forensic settings. Psychol Assess. 2010;22: 745–756. 10.1037/a0020042 [DOI] [PubMed] [Google Scholar]
  • 70.Gervais RO, Ben-Porath YS, Wygant DB, Sellbom M. Incremental validity of the MMPI-2-RF over-reporting scales and RBS in assessing the veracity of memory complaints. Arch Clin Neuropsychol. 2010;25: 274–284. 10.1093/arclin/acq018 [DOI] [PubMed] [Google Scholar]
  • 71.Roma P, Verrocchio MC, Mazza C, Marchetti D, Burla F, Cinti ME, et al. Could time detect a faking-good attitude? A study with the MMPI-2-RF. Front. Psychol. 2018;9: 1064 10.3389/fpsyg.2018.01064 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Roma P, Mazza C, Mammarella S, Mantovani B, Mandarelli G, Ferracuti S. Faking-good behavior in self-favorable scales of the MMPI-2. Eur J Psychol Assess. 2019; 1–9. [Google Scholar]
  • 73.Roma P, Mazza C, Ferracuti G, Cinti ME, Ferracuti S, Burla F. Drinking and driving relapse: Data from BAC and MMPI-2. PLoS ONE. 2019;14: 1 10.1371/journal.pone.0209116 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Mazza C, Burla F, Verrocchio MC, Marchetti D, Di Domenico A, Ferracuti S, Roma P. MMPI-2-RF Profiles in Child Custody Litigants. Front. Psychiatry. 2019;10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Burla F, Mazza C, Cosmo C, Barchielli B, Marchetti D, Verrocchio MC, & Roma P. Use of the Parents Preference Test in Child Custody Evaluations: Preliminary Development of Conforming Parenting Index. Mediterranean Journal of Clinical Psychology. 2019;7(3). [Google Scholar]
  • 76.Roma P, Piccinni E, Ferracuti S. Using MMPI-2 in forensic assessment. Rassegna Italiana di Criminologia. 2016;10(2): 116–122. [Google Scholar]
  • 77.Roma P, Pazzelli F, Pompili M, Girardi P, Ferracuti S. Shibari: double hanging during consensual sexual asphyxia. Archives of Sexual Behavior. 2013;42(5): 895–900. 10.1007/s10508-012-0035-3 [DOI] [PubMed] [Google Scholar]
  • 78.Roma P, Ricci F, Kotzalidis GD, et al. MMPI-2 in child custody litigation: A comparison between genders. Eur J Psychol Assess. 2014;30(2): 110–116. 10.1027/1015-5759/a000192 [DOI] [Google Scholar]
  • 79.Verrocchio MC, Marchetti D, Roma P, Ferracuti S. Relational and psychological features of high-conflict couples who engaged in parental alienation. Ricerche di Psicologia. 2018;41(4): 679–692. [Google Scholar]

Decision Letter 0

Stephan Doering

1 Oct 2019

PONE-D-19-22331

SEARCHING FOR INDICATORS TO DISTINGUISH ACCENTUATORS FROM SIMULATORS: A PRELIMINARY STUDY ON MALINGERING SUBTYPES USING THE SIMS AND MMPI-2-RF

PLOS ONE

Dear Dr Roma,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

We would appreciate receiving your revised manuscript by Nov 15 2019 11:59PM. When you are ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter.

To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). This letter should be uploaded as separate file and labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. This file should be uploaded as separate file and labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. This file should be uploaded as separate file and labeled 'Manuscript'.

Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

We look forward to receiving your revised manuscript.

Kind regards,

Stephan Doering, M.D.

Academic Editor

PLOS ONE

Journal requirements:

When submitting your revision, we need you to address these additional requirements.

Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

http://www.journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and http://www.journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

1. We note that you have indicated that data from this study are available upon request. PLOS only allows data to be available upon request if there are legal or ethical restrictions on sharing data publicly. For more information on unacceptable data access restrictions, please see http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions.

In your revised cover letter, please address the following prompts:

a) If there are ethical or legal restrictions on sharing a de-identified data set, please explain them in detail (e.g., data contain potentially sensitive information, data are owned by a third-party organization, etc.) and who has imposed them (e.g., an ethics committee). Please also provide contact information for a data access committee, ethics committee, or other institutional body to which data requests may be sent.

b) If there are no restrictions, please upload the minimal anonymized data set necessary to replicate your study findings as either Supporting Information files or to a stable, public repository and provide us with the relevant URLs, DOIs, or accession numbers. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories.

We will update your Data Availability statement on your behalf to reflect the information you provide.

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Partly

Reviewer #2: Partly

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: No

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: Malingering detection is an important issue in clinical and forensic settings that should be based, not on clinical impressions or judgements, but on the exigencies of a reliable technique grounded on replicable empirical findings. In the evaluation, a multimethod approach is required combining clinical interviews and psychometric instruments, of which the MMPI is the most extensively used (Graham, 2011, Greene, 2011, McDermott, 2012, Osuna et al. 2015, Rogers et al., 2003, Vilariño et al. 2013). So, the aim of this study is very interesting. As the authors comment, it is a preliminary study. However, before it can be published it would be desirable to introduce important changes to improve this paper.

It would be desirable to modify the title since it is excessively generic and the study has been conducted in individuals diagnosed of adjustment disorder with mixed anxiety and depressed mood.

The introduction requires a better organization to keep a sequence in explaining malingering. For example, in my opinion paragraph in page 5, line 93 should be on page 4 line 79. Also, a better explanation on instruments to detect malingering should be included with the correspondent references.

Also, a better structure is required for the materials and method section. There are several questions no clarified in the paper. Information on the participants, sample size, aetiology of psychological damage, and reasons why they are assessed in the Laboratory of Clinical Psychology must be included. Logically the distribution in each of the three established groups is subject to bias. Table 1 can be deleted as it does not provide more information than is included in the text.

The conclusions section should be improved and not repeating the results obtained.

Minor revisions:

In the results section, at all times the decimals must be represented with points (Tables 3 and 4).

Page 17, line 371: To delete the initial ‘.

Reviewer #2: The manuscript entitled, " Searching for indicators to distinguish accentuators from simulators: a preliminary study using the SIMS and the MMPI 2 RF," uses an interesting methodology and advanced statistics to discern possible test markers of partial and full malingering. There are some difficulties with the terminology used that needs to be addressed. The English is general well written, but I do make some suggestions below. I have major concerns about the methods section as written, the results, and the conclusions, but all should be correctable.

Title. Difficulties that I have with the title illustrate some of the major points of correction for the paper. Remove "Searching for." calling the symptom exaggeration group accentuators is fine. Simulators makes sense as presented, but the research design in this field uses simulators compared to controls, with simulators given instructions to feign exaggeration for monetary gain. Therefore, it might be best to find another term, such as symptom producers. If so, perhaps the accentuators could be called symptom accentuators. Better yet, try partial symptom accentuators compared to full symptom producers. I would not use partial and full malingerers as the terms for the following reason. It is not clear to me that the study is actually getting at malingering. There is no valid basis for attributing malingering in full or in part based on the types of inconsistencies used to differentiate the groups. Of course, the goal is to have the study lead to better indicators of full and partial malingerers, and that could go as the long term objectives of the research program along the indicated lines, but this first study is aimed at getting initial test markers of different degrees of inconsistencies as defined operationally by the study. Granted, there are three groups of patients in the study. The first is called presumably honest. I suggest referred to this group as Fully Consistent or Consistent, and the other two groups as Moderately Inconsistent as Excessively Inconsistent. Continuing with the title, use pilot study, not preliminary study. Use Inconsistency subtypes rather than malingering subtypes.

Abstract. Aside from changes occasioned by the above comments, use "two types of inconsistent behavior that speak to possible malingering." Use "differentiate the groups of [your 3 new labels]. Use undergone instead of requested. Use psychiatric/ psychological damage instead of psychic damage. Psychic has a particular meaning in English that does not apply here. Use "scales discriminated among [your 3 labels].

Introduction. Aside from whatever else applies related to my comments about the title and abstract, for line 55, use "Forensic evaluators are trained to evaluate whether evaluees." For that paragraph, the DSM-5 definition of malingering is not the standard of practice one because of its mention of antisocial PD and medico-legal context. You can safely remove lines 59-62 on that. For line 69, I would also check Rogers 2018 book, mentioned in the discussion. Here is a major correction. For the paragraph beginning on line 70, the literature presented is biased toward finding an elevated base rate of malingering in the forensic disability and related context. The Rogers 2018 book referred to the review by Young on the matter in a complementary way. That article was written in 2014 in Psychological Injury and Law, referring to the base rate in the field as 15 +/- 15 % instead of the exaggerated 40 +/- 10%. If you do not want to cite that article, at least cite the references therein that led to the conclusions offered. Start a new paragraph at line 79, and it should begin with "Researchers have." On line 87, refer to the SIRS-2, as well as the SIRS. Line 93. "classifications of and tests related to malingering." Line 110. There are no hypotheses. This seems a serious oversight. Acknowledge such and the reasons why. Presumably the tests were chosen because they were the best available for the task, and there were inklings about which scales would work best. The SIMS total scale makes sense in this regard. As far as I know, the most sensitive scale for the question at hand for the MMPI 2 RF should be the Fp-r, unlike what is suggested later in the paper. There should be a paragraph on the sensitivity of these different scale scores and which ones were expected to be most discriminative. If there were no inklings this way, then indicate such. Were any stats used that have to consider 2-tailed vs. one-tailed testing?

Methods. You refer to Participants but also subjects later on. Check the APA publication manual on this. PS The new one is coming out in October. Line 116. Certified. Use clinical. Line 122 and throughout. Explain exactly what you mean by blind, and in a separate sentence. Line 123. Confirming. Use establishing. First. Second. Third. Add (a) (b) (c). Line 124. Expert on. Use mental health professional with the required training to diagnose." In the English legal world, expert refers to a court designated expert on a matter. Congruences a, b, c. I am requesting a major addition. These congruences need to be explained much better. Were there pre-established criteria; give examples, how often were there disagreements that had to be resolved, etc. Line 131. Conclusion. Use determination. Line 133. conclusions. Line 193. T not t. Here and throughout. Line 194. What exactly was the p after the Bonferroni adjustment?

Results. Line 204. Should no the p not stop at 0? And throughout. Table 3. SD not DS. Astericks a,b, c are not defined. Sometimes the European comma used sometimes the American period. Use the period. The use of the Total SIMS score in the same analysis as the five subscales introduces collinearity. Perhaps run checking stats or run another analysis, adjusting p accordingly. Table 4, aside from what has been mentioned for Table 3, L-r is included here, but not mentioned in the text to this point. Moreover, as mentioned in the discussion, it is not part of an expected parameter for the question at hand. Remove and redo the analysis. Line 238. Page number for the quote. Line 252. Reference for the b test, please. Line 259. Awkward title. Line 262. KPI not defined. Line 265. because, not as. Line 265. Larger, not bigger. Line 267-268. Explain from which tests the predictors come from; for example, psychosis; also is it not amnestic memory and not memory disorder. Line 269. entrained instead of trained, not sure about that though. Line 274. FI not defined. Line 283. I am not an expert on MI and I suspect the same for many readers. Please explain the difference how predictors and classification efficiency variables are chosen in MI. How is it possible that the SIMS variables were best for the latter? How much better were they than the MMPI 2 RF ones, given that this test is the better one to use in forensic disability assessments? Generally for all statistical testing, were distributions of variables tested for normality, homoskedacity, etc., were tests robust for these considerations, etc?

Discussion. Line 302. The Paulus definition should go at the beginning. It illustrates that all that may be investigated is degree of response bias than anything like malingering, per se. The authors should check Rogers and Bender (2018) for all relevant terms and indicate which applies best. Also, check their design suggestions and comment in a future research section, which was not attempted. Line 304. Among, not between. Line 306. Here, the attribution of malingering is given directly. The evidence does not support using this term for those groups, as indicated right from the start of my review. Line 313. The 14 cut-off value for the SIMS in the manual should no longer be used because of its poorer psychometric properties compared to 17. I believe 2 studies in 2014 showed that. You had referred to other possible cut-offs too in your literature review of this test. Moreover, you do not mention cut-offs in any way on this test in the design or results section, so I am confused. You might have to rerun analyses. Line 315. Same problem. Perhaps under instead of over. Line 320. Better to use two. Line 324. but also. And throughout. Differentiated. Use in differentiating. Line 325. There's the malingering word again. Line 334. accentuators from simulators, or whatever new labels you use. Line 338. distinguish, not separate. Line 385. Fs, not FS. Line 388. Inappropriate cut-scores. Check the manual carefully. These levels are suspect only and the second or third levels, not the first. I know you want to help catch people who take advantage of the system, like we all do. But it should not be at the expense of potentially honest respondents by suggesting modification of more stringent criteria, even indirectly. Please revise accordingly. Elaborate more on the limitations of the study. How can future research carry on with the question at hand. Overall, many suggestions, but all doable. Thanks for submitting.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: Yes: GERALD YOUNG

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files to be viewed.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2019 Dec 30;14(12):e0227113. doi: 10.1371/journal.pone.0227113.r002

Author response to Decision Letter 0


23 Nov 2019

Reviewer 1

Malingering detection is an important issue in clinical and forensic settings that should be based, not on clinical impressions or judgements, but on the exigencies of a reliable technique grounded on replicable empirical findings. In the evaluation, a multimethod approach is required combining clinical interviews and psychometric instruments, of which the MMPI is the most extensively used (Graham, 2011, Greene, 2011, McDermott, 2012, Osuna et al. 2015, Rogers et al., 2003, Vilariño et al. 2013). So, the aim of this study is very interesting. As the authors comment, it is a preliminary study. However, before it can be published it would be desirable to introduce important changes to improve this paper.

Thank you for reviewing our manuscript. In the following points, you will find our answers to your suggestions. We have also highlighted our revisions in the text in yellow.

1. It would be desirable to modify the title since it is excessively generic and the study has been conducted in individuals diagnosed of adjustment disorder with mixed anxiety and depressed mood.

1. According with your suggestion and advices of Reviewer2, we have changed the title as follow: “Indicators to distinguish symptoms accentuators from symptoms producers in individuals with a diagnosed adjustment disorder: a pilot study on inconsistency subtypes using SIMS and MMPI-2-RF”.

2. The introduction requires a better organization to keep a sequence in explaining malingering. For example, in my opinion paragraph in page 5, line 93 should be on page 4 line 79. Also, a better explanation on instruments to detect malingering should be included with the correspondent references.

2. We thank the reviewer for this comment. We agree that the suggested organization is clearer in offering an explanation of malingering and we have re-organized the introduction accordingly. We also added an explanation of instruments to detect malingering, as requested.

3. Also, a better structure is required for the materials and method section. There are several questions no clarified in the paper. Information on the participants, sample size, etiology of psychological damage, and reasons why they are assessed in the Laboratory of Clinical Psychology must be included. Logically the distribution in each of the three established groups is subject to bias.

3. We added all the required information, especially about the Participants, in Material and Method section.

4. Table 1 can be deleted as it does not provide more information than is included in the text.

4. We have deleted Table 1, as suggested.

5. The conclusions section should be improved and not repeating the results obtained.

5. We agree with the reviewer and we improved the conclusions section including research’s limits and future directions.

Minor revisions

1. In the results section, at all times the decimals must be represented with points (Tables 3 and 4).

1. We apologize for the mistake. We’ve replaced the commas with points in Tables 3 and 4.

2. Page 17, line 371: To delete the initial ‘.

2. We’ve deleted the initial ‘ at page 17 (line 371), thank you.

Reviewer 2

The manuscript entitled, "Searching for indicators to distinguish accentuators from simulators: a preliminary study using the SIMS and the MMPI 2 RF," uses an interesting methodology and advanced statistics to discern possible test markers of partial and full malingering. There are some difficulties with the terminology used that needs to be addressed. The English is general well written, but I do make some suggestions below. I have major concerns about the methods section as written, the results, and the conclusions, but all should be correctable.

Thank you very much for reviewing our manuscript and to appreciate out preliminary work. We’ve revised the manuscript trying to address your suggestions and concerns. In the following points, you will find our answers. We have also highlighted our revisions in the text in yellow.

1. Title

Difficulties that I have with the title illustrate some of the major points of correction for the paper. Remove "Searching for" calling the symptom exaggeration group accentuators is fine. Simulators makes sense as presented, but the research design in this field uses simulators compared to controls, with simulators given instructions to feign exaggeration for monetary gain. Therefore, it might be best to find another term, such as symptom producers. If so, perhaps the accentuators could be called symptom accentuators. Better yet, try partial symptom accentuators compared to full symptom producers. I would not use partial and full malingerers as the terms for the following reason. It is not clear to me that the study is actually getting at malingering. There is no valid basis for attributing malingering in full or in part based on the types of inconsistencies used to differentiate the groups. Of course, the goal is to have the study lead to better indicators of full and partial malingerers, and that could go as the long term objectives of the research program along the indicated lines, but this first study is aimed at getting initial test markers of different degrees of inconsistencies as defined operationally by the study. Granted, there are three groups of patients in the study. The first is called presumably honest. I suggest referred to this group as Fully Consistent or Consistent, and the other two groups as Moderately Inconsistent as Excessively Inconsistent. Continuing with the title, use pilot study, not preliminary study. Use Inconsistency subtypes rather than malingering subtypes.

1. We thank the reviewer for this precious suggestion. We changed the title accordingly: “Indicators to distinguish symptoms accentuators from symptoms producers in individuals diagnosed of adjustment disorder: a pilot study on inconsistency subtypes using SIMS and MMPI-2-RF”. Following your advice, we also adapt the running head as “Distinguish symptoms accentuators from symptoms producers”.

2. Abstract

Aside from changes occasioned by the above comments, use "two types of inconsistent behavior that speak to possible malingering." Use "differentiate the groups of [your 3 new labels]. Use undergone instead of requested. Use psychiatric/ psychological damage instead of psychic damage. Psychic has a particular meaning in English that does not apply here. Use "scales discriminated among [your 3 labels].

2. We adjusted the abstract following your suggestions. We used “symptoms accentuators, symptoms producers and consistent participants” as labels, in order to facilitate readers identification of research groups.

3. Introduction

3.1 Aside from whatever else applies related to my comments about the title and abstract, for line 55, use "Forensic evaluators are trained to evaluate whether evaluees."

3.2 For that paragraph, the DSM-5 definition of malingering is not the standard of practice one because of its mention of antisocial PD and medico-legal context. You can safely remove lines 59-62 on that.

3.1-3.2 We adjusted the Introduction section accordingly.

3.3 For line 69, I would also check Rogers 2018 book, mentioned in the discussion. Here is a major correction.

3.4 For the paragraph beginning on line 70, the literature presented is biased toward finding an elevated base rate of malingering in the forensic disability and related context. The Rogers 2018 book referred to the review by Young on the matter in a complementary way. That article was written in 2014 in Psychological Injury and Law, referring to the base rate in the field as 15 +/- 15 % instead of the exaggerated 40 +/- 10%. If you do not want to cite that article, at least cite the references therein that led to the conclusions offered.

3.3-3.4 We thank the Reviewer to point out this suggestion. We check Rogers book (2018) both for epidemiology and labels.

3.5 Start a new paragraph at line 79, and it should begin with "Researchers have."

3.6 On line 87, refer to the SIRS-2, as well as the SIRS.

3.7 Line 93. "classifications of and tests related to malingering."

3.5-3.7 We have re-organized the introduction section. We referred to the SIRS-2, as well as the SIRS.

3.7 Line 110. There are no hypotheses. This seems a serious oversight. Acknowledge such and the reasons why.

3.7 Reviewer is obviously right. We didn’t specify the hypotheses in order to give a more explorative framework to our research. However, following your suggestion, we detailed H1 and H2 at the end of Introduction section.

3.8 Presumably the tests were chosen because they were the best available for the task, and there were inklings about which scales would work best. The SIMS total scale makes sense in this regard. As far as I know, the most sensitive scale for the question at hand for the MMPI 2 RF should be the Fp-r, unlike what is suggested later in the paper. There should be a paragraph on the sensitivity of these different scale scores and which ones were expected to be most discriminative. If there were no inklings this way, then indicate such. Were any stats used that have to consider 2-tailed vs. one-tailed testing?

3.8 We thank the reviewer for this suggestion. We added paragraphs with these data for both instruments.

4. Methods

4.1 You refer to Participants but also subjects later on. Check the APA publication manual on this. PS The new one is coming out in October.

4.1 We thank the reviewer and check the differences between “subjects” and “participants” on APA website (https://apastyle.apa.org/learn/faqs/subjects-and-participants) and changed the label accordingly.

4.2 Line 116. Certified. Use clinical.

4.3 Line 122 and throughout. Explain exactly what you mean by blind, and in a separate sentence.

4.4 Line 123. Confirming. Use establishing. First. Second. Third. Add (a) (b) (c).

4.5 Line 124. Expert on. Use mental health professional with the required training to diagnose." In the English legal world, expert refers to a court designated expert on a matter.

4.2-4.5 We explained what we mean by “blind”: “Until the end of this evaluation step, the two mental health professionals didn’t have any knowledge regarding the assessment made by the other colleague on the same participant (i.e. blind procedure)”. We also modified the Methods section accordingly the other suggestions.

4.6 Congruences a, b, c. I am requesting a major addition. These congruences need to be explained much better. Were there pre-established criteria; give examples, how often were there disagreements that had to be resolved, etc.

4.6 We agree with the reviewer’s observation and deepened with examples the congruence evaluation process. We also gave more details about the frequency of disagreements between our two experts.

4.7 Line 131. Conclusion. Use determination.

4.8 Line 133. conclusions.

4.9 Line 193. T not t. Here and throughout.

4.7-4.9 We changed the Methods section accordingly.

4.10 Line 194. What exactly was the p after the Bonferroni adjustment?

4.10 We have inserted a column with the p after the Bonferroni adjustment for each F test, both for Table 3 and Table 4 (MANCOVA analyses).

4.11 Results. Line 204. Should no the p not stop at 0? And throughout.

4.12 Table 3. SD not DS. Astericks a, b, c are not defined. Sometimes the European comma used sometimes the American period. Use the period.

4.11-4.12 We corrected the mistakes; we apologize for these oversights.

4.13 The use of the Total SIMS score in the same analysis as the five subscales introduces collinearity. Perhaps run checking stats or run another analysis, adjusting p accordingly.

4.13 We thank the reviewer for raising this concern. We checked the multicollinearity assumption using the correlation matrix, both for one and two tailed (please find attached SPSS outputs). The results were, for every correlation, below the suggested cutoff of .90 (Tabachnick & Fidell, 2012) and even below the more conservative value of .80.

4.14 Table 4, aside from what has been mentioned for Table 3, L-r is included here, but not mentioned in the text to this point. Moreover, as mentioned in the discussion, it is not part of an expected parameter for the question at hand. Remove and redo the analysis.

4.14 We agree with reviewer’s remark. We redone the analysis without L scale.

4.15 Line 238. Page number for the quote.

4.16 Line 252. Reference for the b test, please.

4.17 Line 259. Awkward title.

4.18 Line 262. KPI not defined.

4.19 Line 265. because, not as.

4.20 Line 265. Larger, not bigger.

4.15-4.20 We made all the requested changes: we inserted the number for the quote at line 238, added reference for b test, deleted the title at line 259 changing the title at line 230, defined KPI and use “because” and “larger” at line 265.

4.21 Line 267-268. Explain from which tests the predictors come from; for example, psychosis; also is it not amnestic memory and not memory disorder.

4.21 We inserted a detailed explanation about predictors selection in Machine Learning Models paragraph.

4.22 Line 269. entrained instead of trained, not sure about that though.

4.22 The term “trained” is correct and it is a typical Machine Learning term.

4.23 Line 274. FI not defined.

4.23 We have added in the manuscript an extensive definition of F1 (see the note of Table 5).

4.24 Line 283. I am not an expert on MI and I suspect the same for many readers. Please explain the difference how predictors and classification efficiency variables are chosen in MI. How is it possible that the SIMS variables were best for the latter? How much better were they than the MMPI 2 RF ones, given that this test is the better one to use in forensic disability assessments? Generally for all statistical testing, were distributions of variables tested for normality, homoskedacity, etc., were tests robust for these considerations, etc?

4.24 Many thanks for your comment. Thanks to your observation we have noticed that we mistakenly omitted the SIMS score from the list of the variables selected by the feature selection procedure. It was an oversight, so the total number of features selected was 11, including SIMS score. This mistake generated your confusion between the predictors and classification efficiency variables. Moreover, as we agree to the fact that the ML logic is complex for a not expert reader, we tried to clarify what ML is and how it works, including a deeper explanation of cross-validation and predictors selection. Please, find below the most relevant changes that we made in the manuscript to give more clarity to ML concepts:

“ML can be defined as “the study and construction of algorithms that can learn information from a set of data (called a training set) and make predictions for a new set of data (called a test set). In other words, it consists of training one or more algorithms to predict outcomes without being explicitly programmed and only uses the information learned from the training set.”

“The k-fold cross-validation is a technique used to evaluate predictive models by repeatedly partitioning the original sample into a training set to train the model, and a validation set to evaluate it. Specifically, in this paper we adopted a 10-fold cross-validation procedure, as the original sample was randomly partitioned into 10 equal-size subsamples, the folds. Of the 10 subsamples, a single subsample is retained as validation data for testing the model, and the remaining 10-1=9 sub-samples were used as training data. Such process is repeated 10 times, with each of the 10 folds used exactly once as validation data. The results from the 10 folds were then averaged to produce a single estimation of prediction accuracy.”

“The identification of the most informative attributes (or features, or predictors), called “feature selection”, is a widely used procedure in ML. The feature selection is a very powerful mean to build a classification model that can detect accentuators and simulators as accurately as possible. In fact, it permits to remove redundant and irrelevant features, increasing the model generalization and reducing overfitting and noise in the data. In order to identify the most discriminating features for classification, we run a trial and error procedure using random forest as model. This model consists of many decision trees, each built from a random extraction of observations from the dataset and a random extraction of features. The random extraction was repeated many times, finally selecting the set of features that maximized the model accuracy. The selected features at the top of trees are generally more important than those selected at end nodes, as the top splits typically produce bigger information gains. Following this, we listed the most important features for classification accuracy.”

“ML models, such as those reported above, are difficult to interpret: the operations computed by the algorithm to identify the single participant as accentuator, simulator or honest are unclear. To better understand the logic on which the classifications results are based on, a simpler model called OneR was ran. This classifier is clearer in terms of transparency of the operations computed by the algorithm and it permits to easily highlight the classification logic.”

5. Discussion

5.1 Line 302. The Paulus definition should go at the beginning. It illustrates that all that may be investigated is degree of response bias than anything like malingering, per se. The authors should check Rogers and Bender (2018) for all relevant terms and indicate which applies best. Also, check their design suggestions and comment in a future research section, which was not attempted.

5.1 We thank the reviewer for sharing the work of Rogers & Bender (2018) “Clinical Assessment of Malingering and Deception 4th edition”. We found many interesting insights and ideas we’d like to implement in future researches, as we have explained in Conclusion section. Moreover, reading the section about “basic concepts and definitions” led us to choose the term feigning instead of malingering, being the correct word for describing the type of response style observed.

5.2 Line 304. Among, not between.

5.2 We complied with your request.

5.3 Line 306. Here, the attribution of malingering is given directly. The evidence does not support using this term for those groups, as indicated right from the start of my review.

5.3 See answer #5.1

5.4 Line 313. The 14 cut-off value for the SIMS in the manual should no longer be used because of its poorer psychometric properties compared to 17. I believe 2 studies in 2014 showed that. You had referred to other possible cut-offs too in your literature review of this test. Moreover, you do not mention cut-offs in any way on this test in the design or results section, so I am confused. You might have to rerun analyses.

5.5 Line 315. Same problem. Perhaps under instead of over.

5.4-5.5 Albeit we’re aware of previous research findings in clinical settings [van Impelen et al. (2014); Rogers et al. (2014)], we decided to use the 14 cut-off value because participants’ evaluations were conducted into a medical-legal setting in which we had to refer to the SIMS Italian technical manual [La Marca S., Rigoni, D., Sartori, G., & Lo Priore, C. (2011), Giunti OS, Firenze, p. 17]. We decided to report the other cut-off scores in order to be the readers aware of the existence and the use of SIMS cut-offs higher than 14.

5.6 Line 320. Better to use two.

5.7 Line 324. but also. And throughout. Differentiated. Use in differentiating.

5.8 Line 325. There's the malingering word again.

5.9 Line 334. accentuators from simulators, or whatever new labels you use.

5.10 Line 338. distinguish, not separate.

5.11 Line 385. Fs, not FS.

5.6-5.11 We followed all these suggestions and changed the terminology as aforementioned clarified.

5.12 Line 388. Inappropriate cut-scores. Check the manual carefully. These levels are suspect only and the second or third levels, not the first. I know you want to help catch people who take advantage of the system, like we all do. But it should not be at the expense of potentially honest respondents by suggesting modification of more stringent criteria, even indirectly. Please revise accordingly.

5.12 We checked again these references: a) Ben-Porath, Y.S. (2012), Interpreting the MMPI-2-RF. University of Minnesota Press, pp. 245-250 and b) Ben-Porath, Y.S., & Tellegen, A. (2011), MMPI-2-RF Manuale di istruzioni. Adattamento italiano a cura di Sirigatti, S., & Casale, S. (2012), Giunti OS, Firenze, pp. 49-56 [the Italian Technical Manual].

5.13 Elaborate more on the limitations of the study. How can future research carry on with the question at hand?

5.13 We thank the reviewer for this suggestion. We improved the conclusions section including research’s limits and future directions.

Overall, many suggestions, but all doable. Thanks for submitting.

Thank you very much for all your inspiring and helpful suggestions.

Attachment

Submitted filename: Response to Reviewers.docx

Decision Letter 1

Stephan Doering

6 Dec 2019

PONE-D-19-22331R1

INDICATORS TO DISTINGUISH SYMPTOM ACCENTUATORS FROM SYMPTOM PRODUCERS IN INDIVIDUALS WITH A DIAGNOSED ADJUSTMENT DISORDER: A PILOT STUDY ON INCONSISTENCY SUBTYPES USING SIMS AND MMPI-2-RF

PLOS ONE

Dear Dr. Roma,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

We would appreciate receiving your revised manuscript by January 7, 2020. When you are ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter.

To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). This letter should be uploaded as separate file and labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. This file should be uploaded as separate file and labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. This file should be uploaded as separate file and labeled 'Manuscript'.

Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

We look forward to receiving your revised manuscript.

Kind regards,

Stephan Doering, M.D.

Academic Editor

PLOS ONE

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: (No Response)

Reviewer #2: (No Response)

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: (No Response)

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: (No Response)

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The paper has been improved following the recommendations and I consider it suitable for publication.

Reviewer #2: the authors have responded to most of my points. There are a few left. Line 34; producers, and. 39; damage (or psychological/ psychiatric damage). 96. A study by. 96. scales'. 97; e.g., 98. sensitive. 106. FBS-r. 108. FBS-r. 114 set." 120. have on SIMS. 126. judge's order,. 128. Rome, which. 133-4; 30% were a. 135; episodes, and. 135; originated from domestic. 141; e.g., . 149; e.g., . 153; did not. 154; i.e., . 157; times, while; 157; 23 cases involved mental; 157; reaching. 158; on the third. Table 1; still there; 219; Bonferroni not explained; 237 and 253; letters still not clear; I did figure it out, though; 274; paper, we; 275; procedure, in which the; 277; sub-samples are used; 278; folds are then; 291; selection,". 291; means. 295; ran; 297; extraction is. 298; minimizes; 300; list. 318; the FI319; note that. 326; Accentuator, . 327; based, . 328; model, called OneR, was run. 329; to highlight easily. 335; score emerged the; 336; identified. 337; it seems the results mentioned in the prior sentence are not provided. And would these results change anything in the discussion? 347; accentuators, and. 350; scores among participants. 428; feign. 448; vs. . 452; subjectivity; therefore, they.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: Yes: Gerald Young

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files to be viewed.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2019 Dec 30;14(12):e0227113. doi: 10.1371/journal.pone.0227113.r004

Author response to Decision Letter 1


9 Dec 2019

Reviewer 1

The paper has been improved following the recommendations and I consider it suitable for publication.

Thank you for reviewing our manuscript a second time, and to consider it suitable for publication.

Reviewer 2

The authors have responded to most of my points. There are a few left.

Thank you for reviewing our manuscript again. In the following points, you will find our answers to your suggestions. We have also highlighted our revisions in the text in yellow.

Line 34; producers, and.

39; damage (or psychological/ psychiatric damage).

96. A study by.

96. scales'.

97; e.g.,

98. sensitive.

106. FBS-r.

108. FBS-r.

114 set."

120. have on SIMS.

126. judge's order,.

128. Rome, which.

133-4; 30% were a.

135; episodes, and.

135; originated from domestic.

141; e.g., .

149; e.g., .

153; did not.

154; i.e., .

157; times, while;

157; 23 cases involved mental;

157; reaching.

158; on the third. Table 1; still there;

219; Bonferroni not explained

237 and 253; letters still not clear; I did figure it out, though;

274; paper, we;

275; procedure, in which the;

277; sub-samples are used;

278; folds are then;

291; selection,".

291; means.

295; ran;

297; extraction is.

298; minimizes;

300; list.

318; the FI

319; note that.

326; Accentuator, .

327; based, .

328; model, called OneR, was run.

329; to highlight easily.

335; score emerged the;

336; identified.

337; it seems the results mentioned in the prior sentence are not provided. And would these results change anything in the discussion?

347; accentuators, and.

350; scores among participants.

428; feign.

448; vs. .

452; subjectivity; therefore, they.

We have made all the requested changes, thank you.

Line 158. We have deleted Table 1 (“Group’s composition according to the criteria set out”) during the first revision, according to Reviewer1’s suggestion. The table that is now entitled Table 1 describes the “Demographic Composition of the Three Research Groups” (line 172).

Line 219. Regarding Bonferroni correction, SPSS automatically compute the p value after the adjustment. “Confidence interval adjustment. Select least significant difference (LSD), Bonferroni, or Sidak adjustment to the confidence intervals and significance” (IBM SPSS Advanced Statistics 26, p.15). Moreover, we have added this sentence in “Statistical Analysis Section”: “The Bonferroni correction was applied to adjust confidence intervals; SPSS-25 software (SPSS Inc., Chicago, IL) automatically corrected the p-value for the number of comparisons.”

Line 337. To greater clarity, we have rephrased the sentence as follow: “According to the aforementioned classification rules, indeed, OneR identified cut-off for the SIMS score (i.e., 13.5, 18.5, and 34.5) to distinguish symptom producers from symptom accentuators and consistent participants.”

Attachment

Submitted filename: Responses to Reviewers 2nd revision.docx

Decision Letter 2

Stephan Doering

13 Dec 2019

INDICATORS TO DISTINGUISH SYMPTOM ACCENTUATORS FROM SYMPTOM PRODUCERS IN INDIVIDUALS WITH A DIAGNOSED ADJUSTMENT DISORDER: A PILOT STUDY ON INCONSISTENCY SUBTYPES USING SIMS AND MMPI-2-RF

PONE-D-19-22331R2

Dear Dr. Roma,

We are pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it complies with all outstanding technical requirements.

Within one week, you will receive an e-mail containing information on the amendments required prior to publication. When all required modifications have been addressed, you will receive a formal acceptance letter and your manuscript will proceed to our production department and be scheduled for publication.

Shortly after the formal acceptance letter is sent, an invoice for payment will follow. To ensure an efficient production and billing process, please log into Editorial Manager at https://www.editorialmanager.com/pone/, click the "Update My Information" link at the top of the page, and update your user information. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to enable them to help maximize its impact. If they will be preparing press materials for this manuscript, you must inform our press team as soon as possible and no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

With kind regards,

Stephan Doering, M.D.

Academic Editor

PLOS ONE

Acceptance letter

Stephan Doering

18 Dec 2019

PONE-D-19-22331R2

Indicators to distinguish symptom accentuators from symptom producers in individuals with a diagnosed adjustment disorder: A pilot study on inconsistency subtypes using SIMS and MMPI-2-RF  

Dear Dr. Roma:

I am pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please notify them about your upcoming paper at this point, to enable them to help maximize its impact. If they will be preparing press materials for this manuscript, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

For any other questions or concerns, please email plosone@plos.org.

Thank you for submitting your work to PLOS ONE.

With kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Professor Stephan Doering

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    Attachment

    Submitted filename: Response to Reviewers.docx

    Attachment

    Submitted filename: Responses to Reviewers 2nd revision.docx

    Data Availability Statement

    A repository for the data has been created in Zenodo. It can be accessed via this link: https://doi.org/10.5281/zenodo.3548270


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES