Abstract
Background
Patient Reported Outcomes Measurement Information System Depression Short-Form 6a (PROMIS-D-SF) and Patient Reported Outcomes Measurement Information System Anxiety Short-Form 6a (PROMIS-A-SF) are brief self-administered questionnaires designed to assess anxious and depressive symptoms in healthy and clinical populations. Their usefulness as screening tool to identify patients and their validity has yet to be examined, which the present paper aims to do.
Methods
Patients in the Mental Health Service Outpatient Clinics and healthy volunteers were invited to complete a survey that included the Danish translation of the PROMIS-D-SF, the PROMIS-A-SF, the Beck Depression Inventory, second edition (BDI-II), and the Beck Anxiety Index (BAI). We conducted a confirmatory factor analysis of the instruments’ previously proposed single-factor structures. We furthermore evaluated the construct validity of the PROMIS-D-SF and the PROMIS-A-SF by means of their relationship with the BDI-II and the BAI, respectively. Finally, we evaluated the utility of the PROMIS-D-SF and PROMIS-A-SF in identifying patient status by conducting receiver operating characteristic curves.
Results
Seventy healthy volunteers and 62 patients completed the instruments. Both the PROMIS-D-SF and the PROMIS-A-SF had a poor fit to a single-factor structure. Cronbach's alpha and McDonald’s omega showed good internal reliability for both instruments. PROMIS-D-SF score was positively correlated with the BDI-II (r=.89), and the PROMIS-A-SF score was positively correlated with the BAI (r=.90). ROC-analyses for both scales demonstrated good accuracy in detection of patient status.
Conclusions
The level of self-reported anxious and depressive symptoms is high in patients with psychiatric illness compared to healthy volunteers. Patients with borderline personality disorder had the most elevated depressive symptoms. Our data suggest the PROMIS-D-SF and PROMIS-A-SF are not unidimensional measures, but that both instruments have good accuracy in detection of patient status. Examination of psychometric properties in patient populations with somatic disorders could be a natural next step.
Clinical trial number
Not applicble.
Keywords: PROMIS-D-SF, PROMIS-A-SF, Anxiety, Depression, Personality disorders, PRO
Background
The US National Institutes of Health (NIH) funded the Patient Reported Outcomes Measurement Information System (PROMIS) in 2004. PROMIS was developed to have a license free psychometrically sound system of patient-reported outcomes (PROs) [1]. It is a battery of PROs designed to measure a wide range of symptoms and different aspects of functioning in adult as well as pediatric populations. The items of the PROMIS instruments originate from other instruments used to assess the construct (e.g., depression) (which in the PROMIS terminology is known as “legacy measures”). PROMIS instruments are in contrast to legacy measures, which are often specific to a single diagnosis, designed to be applicable across diagnoses. The PROMIS bank includes more than 300 instruments designed to measure symptoms and functioning, which relate to the domains of physical, mental, and social health [1].
Pilkonis et al. (2011) [2] developed the 28-item PROMIS Depression (PROMIS-D) and 29-item PROMIS Anxiety (PROMIS-A). The PROMIS-D also exist in a few derivative short forms (4a, 6a, 8a, and 8b) with four to eight items (PROMIS-D-SF), and the PROMIS-A too exists in a number of derivative short forms (4a, 6a, 7a, and 8a) with four to eight items (PROMIS-A-SF). These instruments are all part of the larger Adult Self-Reported Health framework in the PROMIS. The PROMIS-D-SF 6a and the PROMIS-A-SF 6a are the focus of this paper, and we will, from hereon, for simplicity, refer to them as the PROMIS-D-SF and the PROMIS-D-SF.
Neither the PROMIS-D-SF or the PROMIS-A-SF nor any other forms of the instruments have, to the best of our knowledge, been formally validated in a patient population.
When implementing patient-reported outcome measures (PROMs) clinically, reducing the number of items in each psychometric instrument can possible help to secure adherence. Therefore, is it relevant to examine the psychometric properties of brief instruments such as the PROMIS-D-SF and the PROMIS-A-SF for future clinical use. Further, developing PROMs that quantify subjective distress, such as depressive and anxiety symptoms, which are transdiagnostic, i.e. observed in a range of mental health disorders as well as many somatic disorders, will strengthen research in the patient of tomorrow characterized by comorbidity or multi-morbidity [3].
International guidelines calls for screening for emotional disorders in primary care settings, in order to improve patient’s quality of life and contain health care costs [4, 5]. In order to do so, primary care providers need to have access to tools that are valid, reliable, brief, and easily and freely accessible [6].
Mulvaney-Day et al. (2018) [7] conducted a systematic review of instruments designed to screen for common mental health illness in a primary care general practice setting. They found that the Patient Health Questionnaire, 9 items (PHQ-9) [8, 9] and the General Anxiety Disorder scale, 7 items (GAD-7), had good to excellent abilities to screen for and depression and anxiety, when measured against a clinical interview.
However, Pilkonis et al. (2011) [2] suggested that the PROMIS Depression and Anxiety should be explored for their utility as screening-tool. In this paper we investigate PROMIS-D-SF and the PROMIS-A-SF ability to predict patient-status and the psychometric properties of the instruments. We do so in a Danish population of patients with emotional disorders and healthy adults. We first establish the psychometric validity of the Danish translations of the instruments, secondly explore and compare levels of emotional distress in these populations, and lastly examines the instruments sensitivity and specificity in identifying patient status.
To do so, we report (1) the PROMIS symptom scores in a psychiatric population with emotional disorders compared to the healthy volunteers; (2) a confirmatory factor analysis of the new PROMIS scales and their respective fit to the proposed single-factor model; (3) the internal consistency reliability of the instruments; (4) the agreement between the PROMIS-D-SF and legacy measure Beck Depression Inventory (BDI-II), and the agreement between the PROMIS-A-SF and Beck Anxiety Index (BAI); and (5) the instruments utility for identifying patient status.
Methods
Setting and procedure
We included patients in outpatient secondary care clinics run by Region Zealand Mental Health Services (MHS). These clinics treat patients who in the primary care sector have failed to respond to at least one line of treatment. Patients are referred to these clinics by either general practitioners of private practice psychiatrist. Upon intake, the patients are assessed by psychiatrist or specialty psychologist and diagnosed according to the ICD-10. Patients are thereafter offered treatment.
In these clinics, we recruited patients through posters in the waiting rooms of four outpatient clinics. They then participated through an online survey, from which the reported data was drawn.
The healthy volunteers were partly recruited from the staff of Region Zealand MHS, and partly through an online survey on social media.
The sample also contributed data for a validation of the Danish PROMIS Fatigue Short Form [10].
A priori hypothesis
We hypothetized prior to data analysis, that psychiatric patients will report more symptoms of anxiety and depression compared with healthy volunteers, that both the PROMIS-D-SF and the PROMIS-A-SF will have a single-factor structure, good psychometric properties, and that both scales will be able to discriminate patients from healthy volunteers. We further hypothetized that PROMIS-D-SF would have a good agreement with legacy instrument BDI-II, and that PROMIS-A-SF would have a good agreement with BAI.
Online survey
The survey collected self-reported information regarding psychiatric diagnosis, age, and sex. After that, participants filled in questionnaires, among them the PROMIS-D-SF, PROMIS-A-SF, BDI-II, and the BAI. Data were collected between September 2021 and July 2022.
Ethical considerations
The study per local regulations registered with the Danish Data Protection Agency Region Zealand (REG-048-2021). The survey study according to local guidelines and regulations, not need approval by the Region Zealand Ethics Committee. Patients and healthy volunteers gave digital informed consent on the first page of the online survey. The study was carried out in accordance to the Helsinki Declaration.
Instruments
PROMIS depression and anxiety short form 6a (PROMIS-D-SF and PROMIS-A-SF)
The PROMIS instruments are designed to capture valid, reliable, responsive, and precise patient-reported outcome (PRO) measures, with instruments that are freely assessable and designed to be used across disorders. The scales have a recall period of seven days. Both instruments have six items, each of which is rated on a five-point Likert scale ranging from 1 = Never to 5 = Always. The questionnaire is reported as a T-Score. Higher scores indicating greater symptom severity. We used a translation of the instrument by researchers at the Section of Social Medicine, Department of Public Health, University of Copenhagen [11]. The Danish version can be obtained from the translators of the instrument. Examples of PROMIS-D-SF items include “I felt worthless” and “I felt unhappy” and examples of PROMIS-A-SF items include “I felt nervous” and “I felt afraid”.
The beck depression inventory, second edition (BDI-II)
The Beck Depression Inventory, second edition (BDI-II) is an instrument specifically designed to assess the severity of symptoms associated with affective disorders. The second edition was developed in response to the American Psychiatric Association’s publication of the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV), which changed many of the diagnostic criteria for Major Depressive Disorder. The second edition is thus in line with depression as understood in the DSM-IV [12]. The BDI-II contains 21 items that are self-reported and are designed to be administered to adolescents ages 17 and older as well as adults. Each of the individual items is scored on a four-point scale from 0 to 3. Scores can range from 0 to 63 points, with higher total scores indicating more severe depressive symptoms [13]. We used a Danish translation provided by the publisher [14].
The beck anxiety index (BAI)
The Beck Anxiety Index (BAI) is an instrument specifically designed to assess the severity the symptoms associated with anxiety disorders [15]. It includes 21 items and is designed to be self-reported and to be administered in adolescents ages 17 and older as well as adults. Severity of anxiety symptoms is rated for the past week (including the day they take it). Each of the individual items is scored on a four-point Likert scale from 0 = Not at all to 3 = It bothered me a lot. Scores can range from 0 to 63 points, with higher total scores indicating more pronounced symptoms of anxiety [15]. We used a Danish translation by the publisher [16].
Statistical analyses
We undertook all data processing and analyses in the R 4.3.0 (Already Tomorrow) and RStudio 2022.07.2 + 576 [17] software, including the psych 2.1.9 [18], lavaan 0.6–9 [19], and cutpointr [20] packages for the R software. We performed largely the same analysis as in our previous paper [10].
First, we calculated descriptive statistics. We transformed the simple sum scores on the PROMIS-D-SF and PROMIS-A-SF into T-scores, by manually converting each score, into a T-score from, with the use of the table provided in the official manual of the instrument [21, 22]. We tested for significance between groups scores with t-tests, and calculated the mean difference (MD) and standard difference (SD) between groups (Table 2).
Table 2.
Descriptive statistics
| N | PROMIS-D-SF | PROMIS-A-SF | |||
|---|---|---|---|---|---|
| Sum score mean [SD] | T-score Mead [SD] |
Sum score mean [SD] | T-score mean [SD] |
||
| Total sample | 130 | 13.25 [6.85] | 53.68 [11.27] | 13.47 [6.45] | 55.74 [10.67] |
| Healthy volunteers | 66 | 8.96 [4.06] | 46.79 [8.28] | 9.24 [3.32] | 50.9 [7.27] |
| Patient population | 62 | 17.95 [6.16] | 61.25 [6.85] | 18.11 [5.82] | 63.01 [8.95] |
| Diagnosis | |||||
| Borderline PD | 16 | 21.31 [5.1] | 66.06 [9.05] | 19.38 [4.19] | 65.19 [5.57] |
| Anxiety disorders | 19 | 19.58 [5.4] | 63.78 [6.83] | 20.95 [4.47] | 68.6 [5.98] |
| Depression | 17 | 14.88 [5.85] | 56.67 [9.05] | 14.81 [6.44] | 58.14 [10.03] |
| Other | 11 | 14.91 [6.59] | 56.54 [10.82] | 16.18 [6.55] | 59.47 [11.49] |
[] are used to indicate standard deviations
Secondly, we examined ceiling and floor effects, on a scale level as suggested by Mchorney and Tarlov (1995) [23]. We choose proportions ≥ 15% of people at either end, as evidence of either a ceiling of a flooring effect [23].
Thirdly, we performed confirmatory factor analyses (CFA) of the PROMIS-D-SF and PROMIS-A-SF to evaluate their respective fits to the single-factor model proposed for the unabridged version of the PROMIS Anxiety and Depression [2]. To do so, we utilized the lavaan R [19] package. We used the WLSMV estimator and treated data as ordered categorical variables. We further carried out multigroup analysis with two subgroups (patient status yes or no). As recommended by Kline (2015), we calculated the comparative fit index (CFI), the Tucker-Lewis index (TLI), the root mean square error of approximation (RMSEA), the standardized root mean square residual (SRMR), and the degrees of freedom (df) [24]. To evaluate these fit indexes, we utilized the criteria set forth by Hu and Bentler, which suggest that an RMSEA smaller than 0.06, an SRMR smaller than 0.08, and a CFI and TLI larger than 0.95 indicate relatively good model-data fit [25]. For evaluation of the the chi-square fit statistic, which is usually evaluated as the ratio of the chi-square statistic to the respective degrees of freedom (χ2 /pdf) [26], we interpreted a ratio smaller than 2 as indicating a superior fit of the data [27].
Fourth, in order to evaluate internal consistency reliability of the PROMISF-D-SF and the PROMIS-A-SF, we calculated Cronbach’s alpha (α), McDonald’s hierarchical omega (ωh), and the total omega (ωtotal). We first calculated these measures the complete sample and then for the two subsamples (healthy volunteers and patients). We considered α above 0.70 [28, 29], ωh above 0.65 and ωtotal above 0.80 [30] as satisfactory.
Fifth, to evaluate the convergent validity of the PROMIS-D-SF, we calculated its correlation with a legacy measure of depression, the BDI-II total score. For the PROMIS-A-SF, we calculated its correlation with the BAI total score. We first calculated this for the complete sample and then for the two subsamples (healthy volunteers and patients). We considered according to Cohen (1988) correlations less than 0.30 as weak, correlations between 0.30 and 0.49 as moderate, and correlations greater than 0.49 as considered strong [31].
Finally, we graphed receiver operating characteristic (ROC) curves and calculated the Area Under the Curve (AUC), which tested for sensitivity for patient status (defined as the patient reporting affiliation with mental health services) for the PROMIS-D-SF and the PROMIS-A-SF. We utilized the “cutpoint” function to calculate the cutoff point, which had the best overall sensitivity and specificity. We interpreted an AUC above 0.9 as excellent, > 0.8 as good, > 0.7 as fair, and < 0.7 as poor [32].
Results
Descriptive statistics
One hundred and thirty-two individuals agreed to participate in the online survey. Four had missing data and were excluded from the analysis. Of the resulting sample (N = 128) 62 were patients in the psychiatric clinics, and 66 were healthy adults.
The sample was predominantly middle-aged (39.5 years, SD = 11.7) and identified as female (82.0%). The 66 healthy volunteers had a mean age of 42.8 [SD = 10.8] years and 78.8% identified as female. Of the 62 patients the mean age was 37.7 [SD = 12.0] years and 84.1% identified as female. Most of the patients reported that they were receiving treatment for an anxiety disorder (30.2%). See Table 1 for patient’s distribution according to diagnosis, and for PROMIS-D-SF and PROMIS-A-SF scores for the total sample, healthy adults, and patients.
Table 1.
Descriptive statistics
| Total sample [SD] N = 128 | Patients [SD] N = 62 | Healthy volunteers [SD] N = 66 | |
|---|---|---|---|
| Female | 105 (82.0%) | 53 (84.1%) | 52 (78.8%) |
| Age (mean)[standarddeviation] | 39.5 [11.65] | 37.65 [11.99] | 42.83 [10.81] |
| Diagnosis | |||
| Borderline PD | 17 (27.4%) | ||
| Anxiety disorders | 19 (30.2%) | ||
| Depression | 16 (25.8%) | ||
| Other | 11 (17.5%) |
Data are presented as means (standard deviation)
Patients had a higher score on the PROMIS-D-SF than healthy adults (p = > 0.01, MD = 14.46, SD = 10.75). Patients with borderline personality disorder had a higher score on the PROMIS-D-SF than patients with depression (p = > 0.01, MD = 9.39, SD = 12.80), and patients with other diagnoses (p=.01, MD = 9.53, SD = 14.11), but no difference compared to patients with anxiety.
Patients had a higher score on the PROMIS-A-SF than healthy adults (p = > 0.01, MD = 12.11, SD = 11.53). Patients with borderline personality disorder had higher scores on the PROMIS-A-SF than patients with depression (p=.03, MD = 7.05, SD = 11.47), but not compared to patients with anxiety and other diagnoses.
Ceiling and floor effects
PROMIS-D-SF
6.4% of the included patients scored the lowest possible score and 1.6% scored the highest possible score. We therefore found no evidence of a ceiling/floor effect on the PROMIS-D-SF.
PROMIS-A-SF
4.8% of the included patients scored the lowest possible score and 1.6% scored the highest possible score, i.e., no evidence of a ceiling/floor effect on the PROMIS-A-SF.
Factor structure
PROMIS-D-SF
The result of the CFA showed that the PROMIS-D-SF had a poor fit to the single-factor model previously suggested in the literature for the unabridged version of the instrument on most fit-indices (CFI=0.939 TLI=0.898, RMSEA [CI] =0.209 [0.161, 0.260], SRMR=0.038, Df = 9 and Chi2 = 60.396). The chi-square/df value for the single-factor model was 6.7.
The result of the multi-group CFA showed that the PROMIS-D-SF had a poor fit to the single-factor model previously suggested in the literature for the unabridged version of the instrument on most fit-indices (CFI=0.902 TLI=0.836, RMSEA [CI] =0.218 [0.168, 0.271], SRMR=0.062, Df = 18 and Chi2 = 73.825). The chi-square/df value for the single-factor model was 4.1.
PROMIS-A-SF
The result of the CFA showed that the PROMIS-A-SF had a poor fit to the single-factor model previously suggested in the literature for most fit-indices (CFI=0.956 TLI=0.922, RMSEA [CI] =0.179 [0.130, 0.231], SRMR=0.028, Df = 9 and Chi2 = 46.347). The chi-square/df value for the single-factor model was 5.14.
The result of the multi-group CFA also showed that the PROMIS-A-SF had a poor fit to the single-factor model previously suggested in the literature on most fit-indices (CFI=0.946, TLI=0.910, RMSEA [CI] =. [0.092, 0.204], SRMR=0.043, Df = 18 and Chi2 = 43.536). The chi-square/df value for the single-factor model was 2.42.
Internal consistency reliability
PROMIS-D-SF
Internal consistency reliability was found to be good for the PROMIS-D-SF (α = 0.95, 95% CI [0.94, 0.97]; ωh = 0.86, ωtotal = 0.98) for the total sample. Subgroup-analysis also showed good internal consistency reliability for healthy volunteers (α = 0.9, 95% CI [0.85, 0.93]; ωh = 0.74, ωtotal = 0.95) and for the patient sample (α = 0.93, 95% CI [0.90, 0.93]; ωh = 0.88, ωtotal = 0.96).
PROMIS-A-SF
Internal consistency reliability was found to be good for the PROMIS-A-SF (α = 0.95, 95% CI [0.94, 0.96]; ωh = 0.91, ωtotal = 0.97) for the total sample. Subgroup-analysis also showed good internal consistency reliability for healthy volunteers (α = 0.88, 95% CI [0.82, 0.91]; ωh = 0.78, ωtotal = 0.91) and for the patient sample (α = 0.93, 95% CI [0.89, 0.95]; ωh = 0.87, ωtotal = 0.96).
Convergent validity
PROMIS-D-SF
The PROMIS-D-SF total score correlated strongly with the BDI-II total score (r=.87 95% CI [0.81, 0.9], p = > 0.001) for the total sample. Subgroup-analysis also showed strong correlation for the healthy volunteers only (r=.69 95% CI [0.54, 0.8], p = > 0.001) and the patient sample only (r=.82 95% CI [0.72, 0.9], p = > 0.001)
PROMIS-A-SF
The PROMIS-A-SF total score correlated highly with the BAI total score (r=.90 95% CI [0.87, 0.93], p = > 0.001) for the total sample. Subgroup-analysis also showed strong correlation for the healthy volunteers only (r=.72 95% CI [0.58, 0.82], p = > 0.001) and the patient sample only (r=.82 CI [0.72, 0.89], p = > 0.001).
External validity
PROMIS-D-SF
ROC curves testing for the sensitivity of the PROMIS-D-SF for patient status (N = 128) are presented in Fig. 1. The optimal cutoff point was found to be 55.9 and yielded a sensitivity of 81% and a specificity of 86% with good accuracy, as indicated by an AUC=0.88.
Fig. 1.

ROC curve testing for the sensitivity of the PROMIS-D-SF for patient status
PROMIS-A-SF
ROC curves testing for sensitivity of the PROMIS-A-SF for patient status (N = 128) are presented in Fig. 2. The optimal cutoff point was found to be 51.3 and yielded a sensitivity of 85% and a specificity of 90% with good accuracy, as indicated by an AUC=0.89.
Fig. 2.

ROC curve testing for sensitivity of the PROMIS-A-SF for patient status
Discussion
As expected, patients with emotional disorders reported significantly higher symptom scores on both of the new short form PROMIS scales than the sample of healthy volunteers. Patients with borderline personality disorder reported the highest levels of depressive symptoms, which were higher than the patients with clinical depression and anxiety symptom level not different from those diagnosed with anxiety disorders.
We examined the factor structure of the scales. Both had an poor fit to the proposed single-factor structures [2]. A possible explanation could be that the instruments have been designed from many different items, which in legacy instruments have been associated with major affective disorder and anxiety disorders respectively. More specifically, PROMIS Anxiety and Depression are developed from an item pool consisting of the items from legacy instruments identified in comprehensive database searches. Items that were considered disease-specific, confusing, and otherwise redundant was excluded. The items were presented to patients in focus groups, and items which resonated with the patients was standardized to reflect the same time frame and the same possible responses. The items were administered to the general population and patients in a computerized format and selected for the final versions of the PROMIS Depression and Anxiety, based on their responses. We speculate, that they should be considered as an index of depressive or anxiety symptoms, respectively, rather than covering one uniform concept.
PROMIS-D-SF and the PROMIS-A-SF in this Danish version had good internal consistency. We found no previous report on this. We found no evidence of any ceiling or floor effects of the instruments.
The PROMIS-D-SF correlated as expected with the legacy instrument BDI-II, and the PROMIS-A-SF likewise correlated highly with the BAI. This suggests that the two PRO instruments capture largely the same construct as the legacy instruments and have good convergent validity. Compared to the two legacy instruments, the PROs can do so with markedly fewer items. Therefore, the PRO instruments can be regarded as more feasible to use in research where the participants need to report their level of symptoms on many occasions [33].
Lastly, we examined the PROMIS-D-SF and PROMIS-A-SF specificity and sensitivity for detecting patient status. We found that both instruments had good potential to do so.
Mulvaney-Day et al. (2018) [7] suggested criteria for evaluation of screenings tool for emotional disorders in the primary care sector. The first criterion is whether the instrument is designed to screen for one or more emotional disorders. Both the PROMIS-D-SF and the PROMIS-A-SF, as well as the PHQ-9 and the GAD-7, are designed to only capture symptoms of one disorder. They can therefore be regarded as equal. The PROMIS-SF instruments are, however, easily combined, if one wants to screen for both anxiety and depression, which must be regarded as an advantage. In comparison, the PHQ-9 should be administered as the complete parent instrument, the Patient Health Questionnaire (PHQ), if one were to screen for several disorders [7]. The second criterion is number of items of the instruments. The two PROMIS-SF’s are both shorter than PHQ-9 and the GAD-7. This makes them easier and more efficient to administer than the comparator instruments, and they can therefore be regarded as superior on this criterion.
Mulvaney-Day et al. (2018) [7] suggest that it is important, that screening tests employed in the primary care sector, should have high specificity, so that providers may be confident that patients who screen negative, does not need follow-up. As this is the first investigation of the PROMIS SF as a screening tool, we calculated the optimal cut-off point and used this in our comparison with other available instruments. We further reported sensitivity and specificity for any emotional disorder, and not for depression or anxiety only, as has the studies which we compare our findings to in the following. We chose this approach as our sample size was insufficient to meaningfully calculate sensitivity and specificity for individual categories. However, this is also in the spirit of PROMIS as they are designed to be used transdiagnostic across conditions and are not meant to be disease-specific [1].
The investigated PROMIS instruments had excellent specificity > 85% at the optima cutoff-point. In comparison, similar research on the PHQ-9 found the instrument to have a specificity from 88% [8] to 91% [9] (for score > 10). It is however also important to have a good sensitivity [7], as patients who screen positive will need additional follow-up and diagnostic assessment. Our PROMIS tools showed a good sensitivity of 81% and 85% at the optimal cutoff point, for depression and anxiety respectively. In comparison, the PHQ-9 was found to have a sensitivity of between 88% [8] and 74% [9]. The GAD-7 has also been subject to similar research. Spitzer et al. (2006) [34] examined its effectiveness in identifying generalized anxiety disorder patient status established by a clinical interview. The GAD-7 had at a threshold of 10, a sensitivity of 89% and a specificity of 82%. This is also similar to the PROMIS-A-SF, and overall, this comparison suggests that the scales have similar properties as these two very established instruments.
We would like to address the limitations of the present study. First, the present validation of the Danish PROMIS-D-SF and PROMIS-A-SF was carried out with self-reported data regarding diagnosis. Hence, diagnoses as well as no-diagnosis choice could be erroneous, based on the participants understanding of their situation. Secondly, the findings are based on data from relatively few subjects.
Future research could investigate larger samples to examine the POMIS SF instruments to screen for patient status in individual groups of patients with emotional disorders. Future research could also utilize these instruments in a clinical trial with patients with depression and anxiety and see if they are equally sensitive to changes in symptoms. Lastly, future research could explore the factor structure of the instruments with larger samples.
In conclusion, the findings of the present study indicate that the Danish PROMIS-D-SF and PROMIS-A-SF does not have the proposed single-factor structure, but should possible rater be regarded as an index. This findings should be replicated in a larger sample. The instruments perform well in the identification of patient status, with sensitivity and specificity comparable to the widely applied PHQ-9 and the GAD-7.
Acknowledgements
We thank the patients and staff at the Psychotherapeutic Clinics in Maribo, Næstved, Køge, Roskilde, and Slagelse for participating in the study. The study was conducted in Region Zealand Mental Health Services.
Author contributions
SA and RKO conceived the project. RKO collected data. CM prepared the data, and ORH carried out statistical calculations. RKO wrote the first draft manuscript, ORH was responsible for writing the second draft manuscript. SA and RKO contributed with significant comments. All authors have discussed, reviewed, and approved the manuscript.
Funding
Open access funding provided by Copenhagen University. Psychiatry West, Central- and West Zealand Hospital, Slagelse, Denmark and The COVID Research Fund, Department of Clinical Medicine, Copenhagen University fund the project.
Data availability
The datasets used and analyzed during the current study are available from the last author on reasonable request.
Declarations
Ethics approval and consent to participate
The study was in accordance with local regulations registered with the Danish Data Protection Agency Region Zealand (REG-048-2021). The survey study did not, as per local guidelines and regulations, need approval by the Region Zealand Ethics Committee. Informed consent was taken from participants to participate in the study.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.(NIH). National Institute of Health. Intro to PROMIS®. [cited 2024 March 4th]; Available from: https://www.healthmeasures.net/explore-measurement-systems/promis/intro-to-promis
- 2.Pilkonis PA, Choi SW, Reise SP, Stover AM, Riley WT, Cella D; PROMIS Cooperative Group. Item banks for measuring emotional distress from the Patient-Reported Outcomes Measurement Information System (PROMIS®): depression, anxiety, and anger. Assessment. 2011;18(3):263–283. [DOI] [PMC free article] [PubMed]
- 3.Kingston A, Robinson L, Booth H, Knapp M, Jagger C; MODEM Project. Projections of multi-morbidity in the older population in England to 2035: estimates from the Population Ageing and Care Simulation (PACSim) model. Age Ageing. 2018;47(3):374–380. [DOI] [PMC free article] [PubMed]
- 4.Joffres M, Jaramillo A, Dickinson J, Lewin G, Pottie K, Shaw E, Connor Gorber S, Tonelli M. Canadian task force on preventive health care., recommendations on screening for depression in adults. CMAJ. 2013;185(9):775 – 82., 2013. [DOI] [PMC free article] [PubMed]
- 5.Siu AL; US Preventive Services Task Force (USPSTF), Bibbins-Domingo K, Grossman DC, Baumann LC, Davidson KW, Ebell M, García FA, Gillman M, Herzstein J, Kemper AR, Krist AH, Kurth AE, Owens DK, Phillips WR, Phipps MG, Pignone MP. Screening for depression in adults: US Preventive Services Task Force recommendation statement. JAMA. 2016;315(4):380–387. [DOI] [PubMed]
- 6.Lakkis NA, Mahmassani DM. Screening instruments for depression in primary care: a concise review for clinicians. Postgrad Med. 2015;127(1):99–106. [DOI] [PubMed]
- 7.Mulvaney-Day N, Marshall T, Downey Piscopo K, Korsen N, Lynch S, Karnell LH, Moran GE, Daniels AS, Ghose SS. Screening for behavioral health conditions in primary care settings: a systematic review of the literature. J Gen Intern Med. 2018;33(3):335–346. [DOI] [PMC free article] [PubMed]
- 8.Kroenke K, Spitzer RL, Williams JB. The PHQ-9: validity of a brief depression severity measure. J Gen Intern Med. 2001;16(9):606 – 13. [DOI] [PMC free article] [PubMed]
- 9.Arroll B, Goodyear-Smith F, Crengle S, Gunn J, Kerse N, Fishman T, Falloon K, Hatcher S. Validation of PHQ-2 and PHQ-9 to screen for major depression in the primary care population. Ann Fam Med. 2010 Jul-Aug;8(4):348 – 53. [DOI] [PMC free article] [PubMed]
- 10.Klein Olsen R, Hovmand OR, Madsen C, Arnfred S. High fatigue levels among psychiatric outpatients. the validity of the danish patient reported outcomes measurement information system fatigue short-form (PROMIS-F-SF). In review at Patients Reported Outcomes. 2024. [DOI] [PMC free article] [PubMed]
- 11.Schnohr CW, Rasmussen CL, Langberg H, Bjørner JB. Danish translation of a physical function item bank from the patient-reported outcome measurement information system (PROMIS). Pilot Feasibility Stud. 2017;3:29. [DOI] [PMC free article] [PubMed]
- 12.Beck AT, Ward CH, Mendelson M, Mock J, Erbaugh J. An inventory for measuring depression. Arch Gen Psychiatry. 1961;4:561 – 71. [DOI] [PubMed]
- 13.Beck AT, Steer RA, Brown GK. Manual for the Beck Depression Inventory-II. San Antonio, TX: Psychological Corporation; 1996. [Google Scholar]
- 14.Beck AT, Steer RA, Brown GK. Beck Depression Inventory–Second Edition (BDI-II). Dansk oversættelse. Pearson; 1996.
- 15.Beck AT, Epstein N, Brown G, Steer RA. An inventory for measuring clinical anxiety: psychometric properties. J Consult Clin Psychol. 1988;56(6):893-7. [DOI] [PubMed]
- 16.Beck AT, Steer RA. BAI. Beck Anxiety Index. Dansk Oversættelse. Pearson; 1990.
- 17.RStudio RS. Integrated Development for R. Boston, MA: RStudio, Inc.; 2019. [Google Scholar]
- 18.Revelle P. Procedures for personality and psychological research. 2007.
- 19.Rosseel lavaan. An R package for structural equation modeling. J Stat Softw. 2012;48(2):1–36. [Google Scholar]
- 20.Thiele C. An introduction to cutpointr. 2022.
- 21.PROMIS. PROMIS Depression Scoring Manual. 2021.
- 22.PROMIS. PROMIS Anxiety Scoring Manual. 2021.
- 23.McHorney CA, Tarlov AR. Individual-patient monitoring in clinical practice: are available health status surveys adequate? Qual Life Res. 1995;4(4):293–307. [DOI] [PubMed]
- 24.Kline. Principles and practice of structural equation modeling. Guilford; 2015.
- 25.Hu, Bentler. Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Struct Equation Modeling: Multidisciplinary J. 2009;6(1):1–55. [Google Scholar]
- 26.Wheaton B, Muthén B, Alwin DF, Summers GF. Assessing Reliability and Stability in Panel Models. Sociol Methodol. 1977;8:84–136. [Google Scholar]
- 27.Cole. Utility of confirmatory factor analysis in test validation research. J Consult Clin Psychol. 1987;55(4):584–94. [DOI] [PubMed] [Google Scholar]
- 28.Bland JM, Altman DG. Statistics notes: Cronbach’s alpha. BMJ. 1997;314(7080):572. [DOI] [PMC free article] [PubMed]
- 29.Hair JF, Hult GTM, Ringle CAND, Sarstedt M. A primer on partial least squares structural equation modeling (PLS-SEM). Thousand Oaks.: SAGE; 2016. [Google Scholar]
- 30.Nájera R. Population Classification and Weighting in Multidimensional Poverty Measurement: A Monte Carlo Study. Soc Indic Res. 2019;142:887–910. [Google Scholar]
- 31.Cohen J. Statistical power analysis for the behavioral sciences. 2nd ed. Hillsdale (NJ): Lawrence Erlbaum Associates; 1988.
- 32.University of Nebraska Medical Center. The area under an ROC curve [Internet] Available from: https://gim.unmc.edu/dxtests/roc3.htm
- 33.MacKrill K, Groom KM, Petrie KJ. The effect of symptom-tracking apps on symptom reporting. Br J Health Psychol. 2020;25(4):1074–85. 10.1111/bjhp.12459. Epub 2020 Aug 13., 2020. [DOI] [PubMed] [Google Scholar]
- 34.Spitzer RL, Kroenke K, Williams JB, Löwe B. A brief measure for assessing generalized anxiety disorder: the GAD-7. Arch Intern Med. 2006;166(10):1092-7., 2006. [DOI] [PubMed]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The datasets used and analyzed during the current study are available from the last author on reasonable request.
