Validity and reliability of a self-administered foot evaluation questionnaire (SAFE-Q)

Hisateru Niki; Shinobu Tatsunami; Naoki Haraguchi; Takafumi Aoki; Ryuzo Okuda; Yasunori Suda; Masato Takao; Yasuhito Tanaka

doi:10.1007/s00776-012-0337-2

. 2013 Jan 9;18(2):298–320. doi: 10.1007/s00776-012-0337-2

Validity and reliability of a self-administered foot evaluation questionnaire (SAFE-Q)

Hisateru Niki ^1,^✉, Shinobu Tatsunami ², Naoki Haraguchi ³, Takafumi Aoki ⁴, Ryuzo Okuda ⁵, Yasunori Suda ⁶, Masato Takao ⁷, Yasuhito Tanaka ⁸

PMCID: PMC3607735 PMID: 23299996

Abstract

Background

The Japanese Society for Surgery of the Foot (JSSF) is developing a QOL questionnaire instrument for use in pathological conditions related to the foot and ankle. The main body of the outcome instrument (the Self-Administered Foot Evaluation Questionnaire, SAFE-Q version 2) consists of 34 questionnaire items, which provide five subscale scores (1: Pain and Pain-Related; 2: Physical Functioning and Daily Living; 3: Social Functioning; 4: Shoe-Related; and 5: General Health and Well-Being). In addition, the instrument has nine optional questionnaire items that provide a Sports Activity subscale score. The purpose of this study was to evaluate the test-retest reliability of the SAFE-Q.

Patients and methods

Version 2 of the SAFE-Q was administered to 876 patients and 491 non-patients, and the test-retest reliability was evaluated for 131 patients. In addition, the SF-36 questionnaire and the JSSF Scale scoring form were administered to all of the participants. Subscale scores were scaled such that the final sum of scores ranged between zero (least healthy) to 100 (healthiest).

Results

The intraclass correlation coefficients were larger than 0.7 for all of the scores. The means of the five subscale scores were between 60 and 75. The five subscales easily separated patients from non-patients. The coefficients for the correlations of the subscale scores with the scores on the JSSF Scale and the SF-36 subscales were all highly statistically significantly greater than zero (p < 0.001). The means for the five JSSF Scale classification groups fell within a relatively narrow range, indicating that the SAFE-Q labels are sufficiently similar to permit their use for all of the JSSF Scale classifications.

Conclusion

The present study revealed that the test-retest reliability is high for each subscale. Consequently, the SAFE-Q is valid and reliable. In the future, it will be beneficial to test the responsiveness of the SAFE-Q.

Introduction

Patient-based outcome instruments, which are used to measure changes in health status over time, have become increasingly popular. The four basic types of patient-based outcome instruments are generic, disease-specific, region-specific, and patient-specific. A region-specific instrument contains items specific to only one body part and can be used with several different disease states affecting a specific region. The Japanese Society for Surgery of the Foot (JSSF) is developing a QOL questionnaire for use in individuals with pathological conditions related to the foot and ankle as a region-specific outcome instrument.

The questionnaire, named the Self-Administered Foot Evaluation Questionnaire (hereafter referred to as the “SAFE-Q”) version 1, was subjected to through an initial field test [1], after which it was revised to a second version [2]. The main body of the SAFE-Q version 2 consists of 34 questionnaire items, providing five subscale scores (1: Pain and Pain-Related; 2: Physical Functioning and Daily Living; 3: Social Functioning; 4: Shoe-Related; and 5: General Health and Well-Being). In addition, the instrument has nine optional questionnaire items that provide a Sports Activity score.

The SAFE-Q version 2 was subjected to a limited field test. Tentative scores for the five subscales were compared to their corresponding scales in the Short Form 36 Health Survey, version 2.0 (SF-36) [3] and the JSSF Scale score [4, 5], and the results obtained were reasonable [1]. Therefore, based upon its favorable performance in the previous field test [2], the JSSF decided to evaluate the second version of the SAFE-Q further by applying it to a larger sample of patients with foot and ankle disorders as well as a control sample of healthy teenagers and adults.

Because the factor structure of the responses to the instrument was valid in the former study, the primary aim of the present field survey was to evaluate the test-retest reliability. A secondary aim was to test the influence of background factors such as region-specific classification, age group, and gender on the subscale scores. This report provides an analysis of the data gathered in this second field test of the second version of the SAFE-Q.

Patients and methods

Study group

In the present field survey, the SAFE-Q version 2 was administered to 876 patients with pathological conditions related to the foot and ankle. A total of 491 non-patients consisting of healthy teenagers and adult volunteers were also analyzed. Both patients and non-patients had been registered in a total of 99 institutions in Japan.

Although the SAFE-Q version 1 has already been presented in a previous article [1], we have provided the SAFE-Q version 2 in “Appendix 1” for the sake of reader convenience. In addition, the manual for the SAFE-Q is shown in “Appendix 2.”

Among the 876 patients, 131 of them with stable pathological symptoms attended the test-retest reliability evaluation. The same questionnaire form was answered by these patients twice in succession. The interval between the first and second tests was a minimum of eight weeks. When the test was first administered, an SF-36 questionnaire form was also answered by the subjects, and the JSSF Scale scoring form was recorded by a physician.

Ethical issue

This study was approved by the Life Ethics Committee of St. Marianna University School of Medicine in 2007 (no. 1192). The elongation of the research period until 2014 was approved in 2012.

Statistical analysis

EFA and CFA

An exploratory factor analysis (EFA) and confirmatory factor analysis (CFA) were performed. These were done to determine whether the factor structure was stable, given that the patient population in this field test comprised a wide variety of pathologies. Response data from the patients during the first administration (but not the retest) of this field test were subjected to the same EFA and CFA as used in the first field test of the second version.

Computation of subscale scores

Subscale scores were computed for each of the five subscales. To compute the scores, for each subscale, the average non-missing values of items contributing to the subscale were computed for each respondent. Prior to averaging, VAS items were rescaled to conform to the ranges of the categorical items. Averages were then rescaled so that the final sum of scores ranged between zero (least healthy) and 100 (healthiest), inclusive.

Test-retest reliability

Each subscale’s scores were subjected to a random-effects linear regression with test-retest as a categorical predictor. The intraclass correlation coefficient (ICC) was computed as the index of reliability for each scale. Ninety-five percent confidence intervals (95 % CIs) for ICCs were computed by parametric bootstrapping [6] using 100 bootstrap samples of patients with scores for the scale for both test and retest administrations of the questionnaire.

Comparison with JSSF Scale scores

Spearman’s rank correlation coefficients were computed between the scores for each of the five SAFE-Q subscales and the JSSF Scale scores (which were only taken from patient responders during the first administration of the questionnaire).

Comparison with SF-36 scores

Spearman’s rank correlation coefficients were computed between the scores for each of the five SAFE-Q subscales and those for each of the eight SF-36 subscales. Scores for each of the eight SF-36 subscales were computed using the Japanese norm-based scoring method as prescribed in the commercial instrument’s documentation [3]. Again, QOL scores were only taken from patient responders during the first administration of the questionnaire.

Comparison of scores for the Pain and Pain-Related subscale and the SF-36 Bodily Pain subscale

We compared the patients’ scores for the Pain and Pain-Related subscale with the scores for the SF-36 Bodily Pain subscale. For this purpose, we extracted the values for the Pain subscale scores from the JSSF Scale scores. On the JSSF Pain subscale, 0, 20, 30, and 40 points are assigned to patients with diseases of the ankle and hindfoot, midfoot, hallux, and lesser toe, respectively; and 0, 10, 20, or 30 points are assigned to patients with rheumatoid arthritis. Thus, we computed the Spearman’s rank correlation coefficient between the JSSF Pain score and the Pain and Pain-Related score or SF-36 Bodily Pain score for each of the patient groups.

Background factors

The following patient characteristics were assessed using scores from the first administration of the questionnaire: patient group in the JSSF Scale classification, age group, and gender. Patient groups in the JSSF Scale classification were as follows: ankle and hindfoot, midfoot, hallux, lesser toe, and rheumatoid arthritis. Respondents were grouped by age as follows: 16–39, 40–64, and 65 and older, inclusive. For the patient groups and patient-age groups, each of the five subscales was assessed by means of one-way analysis of variance (ANOVA). Gender comparisons were made by means of Student’s t test in each subgroup of patients classified by patient group and age group. Dunnett’s multiple comparisons test was performed afterward to compare patient groups. In order to stabilize the variances in the presence of floor and ceiling effects, the data were arcsine square-root transformed prior to performing ANOVA or other tests.

Patient versus non-patient comparison

Scores for each of the five subscales were compared between patients (first administration of the questionnaire) and non-patients by means of the Mann–Whitney test. This nonparametric test was used for this comparison due to concern over the large proportion of ceiling responses in the healthy group.

Sports items

Sports-related questionnaire items were scored as above, taking into account the reversal of sense of the VAS item among them. EFA was applied to the responses of patients during the first administration of the questionnaire in order to confirm the unidimensionality of the scale. The test-retest reliability of the sum of these items’ scores was assessed as above.

Statistical probability

In the statistical comparisons, a p value of less than 0.05 was considered statistically significant. Below, for all p values less than 0.001, we simply state p < 0.001, even when the exact value was obtained from the computation.

Results

Patient and non-patient classification and age

The classification of the subjects enrolled in the present field study is summarized in Table 1. A total of 876 patients and 491 non-patients were registered. The majority of the patients had diseases of the ankle and hindfoot (469). Numbers of patients in the lesser toe (45) and midfoot (68) groups were less than 100. The JSSF region-specific classification was not reported for eight patients. The mean age of the patients in each group and that of the non-patients are also indicated in Table 1. As a whole, the mean ages of the patients and non-patients were 52.6 ± 18.0 (mean ± SD; n = 876) and 44.6 ± 16.6 (n = 491), respectively.

Table 1.

Numbers of patients and non-patients

Patient-group by JSSF Scale classification	Gender	Number	Age
Patient-group by JSSF Scale classification	Gender	Number	Mean ± SD
Patient
Ankle and hindfoot	Male	232	47.0 ± 18.6
	Female	237	52.6 ± 18.1
	Total	469	49.8 ± 18.5
Hallux	Male	43	54.9 ± 19.1
	Female	126	59.5 ± 15.6
	Total	169	58.3 ± 16.6
Lesser toe	Male	15	43.7 ± 19.2
	Female	30	52.7 ± 17.4
	Total	45	49.7 ± 18.3
Midfoot	Male	32	44.8 ± 17.3
	Female	36	50.7 ± 19.8
	Total	68	47.9 ± 18.7
Rheumatoid arthritis	Male	30	61.0 ± 14.3
	Female	87	59.3 ± 12.1
	Total	117	59.8 ± 12.6
Not reported	Male	4	31.5 ± 16.9
	Female	4	67.8 ± 7.5
	Total	8	49.6 ± 22.8
All groups	Male	356	48.6 ± 18.8
	Female	520	55.4 ± 17.0
	Total	876	52.6 ± 18.0
Non-patient
	Male	225	45.0 ± 17.0
	Female	266	44.2 ± 16.4
	Total	491	44.6 ± 16.6

Open in a new tab

Factor analysis

The factor structure was remarkably stable, in that factor loadings and residual variances were essentially the same as those obtained in the previous field test of the SAFE-Q version 2 (data not shown). The factor correlation coefficients resulting from the CFA are summarized in Table 2. All of the correlations between different subscale factors were less than 0.9. The maximum coefficient was 0.841, for the correlation between the Physical Functioning and Daily Living subscale and the Social Functioning subscale.

Table 2.

Factor correlation coefficients among five subscales resulting from confirmatory factor analysis

	Physical Functioning and Daily Living (Q12–Q22)	Social Functioning (Q23–Q28)	Shoe-Related (Q8, Q9, Q34)	General Health and Well-Being (Q29–Q33)
Pain and Pain-Related	0.752 (0.016^b)	0.647 (0.021)	0.721 (0.022)	0.785 (0.015)
Physical Functioning and Daily Living		0.841 (0.012)	0.737 (0.021)	0.808 (0.013)
Social Functioning			0.726 (0.023)	0.804 (0.014)
Shoe-Related				0.718 (0.022)

Open in a new tab

^aQ1–Q34 refers to question item numbers (used in “Appendix 1”) included in the corresponding subscales

^bValue in parentheses is the standard error in the factor correlation coefficient

Test-retest reliability

The value of the ICC for each of the five subscales is listed in Table 3. The ICC was always larger than 0.7; even the minimum 95 % CI lower limit for the Social Functioning subscale was larger than 0.6. The ICC for the sum of the subscale scores was 0.85 (with a 95 % CI of 0.81 to 0.89), which is, as expected, higher than any of the individual components.

Table 3.

Values of ICC observed for the five subscales

Subscale	ICC	95 % CI
Pain and Pain-Related	0.78	0.74–0.83
Physical Functioning and Daily Living	0.83	0.77–0.89
Social Functioning	0.72	0.64–0.79
Shoe-Related	0.81	0.76–0.86
General Health and Well-Being	0.82	0.78–0.87
Sum of scores	0.85	0.81–0.89

Open in a new tab

Distribution of subscale scores

The distributions of the subscale scores are illustrated in Fig. 1. The mean ± SD and median for the five subscales were as follows: Pain and Pain-Related: 66.0 ± 23.8, 70.1; Physical Functioning and Daily Living: 69.2 ± 26.2, 75.0; Social Functioning: 66.3 ± 32.4, 75.0; Shoe-Related: 62.7 ± 30.4, 66.7; General Health and Well-Being: 66.8 ± 29.7, 75.0. The width between the 25th percentile and the 75th percentile was broad in the Social Functioning, Shoe-Related, and General Health and Well-Being subscales, while smaller widths were observed in the Pain and Pain-Related and Physical Functioning and Daily Living subscales. The values of the means were very similar for the five subscales, ranging from 60 to 70.

Comparison with the JSSF Scale score

The distribution of the JSSF Scale score is illustrated in Fig. 2. The mean ± SD and median were 69.4 ± 20.9 and 72 (n = 864), respectively. The JSSF score was correlated with each of the present subscale scores. The Spearman’s rank correlation coefficients are summarized in Table 4, where the patients are classified into JSSF patient groups. The scores for the five subscales display statistically significant correlations (p < 0.001) with the JSSF Scale score, with rank correlation coefficients ranging from 0.51 to 0.61 (Table 4). This tendency was the same in each group of patents. However, slightly smaller correlation coefficients were observed in the lesser toe group containing 45 patients.

Table 4.

Correlations with the JSSF score for the five patient groups

Patient group by JSSF Scale classification	n	Pain and Pain-Related	Physical Functioning and Daily Living	Social Functioning	Shoe-Related	General Health and Well-Being
Ankle and hindfoot	467	0.63 (p < 0.001)	0.65 (p < 0.001)	0.57 (p < 0.001)	0.49 (p < 0.001)	0.61 (P < 0.001)
Hallux	167	0.64 (p < 0.001)	0.50 (p < 0.001)	0.50 (p < 0.001)	0.46 (p < 0.001)	0.52 (p < 0.001)
Lesser toe	45	0.47 (p = 0.002)	0.52 (p < 0.001)	0.51 (p = 0.001)	0.48 (p = 0.002)	0.45 (p = 0.004)
Mid foot	68	0.61 (p < 0.001)	0.69 (p < 0.001)	0.63 (p < 0.001)	0.54 (p < 0.001)	0.58 (p < 0.001)
Rheumatoid arthritis	117	0.57 (p < 0.001)	0.64 (p < 0.001)	0.59 (p < 0.001)	0.55 (p < 0.001)	0.56 (p < 0.001)
All	864	0.61 (p < 0.001)	0.60 (p < 0.001)	0.55 (p < 0.001)	0.51 (p < 0.001)	0.56 (p < 0.001)

Open in a new tab

SF-36

The Spearman rank correlation coefficients between each of the five subscales of the SAFE-Q and each of the eight SF-36 subscales were all statistically significantly different from zero (p < 0.001), as summarized in Table 5. The correlation coefficient for the Pain and Pain-Related subscale was highest with Bodily Pain; the correlation coefficient for the Shoe-Related subscale was highest with Bodily Pain and Physical Functioning; the correlation coefficient for the Physical Functioning and Daily Living subscale was highest with Physical Functioning; the correlation coefficient for the Social Functioning subscale was highest with Role Physical and Bodily Pain (but nearly as high with Social Functioning and Physical Functioning); the correlation coefficient for the General Health and Well-Being subscale was highest with Bodily Pain. In these particular patients, the scores obtained with these two instruments were largely driven by pain and difficulty with mobility. The mean ± SD of each norm-based [3] SF-36 subscale score are also shown in Table 5. The mean of the norm-based SF-36 score ranged from 36 to 47 for these patients, indicating that the patients were somewhat below average in their health status.

Table 5.

Comparison of scores for subscales of the SAFE-Q version 2 with SF-36 subscale scores

SF-36 subscale (mean ± SD)	SAFE-Q subscale
SF-36 subscale (mean ± SD)	Pain and Pain-Related	Physical Functioning and Daily Living	Social Functioning	Shoe-Related	General Health and Well-Being
Physical functioning (36.2 ± 18.4)	0.505 (p < 0.001)	0.771 (p < 0.001)	0.657 (p < 0.001)	0.520 (p < 0.001)	0.638 (p < 0.001)
Role physical (36.9 ± 17.1)	0.422 (p < 0.001)	0.625 (p < 0.001)	0.704 (p < 0.001)	0.436 (p < 0.001)	0.607 (p < 0.001)
Bodily pain (42.2 ± 11.8)	0.652 (p < 0.001)	0.634 (p < 0.001)	0.684 (p < 0.001)	0.532 (p < 0.001)	0.669 (p < 0.001)
Social Functioning (46.2 ± 14.9)	0.406 (p < 0.001)	0.579 (p < 0.001)	0.667 (p < 0.001)	0.446 (p < 0.001)	0.592 (p < 0.001)
General health (46.8 ± 11.5)	0.403 (p < 0.001)	0.461 (p < 0.001)	0.409 (p < 0.001)	0.399 (p < 0.001)	0.504 (p < 0.001)
Vitality (47.3 ± 10.5)	0.415 (p < 0.001)	0.400 (p < 0.001)	0.450 (p < 0.001)	0.375 (p < 0.001)	0.507 (p < 0.001)
Role emotional (42.3 ± 15.5)	0.442 (p < 0.001)	0.543 (p < 0.001)	0.600 (p < 0.001)	0.416 (p < 0.001)	0.610 (p < 0.001)
Mental health (47.7 ± 11.0)	0.408 (p < 0.001)	0.441 (p < 0.001)	0.484 (p < 0.001)	0.380 (p < 0.001)	0.566 (p < 0.001)

Open in a new tab

Comparison of scores from the SAFE-Q Pain and Pain-Related subscale and SF-36 Bodily Pain subscale scores

Results of comparisons of the Spearman rank correlation coefficients are summarized in Table 6. The Spearman’s rank correlation coefficients from the Pain and Pain-Related subscale were larger than those from the SF-36 Bodily Pain subscale in all groups of patients. Statistical significance (p < 0.05) was found in the ankle and hindfoot and the hallux groups.

Table 6.

Comparisons of Spearman’s rank correlation coefficients between the present Pain and Pain-Related subscale score and the SF-36 Bodily Pain subscale score

JSSF Scale classification	Rank correlation coefficients in comparisons of Pain scores by JSSF Scales
JSSF Scale classification	SAFE-Q Pain and Pain-Related	SF-36 Bodily Pain	Significance
Ankle and hindfoot	0.63 (n = 409)	0.51 (n = 399)	p < 0.05
Hallux	0.68 (n = 163)	0.47 (n = 160)	p < 0.01
Lessor toe	0.58 (n = 44)	0.32 (n = 44)	NS
Midfoot	0.66 (n = 68)	0.50 (n = 67)	NS
Rheumatoid arthritis	0.62 (n = 116)	0.47 (n = 113)	NS

Open in a new tab

Patient characteristics

Comparison among patient groups

A comparison of the mean subscale scores and SDs of the different JSSF patient groups is provided in Fig. 3. The scores for the five patient groups were statistically significantly different according to one-way ANOVA, for all subscales. The p values from ANOVA were smaller than 0.001 for the Physical Functioning and Daily Living and the Shoe-Related subscales, and were between 0.002 and 0.02 for the other subscales. Patients with rheumatoid arthritis showed the lowest mean values for the five subscales, and the differences between these mean values and those of other patient groups were sometimes found to be statistically significant upon performing Dunnett-type comparison tests, as shown in Fig. 3.

Age and gender

The subscale scores for male and female patients were compared for three age groups (ages 16–39, ages 40–64, ages 65 and older, inclusive) in Fig. 4. The size of the sample analyzed in this work is large enough to allow subscale-specific comparisons of scores among age groups and genders. One-way ANOVA revealed that there were statistically significant differences (p < 0.001) among the age groups in all five subscales when only female patients were considered. When only male patients were considered, there were no statistical significant differences among the age groups for any of the subscales aside from the Physical Functioning and Daily Life subscale (p < 0.001). The scores of male and female patients are also compared in Fig. 4. In all subscales, the male scores were always higher than the female scores, whichever age group was considered; the differences between the male and female scores were sometimes significant, as shown in Fig. 4.

Comparison of the scores of patients and non-patients

As expected, patients scored lower (less healthy) on average than non-patients on each of the five subscales (Table 7). The p value from the Mann–Whitney test comparing patients and non-patients was less than 0.001 for all five subscales. The means and SDs of the five subscale scores for non-patients are summarized in Table 8. Older non-patients tended to present lower mean values than younger non-patients, and female non-patients tended to present lower mean values than male non-patients.

Table 7.

Comparison of the subscale scores of patients and non-patients

Subscale	Patients			Non-patients		Comparison between patients and non-patients
Subscale	25th percentile	Median	75th percentile	25th percentile	Median	Comparison between patients and non-patients
Pain and Pain-Related	47	70	84	94	100	p < 0.001
Physical Functioning and Daily Living	55	75	92	98	100	p < 0.001
Social Functioning	42	75	96	100	100	p < 0.001
Shoe-Related	42	67	92	92	100	p < 0.001
General Health and Well-Being	45	75	90	100	100	p < 0.001
Sports Activity	11	34	75	97	100	p < 0.001

Open in a new tab

Table 8.

Means and SDs of the five subscale scores for non-patients, classified by age and gender

Group		Pain and Pain-Related	Physical Functioning and Daily Living	Social Functioning	Shoe-Related	General Health and Well-Being
Age group^a	1	96 ± 7 (n = 224)	99 ± 3 (n = 224)	99 ± 5 (n = 224)	94 ± 12 (n = 224)	99 ± 3 (n = 224)
	2	95 ± 10 (n = 178)	98 ± 6 (n = 178)	99 ± 7 (n = 177)	92 ± 13 (n = 178)	99 ± 5 (n = 178)
	3	94 ± 12 (n = 89)	93 ± 12 (n = 89)	95 ± 12 (n = 88)	91 ± 14 (n = 88)	94 ± 14 (n = 89)
Gender	Male	97 ± 7 (n = 225)	98 ± 6 (n = 241)	99 ± 6 (n = 224)	97 ± 7 (n = 224)	98 ± 7 (n = 225)
Gender	Female	94 ± 11 (n = 266)	97 ± 7 (n = 266)	98 ± 9 (n = 265)	89 ± 15 (n = 266)	98 ± 7 (n = 266)

Open in a new tab

^a1 16–39 years old, 2 40–64 years old, 3 65–88 years old

Sports items

Optional sports items were responded to by 275 patients and 197 non-patients. EFA of the resulting patient data showed that these items contributed to a single major factor, as seen before (data not shown). The test-retest reliability for sports items was similar to that observed for the other sets of items: ICC = 0.76, with a 95 % CI of 0.64–0.87. The mean ± SD of the Sports Activity score was 45.3 ± 34.2 in patients, and it was 95.7 ± 10.9 in non-patients. The difference in the mean scores of patients and non-patients was statistically significant (p < 0.001).

Discussion

Several patient-based and region-specific outcome instruments for patients with diseases or injuries of the foot and ankle region, such as the American Academy of Orthopaedic Surgeons lower limb outcomes assessment instruments (including the Foot and Ankle Module (AAOS-FA) [7], Foot and Ankle Ability Measure (FAAM) [8], Foot Health Status Questionnaire (FHSQ) [9], and Foot Function Index [10]), have been developed. Recently, a comparison of the responsiveness of the Manchester–Oxford foot questionnaire (MOXFQ) with those of the American Orthopaedic Foot Ankle Society [AOFAS] [11], SF-36 [12], and EuroQol (EQ-5D) [13] assessments following foot or ankle surgery was published [14]. Although the MOXFQ is a patient-based outcome measure, it was originally developed based on interviews with patients who had foot surgery. In the interviews, however, the Manchester Foot Pain and Disability Questionnaire (MFPDQ) [15] had been utilized as a template. In addition, the measurement properties of the MOXFQ were initially assessed in a specific group of patients undergoing surgery for hallux valgus [16, 17]. In this context, there is no new and original patient-based outcome instrument focusing only on the foot and ankle that is similar to the various instruments that have already been verified to be valid, repeatable, and reliable.

There are potential advantages and disadvantages associated with each of these instruments [18], and there is an ongoing process whereby evidence is collected to support their use under various conditions. The usefulness of an outcome instrument is never completely established. There is currently an urgent need for scientific evaluation of foot and ankle surgery, which in turn requires the use of appropriate (patient-based) standard methods of outcome assessment. In this context, the Japanese Society for Surgery of the Foot (JSSF) is developing a QOL questionnaire for use in individuals with pathological conditions related to the foot and ankle as a region-specific outcome instrument.

The present field test of the second version of the SAFE-Q replicated the factor structure of the same version of the SAFE-Q in its first field test (which had a smaller patient sample). The test-retest reliability was high for each of the subscales and for the average of all subscales. Gender-related differences, observed in particular for the Shoe-Related subscale and Physical Functioning and Daily Living subscale, might reflect the well-known foot-health consequences of women wearing high-heeled footwear and women’s more fashion-oriented attitude towards shoes. It is believed by many surgeons that age-related differences reflect a general decline in overall health and physical vigor, as well as a general reduced ability to recover quickly from health-related problems.

The differences between patient groups were also statistically significant according to ANOVA. In particular, patients with rheumatoid arthritis appeared to fare more poorly than patients in other region-specific categories (Fig. 3). Nevertheless, the averages for the patient groups fell in a relatively narrow range, indicating that the SAFE-Q labels are sufficiently similar to allow their use in all patient groups.

As expected, the SAFE-Q readily distinguished patients with foot and ankle disorders from non-patients. The mean scores on the subscales range between 60 and 75, which may lead to concern over the sensitivity or dynamic range of the QOL instrument in these patients. In contrast to this, the distribution of JSSF scores observed in the patients implies that most of the patients did not have severe symptoms (Fig. 2). This is a plausible reason for the scattered range of mean values observed in the present field test.

Given the large sample size, the coefficients for the correlation of the SAFE-Q subscale scores with the JSSF Scale score were all highly statistically significantly greater than zero. Likewise, the coefficients of the correlation of the SAFE-Q subscale scores and SF-36 subscale scores were statistically significantly greater than zero for the same reason. Nevertheless, there was a qualitative alignment of the two QOL scales when the correlation coefficient values were examined. The lack of perfect alignment indicates that the SAFE-Q constructs measured in these patients are superior to those measured by the corresponding subscales in the more general SF-36 instrument. It does appear, however, that the scores obtained using both instruments are largely driven by pain and difficulty with mobility in these patients.

The nine items of the Sports Activity subscale of the SAFE-Q consist of questions about very basic performance of sports activities [8, 19, 20]. Regarding the Sports Activity subscale, the unidimensionality of the items remained stable and the difference between patients and non-patients was apparent. In addition, the test-retest reliability was adequate. Therefore, we will add these nine items to the responsiveness analysis without changing them.

As reviewed by Martin and Irrgang [18], validity testing of QOL outcome instruments should include assessments of content validity, construct validity, test-retest reliability, and responsiveness. In our process, content validity was confirmed for the SAFE-Q version 1 [1] and version 2 [2] through the various Cronbach α metrics. Regarding construct validity, we ascertained convergence by comparing the SAFE-Q subscales with the JSSF scales and SF-36 subscales. We also studied the convergence and divergence [21] by evaluating the results from CFA. That is, we observed that the factor loading of each questionnaire item was large for the intended subscale and small for the other subscales in the previous field study, and similar results were seen in the present study.

As described above, we were able to verify that the test-retest reliability was high for each subscale. The comparison of Spearman rank correlation coefficients shown in Table 6 suggests that the Pain and Pain-Related subscale is more responsive than the SF-36 Bodily Pain subscale. However, there is no other clear standard that could be used to gauge the responsiveness of the other subscales. Additionally, the responsiveness should be evaluated by performing a longitudinal study. In the future, it will be beneficial to test the responsiveness of the present outcome instrument.

Acknowledgments

This study was supported by grants from the JSSF (Japanese Society for Surgery of the Foot) and JOA (Japanese Orthopaedic Association). In addition, the authors would like to thank all of the orthopedic surgeons who collaborated with the field survey. We declare that we have no conflict of interest regarding the present manuscript.

Appendix 1

Inline graphic

Appendix 2

Inline graphic

Footnotes

Naoki Haraguchi is a member of the Clinical Outcomes Committee of the Japanese Orthopaedic Association (JOA). All of the authors belong to The Clinical Outcomes Committee of the Japanese Society for Surgery of the Foot (JSSF). The authors prepared the article on behalf of those committees.

References

1.Niki H, Tatsunami S, Haraguchi N, Aoki T, Okuda R, Suda Y, Takao M, Tanaka Y. Development of the patient-based outcome instrument for the foot and ankle. Part 1: project description and evaluation of the outcome instrument version 1. J Orthop Sci. 2011;16:536–555. doi: 10.1007/s00776-011-0130-7. [DOI] [PubMed] [Google Scholar]
2.Niki H, Tatsunami S, Haraguchi N, Aoki T, Okuda R, Suda Y, Takao M, Tanaka Y. Development of the patient-based outcome instrument for the foot and ankle. Part 2: results from the second field survey: validity of the outcome instrument for the foot and ankle version 2. J Orthop Sci. 2011;16:556–564. doi: 10.1007/s00776-011-0131-6. [DOI] [PubMed] [Google Scholar]
3.Fukuhara S, Suzukamo Y. Manual of SF-36v2 (Japanese version). Kyoto: Institute for Health Outcomes and Process Evaluation Research; 2004.
4.Niki H, Aoki H, Inokuchi S, Ozeki S, Kinoshita M, Kura H, Tanaka Y, Noguchi M, Nomura S, Hatori M, Tatsunami S. Development and reliability of a standard rating system for outcome measurement of foot and ankle disorders I: development of standard rating system. J Orthop Sci. 2005;10:457–465. doi: 10.1007/s00776-005-0936-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Niki H, Aoki H, Inokuchi S, Ozeki S, Kinoshita M, Kura H, Tanaka Y, Noguchi M, Nomura S, Hatori M, Tatsunami S. Development and reliability of a standard rating system for outcome measurement of foot and ankle disorders II: interclinician and intraclinician reliability and validity of the newly established standard rating scales and Japanese Orthopaedic Association rating scale. J Orthop Sci. 2005;10:466–474. doi: 10.1007/s00776-005-0937-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Park HM. Evaluating interrater agreement with intraclass correlation coefficient in SPICE-based software process assessment. In: IEEE, editor. Proceedings of the Third International Conference on Quality Software. Washington, DC: IEEE; 2003. p. 308–14.
7.Johanson NA, Liang MH, Daltroy L, Rudicel S, Richmond J. American Academy of Orthopaedic Surgeons lower limb outcomes assessment instruments: reliability, validity, and sensitivity to change. J Bone Jt Surg Am. 2004;86:902–909. doi: 10.2106/00004623-200405000-00003. [DOI] [PubMed] [Google Scholar]
8.Martin RL, Irrgang JJ, Burdett RG, Conti SF, Van Swearingen JM. Evidence of validity for the foot and ankle ability measure (FAAM) Foot Ankle Int. 2005;26:968–983. doi: 10.1177/107110070502601113. [DOI] [PubMed] [Google Scholar]
9.Bennett PJ, Patterson C, Wearing S, Baglioni T. Development and validation of a questionnaire designed to measure foot-health status. J Am Podiatr Med Assoc. 1998;88:419–428. doi: 10.7547/87507315-88-9-419. [DOI] [PubMed] [Google Scholar]
10.Budiman-Mak E, Conrad KJ, Roach KE. The foot function index: a measure of foot pain and disability. J Clin Epidemiol. 1991;44:561–570. doi: 10.1016/0895-4356(91)90220-4. [DOI] [PubMed] [Google Scholar]
11.Kitaoka HB, Alexander IJ, Adelaar RS, Nunley JA, Myerson MS, Sanders M. Clinical rating systems for ankle-hindfoot, midfoot, hallux and lesser toes. Foot Ankle Int. 1994;15:349–353. doi: 10.1177/107110079401500701. [DOI] [PubMed] [Google Scholar]
12.Ware JE, Sherbourne CD. 36-Item Short-Form Health Survey (SF-36). I. Conceptual framework and item selection. Med Care. 1992;30:474–483. [PubMed] [Google Scholar]
13.Group EuroQol. EuroQol; a new facility for the measurement of health-related quality of life. Health Policy. 1990;16:199–208. doi: 10.1016/0168-8510(90)90421-9. [DOI] [PubMed] [Google Scholar]
14.Dawson J, Boller I, Doll H, Lavis G, Sharp R, Cooke P, Jenkinson C. Responsiveness of the Manchester-Oxford foot questionnaire (MOXFQ) compared with AOFAS, SF-36 and EQ-5D assessments following foot or ankle surgery. J Bone Jt Surg Br. 2012;94:215–221. doi: 10.1302/0301-620X.94B2.27634. [DOI] [PubMed] [Google Scholar]
15.Garrow AP, Papageorgiou AC, Silman AJ, Thomas E, Jayson MI, Macfarlane GJ. Development and validation of a questionnaire to assess disabling foot pain. Pain. 2000;85:107–113. doi: 10.1016/S0304-3959(99)00263-8. [DOI] [PubMed] [Google Scholar]
16.Dawson J, Coffey J, Doll H, Lavis G, Cooke P, Herron M, Jenkinson C. A patient-based questionnaire to assess outcomes of foot surgery: validation in the context of surgery for hallux valgus. Qual Life Res. 2006;15:1211–1222. doi: 10.1007/s11136-006-0061-5. [DOI] [PubMed] [Google Scholar]
17.Dawson J, Doll H, Coffey J, Jenkinson C. Responsiveness and minimally important change for the Manchester-Oxford foot questionnaire (MOXFQ) compared with AOFAS and SF-36 assessments following surgery for hallux valgus. Osteoarthr Cartil. 2007;15:918–931. doi: 10.1016/j.joca.2007.02.003. [DOI] [PubMed] [Google Scholar]
18.Martin RL, Irrgang JJ. A survey of self-reported outcome instruments for the foot and ankle. J Orthop Sports Phys Ther. 2007;37:72–84. doi: 10.2519/jospt.2007.2403. [DOI] [PubMed] [Google Scholar]
19.Seligson D, Gassman J, Pope M. Ankle instability: evaluation of the lateral ligaments. Am J Sports Med. 1980;8:39–42. doi: 10.1177/036354658000800107. [DOI] [PubMed] [Google Scholar]
20.Williams GN, Molloy JM, DeBerardino TM, Arciero RA, Taylor DC. Evaluation of the sports ankle rating system in young, athletic, individuals with acute lateral ankle sprains. Foot Ankle Int. 2003;24:274–282. doi: 10.1177/107110070302400314. [DOI] [PubMed] [Google Scholar]
21.Martin RL, Irrgang JJ, Lalonde KA, Conti S. Current concept review: foot and ankle outcome instruments. Foot Ankle Int. 2006;27:383–390. doi: 10.1177/107110070602700514. [DOI] [PubMed] [Google Scholar]

[CR1] 1.Niki H, Tatsunami S, Haraguchi N, Aoki T, Okuda R, Suda Y, Takao M, Tanaka Y. Development of the patient-based outcome instrument for the foot and ankle. Part 1: project description and evaluation of the outcome instrument version 1. J Orthop Sci. 2011;16:536–555. doi: 10.1007/s00776-011-0130-7. [DOI] [PubMed] [Google Scholar]

[CR2] 2.Niki H, Tatsunami S, Haraguchi N, Aoki T, Okuda R, Suda Y, Takao M, Tanaka Y. Development of the patient-based outcome instrument for the foot and ankle. Part 2: results from the second field survey: validity of the outcome instrument for the foot and ankle version 2. J Orthop Sci. 2011;16:556–564. doi: 10.1007/s00776-011-0131-6. [DOI] [PubMed] [Google Scholar]

[CR3] 3.Fukuhara S, Suzukamo Y. Manual of SF-36v2 (Japanese version). Kyoto: Institute for Health Outcomes and Process Evaluation Research; 2004.

[CR4] 4.Niki H, Aoki H, Inokuchi S, Ozeki S, Kinoshita M, Kura H, Tanaka Y, Noguchi M, Nomura S, Hatori M, Tatsunami S. Development and reliability of a standard rating system for outcome measurement of foot and ankle disorders I: development of standard rating system. J Orthop Sci. 2005;10:457–465. doi: 10.1007/s00776-005-0936-2. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR5] 5.Niki H, Aoki H, Inokuchi S, Ozeki S, Kinoshita M, Kura H, Tanaka Y, Noguchi M, Nomura S, Hatori M, Tatsunami S. Development and reliability of a standard rating system for outcome measurement of foot and ankle disorders II: interclinician and intraclinician reliability and validity of the newly established standard rating scales and Japanese Orthopaedic Association rating scale. J Orthop Sci. 2005;10:466–474. doi: 10.1007/s00776-005-0937-1. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR6] 6.Park HM. Evaluating interrater agreement with intraclass correlation coefficient in SPICE-based software process assessment. In: IEEE, editor. Proceedings of the Third International Conference on Quality Software. Washington, DC: IEEE; 2003. p. 308–14.

[CR7] 7.Johanson NA, Liang MH, Daltroy L, Rudicel S, Richmond J. American Academy of Orthopaedic Surgeons lower limb outcomes assessment instruments: reliability, validity, and sensitivity to change. J Bone Jt Surg Am. 2004;86:902–909. doi: 10.2106/00004623-200405000-00003. [DOI] [PubMed] [Google Scholar]

[CR8] 8.Martin RL, Irrgang JJ, Burdett RG, Conti SF, Van Swearingen JM. Evidence of validity for the foot and ankle ability measure (FAAM) Foot Ankle Int. 2005;26:968–983. doi: 10.1177/107110070502601113. [DOI] [PubMed] [Google Scholar]

[CR9] 9.Bennett PJ, Patterson C, Wearing S, Baglioni T. Development and validation of a questionnaire designed to measure foot-health status. J Am Podiatr Med Assoc. 1998;88:419–428. doi: 10.7547/87507315-88-9-419. [DOI] [PubMed] [Google Scholar]

[CR10] 10.Budiman-Mak E, Conrad KJ, Roach KE. The foot function index: a measure of foot pain and disability. J Clin Epidemiol. 1991;44:561–570. doi: 10.1016/0895-4356(91)90220-4. [DOI] [PubMed] [Google Scholar]

[CR11] 11.Kitaoka HB, Alexander IJ, Adelaar RS, Nunley JA, Myerson MS, Sanders M. Clinical rating systems for ankle-hindfoot, midfoot, hallux and lesser toes. Foot Ankle Int. 1994;15:349–353. doi: 10.1177/107110079401500701. [DOI] [PubMed] [Google Scholar]

[CR12] 12.Ware JE, Sherbourne CD. 36-Item Short-Form Health Survey (SF-36). I. Conceptual framework and item selection. Med Care. 1992;30:474–483. [PubMed] [Google Scholar]

[CR13] 13.Group EuroQol. EuroQol; a new facility for the measurement of health-related quality of life. Health Policy. 1990;16:199–208. doi: 10.1016/0168-8510(90)90421-9. [DOI] [PubMed] [Google Scholar]

[CR14] 14.Dawson J, Boller I, Doll H, Lavis G, Sharp R, Cooke P, Jenkinson C. Responsiveness of the Manchester-Oxford foot questionnaire (MOXFQ) compared with AOFAS, SF-36 and EQ-5D assessments following foot or ankle surgery. J Bone Jt Surg Br. 2012;94:215–221. doi: 10.1302/0301-620X.94B2.27634. [DOI] [PubMed] [Google Scholar]

[CR15] 15.Garrow AP, Papageorgiou AC, Silman AJ, Thomas E, Jayson MI, Macfarlane GJ. Development and validation of a questionnaire to assess disabling foot pain. Pain. 2000;85:107–113. doi: 10.1016/S0304-3959(99)00263-8. [DOI] [PubMed] [Google Scholar]

[CR16] 16.Dawson J, Coffey J, Doll H, Lavis G, Cooke P, Herron M, Jenkinson C. A patient-based questionnaire to assess outcomes of foot surgery: validation in the context of surgery for hallux valgus. Qual Life Res. 2006;15:1211–1222. doi: 10.1007/s11136-006-0061-5. [DOI] [PubMed] [Google Scholar]

[CR17] 17.Dawson J, Doll H, Coffey J, Jenkinson C. Responsiveness and minimally important change for the Manchester-Oxford foot questionnaire (MOXFQ) compared with AOFAS and SF-36 assessments following surgery for hallux valgus. Osteoarthr Cartil. 2007;15:918–931. doi: 10.1016/j.joca.2007.02.003. [DOI] [PubMed] [Google Scholar]

[CR18] 18.Martin RL, Irrgang JJ. A survey of self-reported outcome instruments for the foot and ankle. J Orthop Sports Phys Ther. 2007;37:72–84. doi: 10.2519/jospt.2007.2403. [DOI] [PubMed] [Google Scholar]

[CR19] 19.Seligson D, Gassman J, Pope M. Ankle instability: evaluation of the lateral ligaments. Am J Sports Med. 1980;8:39–42. doi: 10.1177/036354658000800107. [DOI] [PubMed] [Google Scholar]

[CR20] 20.Williams GN, Molloy JM, DeBerardino TM, Arciero RA, Taylor DC. Evaluation of the sports ankle rating system in young, athletic, individuals with acute lateral ankle sprains. Foot Ankle Int. 2003;24:274–282. doi: 10.1177/107110070302400314. [DOI] [PubMed] [Google Scholar]

[CR21] 21.Martin RL, Irrgang JJ, Lalonde KA, Conti S. Current concept review: foot and ankle outcome instruments. Foot Ankle Int. 2006;27:383–390. doi: 10.1177/107110070602700514. [DOI] [PubMed] [Google Scholar]

PERMALINK

Validity and reliability of a self-administered foot evaluation questionnaire (SAFE-Q)

Hisateru Niki

Shinobu Tatsunami

Naoki Haraguchi

Takafumi Aoki

Ryuzo Okuda

Yasunori Suda

Masato Takao

Yasuhito Tanaka

Abstract

Background

Patients and methods

Results

Conclusion

Introduction

Patients and methods

Study group

Ethical issue

Statistical analysis

EFA and CFA

Computation of subscale scores

Test-retest reliability

Comparison with JSSF Scale scores

Comparison with SF-36 scores

Comparison of scores for the Pain and Pain-Related subscale and the SF-36 Bodily Pain subscale

Background factors

Patient versus non-patient comparison

Sports items

Statistical probability

Results

Patient and non-patient classification and age

Table 1.

Factor analysis

Table 2.

Test-retest reliability

Table 3.

Distribution of subscale scores

Fig. 1.

Comparison with the JSSF Scale score

Fig. 2.

Table 4.

SF-36

Table 5.

Comparison of scores from the SAFE-Q Pain and Pain-Related subscale and SF-36 Bodily Pain subscale scores

Table 6.

Patient characteristics

Comparison among patient groups

Fig. 3.

Age and gender

Fig. 4.

Comparison of the scores of patients and non-patients

Table 7.

Table 8.

Sports items

Discussion

Acknowledgments

Appendix 1

Appendix 2

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases