Skip to main content
European Journal of Physical and Rehabilitation Medicine logoLink to European Journal of Physical and Rehabilitation Medicine
. 2024 Feb 1;60(2):257–269. doi: 10.23736/S1973-9087.24.08095-X

Walking test outcomes in adults with genetic neuromuscular diseases: a systematic literature review of their measurement properties

Nawale HADOUIRI 1, 2, 3,*, Isabelle FOURNEL 4, 5, Christel THAUVIN-ROBINET 2, 6, 7, Agnès JACQUIN-PIQUES 8, Paul ORNETTI 9, 10, Mathieu GUEUGNON 3, 10
PMCID: PMC11114158  PMID: 38300152

Abstract

INTRODUCTION

Neuromuscular diseases (NMDs) include a large group of heterogeneous diseases. NMDs frequently involve gait disorders, which affect quality of life. Several walking tests and tools have been described in the literature, but there is no consensus regarding the use of walking tests and tools in NMDs or of their measurement properties for walking outcomes. The aim of this review is to present an overview of walking tests, including their measurement properties when used in adults with inherited or genetic NMDs. The aim is to help clinicians and researchers choose the most appropriate test for their objective.

EVIDENCE ACQUISITION

A systematic review was conducted after consulting MEDLINE (via PubMed), EMBASE, Science direct, Google Scholar and Cochrane Central Register of Controlled Trials databases for published studies in which walking outcome measurement properties were assessed. The validity, reliability, measurement error and responsiveness properties were evaluated in terms of statistical methods and methodological design qualities using the COnsensus-based Standards for the selection of health Measurement Instruments (COSMIN) guidelines.

EVIDENCE SYNTHESIS

We included 46 studies in NMDs. These studies included 15 different walking tests and a wide variety of walking outcomes, assessed with six types of walking tools. Overall, the 6MWT was the most studied test in terms of measurement properties. The methodological design and statistical methods of most studies evaluating construct validity, reliability and measurement error were “very good.” The majority of outcome measurements were valid and reliable. However, studies on responsiveness as minimal important difference or minimal important change were lacking or were found to have inadequate methodological and statistical methods according to the COSMIN guidelines.

CONCLUSIONS

Most walking outcomes were found to be valid and reliable in NMDs. However, in view of the growing number of clinical trials, further studies are needed to clarify additional measurement properties.

Key words: Neuromuscular diseases, Walking, Charcot-Marie-Tooth disease

Introduction

Neuromuscular diseases (NMDs) constitute a large group of rare diseases1 that are mainly of genetic origin and whose prevalence is often below 1 case per 2000 people.1 To date, more than 600 genes are known to be implicated.2 The most frequently encountered NMDs in adulthood are the Charcot-Marie-Tooth (CMT) disease (prevalence ranging from 9.7 to 82.3 per 100,000 persons in the Caucasian population),3 myotonic dystrophy type 1 (DM1) and type 2 (DM2) (prevalence ranging from 5 to 20 per 100,000 persons)4 and facioscapulohumeral muscular dystrophy (FSHD) (prevalence ranging from 4 to 10 per 100,000 persons).5 All of these NMDs have a common deficiency according to the International Classification of Functioning (ICF),6 which is muscle weakness in the upper and/or lower limbs consecutive to the defect of the motor unit.7 The affected muscle topography depends on the type of NMD (for example, CMT patients experience distal upper limb and proximal and distal lower limb muscle weakness and atrophy).8 Gait disorders, which are one of the main disabilities that result from lower limb muscle weakness, have a considerable impact on quality of life.9 This makes gait a relevant endpoint for the functional evaluation of NMD patients. Few interventional trials have assessed walking improvement in inherited NMDs compared to other neurological diseases. However, for the last few years, there has been a lot of hope around gene therapy following advances in the diagnosis of NMDs, which increasingly opens the field of prospects to more ambitious clinical trials.10

It is important for clinicians and researchers to use valid, reliable, feasible, and responsive walking tests to assess walking disability, changes over time, and the effectiveness of interventions. In the clinical trials that have been published so far, various walking tests and tools have been used to assess walking disabilities in individuals with NMDs. Many walking tests are classified as short tests. For example, the 10-m test is a simple test used to measure locomotor capacity in clinical and research settings and in which the time taken to complete the test or the mean velocity is assessed. On the other hand, there are also prolonged tests. For instance, in the 2-Minute Walk Test (2WMT) and the 6-Minute Walk Test (6MWT), individuals are instructed to walk as far as possible in 2 and 6 minutes, respectively. These prolonged tests, which are sub-maximal exercise tests, are used to assess aerobic capacity and endurance, with more ecological properties (i.e., to have a better picture of everyday life) than short tests. Indeed, because most activities of daily living are performed at submaximal levels of exertion, prolonged tests like the 6MWT may better reflect the functional exercise level and motor performance for daily physical activities.11 However, these walking tests and tools only evaluate walking capacities in the strict sense (i.e., what people can do in a standardized and controlled environment). There has been a recent interest in developing home-based monitoring devices that could be used in future therapeutic trials in NMDs to assess their walking capability (i.e., what people can do in their daily environment) and motor performance (i.e., what a person actually does do in his/her daily environment).12

In interventional trials in adults, the 6MWT seems to be the most frequently used walking test in NMDs such as CMT,13, 14 DM115, 16 and DM2,17 late-onset Pompe disease (LOPD),18, 19 FSHD20-22 and other muscular dystrophies.23 In these studies, the 6MWT distance was the only recorded walking assessment variable for analyzing changes in walking capacity after a medical or physical intervention. The acquisition of walking speed was also observed in individuals with NMDs with the 10-m walking test performed at a comfortable and/or fast speed14, 18, 24-28 or with the 2MWT.29 A simple stopwatch was used to assess walking speed.14, 15, 18, 27, 28, 30 However, complex tools such as a 3D motion analysis system can more thoroughly evaluate the progression of walking capacities, i.e., in CMT31-33 or in hereditary spastic paraplegia (HSP),26, 34, 35 through kinetic, kinematic and electromyographic (EMG) data acquisition.31-35

While many walking tests and assessments tools are currently available, only some of them have been assessed for their measurement properties. To the best of our knowledge, no review on this subject has been published so far. However, a complete overview of walking outcome measurement properties in NMDs would be useful for clinicians and researchers when choosing the most appropriate walking test for their specific objective (e.g., to screen pertinent walking outcomes for future clinical trials, or as robust easy-to-use clinical measures). The aim of the present review was to provide an overview of walking test outcomes in adults with genetic NMDs and to provide information concerning the measurement properties of the walking tests in terms of validity, reliability, measurement error and responsiveness.

Our hypotheses were that: 1) the measurement properties of the studied walking tests might depend on the subtype of NMD; and 2) the 6MWT, which measures the total distance walked during the test, would be the most studied test in terms of measurement properties considering that it is frequently used in clinical trials for patients with NMDs.

Evidence acquisition

This systematic review was conducted according to PRISMA guidelines (i.e., Preferred reporting items for a systematic review and meta-analysis of diagnostic test accuracy studies: the PRISMA-DTA Statement).36 The protocol was submitted a priori to the PROSPERO registry (registration number CRD422022366521) on October 11, 2022. Each step was carried out independently by two investigators (N.H., M.G.) and then compared. In case of a disagreement, a third investigator settled the case (P.O.).

Search strategy

The MEDLINE (via PubMed), EMBASE, Science direct, Google Scholar and Cochrane Central Register of Controlled Trials databases were queried until November 30th, 2022 with MeSH terms and other free words divided into three components:

  • studied diseases (as (“Neuromuscular Diseases”[Mesh]) OR (“Muscular Dystrophy, Duchenne”[Mesh]) OR (“Charcot-Marie-Tooth Disease”[Mesh]) etc.),

  • the studied outcome (“gait”[MeSH] and synonyms such as “walking”[MeSH]),

  • a search for measurement properties (with terms such as “Validation Study” [Publication Type] OR “Reproducibility of Results”[Mesh] etc.). The complete search formulation is available in Supplementary Digital Material 1 (Supplementary Text File 1).

The search was limited to studies published in French or English to prevent translation errors. The bibliographies of the included studies were also checked for additional eligible studies. There were no restrictions for the publication date.

Selection of studies

We excluded non-original articles or articles not published in peer-reviewed journals, such as theses, protocol studies, conference proceedings, letters to the editor, case reports, editorials and systematic reviews with or without meta-analysis. Only studies with full text available were included. Simulation studies were excluded.

We included studies that evaluated walking capacities with measurement of their properties and quality of outcome measures (with a focus on validity, reliability, responsiveness, measurement error studies because these are the main metrological properties studied) in adults with genetic NMDs. Studies with subjective evaluation of walking capacities (i.e., only with the use of questionnaires/self-reporting, clinical walking descriptions etc.) were excluded. All study designs were included (i.e., interventional studies, cohort with persons with one type of NMD, persons with NMD versus healthy persons, several subtypes of NMDs, etc.).

Only studies in adults with inherited NMDs were included. We excluded studies concerning non-inherited NMDs such as poliomyelitis or inclusion body myositis. Studies conducted only in children or in animal models were also excluded.

After removing duplicates, the title and then abstracts of the articles identified in the database search were analyzed for eligibility based on the previously cited inclusion criteria.

Data extraction and quality assessment

Data including study design, sample size, characteristics of participants (type of NMDs, age of participants, presence of a healthy group), the type of walking test and tools, the walking variables assessed, the presence of a physical examination of participants (specifying the type of lower limb physical examination and/or the functional scale used) were extracted by one reviewer (N.H.).

Concerning the data for the main outcome(s) (measurement properties of walking tests), data were extracted in accordance with the methodological guidelines of the COnsensus-based Standards for the selection of health status Measurement INstruments (COSMIN) checklist.37 The COSMIN checklist consists of ten sub-checklists for different measurement properties (patient-reported outcome measures (PROM) development, internal consistency, reliability, measurement error, content validity, structural validity, hypothesis testing for construct validity, cross-cultural validity, criterion validity and responsiveness). In the present COSMIN review, we studied four types of measurement properties for walking outcomes:

  • validity of outcomes such as construct validity (i.e., if the studied test or measure accurately assesses what it is supposed to), face validity (i.e., if the content of the test appears to be suitable to its aims), and criterion validity (i.e., if a gold standard is presented; how a test measures the outcome it was designed to measure);

  • reliability (i.e., the fact that an instrument gives the same measure each time it is used), including test-retest reliability, intra-rater reliability, inter-rater reliability;

  • measurement error;

  • responsiveness of walking outcome measures.

If presented in the selected articles, feasibility data were described rather than evaluated, as these are not formal measurement properties according to the step 8 of the COSMIN guidelines (i.e., “they do not provide to us something about the quality of an instrument, but they are important aspects that have to be considered in the selection of the most suitable instrument for a specific aim”).38

The COSMIN checklist was specifically designed to determine methodological quality scores and to assess the risk of bias for the result of each study through the evaluation of design aspects and statistical methods.38 It was adapted for use on “walking tests and tools” by substituting “health-related patient-reported outcomes” (PROM) by “measurement instruments”; this was done in previous COSMIN reviews concerning the assessment of measurement properties of walking outcomes in other diseases.39-41 Here, only walking outcome measurement properties were extracted. If there were numerous walking outcomes in a given study, notably for spatiotemporal parameters, we have chosen to only extract variables such as velocity (with or without cadence, stride length), stability parameters in the general walking pattern (e.g., double support time, stride width or support base) corresponding to the main gait parameters, but not all spatiotemporal parameters as swing phase. If available in analysis motion studies, we extracted a synthesis of kinematics and kinetics data. Concerning construct validity, we extracted correlations data between walking variables and quantitative physical examination variables (e.g., clinical reference scale, measurement of lower limb muscle strength with isokinetic devices or clinical scale), or/and another walking tests, but not with clinical data such as age or disease duration, or other variables such as quality of life. Firstly, scores (i.e., very good, adequate, doubtful, inadequate) were assigned to evaluate the methodological quality of each measurement property using the “worst score counts method.”42 Then, the result of each study for a measurement property was rated against the updated criteria for good measurement properties, based on the adapted quality tool from Prinsen et al.38 and Terwee et al.43 Each result for each measurement property is rated as either positive (+), indeterminate (?), or negative (-), as defined in Supplementary Digital Material 2 (Supplementary Table I). To qualitatively rate the results as sufficient (or insufficient), in principle, 75% of the results should meet the criteria.38 As recommended by Prinsen et al.38 and Terwee et al.,43 the reviewers made hypotheses to evaluate construct validity and responsiveness based on De Vet et al.44 Correlations with (changes in) instruments measuring similar constructs should be ≥0.50, and for responsiveness, the area under the curve (AUC) should be ≥0.70. Under coefficient of correlation of 0.50 and AUC <0.70, the results were qualitatively summarized as insufficient. However, if it was impossible to formulate a pertinent hypothesis for a specific analysis, the results were qualitatively summarized as indeterminate. Two independent reviewers (N.H. and M.G.) extracted the outcome data and assessed the methodological quality. If disagreement persisted after discussion, a third reviewer (PO) was consulted.

Evidence synthesis

Study characteristics

Figure 1 shows the methodology of the selected studies. Overall, 46 studies were included for data analysis after the systematic review, including articles identified in the references of the included articles.

Figure 1.

Figure 1

—Flow chart of the present systematic review.

The main characteristics of the included studies are summarized in Supplementary Digital Material 3 (Supplementary Table II),24, 45-89 including the various study designs. Concerning the assessed NMDs, eight articles focused on CMT,46-53 11 on DM1,54-64 one on DM1 and DM2,65 one on DM2,66 five on FSHD,24, 67-70 one on FSHD and limb girdle muscle dystrophy (LGMD) type 2,71 one on HSP,72 two on LOPD,73, 74 one on several muscular dystrophies (DMD, BMD, LGMD and FSHD),75 three on people with several types of inherited NMD,45, 76, 77 one on primary mitochondrial myopathy,78 one on spinal bulbar muscular atrophy (SBMA),79 and 10 on 5q spinal muscular atrophy (5q SMA) types 3 or 4.80-89 Sample sizes varied from four to 479 participants, and 32 studies did not include a healthy control group.

The 46 included studies used 15 different walking tests. The 6-Minute Walking Test (6MWT) was used in 25 studies.45, 49, 52-54, 56-58, 61, 63, 64, 66, 68, 73, 74, 76-80, 83-85, 87-89 The included studies reported most walking tests categorized according to the ICF as “walking a short distance” conditions:6 1) to measure walking velocity, such as the 10 Meter Walk Test (10MWT)24, 46, 49, 51, 53, 57-60, 63, 64, 74, 75, 89 or 10 Meter Run Test,56, 57, 63, 83, 86 and the 30 Foot Go Test;68 2) to evaluate walking around obstacles or dynamic balance stance, such as the Timed Up and Go Test (TUG),56, 58-62, 64, 68-70, 74, 78, 83, 86 Step Test,58 Flight of Eight (Fo8),59 or Stance Tandem (ST);60, 77 3) to assess walking distance with walking tests or protocols using more time to measure walking distance, such as the 2-Minute Walk Test (2MWT),45, 77 6MWT,45, 49, 52-54, 56-58, 61, 63, 64, 66, 68, 73, 74, 76-80, 83-85, 87-89 or Endurance Shuttle Walking test (ESWT);81, 82 and 4) to measure walking without a particular distance or time to accomplish,47, 48 such as walking assessments with motion analysis system47, 51, 55, 67, 72 or with a portable pressure sensitive walkway.48, 65, 83, 88, 89 Only one study could be categorized as “walking a long distance” assessment conditions according to the ICF,6 which included real-life assessment conditions using wearable magneto-inertial sensors.71 Several types of walking assessment tools are listed, including easy-to-use tools such as a stopwatch,45, 46, 52, 54, 56-60, 62, 64, 66, 73-82, 84-87 and other more technical tools such as a GaitRite walkway,48, 65, 83, 87 activity monitoring watches,49, 53 a motion analysis system,47, 50, 51, 55, 67, 72 a baropodometric platform,24 or wearable magneto-inertial sensors.69, 71, 89

Measurement properties

For clarity and readability, the methodological design, statistical methods and statistical outcomes for the variables of interest in the 46 included studies have been presented separately for CMT (Supplementary Digital Material 4: Supplementary Table III, IV, V),46-53 DM1 and DM2 (Supplementary Digital Material 5: Supplementary Table VI, VII, VIII, IX),54-66 and grouped together for the other NMDs (Supplementary Digital Material 6: Supplementary Table X, XI, XII, XIII).24, 45, 67-89 The two reviewers had moderate agreement (absolute agreement = 0.69 and κ = 0.59, 95% CI 0.49-0.68) according the assessment of methodological quality using the COSMIN checklist. After discussion, the third reviewer was not consulted for any study. Construct validity and reliability (and especially test-retest reliability) were the most evaluated measurement properties in the included studies. Specifically, reliability was evaluated in 19 studies, measurement error in eight studies, responsiveness in 16 studies, construct validity in 35 studies, criterion validity in three studies, content validity in seven studies, convergent validity in one study and discriminative validity in one study.

CMT

In CMT (Supplementary Digital Material 4), the methodological quality of the studies examining construct validity were “very good” for all walking tests, and, according to our hypotheses, the hypotheses testing for construct validity were mostly “sufficient,” except especially correlations between CMT neuropathy score (CMTNS) and 6MWD, 10MWT time and several outputs during the five-day monitoring as an activity index with ρ<0.5 (Supplementary Digital Material 4). Construct validity was particularly good for velocity assessed during the 10MWT, which was inversely strongly correlated with CMTNS (correlation coefficient ρ=-0.783).

The methodological quality of the three studies examining reliability were “adequate” (i.e., for example when the model or formula of the ICC was not described) to “very good” for the 10MWT,46 6MWT49 and kinetic variables47 with “positive” quality criteria ratings for all these studies with excellent intraclass correlation coefficients (ICCs) (>0.9).

Only one study analyzed measurement error for walking assessment (concerning kinetic variables with a standard error measurement (SEM) of velocity and double support time of respectively 3.67 s and 1.11% of gait cycle, and SEM of Mean ankle angle Toe Walking - mean ankle angle High Walking of 2.7°) with “very good” methodological quality, but the quality criteria for hypothesis testing was “indeterminate.”47

Two studies evaluated the responsiveness of the 6MWT, the 10MWT, several outputs from a StepWatch Activity Monitor,53 and velocity acquired with motion analysis system.50 Only one study had “very good” methodological quality with the calculation of the SRM of velocity (assessed with motion analysis system) of -0.55% body height/s and an “positive” quality criteria rating50 according to COSMIN guidelines.

DM1 and DM2

In DM1 (Supplementary Digital Material 5), the methodological quality of the studies examining content and construct validity of the 6MWT was “very good” for all walking tests, and according to our hypotheses, the hypotheses testing for construct validity were mostly “sufficient.” The best construct validity was observed for the 6-Minute Walking Distance (6MWD) and the 10MWT in running conditions (10mW/RT), which were moderately correlated with the Scale for the Assessment and Rating of Ataxia (SARA) rating (respectively ρ=0.65 and ρ=0.55). The 10MWT in comfortable conditions was strongly inversely correlated with total lower limb muscle strength (ρ=-0.705), the TUG was moderately inversely correlated with the MMT of the trunk muscle group (ρ=-0.58), and ST and velocity were strongly inversely correlated with total lower limb muscle strength (ρ=-0.705).

The methodological quality of the five studies examining test-retest reliability was “very good” for the 10MWT,59 6MWT,54 TUG,59 ST59 and Fo8.59 The studies which assessed intra-rater reliability of the 10MWT,56-58 6MWT,56 TUG,56, 58 ST58 had “adequate” to “very good” methodological quality. All these reliability studies had a “positive” quality criteria rating and excellent ICCs (>0.9), except for the intra-rater reliability used during the 10MWT (ICC=0.86 [95% CI 0.74-0.93]) and the intra-rater reliability (1 to 2 weeks) of the TUG time (ICC=0.68 [0.54-0.79]).

Three studies analyzed measurement error of the 6MWT,54 the 10MWT,59 the TUG,58, 59 the Fo8 and ST59 with “very good” methodological quality (with the calculation of SEM with or without measurement error (ME)). However, the quality criteria for hypotheses testing was “indeterminate.”

Three studies evaluated the responsiveness of velocity during the 10MWT and 10MW/RT,57, 59, 60 6MWT,63 TUG,62, 64 and ST64 with methodological quality according to COSMIN guidelines “inadequate” except for 2 studies that assessed the area under curve (AUC),62, 64 and only 1 study had a “positive” quality criteria rating in the TUG with a criterion approach walking AUC>0.7 according to the COSMIN guidelines (AUC=0.8 [95% CI 0.7-0.9]).62 According to COSMIN guidelines, the responsiveness of the 10MWT in comfortable velocity, the TUG and ST were considered “very good” in terms of methodological quality (Supplementary Digital Material 6).

Two studies mentioned some feasibility data for all participants using the 6MWT, 10MWT, and 10MW/RT.54, 57 All participants managed to complete at least one 6MWT as a whole or one trial of each walking test cited. However, this was not the case if the instruction was to perform two different 6MWT or to repeat several walking tests, with the main reported limitation being the generation of fatigue.51, 54

In DM2 (Supplementary Digital Material 5), there was only one specific study (concerning 6MWT measurement properties) with “very good” methodological quality according to the COSMIN guidelines concerning construct validity (weak correlation [ρ] between 6MWD and the sum of lower limb MMT ρ=0.492), and, according to our hypotheses, the hypotheses testing for construct validity were mostly “sufficient.” In this study, the methodological quality of the responsiveness evaluation of the 6MWD was also “inadequate” according to the COSMIN guidelines.66

Other NMDs

Concerning the other NMDs (Supplementary Digital Material 6): in FSHD, the methodological quality of the studies examining construct validity were “adequate” to “very good”, and according to our hypotheses, the hypotheses testing for construct validity were mostly “sufficient.” The best construct validity was observed for the 95th centile length assessed in real life by a wearable device which was strongly correlated with lower limb MMT (ρ=0.915, P<0.05).

The methodological quality of the two studies examining test-retest reliability were “adequate” to “very good” for the 6MWT68 and instrumented TUG (velocity and double support)69 with ICCs >0.9. We observed “very good” methodological quality for the study of responsiveness for the 6MWT, with the calculation of the minimal detectable change (MDC) with 95% confidence (MDC95=34.3 m).68 The methodological quality of the responsiveness of spatiotemporal variables in real life according to the COSMIN guidelines was considered “adequate”71 (Supplementary Digital Material 6).

In HSP, only one study, notably with a velocity assessment, was found to have “very good” methodological quality for the construct validity assessment, but a weak correlation (ρ=0.38) between walking velocity and Spastic Paraplegia Rating Scale, so the hypothesis testing for the construct validity was “indeterminate”72.

In LOPD, two studies with only people with LOPD were found to have “inadequate” methodological quality for the responsiveness assessment of the 6MWT, 10MWT and TUG,73, 74 and the quality criteria rating was “indeterminate.”

In SBMA, only one study using the 6MWT was found to have “very good” methodological quality for the test-retest reliability assessment (with ICC=0.982), with a “positive” quality criteria rating, and a “very good” methodological quality for the construct validity assessment. The best construct validity was observed for the 6MWD, which was moderately correlated with Limb Norris Score (ρ=0.632; P<0.001), but hypothesis testing for construct validity was “indeterminate.”79

In 5q SMA, the methodological quality of the study examining the criteria validity of several spatiotemporal parameters assessed by Solesound instrumented footwear (the gold standard was the assessment of these same spatiotemporal parameters by a GAITRite system) were “very good” with all correlation coefficients ρ > 0.9, and the criterion quality ratings were considered “positive” according to the COSMIN guidelines.89 The methodological quality of the studies examining construct validity was “very good.” The best construct validity was observed for the 6MWD, which was strongly correlated with the total leg strength measured with a manual muscle test (ρ=0.733). However, the hypothesis testing for construct validity was “indeterminate.” The methodological quality of the studies examining convergent, discriminative and face validity were “very good,” but the hypothesis testing for construct validity was “indeterminate” (Supplementary Digital Material 6). The methodological quality of the study examining the criterion validity (e.g., VO2peak, which was the gold standard and measured during 6MWT, TUG and 10-meter walk/run) were “very good” with the calculation of a correlation coefficient, but the quality of the criteria was considered “negative” according to the COSMIN guidelines (ρ=0.558, ICC<0.70).83

The methodological quality of the 4 studies examining test-retest reliability were “very good” for the 6MWT,83, 85 TUG86 and ESWT,81 with excellent ICCs from 0.85 to 0.992 and a “positive”81, 83, 85, 86 quality criteria rating. Two studies analyzed the responsiveness of the 6MWT with “very good” methodological quality (including calculation of MDC90 and minimal clinically important difference (MCID) with or without SEM), but the quality criteria ratings for hypothesis testing were “indeterminate”80, 83. Concerning the ESWT, some feasibility data were available, such as the drop-out rate of the ESWT (73.3% in SMA patients versus 0% in healthy controls) or the measurement completion of 100% for the ESWT.81, 82

In primary mitochondrial myopathy, one psychometric study had “very good” methodological quality for the responsiveness assessment (MCID of the 6MWD = 33.3 m), but the quality criteria rating was “indeterminate”78.

Concerning studies with several subtypes of NMDs, the methodological quality of the construct validity assessment was found to be “very good in different studies in several NMDs with calculation of correlation coefficients. There was a strong correlation between real-life velocity and lower limb MMT (ρ=0.842)71 or a weak correlation between velocity in the 10MWT and knee extension isometric maximal voluntary contraction (ρ=0.484 with a “indeterminate” quality criteria rating), and even a very strong correlation between the velocity in the 2MWT and the 6MWT (ρ=0.99, P=0.001).75 The hypothesis testing for the construct validity was “positive” for the construct validity of the 2MWT compared to the 6MWT with “very good” methodological quality.45 One study analyzed the test-retest reliability properties of the 2MWT and 6MWT.77 It had “very good” methodological quality with the calculation of ICCs, and a “positive” criterion quality rating (ICCs of 0.99 in both the 2MWT and 6MWT and ICCs>0.90 in all NMD subgroups).77 This study also analyzed measurement error for the 2MWT and 6MWT with “very good” methodological quality (calculation of SEM, MDD95 or definition of Limits of Agreement), and the quality criteria for hypotheses testing was “sufficient”77. One study analyzed the responsiveness properties of the spatiotemporal parameters acquired in FSHD and LGMD2 in real life with “adequate” methodological quality according to the COSMIN guidelines.71

Discussion

In this systematic review of 46 studies, we present an overview of the properties for the measurement of walking tests and tools used in adults diagnosed with genetic NMDs, evaluated according to the COSMIN checklist.

There was a particular focus on three subtypes of NMD, namely myotonic dystrophies (DM1 and DM2) (13 studies),54-66, 90 5q SMA (10 studies)80-89 and CMT (eight studies).46-53 We can hypothesize that these diseases have been explored more than the others because of their greater prevalence. This is in accordance with the greater number of therapeutic trials compared to the other NMDs, which has resulted in advanced therapeutic strategies in 5Q SMA for instance.91 In the present review, only five studies out of 46 were conducted in several NMD subtypes.45, 71, 75-77 There are probably more studies that assess the psychometric properties of walking tests and tools for a single NMD subtype because of the heterogeneous clinical presentation of NMDs.

Concerning the walking tests and tools studied in terms of measurement properties, our first hypothesis was that they might depend on the subtype of NMD. However, we observed a wide variety of measurement properties between the different studies for the same NMD but also between the different subtypes (Supplementary Table II). The same observation could be made for the use of clinical scales for impairment assessments or of the type of the lower limb physical examination, which were varied and not systematic (i.e., five of 44 studies without physical examination). To illustrate this, we can take the example of the myotonic dystrophies, which had the most articles in this review.54-66, 90 Six different walking tests were used (i.e., 6MWT, 10MWT with comfortable or rapid walking velocity, 10MW/RT, TUG, Step test and Fo8) with, for the most part, the use of a stopwatch (i.e., only one article with motion analysis system and one article with accelerometer; Supplementary Table I, VI, VII, VIII, IX). In these studies, physical examinations (i.e., of lower limb isometric muscle force or MRC lower limbs) with or without the use of a functional assessment scale (MIRS or SARA) were performed. In therapeutic trials on improving walking abilities in myotonic dystrophies, several walking tests and tools were used, such as the 6MWT,15, 17, 92 10MWT at a comfortable velocity93 or TUG, but there was no real justification for the choice of a specific test. Unfortunately, it is extremely difficult to compare the findings due to the discrepancies between studies. Depending on the subtype of NMD, the objective and the endpoint of a study, evaluation protocols should be more homogeneous and standardized.

Our second hypothesis was that the 6MWT would be the test for which measurement properties were the most studied. Indeed, the 6MWT was used in 26 of the 44 studies in our COSMIN review. Moreover, the 6MWT was the only walking test that was studied in all NMDs. The distance traveled during the 6MWT was the single variable assessed in terms of measurement properties. We found “very good” methodological quality and “positive” quality criteria rating for the reliability assessment of the 6MWT in several NMDs. Specific measurement error assessments of the 6MWT in the studies on DM1 were “very good” in terms of methodological quality, and “very good” in the methodological quality of validity (face, construct, criterion etc.). The responsiveness properties of the 6MWT were the least studied, with “very good” methodological quality of assessments in primary mitochondrial myopathy, 5q SMA and FSHD, but “inadequate” methodological quality in DM1, DM2, CMT and LOPD, which may limit its use, especially in clinical trials. Initially, the 6MWT was used in cardiorespiratory diseases94 and then gradually for the study of walking and endurance in neurological diseases including NMDs. According to the recommendations of the American Thoracic Society,11 the 6MWT consists of walking the greatest possible distance over a period of 6 minutes. The 6MWT seems to be the most evaluated walking test in terms of metrological properties and in clinical trials in NMDs, probably due to the muscle fatigue generated, cardiorespiratory and metabolic disabilities in this assessment of aerobic and exercise capacities, making it possible to better understand the impact of the disease on the walking abilities of people with NMDs. However, the length of this test and the generated fatigue could limit the ability of some participants to complete the assessment,95 hence the need to develop walking tests that can be completed by all people with NMDs.

Furthermore, the same findings were obtained when we looked at walking tests other than the 6MWT, i.e., many studies evaluating reliability and validity measurement properties had “very good” methodological qualities. However, concerning reliability, only few studies examined inter or intra-rater reliability compared to the number of test–retest reliability evaluations. In clinical practice, these aspects are important because patients are often evaluated by different therapists or physicians over time, so the reliability of a test or tool is essential. Therefore, further studies on inter or intra-rater reliability are necessary. The evaluation of validity was also contrasted because construct validity was the most studied type of validity. For example, it was compared using clinical scales or measurements of muscle strength by physical examination or by isokinetic measures (Supplementary Digital Material 4, 5, 6). As expected, the validity assessment criterion was not very present in our review because it seems to be difficult to find a gold standard for a 10-meter test or even a TUG. Content validity has been tested many times for several of the walking tests and tools in our review with very good results. This is encouraging for the use of these tests in clinical trials and in current practice, though they are not systematically used in the clinical follow-up of people with NMDs.

On the other hand, there were fewer studies focused on the measurement error or responsiveness assessment, which is often associated with inadequate methodological quality according to the COSMIN checklist for responsiveness properties. However, without prior adequate evaluation of a walking test’s sensitivity to change and, in particular, the determination of the MCID, it is difficult to conclude on the effectiveness or ineffectiveness of a treatment or of the low sensitivity to change of a walking test or variable in clinical trials (in NMDs or other diseases).96-98 In this promising new era of therapeutic development in neuromuscular diseases, it seems essential to conduct psychometric studies on the sensitivity to change of walking tests and measurement tools.

Overall, and even if further metrological studies are needed to define relevant outcomes for future clinical trials, the results of this COSMIN review have allowed us to draw up some recommendations for the use of tests and walking tools in the most frequent NMDs.

Concerning CMT disease, the 6MWT and 10MWT had a good inverse correlation for time. The correlation of the 6MWT with CMTNS differed between studies, but the correlation appeared to be greater when the CMTNS was higher.49, 52 Correlations between lower limb muscle strength and 6MWD and 10MWT time were generally weak (Supplementary Digital Material 4). The second version of the CMTNS (CMTNS2), published by Murphy et al. in 2011,99 had better correlations with spatio-temporal parameters and kinematic data in the 10MWT, especially velocity and knee flexion–extension, with good construct validity properties.51 Ferrarin et al. found excellent reliability for spatio-temporal parameters and kinematic data during motion analysis and Lencioni et al. indicated that there was higher responsiveness for kinetic and kinematic measures (according to disease severity) compared with the CMT Examination Score.47, 50 The 6MWT exhibited excellent reliability but there was inadequate responsiveness data according to the COSMIN guidelines. To the best of our knowledge, and according to this summary, the selection of CMT patients according to disease severity is crucial for selecting the walking test and assessment tool. These results suggested that motion analysis is the most appropriate tool and that the 6MWT is the most appropriate test for walking assessment in CMT, but further feasibility studies for motion analysis and responsiveness studies for the 6MWT are still necessary.

Concerning DM1, several walking tests were studied. Very good construct validity was found for the 6MWT, 10MWT, 10MW/RT, TUG, Step Test and variables assessed by motion analysis in “basic walk” or “dual task walk”. High correlations were generally observed between walking tests or motion analysis and lower limb muscle strength/MRC scores (Supplementary Digital Material 5), in contrast to the results in CMT patients. All of these walking tests and tools were reliable, and the studies had very good methodological quality and several types of tested reliability. High responsiveness with AUC>0.761 was found for the 10MWT in one study with adequate methodology and sufficient quality criteria rating. For the TUG, one study found AUC<0.764 and another >0.7.62 In DM1, the 10MWT was the walking test that had the most verified metrological properties, which could explain its preferred use in research and clinical practice for walking assessments.

Concerning 5q SMA, several walking tests and tools were studied, including the 6MWT, 10MWT, TUG, ESWT and assessments of spatiotemporal parameters by instrumented footwear and GaitRite. The 6MWT was the most studied test. It was found to have a robust construct validity mainly with high correlations with for lower limb strength, and it was the only test with a study of criterion validity compared to V02 peak.83 The calculated MDC90 was 24 meters,80 and, according to one study, the 6MWT was feasible for 97% of people with 5q SMA.85 These results could justify the preferential use of the 6MWT to assess walking in research and clinical practice.

Concerning the FSHD, several walking tests and tools were studied, such as the 6MWT, iTUG, motion analysis and the use of wearable magneto-inertial sensors for real-life walking assessment. All of these walking tests and tools were valid and reliable (Supplementary Digital Material 6). Only the 6MWT and the use of wearable magneto-inertial sensors were studied properly in terms of responsiveness according to the COSMIN guidelines.68, 71 Their use seems to be preferable. Furthermore, the use of wearable magneto-inertial sensors for walking assessment in real life demonstrated the best construct validity, and the correlations with lower limb MMT were high, and it was a reliable tool with high ICC in inter-session reliability and, according to an early study, it seemed to be a sensitive tool with adequate study methodology according to the COSMIN guidelines.71 To the best of our knowledge, the metrological properties of this tool were studied among NMDs only in FSHD, which open the door to the assessment of walking capability in these patients. Inertial sensors have the advantage of being used in a patient’s everyday environment and was developed in particular due to the COVID-19 pandemic.100 However, the sample size of the single study was small, limiting the conclusions and suggesting the need for further studies in FSHD. In other NMDs, the lack of metrological studies may limit their use in research and clinical applications.100 Further studies using these wearable magneto-inertial sensors will be needed for the development of novel walking outcomes in real-life environments.

Strengths and limitations of the study

To the best of our knowledge, this is the first COSMIN review on the measurement properties of walking outcome assessments in adults with genetic or inherited NMDs. Considering the functional and locomotor deficits caused by NMDs, and the growing number of therapeutic trials in NMD patients, it seemed relevant to synthesize the current data concerning the measurements properties of walking tests and tools in order to identify those that might be more relevant for specific research projects or for clinical follow-up.

Herein, we reviewed data from a large number of studies, even though the inclusion criteria were limited to genetic neuromuscular diseases in adults. Consequently, this review provides a synthesis and new insights concerning the measurement properties of walking tests and tools in adults with NMDs. The major strength of this review was the use of the COSMIN checklist and of a quality assessment for the statistical outcomes. The COSMIN checklist was used to assess the quality of the methodological design and statistical methods of the psychometric studies. Moreover, this checklist has been successfully used in previous systematic reviews for the evaluation of walking disorders in other diseases.39-41 The level of agreement between the two reviewers (NH and MG) before discussion was moderate (κ=0.59) for this COSMIN review. This was in accordance with the agreement level of previous studies using COSMIN guidelines for the assessment of walking psychometric studies in other diseases.39, 101

Despite the many strengths of this review, is also presents some limitations. Firstly, it was not initially possible to conduct a relevant meta-analysis because of the high level of heterogeneity among the studies (metric properties, physical examination protocols and type of walking tests and related variables within same and different NMD).

Secondly, most of the studies included in this review had small sample sizes, and often with no justification, which limits the interpretability of our results in relation to the COSMIN checklist.

The last limitation of this paper is related to the selection criteria, which excluded non-hereditary or non-genetic NMDs or pediatric populations. We therefore omitted important NMDs such as Duchenne muscular dystrophy, or children with 5q SMA. We chose these restrictive inclusion criteria in order to obtain the most homogeneous systematic review possible, despite the broad clinical spectrum of NMDs. It is worth noting that there is a recent narrative review of the literature that provides an overview of outcome walking measures in children with NMDs.102

Conclusions

To conclude, this review provides an overview of the measurement properties for the walking tests and tools that have been used to evaluate adults with inherited or genetic NMDs.

This information will help researchers and clinicians to choose the most appropriate tests and tools depending on their aim and the studied NMD. There is currently a large panel of potential walking tests and tools to evaluate adults with NMDs, and most were found to be valid and reliable. However, important psychometric studies such as inter- or intra-rater reliability and responsiveness assessments are missing or were found to have inadequate methodological qualities. Future studies are warranted to better clarify these elements due to the current expansion of clinical trials that include medical or physical therapies. This is also of major importance considering the emergence of gene therapies and their potential positive impact on walking disorders in NMDs.

Supplementary Digital Material 1

Supplementary Text File 1

The search formulation of this COSMIN review.

Supplementary Digital Material 2

Supplementary Table I

Criteria to rate each measurement property according to Prinsen et al.38 and Terwee et al.43

Supplementary Digital Material 3

Supplementary Table II

Characteristics of the selected studies.

Supplementary Digital Material 4

Supplementary Table III

Measurement properties of the included studies in Charcot Marie Tooth disease: validity.

Supplementary Table IV

Measurement properties of the included studies in Charcot Marie Tooth disease: reliability.

Supplementary Table V

Measurement properties of the included studies in Charcot Marie Tooth disease: measurement error, responsiveness, and feasibility.

Supplementary Digital Material 5

Supplementary Table VI

Measurement properties of the included studies in myotonic dystrophies: validity.

Supplementary Table VII

Measurement properties of the included studies in myotonic dystrophies: reliability.

Supplementary Table VIII

Measurement properties of the included studies in myotonic dystrophies: measurement error.

Supplementary Table IX

Measurement properties of the included studies in myotonic dystrophies: responsiveness and feasibility.

Supplementary Digital Material 6

Supplementary Table X

Measurement properties of the included studies in other NMDs: validity.

Supplementary Table XI

Measurement properties of the included studies in other NMDs: reliability.

Supplementary Table XII

Measurement properties of the included studies in other NMDs: measurement error.

Supplementary Table XIII

Measurement properties of the included studies in other NMDs: responsiveness and feasibility.

Acknowledgements

The authors are grateful to Suzanne Rankin for proofreading the manuscript.

Footnotes

Conflicts of interest: The authors certify that there is no conflict of interest with any financial organization regarding the material discussed in the manuscript.

References

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Text File 1

The search formulation of this COSMIN review.

Supplementary Table I

Criteria to rate each measurement property according to Prinsen et al.38 and Terwee et al.43

Supplementary Table II

Characteristics of the selected studies.

Supplementary Table III

Measurement properties of the included studies in Charcot Marie Tooth disease: validity.

Supplementary Table IV

Measurement properties of the included studies in Charcot Marie Tooth disease: reliability.

Supplementary Table V

Measurement properties of the included studies in Charcot Marie Tooth disease: measurement error, responsiveness, and feasibility.

Supplementary Table VI

Measurement properties of the included studies in myotonic dystrophies: validity.

Supplementary Table VII

Measurement properties of the included studies in myotonic dystrophies: reliability.

Supplementary Table VIII

Measurement properties of the included studies in myotonic dystrophies: measurement error.

Supplementary Table IX

Measurement properties of the included studies in myotonic dystrophies: responsiveness and feasibility.

Supplementary Table X

Measurement properties of the included studies in other NMDs: validity.

Supplementary Table XI

Measurement properties of the included studies in other NMDs: reliability.

Supplementary Table XII

Measurement properties of the included studies in other NMDs: measurement error.

Supplementary Table XIII

Measurement properties of the included studies in other NMDs: responsiveness and feasibility.


Articles from European Journal of Physical and Rehabilitation Medicine are provided here courtesy of Edizioni Minerva Medica S.p.A.

RESOURCES