Skip to main content
The Journal of Manual & Manipulative Therapy logoLink to The Journal of Manual & Manipulative Therapy
. 2023 Dec 9;32(3):255–283. doi: 10.1080/10669817.2023.2269038

Psychometric properties of clinician-reported and performance-based outcomes cited in a scoping review on spinal manipulation and mobilization for pediatric populations with diverse medical conditions: a systematic review

Tricia Hayton a, Anita Gross a,, Annalie Basson b, Ken Olson c, Oliver Ang d, Nikki Milne e, Jan Pool f
PMCID: PMC11216262  PMID: 38070150

ABSTRACT

Introduction

Risks and benefits of spinal manipulations and mobilization in pediatric populations are a concern to the public, policymakers, and international physiotherapy governing organizations. Clinical Outcome Assessments (COA) used in the literature on these topics are contentious. The aim of this systematic review was to establish the quality of clinician-reported and performance-based COAs identified by a scoping review on spinal manipulation and mobilization for pediatric populations across diverse medical conditions.

Method and analysis

Electronic databases, clinicaltrials.gov and Ebsco Open Dissertations were searched up to 21 October 2022. Qualitative synthesis was performed using Consensus-based Standards for the Selection of Health Measurement Instruments (COSMIN) guidelines to select studies, perform data extraction, and assess risk of bias. Data synthesis used Grading of Recommendations, Assessment, Development and Evaluations (GRADE) to determine the certainty of the evidence and overall rating: sufficient (+), insufficient (-), inconsistent (±), or indeterminate (?).

Results

Four of 17 identified COAs (77 studies, 9653 participants) with supporting psychometric research were classified as:

Performance-based outcome measures:

  • AIMS – Alberta Infant Motor Scale (n = 51); or:

Clinician-reported outcome measures:

  • LATCH – Latch, Audible swallowing, Type of nipple, Comfort, Hold (n = 10),

  • Cobb Angle (n = 15),

  • Postural Assessment (n = 1).

AIMS had an overall sufficient (+) rating with high certainty evidence, and LATCH had an overall sufficient (+) rating with moderate certainty of evidence. For the Cobb Angle and Postural Assessment, the overall rating was indeterminate (?) with low or very low certainty of evidence, respectively.

Conclusion

The AIMS and LATCH had sufficient evidence to evaluate the efficacy of spinal manipulation and mobilization for certain pediatric medical conditions. Further validation studies are needed for other COAs.

KEYWORDS: Psychometric properties, outcome measures, manipulation, mobilization, pediatric

Introduction

Assessing the ethical acceptability of health-related research involving spinal manipulation and mobilization in diverse pediatric populations with varying medical conditions necessitates a crucial risk-benefit analysis. International policymakers, clinicians, and guardians are questioning whether the risks associated with spinal manipulation and mobilization exceed the anticipated benefits. The Australian Government in the state of Victoria has restricted the use of these techniques in infants [1]. The International Federation of Orthopaedic Manipulative Physical Therapists (IFOMPT), in collaboration with the International Organisation of Physical Therapists in Paediatrics (IOPTP), has identified a great need to accurately map the scientific evidence and safety issues regarding manipulation and mobilization for infants, children, and adolescents with various conditions. Clinicians are seeking the evidence-basis of this care pathway to make safe clinical judgments, decision-making, and guide their clinical reasoning process. A scoping review [2] for a pediatric population receiving spinal manipulation and mobilization established that there was a diverse listing of Clinical Outcome Assessments (COA) with poor reporting of their psychometric properties; COAs were defined by the United States Food and Drug Administration (FDA) as tools that ‘measure a patient’s symptoms, overall mental state, or the effects of a disease or condition on how the patient functions’ [3]. Healthcare professionals and researchers typically use one or more of the four standardized assessment categories of COA to measure clinical benefit in treatment trials as follows:

  • patient-reported – patient’s self report of difficulty in completing a task or of feelings experienced during tasks of daily living.

  • clinician-reported – a trained health professional’s observation of a patient’s health condition involving clinical judgment and interpretation of the observable signs, behaviors, or other manifestations.

  • observer-reported – a report by a caregiver, guardian, or other observer of patients who cannot report for themselves (e.g., infant); this observation does not include clinical judgment.

  • performance-based – a trained observer administers and evaluates a standardized task(s) performed by the patient.

See Figure 1 for those COAs identified by Milne and colleagues [2] in their scoping review.

Figure 1.

Figure 1.

Clinical outcome assessment (COA) categories that were identified by a scoping review on spinal manipulation and mobilisation in paediatric populations [2]. Clinician-reported and performance-based outcome assessments were the focus of this report. All listed outcomes were included in our search however the bolded ones were those identified in the literature to have psychometric properties for paediatric populations.

Interpreting the effects of outcomes such as pain, function, motor development, mobility difficulties, and participation in life activities for infants, children, and adolescence before and following spinal manipulation and mobilization is critical. Foundational to the determination of the efficacy, effectiveness, and harm is the use of outcome measures with good measurement properties as established by COSMIN’s [4] including:

  • Reliability:internal consistency, test-retest/intra-rater/inter-rater reliability, measurement error;

  • Validity
    1. content validity including relevance, comprehensiveness, comprehensibility, outcome measurement items assessed and appropriately worded [5]
    2. structural validity,
    3. hypothesis testing for construct validity,
    4. cross-cultural validity,
    5. criterion validity and
  • Responsiveness for each measurement.

Additionally, the interpretability and feasibility of the reported outcome measure is needed to formulate recommendations on the suitability for use in an evaluative application of the construct of interest and study population.

The aim of this systematic review was to evaluate the psychometric properties of clinician-reported and performance-based outcomes that were identified by a scoping review on spinal manipulation and mobilization in pediatrics, aged birth to 18 years across diverse medical conditions. These outcomes are related to the physical functioning, motor development, and mobility of infants, children, and adolescence.

Method

The protocol was registered with Open Science Framework (DOI 10.17605/OSF.IO/RN4UX). This review was part of a larger project designed to depict COAs by subcategories: Part 1. Patient-reported and Observer-reported Outcomes [6]; Part 2. Clinician-reported and Performance-based Outcomes. The review was reported in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) statement [7] and Consensus-based Standards for the Selection of Health Measurement Instruments (COSMIN) guideline for systematic reviews [8].

Eligibility criteria

Type of participants: Pediatric (birth to 18 years of age) patients with medical conditions identified by a scoping review and managed with spinal manipulation and mobilization were considered [2]. These conditions varied and included Enuresis, Otitis Media, Colic, Excessive Crying/Colic, Breastfeeding, Low Back Pain, Headaches, Cerebral Palsy, Neck Pain, Scoliosis, Attention Deficit Disorder, Autism, Torticollis, Asthma, Chronic Respiratory Illness, KISS (kinetic imbalances due to suboccipital strain), and Dysfunctional Voiding and could include healthy participants. If there were greater than 20% of the population older than 18 years old, the study was excluded.

Type of clinical outcome assessment (COA): Clinician-reported and performance-reported COAs were considered eligible for this study.

Type of psychometric outcome: Studies addressed a minimum of one psychometric property from any of the following three domains:

  • Validity: content validity (i.e., relevance, comprehensiveness, comprehensibility, outcome measurement items assessed and appropriately worded), structural validity, hypothesis testing for construct validity, cross-cultural validity, criterion validity.

  • Reliability: internal consistency, test–retest reliability, intra-rater reliability, inter-rater reliability, measurement error (i.e., standard error measure (SEM), smallest detectable change (SDC) or limit of agreement (LoA)).

  • Responsiveness (i.e., SDC, minimal important change (MIC), area under the curve (i.e., ROC – Receiver Operation Curve)).

Information on the interpretability (i.e., floor or ceiling effect) and feasibility of the reported outcome measure was also gathered. Trials where the COA was used as an outcome measure or in validation of another instrument were excluded.

Type of study design: All study designs (i.e., cohort, case-control, cross-sectional, case-series, randomized controlled trials) except for questionnaires, surveys, screening tools, and case reports were included.

Information sources and search strategy

A medical librarian and information specialist (JM) designed, tailored, and performed an electronic search of PUBMED, Embase, and CINAHL from inception to 26 February 2021 with an update to 21 October 2022 without language restriction. However, translation capacity of our research team and colleagues was English, Dutch, German, Spanish, French, Korean, Hungarian, Russian, and Croatian and non-translated studies were recorded. The search strategy was to combine each COA identified by a previously completed scoping review [2] with a modified instrument properties search block for PUBMED created by [9] as well as a measurement property filter: 1) for the Index to Chiropractic Literature created by JM, 2) for EMBASE translated by EP Jansma and 3) for CINAHL developed FS van Etten (COSMIN website). By selectively adding a chiropractic filter and a pediatric filter, we balanced the search strategy between recall and precision. A sample of the search parameters can be found in Box 1. We searched gray literature in clinicaltrials.gov and Ebsco Open Dissertations on 29 March 2021, and references of primary studies were checked by TH.

Box 1.

A sample MEDLINE search strategy and parameters include a search block for psychometric properties and a search block for spinal manipulation and mobilization.

Search block for psychometric properties:

(MH ‘Psychometrics’) or (TI psychometr* or AB psychometr*) or (TI clinimetr* or AB clinimetr*) or (TI clinometr* OR AB clinometr*) or (MH ‘Outcome Assessment’) or (TI outcome assessment or AB outcome assessment) or (TI outcome measure* or AB outcome measure*) or (MH ‘Health Status Indicators’) or (MH ‘Reproducibility of Results’) or (MH ‘Discriminant Analysis’) or ((TI reproducib* or AB reproducib*) or (TI reliab* or AB reliab*) or (TI unreliab* or AB unreliab*)) or ((TI valid* or AB valid*) or (TI coefficient or AB coefficient) or (TI homogeneity or AB homogeneity)) or (TI homogeneous or AB homogeneous) or (TI ‘coefficient of variation’ or AB ‘coefficient of variation’) or (TI ‘internal consistency’ or AB ‘internal consistency’) or (MH ‘Internal Consistency+’) or (MH ‘Reliability+’) or (MH ‘Measurement Error+’) or (MH ‘Content Validity+’) or ‘hypothesis testing’ or ‘structural validity’ or ‘cross-cultural validity’ or (MH ‘Criterion-Related Validity+’) or ‘responsiveness’ or ‘interpretability’ or (TI reliab* or AB reliab*) and ((TI test or AB test) OR (TI retest or AB retest)) or (TI stability or AB stability) or (TI interrater or AB interrater) or (TI inter-rater or AB inter-rater) or (TI intrarater or AB intrarater) or (TI intra-rater or AB intrarater) or (TI intertester or AB intertester) or (TI inter-tester or AB inter-tester) or (TI intratester or AB intratester) or (TI intra-tester or AB intra-tester) or (TI interobserver or AB interobserver) or (TI inter-observer or AB inter-observer) or (TI intraobserver or AB intraobserver) or (TI intra-observer or AB intra-observer) or (TI intertechnician or AB intertechnician) or (TI inter-technician or AB inter-technician) or (TI intratechnician or AB intratechnician) or (TI intra-technician or AB intra-technician) or (TI interexaminer or AB interexaminer) or (TI inter-examiner or AB inter-examiner) or (TI intraexaminer or AB intraexaminer) or (TI intra-examiner or AB intra-examiner) or (TI intra-examiner or AB intraexaminer) or (TI interassay or AB interassay) or (TI inter-assay or AB inter-assay) or (TI intraassay or AB intraassay) or (TI intra-assay or AB intra-assay) or (TI interindividual or AB interindividual) or (TI inter-individual or AB inter-individual) OR (TI intraindividual or AB intraindividual) or (TI intra-individual or AB intra-individual) or (TI interparticipant or AB interparticipant) or (TI inter-participant or AB inter-participant) or (TI intraparticipant or AB intraparticipant) or (TI intra-participant or AB intra-participant) or (TI kappa or AB kappa) or (TI kappa’s or AB kappa’s) or (TI kappas or AB kappas) or (TI repeatab* or AB repeatab*) or (TI responsive* or AB responsive*) or (TI interpretab* or AB interpretab*)

Search block for spinal manipulation and mobilization:

‘spinal manipulation’ OR ‘spinal manipulations’ OR ‘spinal mobilization’ OR ‘spinal mobilizations’ OR ‘spinal mobilization’ OR ‘spinal mobilizations’ OR ‘spinal adjustment’ OR ‘spinal adjustments’ OR ‘spinal manual therapy’ OR ‘high velocity low amplitude thrust’ OR ‘HVLA’ OR ‘musculoskeletal of the spine’ OR ‘spinal musculoskeletal’ OR ‘manual therapy of the spine’ OR ‘cervical manual therapy’ OR ‘thoracic manual therapy’ OR ‘lumbar manual therapy’ OR ‘manual therapy of the lumbar spine’ OR ‘manual therapy of the thoracic spine’ OR ‘manual therapy of the cervical spine’ OR ‘spinal osteopathy’ OR ‘osteopathy of the spine’ OR ‘osteopathy of the cervical spine’ OR ‘osteopathy of the thoracic spine’ OR ‘osteopathy of the lumbar spine’ OR ‘spinal osteopathies’ OR ‘osteopathies of the spine’ OR ‘osteopathies of the cervical spine’ OR ‘osteopathies of the thoracic spine’ OR ‘osteopathies of the lumbar spine’ OR chirop* OR ‘spinal manipulative therapy’ OR ‘spinal manipulative therapies’

Key: TI = Title; AB = abstract; MH = MeSH heading (MeSH is the MEDLINE index).

Study identification and selection

The review manager Covidence (Veritas Health Innovation, Melbourne, Australia) was used for study screening, selection, and data extraction. Pre-piloted forms and two independent reviewers (screening and selection review teams: JP/TH, AB/AG, KO/DC, NM/OA). An initial screening of study title and abstract and full-text selection was performed after a 10-study calibration period and with two additional meetings between the authors to ensure consistency (Kappa values were set at 0.4 moderate to 0.75 good a priori). Disagreements were resolved by discussion and reference to a third reviewer when needed (JP; See Figure 1). For abstracts only, we conducted a second search to determine their publication status.

Data extraction and data items

A standardized data collection form was used by a pair of researchers (review teams: JP/TH, AB/TH, OA/AG) independently extracted data using a standardized pilot-tested form in addition to the COSMIN Risk of Bias checklist including:

  • 1. Description and scoring of the assessment tool;

  • 2. Population demographic characteristics, medical condition, country, number of participants, age, sex, professional involved;

  • 3. Inclusion/exclusion criteria;

  • 4. Results addressing psychometrics; and

  • 5. Statistical methods used.

Missing data from primary studies were addressed by either consulting the outcome measurement website/administrator when available and the author for key data. We did not check if studies were registered prior to their start.

Risk of bias (RoB) assessment

Using the valid and reliable COSMIN Risk of Bias checklist, two reviewers (review teams JP/AB, TH/OA) independently evaluated the RoB of each included study and resolved uncertainties through discussion [4,8]. Additionally, validation studies for clinician-reported and performance-based measures was conducted [10,11]. The COSMIN checklist uses a 4-point rating scale: very good (VG), adequate (A), doubtful (D), and inadequate (I) (see Box 2. – Step 1). The RoB of research was judged per psychometric property reported, not per study. The overall score for each measurement property on the COSMIN checklist was determined by a ‘worse score counts’ approach.

Box 2.

COSMIN steps in evaluating the evidence [88] and box replicated from Hayton et al. 2024.

Box 2.

Measures of psychometric values

After qualitative evaluation of study population and the psychometric properties, no further quantitative analysis or meta-analysis was completed by psychometric value due to the heterogeneity of study populations and the properties evaluated. Statistics depicted in this paper are directly quoted as the authors presented them. Confidence Intervals (CI) and standard deviations (SD) are presented if they were reported.

Data synthesis and analysis methods

In Table 1, we summarized a descriptive synthesis of findings for the psychometric properties of each COA for identified studies. In Table 4, we detailed the updated criteria of good measurement properties established by COSMIN’s [4]. We rated the results for reliability (internal consistency, test-retest/intra-rater/inter-rater reliability, measurement error), validity (a. content validity including relevance, comprehensiveness, comprehensibility, outcome measurement item’s assessed, and appropriately worded [9]; b. structural validity, c. hypothesis testing for construct validity, cross-cultural validity, and criterion validity), responsiveness for each measurement property as sufficient (+), insufficient (–), or indeterminate (?) (see Box 2. – Step 2; see Appendix A for definitions) [4; 88]. A qualitative summary using COSMIN defined criteria for an OVERALL RATING was completed on measurement properties; this was based on our reviewer’s consensus judgment as sufficient (+), insufficient (−), inconsistent (±), or indeterminate (?) (see Box 2. – Step 3). The interpretability (Appendix B) and feasibility (Appendix C) were reported as per COSMIN guidelines.

Table 1.

Characteristics of included studies.

Author (Year) Country of Publication Outcome Measure Disorder Studied Professional Age Range Sex Ratio
(Male : Female)
n Psychometric Addressed
Aimsamrarn et al. [12] Thailand AIMS Healthy participants Physiotherapist 1 to 18 months 22:08 30 Reliability, Validity
Albuquerque et al. [13] Brazil AIMS Pre-term infants NR 3 to 16 months (corrected age) 57:51 108 Reliability, Validity
Almeida et al. [14] Brazil AIMS Healthy premature infants Physiotherapist 0 to 18 months 25:21 88 Reliability, Validity
Bartlett and Fanning [15] Canada AIMS Pre-term infants Physiotherapist 8 months
(corrected age)
28:32 60 Reliability,
Validity
Blanchard et al. [16] United States AIMS Healthy participants Physiotherapist 0 to 12 months NR 14 Reliability
Boonzaaijer et al. [17] Netherlands AIMS Healthy participants Physiotherapist 16 to 54 weeks 24:24 48 Reliability, Validity
Campbell and Kolobe [18] United States AIMS Healthy participants and infants at risk for developmental disability NR 9 to 18 weeks 47:43 90 Validity
Campbell et al. [19] United States AIMS Healthy and preterm participants. Physiotherapist <12 months 52:44 96 Validity
Campos et al. [20] Brazil AIMS Healthy participants NR 37 to 41 weeks gestation NR 43 Validity
Chiquetti et al. [21] Brazil AIMS Healthy participants NR 17 weeks (corrected age) 349:301 650 Validity
Darrah [22] Canada AIMS At risk infants developmental delay Physiotherapist 4 to 18 months 65:60 125 Validity
Darrah et al. [23] Canada AIMS Healthy participants Physiotherapist 4 to 8 months 87:77 164 Validity
Darrah et al. [24] Canada AIMS Healthy participants Physiotherapist 2 weeks to 18 months 20:25 45 Reliability
Darrah et al. [25] Canada AIMS Healthy and preterm participants. Physiotherapist 2 weeks to 18 months 338:312 650 Validity
Dumas et al. [26] United States AIMS Chronically ill infants Physiotherapist 3 to 450 days 26:27 53 Validity
Fauls et al. [27] Australia AIMs Children requiring physiotherapy Physiotherapist 0 to 5 years 50:34 84 Validity
Fetters and Tronick [28] United States AIMS Healthy participants Physiotherapist 4 to 15 months 16:23 39 Validity
Fleuren et al. [29] Netherlands AIMS Healthy participants Physiotherapist 0 to 12 months 63:37 118 Reliability
Validity
Ga and Kwon [30] Korea AIMS Healthy participants Physiotherapist 4 to 60 months 119:107 226 Reliability
Validity
Harris et al. [31] Canada AIMS At risk and healthy infants Physiotherapist 4 to 6.5 months 73:71 At risk: 86
Typical: 58
Total: 114
Validity
Heineman et al. [32] Netherlands AIMS NICU infants Medical Doctor 4 to 18 months 49:31 80 Validity
Heineman et al. [33] Netherlands AIMS Pre-term Infants NR 25 to 43 weeks gestation 34:25 205 Validity
Hoskens et al. [34] Belgium AIMS Healthy participants Physiotherapist 19 days to 18 months 62:60 122 Validity
Jeng et al. [35] Taiwan AIMS Pre-term infants Physiotherapist 24 to 36 weeks Reliability: 27:18
Validity: 21:20
45
41
Reliability
Validity
Krosschell et al. [36] United States AIMS Muscular dystrophy Physiotherapist NR NR 23 Reliability
Lackovic et al. [37] Serbia AIMS Healthy participants Medical Doctor 0 to 14 months 34:26 60 Reliability
Lefebvre et al. [38] Canada AIMS Pre-term infants Physiotherapist 26.3 weeks (SD 1.4) 91:69 160 Validity
Morales-Monforte et al. [39] Spain AIMS At risk infants developmental delay Physiotherapist 24 to 41 weeks gestation 27:23 50 Reliability, Validity
Pai-Jun and Campbell [40] United States AIMS Healthy participants. Physiotherapist 3 to 12 months NR 97 Validity
Pin et al. [41] Australia AIMS Pre-term infants Physiotherapist <18 months NR 62 Reliability
Pin et al. [42] Hong Kong AIMS Pre-term infants Physiotherapist 4 to 12 months 15:16 31 Validity
Piper et al. [43] Canada AIMS Healthy participants Physiotherapist 0 to 18 months 285:221 221
291
Total: 506
Reliability, Validity
Rizzi et al. [44] Italy AIMS Neurologically compromised infants NR 0 to 18 months 46:40 86 Validity
Saccani and Valentini [45], Brazil AIMS Healthy participants NR 0 to 18 months 407:378 795 Validity
Quezada-Villalobos et al. [46] Chile AIMS Preterm and term infants at risk motor control problems NR 10 to 16 months Full term: 48:47 Preterm: 10:10 115 Reliability
Siegle and de Sá [47] Brazil AIMS Babies exposed to HIV NR 0 to 18 months 42:33 75 Validity
Silva et al. [48] Brazil AIMS Healthy participants Nurse 123 to 196 days 26:24 50 Reliability
Snyder et al. [90] United States AIMS At risk infants developmental delay Physiotherapist 2 months,
19 days to 16 months,
23 days
16:19 35 Reliability, Validity
Spittle et al. [49] Australia AlMS Preterm infants Osteopath <1 year 47:40 87 Validity
Suir et al. [50] Netherlands AIMS Healthy participants Physiotherapist 2 weeks to 19 months 263:236 499 Reliability
Validity
Syrengelas et al. [51] Greece AIMS Full term and preterm infants NR 1 to 19 months Preterm 251:152
Full-term 584:454
Preterm: 403
Full-term: 1038
Total: 1441
Validity
Syrengelas et al. [52] Greece AIMS Healthy participants Physiotherapist 7 d to 18 months 250:174 424 Reliability, Validity
Tse et al. [53] Canada AIMS Healthy and at-risk infants Other person employed in
childcare, early intervention,
and health care
4 to 6.5 months 60:61 At risk: 72
Typical:49
Total: 121
Reliability, Validity
Tupsila et al. [54] Thailand AIMS Healthy participants Physiotherapist 0 to 14 months 314:260 574 Validity
Valentini and Saccani [55] Brazil AIMS Healthy participants Physiotherapist 0 to 18 months 291:270 561 Reliability
Validity
Valentini and Saccani [56] Brazil AIMS Healthy participants Physiotherapist 1 to 18 months 394:372 766 Reliability, Validity
van Hus et al. [57] Netherlands AIMS Healthy participants Physiotherapist <12 months 63:53 116 Validity
van Schie et al. [58] Netherlands AIMS Healthy Participants Physiotherapist 0 to 2 years 20:12 32 Validity
Wang et al. [59] China AIMS Developmental delay Physiotherapist 0 to 9 months 29:21 50 Reliability, Validity
Yeh et al. [60] Taiwan AIMS Healthy participants Physiotherapist 25 to 39 weeks 18:11 29 Validity
Adams and Hewell [61] United States LATCH Healthy breastfed infants and their mothers Lactation Consultant Mothers:
20 to 29 years
Infants: NR
NR 35 dyads Reliability
Altuntas et al. [62] Turkey LATCH Healthy participants NR NR NR 46 feeds Reliability, Validity
Báez León et al. [91] Spain LATCH Breastfeeding, healthy participants Nurse Mothers:
32 years;
Infants:
37 to 41 weeks gestation
NR 20 dyads Reliability
Chapman et al. [63] United States LATCH Healthy infants, obese/overweight mothers Medical Doctor Mother:
29 years
(SD 5.5)
Infant:
39.3 weeks (SD 1.2)
NR 45 dyads Reliability
DaConceição et al. [64] Brazil LATCH Healthy participants Lactation Consultant Mothers:
34 year
Infant:
39 weeks
Infant: 88:72 160 dyads Validity
DaConceição et al. [65] Brazil LATCH Healthy participants Lactation Consultant NR NR 160 dyads Reliability, Validity
Dolgun et al. [66] Turkey LATCH Healthy infants and their mothers Mother/
translator
Mothers:
19 to 45 years
Infants:
< 6 months
Infant: 66:61 127 dyads Reliability, Validity
Kumar et al. [67] United States LATCH Breastfeeding healthy participants Lactation Consultant Neonate/Infant:
1 day to 6 weeks
Infant: 134:114 250 dyads Validity
Lau et al. [68] Singapore LATCH Healthy participants Trained Research Assistant Neonate:
48 to 72 hours
Infant: 477:430 907 dyads Reliability, Validity
Riordan and Koehn [69] United States LATCH Healthy participants Nurse infant: full gestation/40 weeks NR 28 feeds, 13 dyads Reliability
Adam et al. [70] Australia Cobb Angle Scoliosis Medical Doctor NR NR 12 Reliability
Al-Bashir et al. [71] India, Jordan, United States Cobb Angle Scoliosis Medical Doctor 10 to 18 years 14:14 28 Validity
Allen et al. [72] Canada Cobb Angle Adolescent idiopathic scoliosis Medical Doctor NR NR 22 Reliability,Validity
Brink [73] Netherlands Cobb Angle Idiopathic scoliosis Medical Doctor 10 to 17 years 10:90 33 Reliability
De Carvalho et al. [74] France Cobb Angle Scoliosis Medical Doctor NR NR 20 Reliability, Validity
Gstoettner et al. [75] Austria Cobb Angle Scoliosis Medical Doctor NR NR 48 Reliability
Kumar et al. [76] India Cobb Angle Adolescent idiopathic scoliosis Medical Doctor NR (adolescents implied) NR 20 Reliability
Livanelioglu et al. [77] Turkey Cobb Angle Scoliosis Physiotherapist 9 to 18 years 9:42 51 Reliability, Validity
Loder et al. [78] United States Cobb Angle Scoliosis Medical Doctor <10 years 17:47 64 Reliability
Marchetti et al. [79] Brazil Cobb Angle Scoliosis Medical Doctor 5 to 18 years 42:48 90 Reliability, Validity
Mehta et al. [80] Korea Cobb Angle Scoliosis Medical Doctor <7 to > 10 years NR 318 Reliability
Pruijs et al. [81] Netherlands Cobb Angle Scoliosis Medical Doctor 8.1 to 21.2 years 6:35 41 Reliability, Validity
Safari et al. [82] Iran Cobb Angle Scoliosis NR 4 to 40 years NR 14 Reliability
Stokes and Aronsson [83] United States Cobb Angle Scoliosis Medical Doctor NR NR 27 Reliability
Xiaohua et al. [84] United States Cobb Angle Scoliosis Chiropractor NR NR 20 Reliability
Watson and Mac Donncha [85] Ireland Posture Assessment Healthy participants NR 15 to 17 years 114:00 114 Reliability

Key: AIMS = Alberta Infant Motor Scale; LATCH = Latch, Key: Audible Swallow, Type of Nipple, Comfort, Hold; HIV = human immunodeficiency virus; NICU = neonatal intensive care unit; SD = standard deviation; NR = not reported; n = sample size, dyad = mother and infant pair.

Table 4.

Summary of results for Alberta Infant motor Scale (AIMS).

Author (year) Country of Origin n Psychometric COSMIN Score
RoB
Results
Aimsamrarn et al. [12] Thailand 30 Reliability
(Intra-rater)
I Assessor 1
ICC = 0.99 (95% CI: 0.98 to 0.99)
        I Assessor 2
ICC = 0.98 (95% CI: 0.92 to 0.99)
      Reliability
(Inter-rater)
I ICC = 0.98 (95% CI: 0.98 to 0.99)
    30 Construct Validity D With comparison to the BSID – II
ρ = 0.969, P < 0.01
Albuquerque et al. [13] Brazil 108 Reliability
(Intra-rater)
VG ICC = 0.99 (95% CI 0.98 to 1.00)
      Reliability
(Inter-rater)
VG ICC = 0.99 (95% CI 0.99 to 1.00)
      Predictive Validity I R2 = 0.93,
R2 = 0.92 after ceiling effect considered.
          5th percentile cut-off: Sensitivity 51.5%, Specificity 98.4%, Positive Predictive Value 889.5%, Negative Predictive Value 88.6%
          10th percentile cut-off: Sensitivity 75.8% Specificity 92.9%, Positive Predictive Value 73.5%, Negative Predictive Value 93.6%
Almeida et al. [14] Brazil 88 Reliability
(Inter-rater)
VG ICC = 0.99
      Construct Validity VG When compared to BSID:
Total population: rs = 0.95, R2 = 0.90
          6m: rs = 0.74, R2 = 0.55
          12m rs = 0.89, R2 = 0.79
Bartlett and Fanning [15], Canada 7 Reliability
(Inter-rater)
D ICC = 0.95 (95% CI: 0.73 to 0.99)
    5   D ICC = 0.98 (95% CI: 0.87 to 0.99)
    60 Criterion Validity VG total and subscale score for three subgroups were significantly different
(F = 11.03, df 59, P<0.001) subgroups were normal, suspect, and abnormal.
Blanchard et al. [16] USA 14 Reliability
(Inter-rater)
D total score: ICC = 0.98 to 0.99
Boonzaaijer et al. [17] Netherlands 48 Reliability(Inter-rater) Measurement Error VG ICC = 0.99,

SEM = 0.92
      Reliability
(Intra-rater)
Measurement Error
VG ICC = 0.99,
SEM = 0.96
      Concurrent Validity VG Comparison between live and home video:
ICC = 0.99 (95% CI 0.89 to 0.99)
SEM = 1.41 (95% CI: 1.63 to 0.80)
Campbell and Kolobe [18], United States 90 Construct Validity VG Comparison with TIMP:
Raw scores: rs = 0.64
(p < 0.0001)
          percentiles: rs = 0.60
(p < 0.0001)
Campbell et al. [19] USA 96 Construct Validity VG With comparison to TIMP:
ICC = 0.20 to 0.67 p < 0.001
Campos et al. [20] Brazil 43 Construct Validity I With comparison to BSID – II:
5th percentile cut-off:
Sensitivity 100%,
Specificity 78.37%, k = 0.5
          10th percentile cut-off:
Sensitivity 100%,
Specificity 48.64%, k = 0.2
Chiquetti et al. [21] Brazil 650 Construct Validity D Weak to moderate correlation with TIMP t = 0.24
Darrah [22] Canada 12 Reliability
(Inter-rater)
VG Total score: ICC = 0.98
    125 Predictive Validity VG Abnormal vs suspicious:
4th month:
Sensitivity 87.5%,
Specificity 78.0%,
Positive Predictive Value 36.8%,
Negative Predictive Value 97.7%,
8th month:
Sensitivity 100%,
Specificity 90.8%,
Positive Predictive Value 61.5%,
Negative Predictive Value 100%
      Construct Validity VG Comparison with BSID II
4th month: ICC = 0.42,
8th month: ICC = 0.5
Darrah et al. [23] Canada 164 Predictive Validity VG Abnormal vs Normal/Suspicious:
4th month (10th percentile cut-off):
Sensitivity 77.3%,
Specificity 81.7%,
Positive Predictive Value 39.5%,
Negative Predictive Value 95.8%
          8th month (5th percentile cut-off):
Sensitivity 86.4%,
Specificity 93%,
Positive Predictive Value 65.5%,
Negative Predictive Value 97.8%
          Abnormal/Suspicious vs Normal
4th month (10th percentile cut-off):
Sensitivity 58.3%,
Specificity 82.8%,
Positive Predictive Value 48.8%,
Negative Predictive Value 87.6%
          8th month (5th percentile cut-off):
Sensitivity 63.9%,
Specificity 95.3%,
Positive Predictive Value 79.3%,
Negative Predictive Value 90.4%
Darrah et al. [24] Canada 45 Reliability
(Inter-rater)
D ICC > 0.99
Darrah et al. [25] Canada 650 Construct Validity VG Comparison of original and contemporary norms:
ICC = 0.99
Dumas et al. [26] United States 53 Construct Validity A Comparison of the PEDI = CAT:
rs = 0.32 P = 0.02 (fair association)
Fauls et al. [27] Australia 37 Construct Validity VG Comparison of ASQ-3-GM
rs = 0.697
Fetters and Tronick [28], United States 39 Construct Validity D High false positives at 4 months. Improved specificity/sensitivity at 7 months
Fleuren et al. [29] Netherlands 118 Reliability
(Inter-rater)
A total score: rs = 0.99
      Cultural Validity VG significantly more Dutch children score below the cut off scores of 10% (29% of Dutch children) and 5% (17% of Dutch children)
Ga and Kwon [30] Korea 226 Reliability
(Inter-rater)
I 97.7% between three testers
    226 Construct Validity VG Sensitivity of Korean ASQ-3-GM 76.3 to 90.2% when compared to less than 10th percentile cut off of AIMS
Harris et al. [31] Canada 144 Construct Validity VG Comparison of BSID – II:
4 to 6 months evaluation:
rs = 0.36
          10 to 12 months evaluation:
rs = 0.4
          2 to 3 years comparison:
rs = 0.55
    144 Predictive Validity VG 4 to 6.5 month (cut-off less than 5th percentile):
Sensitivity 12.3%,
Specificity 94.3%,
Positive Predictive Value 58.3%,
Negative Predictive Value 62.1 %
Heineman et al. [32] Netherlands 80 Construct Validity VG With comparison to IMP performance domain:
Total score (corrected age)
rs = 0.91 (95% CI 0.68 to 0.94)
          total score (raw scores)
rs = 0.94, p < 0.005
Heineman et al. [33] Netherlands 205 Construct Validity VG With comparison to IMP:
Fair to moderate correlation (rs = 0.37 to 0.54 through the ages)
Hoskens et al. [34] Belgium 122 Construct Validity D With comparison to BSID – III:
rs = 0.59 to 0.98
(p < 0.001)
overall rs = 0.98
Jeng et al. [35] Taiwan 45 Reliability
(Intra-rater)
A 0 to 3 months
ICC = 0.85 to 0.98,
SEM = 0.17 to 0.55
4 to 7 months
ICC = 0.90 to 0.98,
SEM = 0.32 to 1.24
>8 months
ICC = 0.97 to 0.99,
SEM = 0.01 to 0.73
      Reliability
(Inter-rater)
A 0 to 3 months
ICC = 0.73 to 0.97,
SEM = 0.16 to 0.68
4 to 7 months
ICC = 0.75 to 0.98,
SEM = 0.4 to 1.24
>8 months
ICC = 0.98 to 0.99
SEM = 0.01 to 0.59
    41 Construct Validity VG with comparison to BSID:
6 months rs = 0.78
12 months rs = 0.90
(p < 0.0001)
Krosschell et al. [36] United States 23 Reliability
(Inter-rater)
VG 2014 training:
ICC = 0.95 (95% CI 0.76 to 1.00)
          2015 training:
ICC = 0.98 (95% CI 0.93 to 1.00)
Lackovic et al. [37] Serbia 60 Internal consistency I α 3 months = 0.93 to 0.941
          7 months = 0.997 to 0.998
          14 months = 0.996 to 0.998
      Reliability
(Test – retest)
I α = 0.940 to 0.998
      Reliability
(Inter-rater)
I total ICC = 0.75
Lefebvre et al. [38] Canada 160 Predictive Validity VG Predictive Validity: 4 month AIMS ability to predict BSID < 85
Sensitivity 0.66% (95% CI 0.55 to 0.77),
Specificity 0.53% (95% CI 0.45 to 0.64),
Positive Predictive Value 0.55% (95% CI 0.45 to 0.65),
Negative Predictive Value 0.65% (95% CI 0.54 to 0.76)
          BSID < 70:
Sensitivity 0.78% (95% CI 0.61 to 0.95),
Specificity 0.48% (95% CI 0.4 to 0.56),
Positive Predictive Value 0.20% (95% CI 0.12 to 0.29),
Negative Predictive Value 0.93% (95% CI 0.87 to 0.99)
          10 to 12 months AIMS
          BSID < 85
Sensitivity 0.44% (95% CI 0.32 to 0.57)
Specificity 0.82% (95% CI 0.73 to 0.90),
Positive Predictive Value 0.67% (95% CI 0.52 to 0.81),
Negative Predictive Value 0.64% (95% CI 0.54 to 0.73)
          BSID < 70:
Sensitivity 0.65% (95% CI 0.42 to 0.87),
Specificity 0.75% (95% CI 0.67 to 0.82)
Positive Predictive Value 0.26% (95% CI 0.13 to 0.39),
Negative Predictive Value 0.94% (95% CI 0.89 to 0.97)
Pai-Jun and Campbell [40] United States 97 Structural Validity D Rasch ability measure for infants:
51.05 (SD = 8.28) in non-extreme scoring infants.
Infit mean square: 0.92
Outfit mean square: 0.63
Morales-Monforte et al. [39] Spain 50 Reliability
(Intra-rater)
D ICC > 0.98
      Reliability
(Inter-rater)
D ICC = 0.98 (95% CI 0.82 to 1.00)
      Internal Consistency VG KR 20 scales: 0.88 to 0.99
      Construct Validity VG With comparison to BISD – III
rs = 0 to 3 months 0.973,
4 to 8 months 0.693
9 to 18 months: 0.957
Pin et al. [41] Australia 62 Reliability
(Intra-rater)
VG ICC = 0.99
      Reliability
(Inter-rater)
VG ICC = 0.85 to 0.97
Pin et al. [42] Hong Kong 31 Construct Validity A Sub items: rs = 0.315 to 0.621 (3 areas with significant differences: 4 months supine, 12 months standing score and total score)
Piper et al. [43] Canada 221 Reliability
(Inter-rater)
A rs = 0.996
    138

291 PT;6 International experts
Reliability
(Test-retest)
Content
A

A
rs = 0.993

Relevance - stemmed from functional sequences and variations that occur in early motor development.
Comprehensiveness − 84 items reviewed and rated by pediatric physiotherapists in Alberta and 291 members of the Paediatric Division of the Canadian Physiotherapy Association. 6 international experts to revise the 84 items to 59 items.
Comprehensibility – tested for feasibility with 97 low risk normal infants; 15 minutes, no handling, minimal space, and equipment required.
    506 Construct Validity A when compared to BSID:
rs = 0.98
        A when compared to PDGMS: rs = 0.97
Quezada-Villalobos et al. [46] Chile 115 Reliability
(Inter-rater)
VG Total score: ICC = 0.94 SEM of total = 3.1 points
Rizzi et al. [44] Italy 86 Construct Validity VG With comparison to IMP:
total rs = 0.076,
Performance domain:
rs = 0.89 (p < 0.001)
      Predictive Validity VG AUC = 0.85 (p < 0.001,
95% CI 0.77 to 0.94)
Saccani and Valentini [45] Brazil 795 Cultural Validity A Comparison of Brazilian and Canadian norms:
t-test: <± 2.614 (p < 0.0001)
        I No significant differences between genders
Siegle and de Sá [47] Brazil 75 Construct Validity A with comparison to BSITD
1 month: rs = 0.57
(95% CI 0.20 to 0.82),
2 months rs = 0.44
(95% CI 0.29 to 0.63),
3 months: rs = 0.03
(5% CI −0.43 to 0.48),
4 months: rs = 0.53
(95% CI 0.23 to 0.74)
4 months: rs = 0.53
(95% CI 0.23 to 0.74),
8 months: rs = 0.51
(95% CI 0.16 to 0.74),
12 months: rs = 0.59
(95% CI 0.32 to 0.78)
Silva et al. [48] Brazil 50 Reliability
(Intra-rater)
VG ICC = 0.55 to 0.95 for categories (prone, supine, sitting and standing)
Snyder et al. [86] United States 35 Reliability
(Inter-rater)
VG ICC = 0.98
(95% CI 0.96 to 0.99)
      Construct Validity VG Comparison of PDGMS-2
rs= 0.90 to 0.97
Spittle et al. [87] Australia 87 Predictive Validity VG Accuracy for predicting motor impairment was good. Motor impairment on the MABC-2 (scores < or = 5th and < or = 15th) was most accurately predicted by the AIMS at 4 months, whereas cerebral palsy was most accurately predicted by the NSMDA at 12 months. Best accuracy was achieved when combining NSMDA and AIMS at 4 months as follows: MABC</= 15th Accuracy 80 (95% CI 70, 88); MABC </= 5th Accuracy 84 (95% CI 74, 91); Cerebral Palsy Accuracy 92 (95% CI 84 to 97).
Suir et al. [50] Netherlands 499 Reliability
(Inter-rater)
I % agreement = 97.80%
      Cultural Validity D (Dutch) = 1.18 x (Canada), 7% proportion explained variance
Syrengelas et al. [52] Greece 424 Reliability
(Inter-rater)
D ICC = 0.99
(95% CI 0.99 to 0.99)
      Cultural Validity I No significant difference with Canada norms x 2 to 3 months, no significant differences in the 5th and 90th percentile
Syrengelas et al. [51] Greece 1441 Construct Validity A Total AIMS score significantly lower for premature infants (p < 0.0001) at each age category
          The rate of mean increase did not differ between premature and full-term infants.
Tse et al. [53] Canada 121 Reliability
(Inter-rater)
I ICC = 0.72 to 0.98 for HINT and AIMS (not sub divided)
      Construct Validity VG With comparison to HINT 4 to 6.5m ρ = 0.831 (n = 121)
          10 to 12.5 months ρ = − 0.84 (n = 109)
Tupsila et al. [54] Thailand 574 Cultural Validity VG Thai scores lower for 01, 1–2, 2–3 and 3–4 months, with a moderate to large effect size, from 6–14 Thai scores were higher. Overall Thai infant scores were significantly higher with a moderate effect size.
Valentini and Saccani [55] Brazil 561 Reliability
(Inter-rater)
D ICC = 0.86 to 0.99
      Reliability
(Intra-rater)
D ICC = 0.915 to 0.993
      Reliability
(Test-retest)
D rs = 0.85 p < 0.001
      Internal Consistency VG α = 0.88
      Construct Validity   correlation with DSCB: w = 0.319, k = 0.309, McNemar-Bower test: 0.047
      Predictive Validity I Comparison to DSCB
ICC = 0.96 (95% CI 0.89 to 0.99, p < 0.0001)
Valentini and Saccani [56] Brazil 766 Internal Consistency VG Total score:
α = 0.90 (subscales: 0.84 to 0.92)
      Content Validity D Language Clarity:
k = 0.51 to 0.87
Pertinence:
k = 0.82 to 0.93
      Reliability:
Test-retest
Inter-rater
Intra-rater
Construct
Concurrent
Validity
Discriminant Validity

Predictive Validity

VG
VG
VG
VG


VG


VG

rs = 0.98
ICC = 0.86 to 0.99
ICC = 0.91 to 0.99
Comparison to CBDS classification moderate positive and significant correlation
rs = 0.34, p = 0.03
Full-term vs preterm groups
significant difference between for the raw score (t = - 4.84, p < 0.001) and percentile (t = -1.99, p < 0.05).
predictive power (p < 0.001) was limited to the group of infants aged from 3 months to 9 months.
van Hus et al. [57] Netherlands 116 Construct Validity I With comparison to the PDGMS
rs = 0.726 p < 0.001)
van Schie et al. [58] Netherlands 32 Predictive Validity D Predicting motor and mental outcome after perinatal hypoxic ischemic encephalopathy:
Sensitivity 100% (95% CI 87 to 100),
Specificity 80% (95% CI 48 to 80),
Positive Predictive Value: 92 (95% CI 80 to 92),
Negative Predictive Value (95% CI 0 to 40)
Wang et al. [59] China 50 Reliability
(Inter-rater)
VG ICC = 0.99 (95% CI 0.98 to 0.99)
      Construct Validity VG With comparison to PDGMS −2:
ICC > 0.90 for all subscale except RE
ICC = 0.75
          k = AIMS 5th percentile cut-off 0.580 to 0.724 moderate
Yeh et al. [60] Taiwan 29 Construct Validity VG Consistency of GMA results between the home and clinic recorded videos was very good (k = 0.869; P < 0.05).

Key: COSMIN RoB Score: very good (VG), adequate (A), doubtful (D), and inadequate (I); ASQ-3- GM = Ages and Stages Questionnaire; AIMS = Alberta Infant Motor Scale; AUC = area under curve; BSITD = Bayley Scale of Infant and Toddler Development; BSID – II = Bayley Score of Infant Development − 2nd Edition; BSID = Bayley Score of Infant Development; CBDS = Child Behaviour Development Scale; CI = confidence interval; DSCB = Development Scale of Child Behaviour; GMA = General Movement Assessment; HINT= Harris Infant Neuromotor Test; α = Cronbach’s alpha; ρ = Pearson’s correlation coefficient (PCC); rs = Spearman’s ranked correlation coefficient (rho); ICC = interclass correlation coefficient; k = Cohen’s kappa; R2 = Linear Regression; TIMP = Test of Infant Motor Performance; IMP = Infant Motor Profile; MABC-2 = Movement Assessment Battery for Children 2 Edition; n = sample size; NSMDA = Neuro-Sensory Motor Development Assessment; PDGMS-2 = Peabody Development Gross Motor Scale 2nd Edition; PEDI-CAT = Pediatric Evaluation of Disability Inventory Computer Adaptive Test; SD = standard deviation; SEM = standard error of measurement.

Certainty assessment

The adapted GRADE (Grading of Recommendations, Assessment, Development and Evaluations) approach was used to summarize the evidence across the psychometric elements as follows: 1. high, 2. moderate, 3. low, and 4. very low certainty (see Box 2. – Step 4 [4; –] page 32–36 [92,93].

Results

Of 2361 records left following deduplication, 248 were retrieved for full-text review, we selected 95 studies which met the inclusion criteria of which 77 studies concerned clinician-reported or performance-based outcomes (Figure 2)

Figure 2.

Figure 2.

The PRISMA diagram for study flow.

Included studies

We included 77 studies and 9789 participants analyzed: AIMS – Alberta Infant Motor Skill (n = 51 studies), LATCH – Latch, Audible swallowing, Type of nipple, Comport, Hold (n = 10 studies), Cobb Angle (n = 15 studies), Postural Assessment (n = 1 study). Table 1 cites the included studies by outcome measure and describes their characteristics, while Table 2 details the population’s age and medical condition by outcome measure. Most studies assessed a spectrum of psychometric properties, and one COA (AIMS) was comprehensive (see Tables 4 to 7 for a detailed description). Most studies used convenience samples drawn from community and government supported healthcare centers, schools, and daycares. The ages of the participants reflected the purpose of the outcome measure used. Two studies [81,82] utilizing Cobb angles did consider participants as older than 18 years. The sex of the participants was distributed evenly except for postural assessment where males were exclusively studied. Of the full-text articles screened, nine studies cited in Appendix D, addressed the validity and reliability of the Cobb angle using alternative methods from that sited in the scoping review. Convenience samples drawn from community and government supported healthcare centers, schools, and daycares were predominately use. Gender varied by the diagnoses studied.

Table 2.

Population per outcome measure.

Outcome Measure Age of Population Studied Medical Condition
AIMS 0–5 years old
(51 studies 10,215 participants)
  • Neurologically compromised (145 participants)

  • At risk (undefined) (582 participants)

  • Preterm (<38 weeks’ gestation) (1437 participants)

  • Chronically ill (107 participants)

  • Exposed to human immunodeficiency virus (HIV, 75 participants)

  • Typical (7869 participants).

LATCH Neonates, Infants and mothers
(10 studies, 1717 paired participants,
46 undetermined feeds)
  • Healthy infants

Cobb Angle Children/Adolescents
(15 studies, 808 participants)
  • Idiopathic adolescent scoliosis

Postural Assessment Adolescents
(1 study, 114 participants)
  • Healthy participants

Key: AIMS = Alberta Infant Motor Scale; LATCH = Latch, Audible Swallow, Type of Nipple, Comfort, Hold; HIV = human immunodeficiency virus.

Table 5.

Summary of results for LATCH.

Author (year) Country of Origin n Psychometric COSMIN
Score
RoB
Results
Adams and Hewell [61] USA 25 Reliability
(Inter-rater)
D Percent agreements: 75 to 100 % between lactation consultant and researcher
          Between professional and mother rs = 0.53
Altuntas et al. [62] Turkey 46 Reliability D Total score: rs = 0.85 to 0.91
(Domains: 0.65 to 0.90)
      Construct Validity A Correlation with IBFAT: rs = 0.71
          Correlation with MBA: rs = 0.88
Báez León et al. [94] Spain 20 Reliability
(Inter-rater)
VG Totals; time (1) rs = 0.894 (2) rs = 0.453
(3) rs = 0.729
Chapman et al. [63] USA 45 Reliability
(Inter-rater)
A Day 2: ICC 0.79 (95% CI 0.7 to 0.86)
          Day 4: ICC 0.78 (95% CI 0.67 to 0.86)
          Day 7: ICC 0.89 (95% CI 0.83 to 0.93)
DaConceição et al. [64] Brazil 160 Internal Consistency D α global score: 0.25 to 0.32
      Reliability
(Inter-rater)
A ICC = 0.96
DaConceição et al. [65] Brazil 160 Criterion Validity
Cross-Cultural Validity
AI applied in two specialist nurses simultaneously in the evaluation of 160 feeds in order to validate the reliability of the instrument ICC 0.96
Original version of the translation to Portuguese and back translation of the original instrument and evaluated by the judges CVI: > 0.90 (60%)
evaluation of judges concordance coefficient AC 2 Gwett, 0.93 (95% CI 0.91 to 0.96).
Dolgun et al. [66] Turkey 127 Construct Validity VG Comparison of BBAT rs = 0.76
Kumar et al. [67] USA 250 Predictive Validity VG 9 or higher during 16 to 24 hours period after birth:
Sensitivity 75%
Specificity 63%
1.7 times more likely to be solely breastfeeding at 6 weeks.
0 to 8 hours, 6 or more:
Sensitivity 92.8%
Specificity 30.0%
Relative Risk: 2.3
8 to 16 hours: 7 or more:
Sensitivity 89%,
Specificity 34%,
Relative Risk 1.8
Lau et al. [68] Singapore 907 Structural Validity VG Exploratory Factor Analysis resulted in removal of C (Comfort)
Confirmatory Factor Analysis
4-item: Goodness-of fit (GFI): 0.994 to 0.995, Adjusted Goodness of Fit (AGFI): 0.971 to 0.976, Incremental Fit Index (IFI): 0.993 to 0.995, Tucker-Lewis Index (TLI): 0.979 to 0.985, Comparative Fit Index (CFI): 0.993 to 0.995, Root Means Square Error of Approximation (RMSEA) = 0.042 to 0.044
5-item: GFI: 0.961 to 0.980, AGFI = 0.883 to 0.941, IFI: 0.889 to 0.957, TLI = 0.772 to 0.957, CFI: 0.886 to 0.956, RMESA: 0.086 to 0.125
      Internal Consistency VG α: 4-item: 0.74
          α: 5-item: 0.70
      Predictive
Validity
VG Predicting exclusive or nonexclusive breast-feeding success: 4 items: vaginally delivered (cut off 3.5 and 5.5): Sensitivity 94 to 95%, Specificity 0 to 2%, Positive Predictive Value: 25%, Negative Predictive Value 20 to 47%, Likelihood Ratio = 0.96
Riordan and Koehn [69], USA 28 Reliability (Inter-rater)
Construct validity
VGD r = 0.11 to 0.46
percent agreement: 0.37 to 0.95
correlation with IBFAT: r = 0.69
correlation with MBA: r = 0.68

Key: COSMIN Risk of Bias (RoB) Score: very good (VG), adequate (A), doubtful (D), and inadequate (I); USA = United States of America; SD = standard deviation; α = Cronbach’s alpha; ρ = Pearson’s correlation coefficient; rs = Spearman’s ranked correlation coefficient (rho); ICC = interclass correlation coefficient; AGFI = adjusted Goodness of Fit; BBAT = Bristol Breast-feeding Assessment Tool; CFI = Comparative Fit Index; CVI = Content Validity Index; GFI = Goodness-of fit; IBFAT = Infant Breast-Feeding Assessment Tool; IFI = Incremental Fit Index; MBA = Mother-Baby Assessment; n = sample size; RMSFA = Root Means Square Error of Approximation; TLI = Tucker-Lewis Index.

Table 6.

Summary of results for Cobb angle.

Author (year) Country of Origin n Psychometric COSMIN score
RoB
Results
Adam et al. [70] Australia 12 Reliability
(Intra-rater)
I Error ± 7.7 degrees for single reading
±5.9 degrees for multiple readings.
Allen et al. [72] Canada 22 Reliability
(Intra-rater)
D Manual method: ICC = 0.95Semiautomated: ICC = 0.92 to 0.94
          Automated: ICC = 0.91 to 0.92
      Reliability
(Inter-rater)
D Manual method:
ICC = 0.93 to 0.95 (95% CI 0.83 to 0.98)
          Semiautomated:
ICC = 0.0.89 to 0.93 (95% CI 0.76 to 0.97)
          Automated:
ICC = 0.83 to 0.96 (95% CI 0.63 to 0.98)
      Construct Validity D Correlation between manual and automated:
ICC = 0.30 (95% CI 0 to 0.6)
          Correlation between semiautomated and automated:
ICC = 0.30 (95% CI 0 to 0.6)
          Correlation between manual and semiautomated:
ICC = 0.70 (95% CI 0.4 to 0.9) Outliers removed
Al-Bashir et al. [71] India, Jordan, USA 28 Reliability
Construct Validity
I
VG
Mean standard deviations: manual: 5.28 degrees
Computer algorithm: 2.64
Correlation between manual and computer algorithm: rs = 0.80
Brink et al. [73] Netherlands 33 Reliability
(Inter-rater)
D Automated: ICC = 0.94 (95% CI 0.85 to 0.95)
Manual Spinous Process (SP) angle:
ICC = 0.86 (95% CI 0.68 to 0.94)
Manual Transverse Process (TP) angle:
ICC = 0.84 (95% CI 0.64 to 0.94)
      Reliability
(Intra-rater)
D Automated: ICC = 0.97
Manual SP angle: ICC = 0.96
Manual TP angle: ICC = 0.94
De Carvalho et al. [74] France 20 Reliability
(Inter-rater)
D 30 cm × 90cm: rs = 0.886
ICC = 0.94 (95% CI 0.92 to 0.95)
          14 cm × 42 cm: rs = 0.888
ICC = 0.93 (95% CI 0.91 to 0.95)
      Reliability
(Intra-rater)
D 30 cm x 90 cm: rs = 0.0.932
ICC = 0.95 to 0.99
          14 cm × 42 cm: rs = 0.0.832
ICC = 0.91 to 0.98
      Construct Validity I Comparison between small films and plain-films measurements ICC = 0.95 (95% CI 0.96 to 0.97); The graphic study of agreement proposed by Altman and Bland showed discordance higher than 10 between 30 to 90 cm radiographs and small radiograph measurements for 16 of 640 measurements.
Gstoettner et al. [75] Austria 48 Reliability
(Inter-rater)
A Manual set:
ICC = 0.96 (95% CI 0.96 to 0.97)
          Digital set:
ICC = 0.93 (95% CI 0.92 to 0.93)
      Reliability
(Intra-rater)
A Manual set:
ICC = 0.97 (95% CI 0.94 to 0.83)
      Reliability
(Intra-rater)
A Manual set:
ICC = 0.97 (95% CI 0.94 to 0.83)
          Digital set:
ICC = 0.96 (95% CI 0.90 to 0.98)
Kumar et al. [76] India 20 Reliability
(Inter-rater)
A Average 4 degrees SD ICC > 0.9
      Reliability
(Intra-rater)
A Average 2 degrees SD ICC > 0.9
Livanelioglu et al. [77] Turkey 51 Reliability
(Inter-rater)
A ICC = 0.96 (95% CI: 0.93 to 0.97)
      Construct Validity VG Comparison of Cobb1 to smart mouse1:
rs = 0.93 (95% CI 0.88 to 0.96)
          Comparison of Cobb1 to smart mouse2:
rs = 0.87 (95% CI: 0.77 to 0.92)
Loder et al. [78] USA 64 Reliability
(Inter-rater)
I ±7 to 7.9 degrees
      Reliability
(Intra-rater)
I ±6.1 to 6.9 degrees
Marchetti et al. [79] Other: Brazil tsp: 90 Reliability
(Inter-rater)
A Thoracic spine: ICC = 0.77 (95% CI: 0.67 to 0.84) SEM: 4.8 degrees, MDC: 9.3 degrees
    lsp: 89     Lumbar spine: ICC = 0.87 (95% CI: 0.76 to 0.89) SEM: 3.8 degrees MDC: 7.5 degrees
      Reliability
(Intra-rater)
I Thoracic spine: ICC = 0.75 (95% CI: 0.68 to 0.82) SEM: 5.2 degrees, MDC: 10.2 deg
          Lumbar spine: ICC = 0.76 (95% CI: 0.69 to 0.82) SEM: 4.3 degrees, MDC: 8.5 deg
      Construct Validity D Compared to measuring apex angle and linear measurements:
Thoracic spine: rs = 0.791 p < 0.001)
          Lumbar spine: rs = 0.689, p < 0.001)
Mehta et al. [80] Korea 318 Reliability
(Inter-rater)
D rs = 0.986 (both endplate and pedicle measurement)
      Reliability
(Intra-rater)
  Endplate measurement rs = 0.980
        D Pedicle measurement rs = 0.986
Pruijs et al. [81] Netherlands 41 Reliability I Measurement error, 40 degrees of freedom, Range 6 to 48 degrees SD 3.67 degrees
Safari et al. [82] Iran 14 Construct Validity I “New” semi-automated method compared to gold standard difference in mean was 4 degrees
Stokes and Aronsson [83], USA 27 Reliability
(Inter-rater)
I 2.3 to 2.7 degrees
      Reliability
(Intra-rater)
I 1.8 to 2.4 degrees
Xiaohua et al. [84] USA 20 Reliability
(Inter-rater)
D Manual method: ICC = 0.92
          Digital method: ICC = 0.97

Key: COSMIN RoB Score: very good (VG), adequate (A), doubtful (D), and inadequate (I); USA = United States of America; ICC = interclass correlation coefficient; SD = standard deviation; p = probability level; rs = Spearman’s ranked correlation coefficient (rho); ICC = interclass correlation coefficient; MDC = minimal detectable change; SEM = standard error of measurement.

Table 7.

Summary of results for postural assessment.

Author (Year) Country of Origin n Psychometric COSMIN RoB Results
Watson and Mac Donncha [85], Ireland 114 Reliability
(Intra-rater)
Reliability
(inter-rater)
I

I
10-items for Bland and Altman plot % agreement = 86.6% to 100%
10-items for Bland and Altman plot % agreement = 93.3 to 100%

Key: COSMIN RoB Score: very good (VG), adequate (A), doubtful (D), and inadequate (I); n = sample size.

Excluded studies

We excluded 169 studies for the following reasons: 1) the participants were adults; 2) the COA was not on the pre-identified list from the scoping review; 3) more than 20% of the population under research was over 18 years of age; and 4) psychometric properties did not address reliability, validity, responsiveness, and information related to the interpretability and feasibility of the reported outcome measure.

Risk of bias

The quality of research was judged per psychometric property reported, not per study. Research quality was judged to be adequate or very good in 31% for reliability and 43% for validity with the most common sources of bias being poor statistical analysis, unclear methodology, and small population numbers (see Table 3).

Table 3.

Summary of findings for each COA included 1) criteria rating for good measurement property (sufficient (+), insufficient (-), inconsistent (±), indeterminate (?); see gray highlighted ROW) for each measurement property (content validity, structural validity, internal consistency, reliability, measurement error, hypothesis testing for construct validity, cross-cultural validity, criterion validity, and responsiveness), 2) overall rating across measurement properties for each COA (sufficient (+), insufficient (–), or indeterminate (?); see gray highlight COLUMN), and 3) certainty of evidence (GRADE). Note that the author(s) with risk of bias score by psychometric property were rated as very good (VG), adequate (A), doubtful (D), or inadequate (I) (i.e. Allen 208 (D)).

COA Content Validity
Structural Validity
Reliability
Validity (Construct, Cross-cultural, Criterion)
   
Criteria Rating Internal Consistency Measurement Error Responsiveness Overall Rating across Measurement Properties Certainty of Evidence
(GRADE)
AIMS Content Validity: -Relevance founded on functional sequences in early motor development. Piper (1992) (A);
-Pertinence: (k) = 0.82 to 0.93 (1 study, n = 766) Valentini (2012) (D)
-Comprehensiveness 84-items reduced to 56-items by 291 PTs and 6 international experts. Piper (1992) (A)
-Comprehensibility feasibility tested(1 study, n = 291 PT and 6 experts) Piper (1992) (A)
- PROM item’s assessed and appropriately worded: assessed clarity k = 0.51 to 0.87 (1 study, n = 766) Valentini (2012) (D)
Structural Validity: Rasch Analysis = 51.05 (SD 8.28) (1 study, n = 97) Pai-jun (2004) (D)
Internal Consistency: α = 0.88 to 0.99, (4 studies, n = 1437), Lackovic (2020) (I), Morales-Monforte (2017) (VG), Valentini (2011) (VG), Valentini (2012) (VG)
Test-retest: α = 0.85 to 0.998 (4 study, n = 1525)Lackovic (2020) (I), Valentini (2012) (VG), Piper (1992) (A), Valentini (2011) (D)
Intra-rater: ICC = 0.55 to 0.99, (8 studies, n = 1720) Aimsamrarn (2019) (I)Albuquerque (2018) (VG), Boonzaaijer (2017) (VG), Jeng (2000) (A), Morales-Monforte (2017) (D), Pin (2010) (VG), Silva (2013) (VG), Valentini (2011) (D), Valentini (2021) (VG)
Inter-rater: ICC = 0.55 to 0.99, (23 studies, n = 2962) Aimsamrarn (2019) (I), Albuquerque (2018) (VG), Almeida (2008) (VG), Bartlett (2003) (D), Blanchard (2004) (D), Boonzaaijer (2017) (VG), Darrah (1996) (VG), Darrah (1998b) (D), Fleuren (2007) (A), Jeng (2000) (A), Krosschell (2018) (VG), Ga (2011) (I), Lackovic (2020) (I), Morales-Monforte (2017) (D), Pin (2010) VG, Piper (1992) (A), Quezada-Villalobos (2010) (VG), Snyder (2008) (VG), Suir (2019) (I) Syrengelas (2010) (A), Tse (2008) (I), Valentini (2011) (D), Wang (2018) (VG)
Measurement error:- SEM = 0.92 to 0.96 (1 study, n = 48) Boonzaaijer 2017 (VG)
- SEM varied by age (1 study, n = 45) Jeng 2000 (A)
- SDC NR
- LoA NR
Construct Validity: -Comparison of ASQ-3-GM rs = 0.697 (1 study, n = 37) Fauls (2020) (VG)
-Comparison to BSID or BSID-II or BSID-III or BSITD had varied results rs = 0.36 to 0.969(10 studies, n = 879) Aimsamrarn (2019) (D),
Almeida (2008) (VG), Campos (2006) (I), Darrah (1996) (VG), Harris (2010) (VG), Hoskens (2018) (D), Jeng (2000) (VG), Morales-Monforte (2017) (VG), Piper (1992) (A), Siegle (2018) (A)
-Comparison with CBDS classification rs = 0.34 (1 study, n = 766) Valentini (2012) (VG)
-Comparison with DSCB k = 0.309 (1 study, n = 561) Valentini (2011) (I)
-Consistency of GMA results between the home and clinic recorded videos k = 0.869 (1 study, n = 29) Yeh (2020) (VG)
-Comparison to HINTat 4 to 6 months rs = 0.831 or 10 to 12.5 months, rs = - 0.84 (1 study, n = 121) Tse (208) (VG)
-Comparison to PDGMS or PDGMS-2 rs = 0.72 to 0.97 or ICC > 0.90 (4 studies, n = 287) Piper (1992) (A), Snyder (2008) (VG), van Hus (2013) (I), Wang (2018) (VG)
(+) sufficient High ⊕⊕⊕⊕
      -Comparison to TIMP or IMP rs = 0.64 to 0.91 or rs = 0.37 to 0.54 through the various ages (5 study, n = 1111) Campbell (2000) (VG), Chiquetti (2020) (D), Heineman (2013) (VG), Heineman (2013) (VG), Rizzi (2021) (VG)
-Comparison to TIMP at 30, 60 and 90 days: ICC = 0.20 to 0.67 (1 study, n = 96) Campbell (2002) (VG)
-Comparison of the PEDI-CAT: rs = 0.32 (1 study, n = 53) Dumas (2015) (A)
-Comparison sub-items at varied ages and positions significant differences identified (1 study, n = 31) Pin (2020) (A)
-Comparison premature vs other age categories p < 0.0001 (1 study, n = 1441)Syrengelas (2016) (A)
-Comparison of original and contemporary norms: ICC = 0.99 (1 study, n = 650) Darrah (2014) (VG)
Cross-cultural Validity: -Significantly more Dutch children score below the cut off scores (1 study, n = 118) Fleuren (2007) (VG)
-Comparison of Brazilian and Canadian norms: t-test: < 2.614
-Comparison of Greece and Canadian norms: no significant difference
-Comparison of Thai score and Canadian norms: significantly higher (3 study, n = 2110) Saccani (2012) (A), Syrengelas (2010) (I), Tupsila (2020) (VG)
   
      Criterion Validity:-Predictive validity -10th percentile cut-off had varied results Sensitivity 75.8%, Specificity 92.9% (4 study, n = 642) Albuquerque (2018) (I), Darrah (1998) (VG), Ga (2022) (VG), Harris (2020) (VG)
-Abnormal vs suspicious: 8th month: varied results, Sensitivity 100%, Specificity 90.8% (2 study, n = 125+164) Darrah (1996) (VG), Darrah (1998) (VG)
-High false positive rate at 4 month, improved at 7 months (1 study, n = 39) Fetters (2000) (D)
-Predict BSID of various levels <85; <70 at 4 months or 10 to 12 months (see 7) (1 study, n = 160) Lefebvre (2016) (VG)
-Predict DSCB ICC = 0,96 (1 study, n = 561) Valentini (2011) (I)
-Predict IMPAUC = 0.85 (1 study, n = 86) Rizzi (2021) (VG)
-Predict MAABC-2 scores or NSMDA at 4 or 12 months: Accuracy 80 to 92% (1 study, n = 87), Spittle (2015) (VG)
-Predicting motor and mental outcome after perinatal hypoxicischaemic encephalopathy Sensitivity 100%, Specificity 80%, van Schie (2010) (D)
-Predictive power limited to group of infants aged 3 months vs 9 months p < 0.001 (1 study, n = 766) Valentini (2012) (VG)
-Concurrent validity-3 subgroups different F = 11.03, df 59, p < 0.001 (1 study, n = 60)Bartlett (2003) (VG)
-Comparison live and home video ICC = 0.99 (1 study, n = 48) Boonzaaijer (2017) (VG)
Responsiveness: SDC N R; MIC N R; ROC NR
   
Criteria Rating +|+|+ +|+ +|+|+|?    
LATCH Content Validity: NR
Structural Validity: Exploratory Factor Analysis resulted in removal of C (Comfort), Confirmatory Factor Analysis 4-items GFI 0.994 to 0.995 (1 study, n = 907) Lau (2016) (VG)
Internal Consistency: 5-item α = 0.25 to 0.70, 4-item α = 0.74 (2 studies, n = 1067) Da Conceicao (2017) (D), Lau (2016) (VG)
Inter-rater: ICC = 0.79 to 0.96, (7 studies, n = 484) Adams (1997) (D), Altuntas (2014) (D), Baez Leon (2008) (VG), Chapman (2016) (A), Da Conceicao (2017) (A), Da Conceicao (2018) (A), Riordan (1997) (VG)
Test-retest: NR
Measurement error: SEM NR; SDC NR; LoA NR
Construct Validity: -Comparison on Infant Breast-Feeding Assessment Tool rs = 0.69 to 0.71, (2 studies, n = 74) Altuntas (2014) (A), Riordan (1997) (D)
-Mother-Baby Assessment rs = 0.88, (2 studies, n = 74) Altuntas (2014) (A), Riordan (1997) (D)
-Bristol Breast-feeding Assessment Tool rs = 0.76, (1 study, n = 127), Dolgun (2018) (VG)
Cross-cultural Validity: Forward and back translation to Portuguese CVI >0.90 (60%)(1 study, n = 160) Da Conceicao (2018) (I)
Criterion Validity:
-Predictive -9 or higher during 16 to 24 hours period after birth: Sensitivity 75%, Specificity 63% (1 study, n = 250) Kumar (2006) (VG)
-Predicting exclusive or nonexclusive breast-feeding success: 4 items: (cut off 3.5 and 5.5): Sensitivity 94 to 95%, Specificity 0 to 2%, Likelihood Ratio = 0.96, ROC analysis confirmed high sensitivity (1 study, n = 907) Lau (2016) (VG)
Responsiveness: SDC NR; MIC NR; ROC NR
(+) sufficient MODERATE ⊕⊕⊕⊝
downgraded due to
  • imprecision (-1: total 10 studies, n < 100; no meta-analysis)

CriteriaRating ?|+|+ +|? +|-|+|?    
Cobb Angle Content Validity: NA
Structural Validity: NA
Internal Consistency: NA
Intra-rater: ICC = 0.92 to 0.97, or rs = 0.980 (10 studies, n = 654) Adam (2005) (I), Allen (2008) (D), Brink (2018) (D), DeCarvalho (2007) (D), Gstoettner (2007) (A), Kumar (2009) (A), Lodar (2004) (I), Marchetti (2017) (I), Mehta (2009) (D), Stokes (2006) (I)
Inter-rater: ICC = 0.75 to 0.96, or rs = 0.986 (11 studies, n = 713) Allen (2008) (D), Brink (2018) (D), DeCarvalho (2007) (D), Gstoettner (2007) (A), Kumar (2009) (A), Livanelioglu (2016) (A), Lodar (2004) (I), Marchetti (2017) (A), Mehta (2009) (D), Stokes (2006) (I), Xiaohua (2013) (D)
Test-retest: NA
Measurement error:- SEM 40 degrees of freedom(1 study, n = 41) Pruijs 1995 (I)
- SEM 4.8 degrees thoracic and SEM 3.8 degrees lumbar (1 study, n = 90) Marchetti (2017) (A)
- MDC thoracic spine 9.3 degrees and MDC lumbar spine 7.5 degrees (1 study, n = 90) Marchetti (2017) (A)
- LoA NR
Construct Validity: -Comparison manual and automated ICC = 0.30
-Comparison manual and semiautomated (outliers removed) ICC = 0.70 (1 study, n = 22) Allen (2008) (D)
-Comparison manual and computer algorithm rs = 0.80 (1 study, n = 28) Al-Bashir (2019) (VG)
-Comparison between small films and plain-films measurements ICC = 0.95 (1 study, n = 20) De Carvalho (2007) (I)
-Comparison of Cobb1 to smart mouse1 or smart mouse2: rs = 0.93 and 0.87 (1 study, n = 51) Livanelioglu (2016) (VG)
-Compared to measuring apex angle and linear measurements: Thoracic spine: rs = 0.79 (1 study, n = 90) Marchetti (2017) (D)
-“New” semi-automated method compared to gold standard difference in mean was 4 degrees (1 study, n = 14) Safari (2019) (I)
Cross-cultural Validity: NA
Criterion Validity: NA
Responsiveness: SDC NR; MIC NR; ROC NR
(?) indeterminate LOW ⊕⊝⊝⊝downgraded due to limitation in
  • inconsistency (-1: numerous studies are rated doubtful or inadequate in study design);

  • indirectness (-1: about 20% of population was between 18 to 40 years of age)

Criteria Rating NA|NA|NA +|+ +|NA|NA|?    
Postural Assess-ment Content Validity: NR
Structural Validity: NA
Internal Consistency:
NA
Intra-rater:10-items for Bland and Altman agreement 86.6 to 100%, (1 study, n = 114)Watson (2000) (I)
Inter-rater:10-items for Bland and Altman agreement 93.3 to 100%, (1 study, n = 114)Watson (2000) (I)
Measurement error: SEM NR; SDC NR; LoA NR
Construct Validity: NR
Cross-cultural Validity: NR
Criterion Validity: NR
Responsiveness:- SDC NR; MIC NR; ROC NR
(?) indeterminate VERY LOW ⊝⊝⊝⊝
downgraded due to limitation in
  • risk of bias (-2: COSMIN score= inadequate, only agreement was assessed; poor statistical analysis)

  • indirectness (-1: biased population)

  • imprecision (-1: one study with sample size n = 114)

CriteriaRating ?|NA|NA ?|? ?|?|?|?    

Key: ¶ Piper and Darrah 1994 (text book) detail theories of motor development and construction of a Motor Assessment Tool for the Developing Infant. COA = clinical outcome assessment; Criteria Rating: “+” = sufficient, “−” = insufficient, “?” = indeterminate, consistent with criteria defined in Table 1 from [88]AIMS = Alberta Infant Motor Scale; BSID = Bayley Score of Infant Development; CBDS = Child Behavior Development Scale; CVI = Content Validity Index; DSCB = Development Scale of Child Behaviour; GFI = Goodness of Fit Index; GMA = General Movement Assessment; LATCH = “L” identifies how well the infant latches onto the breast. “A” identifies the amount of audible swallowing noted. “T” identifies the mother’s nipple type. “C” identifies the mother’s level of comfort. “H” indicates the amount of help the mother needs to hold her infant to the breast; LoA = Limits of Agreement; MIC = Minimal important change; α = Cronbach’s alpha; ρ = Pearson’s correlation coefficient; rs = Spearman’s ranked correlation coefficient (rho); ICC = interclass correlation coefficient; k = Cohen’s kappa; MABC-2 = Movement Assessment Battery for Children 2 Edition; n = sample size; NSMDA = Neuro-Sensory Motor Development Assessment; NA = not applicable; NR = not reported; PDGMS = Peabody Development Gross Motor Scale; PEDI-CAT = Pediatric Evaluation of Disability Inventory Computer Adaptive Test; ROC = receiver operation curve; SDC = smallest detectable change; SEM = standard error of measurement; TIMP = Test of Infant Motor Performance.

Main results by clinical outcome assessment

Table 3 summarizes the main findings with overall ratings by COA and the GRADE with certainty of evidence and related reasons of downgrading. The main psychometric properties can be found in Table 4 for AIMS, Table 5 for LATCH, Table 6 for Cobb angle, and Table 7 for Postural.

Performance-based outcomes

  • Alberta Infant Motor Scale (AIMS)

The Alberta Infant Motor Scale has a predictive purpose to classify motor learning. It is an observational assessment scale that assesses gross motor maturation and skills in infants from ages 0 to 18 months with typical development or those with developmental delay or atypical patterns. It measures weight bearing, posture, and antigravity movements of infants. Fifty-one studies examined the psychometric properties of AIMS (Table 4). Population studies included healthy, preterm, chronically ill, neurologically compromised, and infants exposed to HIV (human immunodeficiency virus). Content validity (two studies, n = 1057) was assessed for relevance, comprehensiveness, comprehensibility, and appropriate wording. Forty-six properties of the validity domain were studied, including construct, predictive, criterion, concurrent, structural, and cultural validity. With the exception to one outlier [47], there is a moderate to strong correlation with other outcome measures intended to evaluate motor performance of infants. Thirty-eight properties of the reliability domain were studied. If >0.70 is considered a significant ICC value, then all studies show an acceptable inter-rater reliability. Apart from one outlier [48], the range showed acceptable inter-rater reliability ICC = 0.85 to 0.99. High certainty evidence supports the use of AIMS with responsiveness needing further research. The overall rating was sufficient (+). There were greater number of large studies of lower risk of bias across all psychometric domains exploring different aspects of reliability. Construct validity explored numerous hypotheses with large trials (see Table 3).

Clinician-reported outcomes

  • LATCH

The quality of breastfeeding is evaluated using the evaluative measure LATCH [95]; each letter of the acronym indicates an area of assessment. ‘L’ identifies how well the infant latches onto the breast. ‘A’ identifies the amount of audible swallowing noted. ‘T’ identifies the mother’s nipple type. ‘C’ identifies the mother’s level of comfort. ‘H’ indicates the amount of help the mother needs to hold her infant to the breast. Structural validity using confirmatory factor analysis on 4-items removing one item C (Comfort). Internal consistency improved with four items (α = 0.74) versus five items (α = 0.25 to 0.70) as well. Six further elements of validity and seven of inter-rater reliability were studied (Table 5) in a population of healthy newborns and infants. The certainty of evidence was judged to be moderately downgraded due to imprecision (−1: total 10 studies, n < 100; no meta-analysis). The overall rating was sufficient (+) (see Table 3).

  • Cobb Angle

Cobb Angle has a discriminative purpose used with scoliosis. It is a standard measurement to determine the presence of and track the advancement of scoliosis. Of 15 studies (see Table 6), six assessed validity and 12 various aspects of reliability. All studies correlated different novel methods of digital measurements versus a traditional reading but did not directly address the validity of measuring the Cobb angle as an indication of disease severity. The certainty of the evidence was low downgraded due to limitation in inconsistency (−1: numerous studies are rated doubtful or inadequate in study design) and indirectness (−1: about 20% of population was between 18 to 40 years of age). The overall rating of the Cobb Angle was indeterminate (?) (see Table 3).

  • Postural Assessment

A postural assessment can involve observation of static posture for symmetry and alignment of anatomic landmarks. It may be visual or use a palpation assessment. The patient is directed to stand with feet shoulder-width apart and arms relaxed to the sides. One study (Watson and Mac Donncha, 2020; Table 7) examined intra-rater and inter-rater reliability of postural assessment in male adolescents using a photographic method. The certainty of the evidence was very low downgraded due to limitations in risk of bias (−2), imprecision (−1), and indirectness (−1). We could not find reports on validity or responsiveness. The overall rating of postural assessment was indeterminate (?) (see Table 3).

Discussion

We evaluated the psychometric properties on four of 18 pre-identified COAs used to determine the effectiveness of spinal manipulation and mobilization for various medical conditions in a pediatric population [2]. One performance-based outcome (AIMS) and one clinician-reported outcome measures (LATCH) had sufficient information; AIMS had high certainty evidence, and LATCH had moderate, but both lacked sufficient assessment on responsiveness.

Overall completeness and applicability of evidence

The context and purpose of this systematic review was to determine if the included COAs can and should be used to evaluate the effectiveness of spinal manipulation or mobilization for various medical conditions in a pediatric population. We questioned if the identified COAs were fit for this purpose. Whilst we determined that one performance-based outcome and one clinician-reported outcome measure have high to moderate levels of certainty, respectively, they both lacked assessment on responsiveness. The COA may have poor responsiveness to detect change, and a treatment effect may not be detected, when one exists, in other words, leading to a false-negative result or to a failed trial. Additionally, the greatest level of certainty is to use a COA that was developed and assessed in the target patient population in which it will be used. The construct underpinning the clinical reasoning in performing a manipulation and mobilization in diverse medical conditions remains unclear.

The AIMS is a well-researched performance-based outcome tool used to determine motor development in infants [23]. We have high certainty that it is a reliable and valid tool that gives appropriate results for the staging of infants’ motor development. One randomized control study that utilized AIMS as a chosen outcome measure was identified through a scoping review. The effect of manual therapy on non-synostotic plagiocephaly were examined [96]. It is beyond the scope of the paper to determine if manipulation or mobilization is an effective treatment technique for plagiocephaly; however, the scientific basis of this construct is in debate [97]. Considering the objective of this study, AIMS was an appropriate outcome measure for assessing motor development; however, since responsiveness is not reported, it may be inadequate in this medical condition.

It has been suggested that chiropractic manipulation is an effective treatment option to improve breastfeeding due to mechanical restriction of the infant’s cervical spine and temporomandibular joint [98–100]. Fry [101] completed a literature review regarding the role of chiropractic care and breastfeeding. Chiropractic care was identified to be beneficial if there was a biomechanical limitation to breastfeeding. Successful treatment was aimed at the cranial sutures, temporomandibular joint, and cervical spine, particularly the atlantoaxial joint. Again, the scientific basis of this construct of thinking is lacking, and it is hard to find evidence for the relation between, for example, movement of the cranial sutures and breastfeeding. One randomized control study was identified by the scoping review [2]. Mother-infant dyads were randomized to either a group with a lactation consultation and sham osteopathic treatment or a group with a lactation consultation and osteopathic treatment [102]. The biological rationale does not seem plausible that manipulation or mobilization should have a positive effect on breastfeeding and is without any evidence. The LATCH tool was presented by Jensen and colleagues [95] as a clinician-reported outcome that assesses the quality of the LATCH, audible swallowing, nipple type, comfort, and hold to measure the effectiveness of breastfeeding. Research from this review demonstrates that LATCH has a moderate certainty of evidence for reliability and validity. Regarding responsiveness, there was no evidence found. It is beyond the scope of this paper to draw conclusions regarding the effectiveness of treatment utilizing manipulation or mobilization, to improve the infant’s biomechanics of breastfeeding. It is unclear if the LATCH is an appropriate tool to measure responsiveness to changes in breastfeeding.

Quality of the evidence

A thorough methodological assessment of included studies using the robust and validated COSMIN checklist across a variety of outcome measures identified that the limitations in the retrieved articles were poor statistical analysis, poor methodological reporting, and low sample size. A thoughtful and strategic approach to instrument selection in clinical trials is a unique challenge.

Potential biases in the review process

Our medical information specialist was unable to limit the search parameters to the medical conditions discussed in the scoping review due to the vague definition of medical conditions within the identified studies in the scoping review [2]. Additionally, we did have language restrictions, our search included only English, Spanish, Dutch, French, and German articles. We maximized the inter-rater reliability for the COSMIN score by organizing group consensus discussions on three occasions during the review process to ensure calibration and interpretation of checklist items.

Agreements and disagreements with other studies or reviews

No other systematic reviews assessing the psychometric properties of LATCH and posture assessment were identified for pediatric conditions. We found reviews on the measurement of the Cobb angle, but these reviews did not mention a pediatric population. Three systematic reviews regarding the AIMS agreed with our review [49,103,104]. They concluded that the AIMS was valid and reliable but was not ideal for use with preterm infants and lacked predictive value. Spittle et al. [89] concluded that the AIMS was good at predicting motor impairment of corrected age preterm infants, with the most consistent results at 4 months. Albuquerque et al. also found a small ceiling effect. Kjølbye et al. [105] differed from our review in that they also stated that AIMS was valid for staging motor skills as confirmed with Peabody Developmental Gross Motor Scale and with Bayley Score of Infant Development. None of the reviews found data on responsiveness, similar to our findings.

Conclusion

The AIMS has a high and LATCH moderate certainty of evidence and demonstrated sufficient measurement properties to be used in clinical practice. Neither had sufficiently assessed responsiveness.

High-quality research assessing the psychometric properties of diagnostic and evaluative tools utilized with pediatric populations is needed to fill the gap for those utilizing spinal manipulation or mobilization to manage a variety of medical conditions in infants, children, and adolescents. Furthermore, a focus on responsiveness is needed.

Supplementary Material

Supplemental Material

Acknowledgements

Jurgen Mollema, University of Applied Sciences, The Netherlands was our medical information specialist and research librarian. Derek Clewney, Duke University, USA, is a member of the collaborative International Federation of Orthopaedic Manipulative Physical Therapists (IFOMPT) and International Organisation of Physical Therapists in Paediatrics (IOPTP) Task Force on Spinal Manipulation in Children and assisted with data screening.

Biographies

Tricia Hayton is a private practitioner in Oakville, Ontario, Canada. She was a graduate student of the OMPT program at McMaster University and completed this work as part of her research requirements.

Anita Gross is an Associate Clinical Professor at McMaster University on the School of Rehabilitation Sciences leading their advanced orthopedic musculoskeletal-manipulative physical therapy (OMPT) program. She is a lecturer in the Master’s of Clinical Science program in Manipulative Therapy at Western University and the Canadian Physiotherapy Association AIM program. She is the chair of the IFOMPT/IOPTP Taskforce on Pediatric Manipulation informing PT policy with systematic reviews and evidence gap maps. She is a clinician scientist and educator. She has over 150 peer reviewed publications, has been principal/co-investigator on 30 grants and has been an invited speaker at 20 international conferences. She coordinates the Cervical Overview Group, an International Network that conducts and maintains Cochrane systematic reviews on neck pain and participates in randomized clinical trials on back pain (Welback). She works in private practice OMPT and is a Fellow of the Canadian Academy of Manipulative Physiotherapy (FCAMPT).

Annalie Basson is a clinician and part-time lecturer at the University of Witwatersrand, Johannesburg, South Africa working in private practice in Pretoria.

Ken Olson is the president and co-owner of the physical therapy private practice Northern Rehab Physical Therapy Specialists in DeKalb, Illinois and is adjunct faculty for Northern Illinois University. He is a Past-President of both the International Federation of Orthopaedic Manipulative Physical Therapists (IFOMPT) and the American Academy of Orthopaedic Manual Physical Therapists (AAOMPT).

Oliver Ang primary research interests are innovative interventions using digital technologies to address cervical disorders and contextual factors, particularly therapeutic alliance, in physical therapy treatments. He is currently involved in the Spinal Manipulation and Patient Self-Management for Preventing Chronic Back Pain (PACBACK), the Integrated Supported Biopsychosocial Self-Management for Back Related Leg Pain (SUPPORT) and Partners4Pain studies, funded by the US National Institute of Health (NIH). He is a member of the validity assessment team of the Cervical Overview Group.

Nikki Milne works as an Associate Professor of Physiotherapy (Paediatrics) at Bond University where she has worked for the past 16 years. Prior to starting work in the academic setting Nikki worked as a Paediatric physiotherapist for NSW Health which led to her research interests in child health and wellbeing and paediatric curriculum. Nikki has a special interest in child health, learning and paediatric physiotherapy and is passionate about the inclusion of paediatric curriculum in entry-level physiotherapy programs, to ensure that all graduates of accredited entry-level programs have knowledge and skills to safely and effectively work with children.

Jan Pool has worked as Associate Professor Institute of Human Movement Studies, Faculty of Health Care and as a Coordinator/Head of Master Program Physical Therapy division; Orthopedic Manual Therapy. He was senior researcher of Research Group Lifestyle and Health, HU University of Applied Sciences Utrecht, Utrecht, The Netherlands. He worked as a manual therapist for over 30 years in a private clinic. His scientific interest culminated in a master degree in epidemiology in 2003 and a doctorate in medicine in 2007 both at the Free University Amsterdam. He wrote numerous articles on the topics neck pain, chronic pain and manipulative therapy and has a special interest in clinimetry. He was a member of the board of the Dutch Association of Manual therapy in The Netherlands (NVMT), from 1990 till 1998. From 2000-2016 he was a member of the Standard Committee of the International Federation of Manipulative Physical Therapy (IFOMPT).Jan became a member of the Spinal Manipulation Taskforce in 2020.

Funding Statement

The work was supported by the Canadian Academy of Manipulative Therapy Student Research Fund.

Disclosure statement

No authors have declared any conflict of interest.

Supplementary material

Supplemental data for this article can be accessed online at https://doi.org/10.1080/10669817.2023.2269038

References

  • [1].Chiropractic Board . Chiropractic board of Australia policy statement: interim policy on spinal manipulation for infants and young children; Melbourne, 2019; p. 1–2. Available from: https://www.chiropracticboard.gov.au/Codes-guidelines/Position-statements/Interim-policy-on-spinal-manipulation.aspx. [Google Scholar]
  • [2].Milne N, Longeri L, Patel A, et al. Spinal manipulation and mobilisation in the treatment of infants, children, and adolescents: a systematic scoping review. BioMed Central Pediatr. 2022. Dec;22(1):1–24. doi: 10.1186/s12887-022-03781-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [3].McKown S, Acquadro C, Anfray C, et al. Good practices for the translation, cultural adaptation, and linguistic validation of clinician-reported outcome, observer-reported outcome, and performance outcome measures. J Patient-Reported Outcomes. 2020. Dec;4(1):1–8. doi: 10.1186/s41687-020-00248-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [4].Mokkink LB, Prinsen C, Patrick DL et al. COSMIN methodology for systematic reviews of patient-reported outcome measures (PROMs). User Manual. 2018. Feb;78(1): 27–29. [Google Scholar]
  • [5].Terwee CB, Prinsen CA,Chiarotto A, de Vet HC, Bouter LM, Alonso J, Westerman MJ, Patrick DL, Mokkink LB.. COSIM methodology for assessing the content validity of PROMs – user manual. 2018. http://www/cosmin.nl/. [DOI] [PMC free article] [PubMed]
  • [6].Hayton T, Gross AR, Basson A, et al. Psychometric measurement properties of patient-reported and observer-reported outcome measures for spinal mobilisations and manipulation on paediatric subjects with diverse medical conditions: a systematic review. J Manual Manipulative Ther. 2023. in press. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [7].Page MJ, McKenzie JE, Bossuyt PM, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. Syst Rev. 2021. Dec;10(1):1–1. doi: 10.1186/s13643-021-01626-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [8].Mokkink LB, Prinsen CA, Patrick DL et al. COSMIN Risk of Bias checklist for systematic reviews of Patient-Reported Outcome Measures. 27. Qual Life Res; 2019. pp. 1171–1179. doi: 10.1007/s11136-017-1765-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [9].Terwee CB, Jansma EP, Riphagen II, et al. Development of a methodological PubMed search filter for finding studies on measurement properties of measurement instruments. Qual Life Res. 2009. Oct;18(8):1115–1123. doi: 10.1007/s11136-009-9528-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [10].Bartels B, De Groot JF, Terwee CB.. The six-minute walk test in chronic pediatric conditions: a systematic review of measurement properties. Phys Ther. 2013 Apr 1;93(4):529–541. doi: 10.2522/ptj.20120210 [DOI] [PubMed] [Google Scholar]
  • [11].D’hondt NE, Pool JJ, Kiers H, et al. Validity of clinical measurement instruments assessing scapular function: insufficient evidence to recommend any instrument for assessing scapular posture, movement, and dysfunction—a systematic review. J Orthop Sports Phys Ther. 2020. Nov;50(11):632–641. doi: 10.2519/jospt.2020.9265 [DOI] [PubMed] [Google Scholar]
  • [12].Aimsamrarn P, Janyachareon T, Rattanathanthong K, et al. Cultural translation and adaptation of the Alberta Infant motor Scale Thai version. Early Hum Dev. 2019 Mar 1;130:65–70. doi: 10.1016/j.earlhumdev.2019.01.018 [DOI] [PubMed] [Google Scholar]
  • [13].Albuquerque PL, Guerra MQF, Lima MC, et al. Concurrent validity of the Alberta Infant motor Scale to detect delayed gross motor development in preterm infants: a comparative study with the Bayley III. Dev Neurorehabil. 2018;21(6):408–414. doi: 10.1080/17518423.2017.1323974 [DOI] [PubMed] [Google Scholar]
  • [14].Almeida KM, Dutra MV, Mello RR, et al. Concurrent validity and reliability of the Alberta Infant motor Scale in premature infants. J Pediatr (Rio J). 2008;84(5):442–448. doi: 10.2223/JPED.1836 [DOI] [PubMed] [Google Scholar]
  • [15].Bartlett DJ, Fanning JEK. Use of the Alberta Infant motor Scale to characterize the motor development of infants born preterm at eight months corrected age. Phys Occup Ther Pediatr. 2003;23(4):31–45. doi: 10.1080/J006v23n04_03 [DOI] [PubMed] [Google Scholar]
  • [16].Blanchard Y, Neilan E, Busanich J, et al. Interrater reliability of early intervention providers scoring the Alberta Infant motor Scale. Pediatr Phys Ther. 2004 Apr 1;16(1):13–18. doi: 10.1097/01.PEP.0000113272.34023.56 [DOI] [PubMed] [Google Scholar]
  • [17].Boonzaaijer M, van Dam E, van Haastert IC, et al. Concurrent validity between live and home video observations using the Alberta Infant motor Scale. Pediatr Phys Ther. 2017. Apr;29(2):146. doi: 10.1097/PEP.0000000000000363 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [18].Campbell SK, Kolobe TH. Concurrent validity of the test of infant motor performance with the Alberta Infant Motor Scale. Pediatr Phys Ther. 2000 Apr 1;12(1):2–9. doi: 10.1097/00001577-200012010-00002 [DOI] [Google Scholar]
  • [19].Campbell SK, Kolobe TH, Wright BD, et al. Validity of the test of Infant motor performance for prediction of 6-, 9- and 12-month scores on the Alberta Infant motor Scale. Dev Med Child Neurol. 2002. Apr;44(4):263–272. doi: 10.1017/S0012162201002043 [DOI] [PubMed] [Google Scholar]
  • [20].Campos D, Santos DC, Gonçalves VM, et al. Agreement between scales for screening and diagnosis of motor development at 6 months. J Pediatr (Rio J). 2006;82(6):470–474. doi: 10.2223/JPED.1567 [DOI] [PubMed] [Google Scholar]
  • [21].Chiquetti EM, Valentini NC, Saccani R. Validation and reliability of the test of infant motor performance for Brazilian infants. Phys Occup Ther Pediatr. 2020 Jul 3;40(4):470–485. doi: 10.1080/01942638.2020.1711843 [DOI] [PubMed] [Google Scholar]
  • [22].Darrah JM. Feasibility of early screening for neuromotor problems in at-risk infants: predictive validity of the Alberta Infant motor Scale. 1996. Available from: https://search.ebscohost.com/login.aspx?direct=true&db=rzh&AN=109842973&scope=site.
  • [23].Darrah J, Piper M, Watt MJ. Assessment of gross motor skills of at‐risk infants: predictive validity of the Alberta Infant motor Scale. Dev Med Child Neurol. 1998. Jul;40(7):485–491. doi: 10.1111/j.1469-8749.1998.tb15399.x [DOI] [PubMed] [Google Scholar]
  • [24].Darrah J, Redfern L, Maguire TO, et al. Intra-individual stability of rate of gross motor development in full-term infants. Early Hum Dev. 1998b Sep 1;52(2):169–179. doi: 10.1016/S0378-3782(98)00028-0 [DOI] [PubMed] [Google Scholar]
  • [25].Darrah J, Bartlett D, Maguire TO, et al. Have infant gross motor abilities changed in 20 years? A re‐evaluation of the Alberta Infant motor Scale normative values. Dev Med Child Neurol. 2014. Sep;56(9):877–881. doi: 10.1111/dmcn.12452 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [26].Dumas HM, Fragala-Pinkham MA, Rosen EL, et al. Pediatric evaluation of Disability Inventory Computer Adaptive Test (PEDI-CAT) and Alberta Infant Motor Scale (AIMS): validity and responsiveness. Phys Ther. 2015 Nov 1;95(11):1559–1568. doi: 10.2522/ptj.20140339 [DOI] [PubMed] [Google Scholar]
  • [27].Fauls JR, Thompson BL, Johnston LM. Validity of the ages and stages questionnaire to identify young children with gross motor difficulties who require physiotherapy assessment. Dev Med Child Neurol. 2020. Jul;62(7):837–844. doi: 10.1111/dmcn.14480 [DOI] [PubMed] [Google Scholar]
  • [28].Fetters L, Tronick EZ. Discriminate power of the Alberta Infant motor Scale and the Movement assessment of infants for prediction of Peabody Gross motor Scale scores of infants exposed in utero to cocaine. Pediatr Phys Ther. 2000 Apr 1;12(1):16–23. doi: 10.1097/00001577-200012010-00004 [DOI] [Google Scholar]
  • [29].Fleuren KM, Smit LS, Stijnen TH, et al. New reference values for the Alberta Infant motor Scale need to be established. Acta Paediatrica. 2007. Mar;96(3):424–427. doi: 10.1111/j.1651-2227.2007.00111.x [DOI] [PubMed] [Google Scholar]
  • [30].Ga HY, Kwon JY. A comparison of the Korean-ages and stages questionnaires and Denver developmental delay screening test. Ann Rehabil Med. 2011 Jun 30;35(3):369–374. doi: 10.5535/arm.2011.35.3.369 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [31].Harris SR, Backman CL, Mayson TA. Comparative predictive validity of the Harris Infant neuromotor test and the Alberta Infant motor Scale. Dev Med Child Neurol. 2010;52(5):462–467. doi: 10.1111/j.1469-8749.2009.03518.x [DOI] [PubMed] [Google Scholar]
  • [32].Heineman KR, Bos AF, Hadders‐Algra M. The Infant motor profile: a standardized and qualitative method to assess motor behaviour in infancy. Dev Med Child Neurol. 2008. Apr;50(4):275–282. doi: 10.1111/j.1469-8749.2008.02035.x [DOI] [PubMed] [Google Scholar]
  • [33].Heineman KR, Middelburg KJ, Bos AF, et al. Reliability and concurrent validity of the Infant motor profile. Dev Med Child Neurol. 2013. Jun;55(6):539–545. doi: 10.1111/dmcn.12100 [DOI] [PubMed] [Google Scholar]
  • [34].Hoskens J, Klingels K, Smits-Engelsman B. Validity and cross-cultural differences of the Bayley scales of Infant and toddler development, third Edition in typically developing infants. Early Hum Dev. 2018 Oct 1;125:17–25.45. doi: 10.1016/j.earlhumdev.2018.07.002 [DOI] [PubMed] [Google Scholar]
  • [35].Jeng SF, Yau KI, Chen LC, et al. Alberta Infant Motor Scale: reliability and validity when used on preterm infants in Taiwan. Phys Ther. 2000 Feb 1;80(2):168–178. doi: 10.1093/ptj/80.2.168 [DOI] [PubMed] [Google Scholar]
  • [36].Krosschell KJ, Bosch M, Nelson L, et al. Motor function test reliability during the NeuroNEXT spinal muscular atrophy infant biomarker study. JND. 2018 Jan 1;5(4):509–521. doi: 10.3233/JND-180327 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [37].Lackovic M, Nikolic D, Filimonovic D, et al. Reliability, consistency and temporal stability of Alberta Infant motor Scale in Serbian infants. Children. 2020 Mar 2;7(3):16. doi: 10.3390/children7030016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [38].Lefebvre F, Gagnon MM, Luu TM, et al. In extremely preterm infants, do the Movement assessment of infants and the Alberta Infant motor Scale predict 18-month outcomes using the Bayley-III? Early Hum Dev. 2016 Mar 1;94:13–17. doi: 10.1016/j.earlhumdev.2016.01.012 [DOI] [PubMed] [Google Scholar]
  • [39].Morales-Monforte E, Bagur-Calafat C, Suc-Lerin N, et al. The Spanish version of the Alberta Infant Motor Scale: validity and reliability analysis. Dev Neurorehabil. 2017 Feb 17;20(2):76–82. doi: 10.3109/17518423.2015.1066461 [DOI] [PubMed] [Google Scholar]
  • [40].Pai-Jun ML, Campbell SK. Examination of the item structure of the Alberta Infant motor Scale. Pediatr Phys Ther. 2004 Apr 1;16(1):31–38. doi: 10.1097/01.PEP.0000114843.92102.98 [DOI] [PubMed] [Google Scholar]
  • [41].Pin TW, De Valle K, Eldridge B, et al. Clinimetric properties of the Alberta Infant Motor Scale in infants born preterm. Pediatr Phys Ther. 2010 Oct 1;22(3):278–286. doi: 10.1097/PEP.0b013e3181e94481 [DOI] [PubMed] [Google Scholar]
  • [42].Pin TW, Butler PB, Cheung HM, et al. Longitudinal development of segmental trunk control in full term and preterm infants-a pilot study: part II. Dev Neurorehabil. 2020 Apr 2;23(3):193–200. doi: 10.1080/17518423.2019.1629661 [DOI] [PubMed] [Google Scholar]
  • [43].Piper MC, Pinnell LE, Darrah J, et al. Construction and validation of the Alberta Infant motor Scale (AIMS). Can J Public Health = Revue Canadienne de Sante Publique. 1992 Jul 1;83:S46–50. [PubMed] [Google Scholar]
  • [44].Rizzi R, Menici V, Cioni ML, et al. Concurrent and predictive validity of the infant motor profile in infants at risk of neurodevelopmental disorders. BMC Pediatr. 2021. Dec;21(1):1–1. doi: 10.1186/s12887-021-02522-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [45].Saccani R, Valentini NC. Reference curves for the Brazilian Alberta Infant motor Scale: percentiles for clinical description and follow-up over time. J Pediatr (Rio J). 2012;88:40–47. doi: 10.2223/JPED.2142 [DOI] [PubMed] [Google Scholar]
  • [46].Quezada-Villalobos L, Soto-García I, Escobar-Cabello M, et al. ‘Confiabilidad interevaluador’ de la Escala Motora Infantil de Alberta en niños de término y pretérmino de la provincia de Talca - Chile. Revista Ciencias de la Salud. 2010;8(2):21–32. [Google Scholar]
  • [47].Siegle CB, de Sá CD. Concurrent validity between instruments of assessment of motor development in infants exposed to HIV. Infant Behav Dev. 2018 Feb 1;50:198–206. doi: 10.1016/j.infbeh.2018.01.005 [DOI] [PubMed] [Google Scholar]
  • [48].Silva LP, Maia PC, Lopes MM, et al. Intraclass reliability of the Alberta Infant motor Scale in the Brazilian version. Revista da Escola de Enfermagem da USP. 2013;47(5):1046–1051. doi: 10.1590/S0080-623420130000500006 [DOI] [PubMed] [Google Scholar]
  • [49].Spittle AJ, Doyle LW, Boyd RN. A systematic review of the clinimetric properties of neuromotor assessments for preterm infants during the first year of life. Dev Med Child Neurol. 2008. Apr;50(4):254–266. doi: 10.1111/j.1469-8749.2008.02025.x [DOI] [PubMed] [Google Scholar]
  • [50].Suir I, Boonzaaijer M, Nijmolen P, et al. Cross-cultural validity: Canadian norm values of the Alberta Infant motor Scale evaluated for Dutch infants. Pediatr Phys Ther. 2019 Oct 1;31(4):354–358. doi: 10.1097/PEP.0000000000000637 [DOI] [PubMed] [Google Scholar]
  • [51].Syrengelas D, Kalampoki V, Kleisiouni P, et al. Alberta Infant motor Scale (AIMS) performance of Greek preterm infants: comparisons with full-term infants of the same nationality and impact of prematurity-related morbidity factors. Phys Ther. 2016 Jul 1;96(7):1102–1108. doi: 10.2522/ptj.20140494 [DOI] [PubMed] [Google Scholar]
  • [52].Syrengelas D, Siahanidou T, Kourlaba G, et al. Standardization of the Alberta Infant Motor Scale in full-term Greek infants: preliminary results. Early Hum Dev. 2010 Apr 1;86(4):245–249. doi: 10.1016/j.earlhumdev.2010.03.009 [DOI] [PubMed] [Google Scholar]
  • [53].Tse L, Mayson TA, Leo S, et al. Concurrent validity of the Harris Infant neuromotor test and the Alberta Infant motor Scale. J Pediatr Nurs. 2008 Feb 1;23(1):28–36. doi: 10.1016/j.pedn.2007.07.009 [DOI] [PubMed] [Google Scholar]
  • [54].Tupsila R, Bennett S, Mato L, et al. Gross motor development of Thai healthy full-term infants aged from birth to 14 months using the Alberta Infant motor Scale: inter individual variability. Early Hum Dev. 2020 Dec 1;151:105169. doi: 10.1016/j.earlhumdev.2020.105169 [DOI] [PubMed] [Google Scholar]
  • [55].Valentini NC, Saccani R. Escala Motora Infantil de Alberta: validação para uma população gaúcha. Rev Paulista Pediatria. 2011;29(2):231–238. doi: 10.1590/S0103-05822011000200015 [DOI] [Google Scholar]
  • [56].Valentini NC, Saccani R. Brazilian validation of the Alberta Infant Motor Scale. Phys Ther. 2012 Mar 1;92(3):440–447. doi: 10.2522/ptj.20110036 [DOI] [PubMed] [Google Scholar]
  • [57].van Hus JW, Jeukens-Visser M, Koldewijn K, et al. Comparing two motor assessment tools to evaluate neurobehavioral intervention effects in infants with very low birth weight at 1 year. Phys Ther. 2013 Nov 1;93(11):1475–1483. doi: 10.2522/ptj.20120460 [DOI] [PubMed] [Google Scholar]
  • [58].van Schie PE, Becher JG, Dallmeijer AJ, et al. Motor testing at 1â??year improves the prediction of motor and mental outcome at 2â??years after perinatal hypoxicâ??ischaemic encephalopathy. Dev Med Child Neurol. 2010. Jan;52(1):54–59. doi: 10.1111/j.1469-8749.2009.03302.x [DOI] [PubMed] [Google Scholar]
  • [59].Wang H, Li H, Wang J, et al. Reliability and concurrent validity of a Chinese version of the Alberta Infant motor Scale administered to high-risk infants in China. Bio Med Res Int. 2018. Jun 13;2018:1–10. doi: 10.1155/2018/2197163. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [60].Yeh KK, Liu WY, Wong AM, et al. Validity of general movement assessment based on clinical and home videos. Pediatr Phys Ther. 2020 Jan 1;32(1):35–43. doi: 10.1097/PEP.0000000000000664 [DOI] [PubMed] [Google Scholar]
  • [61].Adams D, Hewell S. Maternal and professional assessment of breastfeeding. J Hum Lact. 1997. Dec;13(4):279–283. doi: 10.1177/089033449701300412 [DOI] [PubMed] [Google Scholar]
  • [62].Altuntas N, Turkyilmaz C, Yildiz H, et al. Validity and reliability of the infant breastfeeding assessment tool, the mother baby assessment tool, and the LATCH scoring system. Breast Feeding Med. 2014 May 1;9(4):191–195. doi: 10.1089/bfm.2014.0018 [DOI] [PubMed] [Google Scholar]
  • [63].Chapman DJ, Doughty K, Mullin EM, et al. Reliability of lactation assessment tools applied to overweight and obese women. J Hum Lact. 2016. May;32(2):269–276. doi: 10.1177/0890334415597903 [DOI] [PubMed] [Google Scholar]
  • [64].DaConceição CM, Coca KP, Alves MD, et al. Validação para língua portuguesa do instrumento de avaliação do aleitamento materno LATCH. Acta Paul Enferm. 2017. Mar;30(2):210–216. doi: 10.1590/1982-0194201700032 [DOI] [Google Scholar]
  • [65].DaConceição CM, Nur M, De Amorim, et al. Cultural adaptation of breastfeeding assessment tool to the Portuguese language: ‘latch’. Pediatrics. 2018;141(1_MeetingAbstract):286. doi: 10.1542/peds.141.1MA3.286 [DOI] [Google Scholar]
  • [66].Dolgun G, İ̇nal S, Erdim L, et al. Reliability and validity of the Bristol breastfeeding assessment tool in the Turkish population. Midwifery. 2018 Feb 1;57:47–53. doi: 10.1016/j.midw.2017.10.007 [DOI] [PubMed] [Google Scholar]
  • [67].Kumar SP, Mooney R, Wieser LJ, et al. The LATCH scoring system and prediction of breastfeeding duration. J Hum Lact. 2006. Nov;22(4):391–397. doi: 10.1177/0890334406293161 [DOI] [PubMed] [Google Scholar]
  • [68].Lau Y, Htun TP, Lim PI, et al. Psychometric evaluation of 5-and 4-item versions of the LATCH breastfeeding assessment tool during the initial postpartum period among a multiethnic population. PLoS One. 2016 May 2;11(5):e0154331. doi: 10.1371/journal.pone.0154331 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [69].Riordan JM, Koehn M. Reliability and validity testing of three breastfeeding assessment tools. J Obstet Gynaecol Gynecologic Neonatal Nur. 1997 Mar 1;26(2):181–187. doi: 10.1111/j.1552-6909.1997.tb02131.x [DOI] [PubMed] [Google Scholar]
  • [70].Adam CJ, Izatt MT, Harvey JR, et al. Variability in Cobb angle measurements using reformatted computerized tomography scans. Spine. 2005 Jul 15;30(14):1664–1669. doi: 10.1097/01.brs.0000169449.68870.f8 [DOI] [PubMed] [Google Scholar]
  • [71].Al-Bashir AK, Al-Abed MA, Amari HK, et al. Computer-based Cobb angle measurement using deflection points in adolescence idiopathic scoliosis from radiographic images. Neural Comput Appl. 2019. May;31(5):1547–1561. doi: 10.1007/s00521-018-3614-y [DOI] [Google Scholar]
  • [72].Allen S, Parent E, Khorasani M, et al. Validity and reliability of active shape models for the estimation of cobb angle in patients with adolescent idiopathic scoliosis. J Digit Imaging. 2008. Jun;21(2):208–218. doi: 10.1007/s10278-007-9026-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [73].Brink RC, Wijdicks SPJ, Tromp IN, et al. A reliability and validity study for different coronal angles using ultrasound imaging in adolescent idiopathic scoliosis. Spine J. 2018 Jun 1;18(6):979–985. doi: 10.1016/j.spinee.2017.10.012 [DOI] [PubMed] [Google Scholar]
  • [74].De Carvalho A, Vialle R, Thomsen L, et al. Reliability analysis for manual measurement of coronal plane deformity in adolescent scoliosis. Are 30× 90 cm plain films better than digitized small films? Eur Spine J. 2007. Oct;16(10):1615–1620. doi: 10.1007/s00586-007-0437-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [75].Gstoettner M, Sekyra K, Walochnik N, et al. Inter- and intraobserver reliability assessment of the Cobb angle: manual versus digital measurement tools. Eur Spine J. 2007. Oct;16(10):1587–1592. doi: 10.1007/s00586-007-0401-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [76].Kumar VP, Thomas T, Menon KV. Content-based image retrieval of spine radiographs with scoliosis. Clin Spine Surg. 2009 Jun 1;22(4):284–289. doi: 10.1097/BSD.0b013e31816d8148 [DOI] [PubMed] [Google Scholar]
  • [77].Livanelioglu A, Kaya F, Nabiyev V, et al. The validity and reliability of “spinal mouse” assessment of spinal curvatures in the frontal plane in pediatric adolescent idiopathic thoraco-lumbar curves. Eur Spine J. 2016. Feb;25(2):476–482. doi: 10.1007/s00586-015-3945-7 [DOI] [PubMed] [Google Scholar]
  • [78].Loder RT, Spiegel D, Gutknecht S, et al. The assessment of intraobserver and interobserver error in the measurement of noncongenital scoliosis in children≤ 10 years of age. Spine. 2004 Nov 15;29(22):2548–2553. doi: 10.1097/01.brs.0000144828.72721.d8 [DOI] [PubMed] [Google Scholar]
  • [79].Marchetti BV, Candotti CT, Raupp EG, et al. Accuracy of a radiological evaluation method for thoracic and lumbar spinal curvatures using spinous processes. J Manipulative Physiol Ther. 2017 Nov 1;40(9):700–707. doi: 10.1016/j.jmpt.2017.07.013 [DOI] [PubMed] [Google Scholar]
  • [80].Mehta SS, Modi HN, Srinivasalu S, et al. Interobserver and intraobserver reliability of Cobb angle measurement: endplate versus pedicle as bony landmarks for measurement: a statistical analysis. J Pediatr Orthop. 2009 Oct 1;29(7):749–754. doi: 10.1097/BPO.0b013e3181b72550 [DOI] [PubMed] [Google Scholar]
  • [81].Pruijs JE, Stengs C, Keessen W. Parameter variation in stable scoliosis. Eur Spine J. 1995. Jun;4(3):176–179. doi: 10.1007/BF00298242 [DOI] [PubMed] [Google Scholar]
  • [82].Safari A, Parsaei H, Zamani A, et al. A semi-automatic algorithm for estimating cobb angle. J Biomed Phys Eng. 2019 Jun 1;9(3):317–326. doi: 10.31661/jbpe.v9i3Jun.730 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [83].Stokes IA, Aronsson DD. Computer-assisted algorithms improve reliability of King classification and Cobb angle measurement of scoliosis. Spine. 2006 Mar 15;31(6):665–670. doi: 10.1097/01.brs.0000203708.49972.ab [DOI] [PubMed] [Google Scholar]
  • [84].Xiaohua MH, KyeongAh J, HanSuk H, et al. A comparison of the validity and reliability between a digital radiographic imaging system and manual method in measuring the Cobb angle. Scoliosis. 2013. Sep;8(S2):1–2. doi: 10.1186/1748-7161-8-S2-O2023311985 [DOI] [Google Scholar]
  • [85].Watson AW, Mac Donncha C. A reliable technique for the assessment of posture: assessment criteria for aspects of posture. J Sports Med Phys Fitness. 2000 Sep 1;40(3):260. [PubMed] [Google Scholar]
  • [86].Snyder P, Eason JM, Philibert D, et al. Concurrent validity and reliability of the Alberta Infant motor Scale in infants at dual risk for motor delays. Phys Occup Ther Pediatr. 2008 Jan 1;28(3):267–282. doi: 10.1080/01942630802224892 [DOI] [PubMed] [Google Scholar]
  • [87].Spittle AJ, Lee KJ, Spencer-Smith M, et al. Accuracy of two motor assessments during the first year of life in preterm infants for predicting motor outcome at preschool age. PLoS One. 2015 May 13;10(5):e0125854. doi: 10.1371/journal.pone.0125854 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [88].Prinsen CA, Mokkink LB, Bouter LM, et al. COSMIN guideline for systematic reviews of patient-reported outcome measures. Qual Life Res. 2018. May;27(5):1147–1157. doi: 10.1007/s11136-018-1798-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [89].Spittle AJ, Lee KJ, Spencer-Smith M, Lorefice LE, Anderson PJ, Doyle LW.. Accuracy of two motor assessments during the first year of life in preterm infants for predicting motor outcome at preschool age. PLoS One. 2015 May 13;10(5):e0125854. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [90].Snyder P, Eason JM, Philibert D, Ridgway A, McCaughey T.. Concurrent validity and reliability of the Alberta Infant Motor Scale in infants at dual risk for motor delays. Physical & occupational therapy in pediatrics. 2008 Jan 1;28(3):267–82. [DOI] [PubMed] [Google Scholar]
  • [91].Báez León C, Blasco Contreras R, Martín Sequeros E, del Pozo Ayuso ML, Sánchez Conde AI, Vargas Hormigos C.. Validation of the LATCH assessment tool into Spanish. Reliability analysis. Index de Enfermería. 2008;17(3):205–9. [Google Scholar]
  • [92].Watson AW, Mac Donncha C.. A reliable technique for the assessment of posture: assessment criteria for aspects of posture. Journal of sports medicine and physical fitness. 2000 Sep 1;40(3):260. [PubMed] [Google Scholar]
  • [93].Santesso N, Glenton C, Dahm P, et al. GRADE guidelines 26: informative statements to communicate the findings of systematic reviews of interventions. J Clinical Epidemiol. 2020 Mar 1;119:126–135. doi: 10.1016/j.jclinepi.2019.10.014 [DOI] [PubMed] [Google Scholar]
  • [94].Báez León C, Blasco Contreras R, Martín Sequeros E, et al. Validación al castellano de una escala de evaluación de la lactancia materna: el LATCH. Análisis de fiabilidad. Index Enferm. 2008;17(3):205–209. doi: 10.4321/S1132-12962008000300012 [DOI] [Google Scholar]
  • [95].Jensen D, Wallace S, Kelsay P. LATCH: a breastfeeding charting system and documentation tool. J Obstet Gynaecol Gynecologic Neonatal Nur. 1994. Jan;23(1):27–32. doi: 10.1111/j.1552-6909.1994.tb01847.x [DOI] [PubMed] [Google Scholar]
  • [96].Corso M, Cancelliere C, Mior S, et al. The safety of spinal manipulative therapy in children under 10 years: a rapid review. Chiropr Man Therap. 2020. Dec;28(1):1–8. doi: 10.1186/s12998-020-0299-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [97].Ellwood J, Draper-Rodi J, Carnes D. The effectiveness and safety of conservative interventions for positional plagiocephaly and congenital muscular torticollis: a synthesis of systematic reviews and guidance. Chiropr Man Therap. 2020. Dec;28(1):1–1. doi: 10.1186/s12998-020-00321-w [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [98].Alcantara J, Alcantara JD, Alcantara J. The chiropractic care of infants with breastfeeding difficulties. Explore. 2015 Nov 1;11(6):468–474. doi: 10.1016/j.explore.2015.08.005 [DOI] [PubMed] [Google Scholar]
  • [99].Hawk C, Minkalis A, Webb C, et al. Manual interventions for musculoskeletal factors in infants with suboptimal breastfeeding: a scoping review. J Evid Based Complementary Altern Med. 2018 Dec 11;23:2515690X18816971. doi: 10.1177/2515690X18816971 [DOI] [Google Scholar]
  • [100].Miller JE, Miller L, Sulesund AK, et al. Contribution of chiropractic therapy to resolving suboptimal breastfeeding: a case series of 114 infants. J Manipulative Physiol Ther. 2009 Oct 1;32(8):670–674. doi: 10.1016/j.jmpt.2009.08.023 [DOI] [PubMed] [Google Scholar]
  • [101].Fry LM. Chiropractic and breastfeeding dysfunction: a literature review. J Clin Chiropractic Pediatr. 2014. Mar;14(2):1151–1155. [Google Scholar]
  • [102].Herzhaft-Le Roy J, Xhignesse M, Gaboury I. Efficacy of an osteopathic treatment coupled with lactation consultations for infants’ biomechanical sucking difficulties: a randomized controlled trial. J Hum Lact. 2017. Feb;33(1):165–172. doi: 10.1177/0890334416679620 [DOI] [PubMed] [Google Scholar]
  • [103].Albuquerque PL, Lemos A, Guerra MQ, et al. Accuracy of the Alberta Infant motor Scale (AIMS) to detect developmental delay of gross motor skills in preterm infants: a systematic review. Dev Neurorehabil. 2015 Jan 2;18(1):15–21. doi: 10.3109/17518423.2014.955213 [DOI] [PubMed] [Google Scholar]
  • [104].Mokkink LB, Terwee CB, Knol DL, et al. The COSMIN checklist for evaluating the methodological quality of studies on measurement properties: a clarification of its content. BMC Med Res Methodol. 2010. Dec;10(1):1–8. doi: 10.1186/1471-2288-10-22 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [105].Kjølbye CB, Drivsholm TB, Ertmann RK, et al. Motor function tests for 0-2-year-old children - a systematic review. Dan Med J. 2018 Jun 1;65(6):A5484. [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Material

Articles from The Journal of Manual & Manipulative Therapy are provided here courtesy of Taylor & Francis

RESOURCES