Skip to main content
PLOS One logoLink to PLOS One
. 2024 Apr 18;19(4):e0301934. doi: 10.1371/journal.pone.0301934

Prechtl’s method to assess general movements: Inter-rater reliability during the preterm period

Angélica Valencia 1,2,*, Carlos Viñals 3,#, Elsa Alvarado 3,#, Marcela Balderas 3,#, Joëlle Provasi 2
Editor: Claudia Brogna4
PMCID: PMC11025967  PMID: 38635854

Abstract

Introduction

Prechtl’s method (GMA) is a test for the functional assessment of the young nervous system. It involves a global and a detailed assessment of the general movements (GMs) and has demonstrated validity. Data on the reliability of both assessments in the preterm period are scarce. This study aimed to evaluate the inter-rater reliability for the global and detailed assessments of the preterm writhing GMA.

Materials and methods

The study participants were 69 infants born at <37 gestational weeks and admitted to the neonatal intensive care unit. They were randomly assigned to five pairs of raters. Raters assessed infants’ GMs using preterm videos. Outcome variables were (a) the GMs classification (normal versus abnormal; normal versus abnormal subcategories) and (b) the general movements optimality score (GMOS), obtained through the global and detailed assessments. The Gwet’s AC1 and the intraclass correlation coefficient (ICC) were calculated for the GMs classification and the GMOS, respectively.

Results

The global assessment presented an AC1 = 0.84 [95% CI = 0.54,1] for the GMs binary classification and an AC1 = 0.67 [95% CI = 0.38,0.89] for the GMs classification with abnormal subcategories. The detailed assessment presented an ICC = 0.72 [95% CI = 0.39,0.90] for the GMOS.

Conclusions

Inter-rater reliability was high and substantial for the global assessment and good for the detailed assessment. However, the small sample size limited the precision of these estimates. Future research should involve larger samples of preterm infants to improve estimate precision. Challenging items such as assessing the neck and trunk, poor repertoire GMs, and tremulous movements may impact the preterm writhing GMA’s inter-rater reliability. Therefore, ongoing training and calibration among raters is necessary. Further investigation in clinical settings can enhance our understanding of the preterm writhing GMA’s reliability.

Introduction

Preterm infants are at risk of neurodevelopmental disorders [1]. Thus, international guidelines recommend performing appropriate neurodevelopmental assessments to identify infants who could benefit from early intervention [2]. Prechtl’s method (GMA) is a video-based test for the functional assessment of the young nervous system. The GMA has high clinical utility for premature infants because it consists of an observational assessment of the quality of general movements (GMs). GMs originate endogenously in the entire body in a sequence of spontaneous movements [3]. Variability, complexity, and fluidity are the characteristics that determine the quality of GMs and reflect the integrity of the young nervous system. These characteristics of GMs can be negatively affected by brain abnormalities [4, 5] thus becoming a neurodevelopmental marker [6]. The GMA facilitates the evaluation of GMs in three different developmental periods. The preterm writhing GMA (before term age) and writhing GMA (from term age) identify various neurodevelopmental disorders, including functional, motor, and cognitive issues [7, 8]. The fidgety GMA (from 9 weeks post-term) mainly detects cerebral palsy and cognitive disorders [9, 10].

The three GMA periods include both (a) a global and (b) a detailed assessment of the GMs [11]. During the preterm writhing GMA and writhing GMA, the global assessment involves analyzing the quality of GMs and classifying them as either normal or abnormal. The detailed assessment consists of scoring the GMs’ characteristics in the limbs to obtain the GMs optimality score (GMOS). During the fidgety GMA, the global assessment classifies GMs as present, abnormal, or absent, while the detailed assessment scores the motricity and posture to obtain the motor optimality score (MOS).

Clinicians, therapists, and neuroscientists worldwide have received training from the GMs Trust group in using both the global and detailed assessments of the GMA. As a result, studies in clinical and research contexts have evaluated the GMA’s validity and reliability in preterm infants. Validity and reliability are psychometric properties that express the appropriateness of a test. Validity indicates the scoring accuracy, and reliability shows its consistency. Inter-rater reliability evaluates scoring consistency when different raters assess the same patients with the same test, and agreement is the degree to which scoring is identical [12]. Studies found that the preterm writhing GMA presents high validity (94%) [13] predicting neurodevelopment in preterm infants and high inter-rater agreement (90%) [14]. Specifically, the inter-rater reliability of the global assessment was high (k > 0.80) for preterm writhing GMA and writhing GMA [15], from moderate (k = 0.50) to substantial (k = 0.80) for preterm writhing GMA and writhing GMA in combined analyzes [16, 17], and from substantial (k = 0.64) to high (k = 0.92) for fidgety GMA [15, 18, 19]. A recent study evaluated the reliability of a checklist for guiding raters during the global assessment. That study found that the inter-rater reliability values (k = 0.68–0.80) for the preterm writhing GMA and writhing GMA were similar to those in previous studies [16]. While limited published data exist on the reliability of detailed assessment, studies have report excellent inter-rater reliability (ICC > 0.87) for both the fidgety MOS and the fidgety MOS-revised [19, 20]. However, to our knowledge, data on inter-rater reliability for detailed assessment in the preterm writhing period are scarce.

Preterm writhing GMA inter-rater reliability needs to be identified because most studies either focus on fidgety GMA, combine preterm writhing GMA and writhing GMA analyses, or do not evaluate the detailed assessment. Furthermore, there is a scarcity of data regarding the inter-rater reliability for the global and detailed assessments of the preterm writhing GMA within the same sample of preterm infants. This gap highlights the need for studies to complement existing evidence on the inter-rater reliability of the preterm writhing GMA.

Consequently, this study aimed to evaluate the preterm writhing GMA inter-rater reliability for (a) the global and (b) the detailed assessment. Specifically, this study addressed the global assessment because of its utility on preterm infants. It also considered the detailed assessment because studies suggest using it to complement the global assessment [21]. Given its well-documented high validity in predicting later development in high-risk preterm infants [13], understanding the reliability of the preterm writhing GMA is crucial for clinical practice. Establishing this reliability is essential to inform clinicians about the preterm writhing GMA’s ability to consistently identify at-risk babies, thus facilitating timely interventions during this critical period of neuroplasticity [22]. According to the hypotheses, inter-rater reliability of the preterm writhing GMA would be high (AC1 > 0.80) for the global assessment and excellent (ICC > 0.75) for the detailed assessment.

Materials and methods

Design and setting

This psychometric study on the inter-rater reliability of the preterm writhing GMA is part of a broader prospective longitudinal study on the neurodevelopmental assessment of preterm infants. The infants were recruited between January and November 2017 from the Necker Enfants-Malades and Armand Trousseau University hospitals in France. This research followed the Guidelines for Reporting Reliability and Agreement Studies (GRRAS) [12].

Ethical approval

This study obtained the approval of the Ethical Committee for the Protection of Persons of Île-de-France V (CPP Ref: d-10-16) and followed the principles set out in the Declaration of Helsinki. The infants’ parents were sent an informative letter outlining the study’s objectives and methods. In reply, they gave written informed consent to their children’s inclusion in the study and the video evaluation of the infant’s GMs.

Participants and raters

The sample size selection complied with Bonett’s parameters that recommend 28 participants for reliability studies with five raters and an intraclass correlation coefficient value of 0.80 [23]. The sample was recruited conveniently based on the following inclusion criteria: infants born at <37 gestational weeks and admitted to the neonatal intensive care unit (NICU). Infants with congenital anomalies and a severe illness at the time of GMs assessment were excluded. The study used a computerized randomization method to select pairs of raters from a list of GMA-certified raters (n = 5). According to Table 1, each participant (n = 69) was randomly assigned to be evaluated by a different pair (n = 5) of raters.

Table 1. Assignment of subjects to be evaluated by each pair of raters.

Number of pairs of raters 1 2 3 4 5 Total
Rater’s name B D E C B
C A D E D
Number of subjects 13 22 10 4 20 69

Data collection

The data consisted of the participants’ GMs video filmed at <37 corrected weeks before their discharge from hospital. Following the GMA parameters, a remotely controlled video camera was positioned above the infants to capture the entire body. The infants were filmed in the supine position wearing little clothing. The filming began when the infants exhibited spontaneous movements following routine nursing care. Infants of <36 corrected weeks were filmed awake or asleep. Infants of ≥36 corrected weeks were filmed in stage 4 (eyes open, no crying, movement present) [24]. During the filming, interaction with caregivers and movement-limiting objects was avoided. Rater A recorded and edited the videos. She also collected the participants’ clinical information from their electronic medical records after completing the GMs assessment.

General movements assessment

The raters received the videos via an online secure platform with a fifteen-day time frame to complete the GMs assessment. Based on these videos, the raters conducted a global and a detailed GMs assessment according to the preterm writhing GMA criteria [11]. In the global assessment, the raters classified GMs as normal if they exhibited varying speed and spatial occupation of moving extremities, along with complex and fluid articulatory rotations. The raters classified the GMs as abnormal if they showed decreased variability, complexity, and fluency. Abnormal GMs included these four subcategories: (a) poor repertoire if the limbs presented monotonous speed and spatial occupation, (b) cramped synchronized if the limbs presented high rigidity, (c) chaotic if the movements were disorganized and (d) hypokinetic if no GMs were observed. Next, the raters performed the detailed assessment using the scoring sheet to evaluate (from 0 to 2) the following items separately: sequence of the GMs, neck and trunk involvement, superior extremities, and lower extremities. The total of the scores for the items gave the GMOS (from 5 to 42). GMOS was not calculated when GMs were hypokinetic.

The raters were in different countries and did not communicate with each other. The raters were blinded to the participants’ assignment, clinical data, and individual identity information (except filming age in corrected weeks) and conducted the evaluations independently. After 45 minutes of viewing the videos, the raters took a 5-minute break to calibrate their perception. They participated in two online previous training sessions, which included studying a preterm writhing GMA pedagogical video and reaching an agreement on two cases.

Outcome variables

The outcome variables were (a) the GMs classification (normal versus abnormal; normal versus abnormal subcategories) obtained through the global assessment and (b) the GMOS (from 5 to 42) obtained through the detailed assessment.

Statistical analysis

The data reporting used statistical descriptors according to the level of measurement of the variables, with mean and standard deviation (±SD) for continuous variables, median and interquartile range (IQR) for ordinal variables, and frequency and percentage for nominal variables. The analysis considered two types of GMs classification. Firstly, the binary classification of GMs in normal versus abnormal. Next, the classification of the GMs in normal versus abnormal GMs subcategories (poor repertoire, cramped synchronized, chaotic, or hypokinetic). The Kolmogorov-Smirnov test confirmed the normal distribution of the GMOS.

The study calculated the percentage of agreement and Gwet’s AC1 coefficient for the GMs classification to evaluate the inter-rater reliability of the global assessment. The AC1 was used because it corresponds better with the percentage of agreement than the k coefficient and controls the problems associated with the prevalence [25]. Additionally, the AC1 has already been used to assess inter-rater reliability of the writhing GMA in postoperative infants [26]. The interpretation of the AC1 considered fair (≤ 0.40), moderate (from 0.41 to 0.60), substantial (from 0.61 to 0.80), and high (from 0.81 to 1.00) inter-rater reliability [27].

The evaluation of the inter-rater reliability of the detailed assessment required the calculation of the intraclass correlation coefficient (ICC) for the GMOS. The model of One-way Random Effect, absolute agreement, and single assessment was used. That model is suitable for designs in which subjects are evaluated by different pairs of randomly selected raters [28]. The interpretation of the ICC considered poor (≤ 0.40), fair (from 0.41 to 0.59), good (from 0.60 to 0.74), and excellent (from 0.75 to 1) inter-rater reliability [29]. The standard error of measurement (SEM) was calculated with the formula SEM = SD/2 using the SD of the differences between each pair of raters [30].

Reliability coefficients were calculated for each pair of raters and then averaged to obtain a single inter-rater reliability index. This study considered a two-tailed p-value of < 0.05 as significant and calculated 95% confidence intervals (CI). The data analysis was performed using SPSS Statistics version 25.0 and AgreeStat 360 [31].

Results

Participants and raters

Eighty-two infants were recruited to ensure sufficient data in the contingency table. Thirteen infants were excluded due to withdrawal (n = 1), congenital disease (n = 1), full-term birth (n = 3), and unavailability for filming before reaching 37 corrected weeks (n = 8). In total, 69 infants were included with 35 (51%) males. Table 2 presents the participants’ clinical characteristics. The participants were filmed at 35±1 corrected weeks for an average of 3±1 minutes to obtain 5±1 sequences of GMs.

Table 2. Clinical characteristics of participants (n = 69).

Gestational age in weeksa 31.6±2
Birth weight in gramsa 1543.7±429.8
Head circumference at birth in centimetersa 28.6±2.3
APGAR score at 5 minutesb 10 (4,10)
Extremely pretermc 3 (4.3)
Extremely low birth weightc 6 (8.7)
Small for gestational agec 10 (15.6)
Brain ultrasound abnormalitiesc 29 (46)
Intraventricular hemorrhage (IVH)c 12 (19)
 IVH Ic 3 (4.8)
 IVH IIc 6 (9.5)
 IVH III-IVc 3 (4.8)
Periventricular leukomalacia (PVL)c 24 (38.1)
 PVL Ic 18 (28.6)
 PVL IIc 4 (6.3)
 PVL III-IVc 2 (3.2)
Bronchopulmonary dysplasiac 11 (15.9)

Note: Pediatric specialists, who were blinded to the study, identified morbidities during routine NICU assessments.

Small for gestational age was defined as birth weight below the 10th percentile.

Brain ultrasound abnormalities refers to IVH (brain bleeding) and PVL (white matter lesion), graded according to international guidelines [32, 33].

Bronchopulmonary dysplasia was defined as the requirement for oxygen supplementation for 28 or 56 days in very preterm or late preterm infants [34].

aData reported as mean±SD

bData reported as median (min–max, IQR)

cData reported as frequency (percentage)

The raters (n = 5) were three physicians and two psychologists with an average of 18±2 years of experience in child neurodevelopment. Three raters are clinical rehabilitation specialists and the other two work in research. The raters are GMs Trust group certified with 6±5 years of experience in working on the global and detailed assessments of the preterm writhing GMA. The results considered the pairs of raters numbered 1,2,3, and 5. The number 4 pair of raters was excluded due to insufficient participants (n = 4) assignment (see Table 1). Thus, four pairs of raters completed the global and detailed assessments of the GMs in 65 preterm infants during the preterm writhing period.

Inter-rater reliability for the global assessment

Table 3 shows the distribution of the subjects across GMs categories. The GMs categories presented an agreement percentage of 84% (from 69% to 100%). The category with the highest disagreement among the raters was poor repertoire GMs (from 69% to 75%). None of the GMs categories obtained 100% of agreement among all pairs of raters.

Table 3. Distribution of subjects per pair of raters and category of the GMs evaluated during the preterm period (n = 65).

Pair of raters 1 (B-C)a % Agreement Pair of raters 2 (D-A)a % Agreement
Rater C Rater A
Rater B Yes (Y) No (N) Rater D Yes (Y) No (N)
Normal Yes (Y) 4 (30) 0 (0) 84 Yes (Y) 8 (36) 0 (0) 90
No (N) 2 (15) 7 (53) No (N) 2 (9) 12 (54)
Abnormal Y 7 (53) 2 (15) 84 Y 12 (54) 2 (9) 90
N 0 (0) 4 (30) N 0 (0) 8 (36)
 Poor repertoire Y 4 (30) 4 (30) 69 Y 8 (36) 6 (27) 72
N 0 (0) 5 (38) N 0 (0) 8 (36)
 Cramped synchronized Y 1 (7) 0 (0) 84 Y 0 (0) 4 (18) 81
N 2 (15) 10 (76) N 0 (0) 18 (81)
 Chaotic Y 0 (0) 0 (0) - Y 0 (0) 0 (0) -
N 0 (0) 0 (0) N 0 (0) 0 (0)
 Hypokinetic Y 0 (0) 0 (0) - Y 0 (0) 0 (0) -
N 0 (0) 0 (0) N 0 (0) 0 (0)
Pair of raters 3 (E-D) a % Agreement Pair of raters 5 (D-B) a % Agreement
Rater D Rater B
Rater E Yes (Y) No (N) Rater D Yes (Y) No (N)
Normal Yes (Y) 2 (20) 0 (0) 100 Yes (Y) 3 (15) 4 (20) 80
No (N) 0 (0) 8 (80) No (N) 0 (0) 13 (65)
Abnormal Y 8 (80) 0 (0) 100 Y 13 (65) 0 (0) 80
N 0 (0) 2 (20) N 4 (20) 3 (15)
 Poor repertoire Y 4 (40) 3 (30) 70 Y 12 (60) 1 (5) 75
N 0 (0) 3 (30) N 4 (20) 3 (15)
 Cramped synchronized Y 0 (0) 3 (30) 70 Y 2 (10) 0 (0) 95
N 0 (0) 7 (70) N 1 (5) 17 (85)
 Chaotic Y 0 (0) 0 (0) - Y 0 (0) 0 (0) -
N 0 (0) 0 (0) N 0 (0) 0 (0)
 Hypokinetic Y 1 (10) 0 (0) 100 Y 0 (0) 0 (0) -
N 0 (0) 9 (90) N 0 (0) 0 (0)

aData reported as frequency (percentage)

Table 4 presents the inter-rater reliability for GMs classification. The GMs binary classification (normal versus abnormal) obtained an inter-rater agreement percentage of 88% (from 80% to 100%) with a coefficient AC1 = 0.84±0.5 (from 0.68 to 1). The GMs classification with abnormal subcategories (poor repertoire, cramped synchronized, chaotic, or hypokinetic) demonstrated an inter-rater agreement of 72% (from 69% to 77%) and an AC1 = 0.67±0 (from 0.63 to 0.73).

Table 4. Inter-rater reliability per pair of raters for the GMs classification during the preterm period (n = 65).

Pair of raters 1 (B-C) Pair of raters 2 (D-A)
% Agreement AC1 p CI % Agreement AC1 p CI
Normal versus Abnormal 84 0.70 0.001 [0.28,1] 90 0.82 0.000 [0.57,1]
Normal versus Abnormal subcategories 69 0.63 0.001 [0.30,0.69] 77 0.73 0.000 [0.51,0.95]
Pair of raters 3 (E-D) Pair of raters 5 (D-B)
% Agreement AC1 p CI % Agreement AC1 p CI
Normal versus Abnormal 100 1 0.000 [1,1] 80 0.68 0.000 [0.34,1]
Normal versus Abnormal subcategories 70 0.64 0.002 [0.25,1] 75 0.71 0.000 [0.47,0.94]

AC1, Gwet’s reliability coefficient; p, significance level of < 0.05; CI, lower limit, upper limit 95% confidence interval.

Inter-rater reliability for the detailed assessment

Table 5 presents the inter-rater reliability for the GMOS. The GMOS obtained an ICC = 0.72±8 (from 0.66 to 0.79). The lowest inter-rater reliability was related to the item assessing the neck and trunk involvement with an ICC = 0.44±1 (from 0.19 to 0.66). No GMOS item obtained a perfect inter-rater agreement among all pairs of raters.

Table 5. Inter-rater reliability for the GMOS by pair of raters during the preterm period (n = 64).

Pair of raters 1 (B-C) Pair of raters 2 (D-A)
Scoring a ICC p CI SEM Scoring a ICC p CI SEM
Rater B Rater C Rater D Rater A
GMOS 23.4±7 25.8±7 0.72 0.001 [0.32,0.90] 3 27±10 26.9±9 0.79 0.000 [0.56,0.9] 4
 Sequence 1.2±0 1.2±0 0.70 0.006 [0.29,0.8] 1.2±0 1.4±0 0.68 0.000 [0.33,0.87]
 Neck and trunk 1.9±0 2.6±1 0.47 0.037 [0.05,0.86] 2.5±0 1.9±1 0.19 0.219 [0.29,0.60]
 Upper extremities 10.31±3 11.92±2 0.62 0.007 [0.15,0.86] 12.5±3 12.5±3 0.84 0.000 [0.63,0.94]
 Lower extremities 10±3 10±3 0.62 0.007 [0.16,0.86] 10.8±3 12.4±4 0.75 0.000 [0.44,0.90]
Pair of raters 3 (E-D) Pair of raters 5 (D-B)
Scoring a ICC p CI SEM Scoring a ICC p CI SEM
Rater E Rater D Rater D Rater B
GMOS 23.6±10 26.4±6 0.73 0.005 [0.23,0.93] 4 26.3±9 22.6±7 0.66 0.000 [0.32,0.84] 4
 Sequence 1.1±0 1.2±0 0.80 0.190 [0.27,0.96] 1.2±0 1.2±0 0.66 0.043 [0.29,0.86]
 Neck and trunk 2.7±1 2.5±0 0.66 0.027 [0.01,0.93] 2.5±0 1.9±0 0.47 0.022 [0.16,0.76]
 Upper extremities 11.7±4 12.5±3 0.86 0.002 [0.44,0.97] 12.5±3 10.31±3 0.65 0.001 [0.27,0.85]
 Lower extremities 10.1±5 10.8±3 0.79 0.006 [0.26,0.96] 10.8±3 10±3 0.55 0.007 [0.12,0.89]

Note: The GMOS was not calculated for one subject due to hypokinetic GMs.

aData reported as mean±SD.

ICC, intraclass correlation coefficient; p, significance level of < 0.05; CI, lower limit, upper limit 95% confidence interval; SEM, standard error of measurement.

Discussion

This study aimed to assess the inter-rater reliability of the preterm writhing GMA, providing evidence on both (a) the global and (b) the detailed assessment of the same sample of preterm infants. We will therefore discuss inter-rater reliability estimates for both the GMs classification and the GMOS, then the precision of these estimates will also be addressed.

Inter-rater reliability for the global assessment

As expected in our hypothesis, the inter-rater reliability of the global assessment was high (AC1 = 0.84 [95% CI = 0.54,1]) for the GMs binary classification (normal versus abnormal). However, it was substantial (AC1 = 0.67 [95% CI = 0.38,0.89]) for the GMs classification with abnormal subcategories (poor repertoire, cramped synchronized, chaotic, or hypokinetic). Our findings align with prior research on preterm writhing, writhing, and fidgety GMA, which found higher inter-rater reliability for the GMs binary (k > 80) classification compared to the GMs classification with abnormal subcategories (k = 50) [15, 17, 19]. Therefore, we will discuss the factors that influence this disparity in the inter-rater reliability for the global assessment.

The decrease in inter-rater reliability for the GMs classification with abnormal subcategories can be attributed to the nature of reliability coefficients, which tend to diminish when there are more than two classification categories [35]. Additionally, items that are difficult to interpret can impact the inter-rater reliability for the GMs classification with abnormal subcategories [30]. The poor repertoire GMs category could be a challenging item due to the highest bias (0.23) and the highest disagreement (76%) among raters. Although disagreement could have been reduced by using the checklist to guide GMs assessment, our inter-rater reliability estimates agree with those obtained for the checklist (k = 0.68–0.80) [16]. Observations in this study align with previous studies that have suggested the lack of precision of poor repertoire GMs in identifying neurodevelopmental disturbances in preterm infants [36]. Studies have shown that infants with poor repertoire GMs can transition into normal GMs by the time they reach term age [37]. This imprecision of poor repertoire GMs could also impact inter-rater reliability for the GMs classification with abnormal subcategories.

Therefore, clinical studies recommend combining the binary GMs classification with other neurological measures and neuroimaging to enhance the identification of preterm infants at neurodevelopmental risk [38]. Also, researchers have suggested using the global assessment of the preterm writhing GMA in the framework of longitudinal neurodevelopmental follow-up monitoring [21].

Inter-rater reliability for the detailed assessment

Contrary to our hypothesis, the detailed assessment demonstrated good inter-rater reliability (ICC = 0.72 [95% CI = 0.35,0.89]) for the GMOS. This observation differs from previous studies that reported higher inter-rater reliability for the fidgety MOS and the fidgety MOS-revised [19, 20]. Therefore, we will now consider the factors that may have influenced the inter-rater reliability value for the GMOS in this study.

Earlier studies have shown that the differences among raters’ expertise levels can affect inter-rater reliability for the fidgety GMA [18]. Although the raters have comparable expertise levels in preterm writhing GMA, they come from different professional clinical and research backgrounds. Raters (pair 2) with a research background demonstrated higher inter-rater reliability (ICC = 0.79 [95% CI = 0.56,0.90]) for the GMOS compared to raters from clinical fields. This observation aligns with a study that revealed lower inter-rater reliability for the global assessment of the writhing GMA and fidgety GMA in clinical settings than in research settings [17]. Professional background differences among raters in this study may have influenced inter-rater reliability for the GMOS.

Additionally, the item assessing neck and trunk involvement may be challenging because it presented the highest disagreement (ICC = 0.44 [95% CI = 0.12,0.78]) among raters. The response options for this item make it hard to discriminate between little and no involvement. The item evaluating the presence of tremulous movements may also be challenging because it had the second-highest rater disagreement. Two previous studies could support these observations. One study observed tremulous movements in both normal GMs and abnormal GMs [21] during the preterm writhing and writhing period. The other study demonstrated the clinical imprecision of tremulous movements during the writhing period in identifying neurodevelopmental disturbances in preterm infants [39]. The lack of precision of these challenging items may have impacted the inter-rater reliability for the GMOS.

Given the clinical utility of the detailed assessment, previous studies recommend using it in combination with the global assessment to gain a deeper understanding of specific parameters and trajectories of the GMs in preterm infants [21].

The precisión of estimates

We also observed wide 95% confidence intervals, which suggest potential limitations in the precision of the inter-rater reliability estimates. Confidence intervals for any inter-rater reliability estimate depend on two factors: sample size and sample variability related to the assessed parameter (in this case, the GMs) [40]. Therefore, we will address how these two factors could have influenced the precision of inter-rater reliability estimates for both the global and the detailed assessments.

Firstly, a small sample size can lead to increased error and increased uncertainty in inter-rater reliability estimates [40]. While our sample size is suitable for reliability studies involving categorical and numerical variables [23, 35], the relatively small number of subjects (n = 69) might affect the precision of inter-rater reliability estimates for both the global and detailed assessments. Two previous studies with smaller sample sizes reported similar findings. One study (n = 39) on the global assessment of the preterm writhing GMA reported high inter-rater reliability (k > 0.80 [95% CI = 0.40,1]) for the GMs classification but noted a wide 95% confidence interval [15]. Another study (n = 24), focusing on the detailed assessment, reported high inter-rater reliability (ICC = 0.87) for the fidgety MOS but noted a slightly higher measurement error than expected.

Secondly, an increased variability within the sample reduces the precision of inter-rater reliability estimates [41]. The oscillation in the proportion of abnormal GMs (ranging from 53% to 80%) and the standard deviation (25.3±8) of the GMOS may suggest variability within our sample. Thus, the relative heterogeneity related to the GMs within the sample could have influenced the precision of inter-rater reliability estimates for both the global and the detailed assessments. These observations contrast with the findings of a recent study involving a more heterogeneous sample of preterm and term infants with diverse clinical characteristics [20]. In that study, the fidgety MOS-revised exhibited higher inter-rater reliability values (ICC = 0.98 [95% CI = 0.97,0.99]). It is important to note that the study included a significantly larger sample (n = 252).

Limitations and future research

Our sample size met Bonett’s parameters for inter-rater reliability studies [23] but the relatively low number of subjects was a limiting factor for this study. Therefore, future research should consider larger samples of preterm infants to increase the precision of inter-rater reliability estimates for the preterm writhing GMA. While the randomized assignment of participants to randomly formed pairs of raters might minimize bias [28], the convenience recruiting participants from the NICU was another limitation for this study. Participants recruiting in NICU might to explain the high rate of PVL in our sample. Therefore, future studies on high-risk preterm infants could consider recruiting participants upon admission to the NICU to improve the sample’s representativeness and the generalizability of inter-rater reliability estimations for the preterm writhing GMA.

Conclusions

Reliability in identifying preterm infants at neurodevelopmental risk is a critical concern in assessments. This study provides insights into the inter-rater reliability of the preterm writhing GMA for evaluating the functionality of a young nervous system. We observed high and substantial inter-rater reliability for the global assessment, with the binary GMs classification being the most reliable. The detailed assessment showed good inter-rater reliability for the GMOS. However, our small sample size limited the precision of these estimates. Several challenging items, such as assessing neck and trunk involvement, poor repertoire GMs, and tremulous movements contributed to substantial inconsistency among raters. Therefore, ongoing training and rater calibration is necessary to enhance inter-rater reliability for the preterm writhing GMA. The preterm writhing GMA seems to have better inter-rater reliability in research settings than in a clinical environment. Given the utility of the preterm writhing GMA, further investigation in clinical settings is necessary to better understand its inter-rater reliability in identifying preterm infants at a high risk of neurodevelopmental issues.

Supporting information

S1 File

(XLSX)

pone.0301934.s001.xlsx (31.7KB, xlsx)

Data Availability

All relevant data are within the manuscript and its Supporting Information files.

Funding Statement

The author received no specific funding for this work.

References

  • 1.Blencowe H, Vos T, Lee AC, Philips R, Lozano R, Alvarado MR, et al. Estimates of neonatal morbidities and disabilities at regional and global levels for 2010: introduction, methods overview, and relevant findings from the Global Burden of Disease study. Pediatr Res. 2013;74: 4–16. doi: 10.1038/pr.2013.203 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Wang CJ, Mcglynn EA, Brook RH, Leonard CH, Piecuch RE. Quality-of-Care Indicators for the Neurodevelopmental Follow-up of Very Low Birth Weight Children: Results of an Expert Panel Process. Pediatrics. 2006;Volume 117: 2080–2092. doi: 10.1542/peds.2005-1904 [DOI] [PubMed] [Google Scholar]
  • 3.Einspieler C, Prechtl HFR. Prechtl’s assessment of general movements: A diagnostic tool for the functional assessment of the young nervous system. Ment Retard Dev Disabil Res Rev. 2005;11: 61–67. doi: 10.1002/mrdd.20051 [DOI] [PubMed] [Google Scholar]
  • 4.Cioni G, Bos A, Einspieler C, Ferrari F, Martijn A, Paolicelli PB, et al. Early neurological signs in preterm infants with unilateral intraparenchymal echodensity. Neuropediatrics. 2000;31: 240–51. doi: 10.1055/s-2000-9233 [DOI] [PubMed] [Google Scholar]
  • 5.Spittle AJ, Brown NC, Doyle LW, Boyd RN, Hunt RW, Bear M, et al. Quality of general movements is related to white matter pathology in very preterm infants. Pediatrics. 2008;121: e1184–9. doi: 10.1542/peds.2007-1924 [DOI] [PubMed] [Google Scholar]
  • 6.Peyton C, Einspieler C. General Movements: A Behavioral Biomarker of Later Motor and Cognitive Dysfunction in NICU Graduates. Pediatr Ann. 2018;47: e159–e164. doi: 10.3928/19382359-20180325-01 [DOI] [PubMed] [Google Scholar]
  • 7.Olsen JE, Cheong JLY, Eeles AL, FitzGerald TL, Cameron KL, Albesher RA, et al. Early general movements are associated with developmental outcomes at 4.5–5 years. Early Hum Dev. 2020;148: 105115. doi: 10.1016/j.earlhumdev.2020.105115 [DOI] [PubMed] [Google Scholar]
  • 8.Ferrari F, Cioni G, Einspieler C, Roversi MF, Bos AF, Paolicelli PB, et al. Cramped synchronized general movements in preterm infants as an early marker for cerebral palsy. Arch Pediatr Adolesc Med. 2002;156: 460–467. doi: 10.1001/archpedi.156.5.460 [DOI] [PubMed] [Google Scholar]
  • 9.Novak I, Morgan C, Adde L, Blackman J, Boyd RN, Brunstrom-Hernandez J, et al. Early, accurate diagnosis and early intervention in cerebral palsy: Advances in diagnosis and treatment. JAMA Pediatr. 2017;171: 897–907. doi: 10.1001/jamapediatrics.2017.1689 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Einspieler C, Bos A, Libertus M, Marschik P. The General Movement Assessment Helps Us to Identify Preterm Infants at Risk for Cognitive Dysfunction. Front Psychol. 2016;7: 406. doi: 10.3389/fpsyg.2016.00406 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Einspieler C, Prechtl HFR, Bos A, Ferrari F, Cioni G. Prechtl’s method on the qualitative assessment of general movements in preterm, term and young infants. Clin Dev Med. 2004;167: 1–91. [DOI] [PubMed] [Google Scholar]
  • 12.Kottner J, Audig L, Brorson S, Donner A, Gajewski BJ. Guidelines for Reporting Reliability and Agreement Studies (GRRAS) were proposed. 2011;64: 96–106. doi: 10.1016/j.jclinepi.2010.03.002 [DOI] [PubMed] [Google Scholar]
  • 13.Craciunoiu O, Holsti L. A Systematic Review of the Predictive Validity of Neurobehavioral Assessments During the Preterm Period. Phys Occup Ther Pediatr. 2016;2638: 1–16. doi: 10.1080/01942638.2016.1185501 [DOI] [PubMed] [Google Scholar]
  • 14.Valentin T, Uhl K, Einspieler C. The effectiveness of training in Prechtl’s method on the qualitative assessment of general movements. Early Hum Dev. 2005;81: 623–627. doi: 10.1016/j.earlhumdev.2005.04.003 [DOI] [PubMed] [Google Scholar]
  • 15.Mutlu A, Einspieler C, Marschik PB, Livanelioglu A. Intra-individual consistency in the quality of neonatal general movements. Neonatology. 2008;93: 213–216. doi: 10.1159/000110870 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Aizawa CYP, Einspieler C, Genovesi FF, Ibidi SM, Hasue RH. The general movement checklist: A guide to the assessment of general movements during preterm and term age. J Pediatr (Rio J). 2021;97: 445–452. doi: 10.1016/j.jped.2020.09.006 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Bernhardt I, Marbacher M, Hilfiker R, Radlinger L. Inter- and intra-observer agreement of Prechtl’s method on the qualitative assessment of general movements in preterm, term and young infants. Early Hum Dev. 2011;87: 633–639. doi: 10.1016/j.earlhumdev.2011.04.017 [DOI] [PubMed] [Google Scholar]
  • 18.Peyton C, Pascal A, Boswell L, deRegnier R, Fjørtoft T, Støen R, et al. Inter-observer reliability using the General Movement Assessment is influenced by rater experience. Early Hum Dev. 2021;161. doi: 10.1016/j.earlhumdev.2021.105436 [DOI] [PubMed] [Google Scholar]
  • 19.Fjørtoft T, Einspieler C, Adde L, Strand LI. Inter-observer reliability of the “Assessment of Motor Repertoire—3 to 5 Months” based on video recordings of infants. Early Hum Dev. 2009;85: 297–302. doi: 10.1016/j.earlhumdev.2008.12.001 [DOI] [PubMed] [Google Scholar]
  • 20.Örtqvist M, Marschik PB, Toldo M, Zhang D, Fajardo‐Martinez V, Nielsen‐Saines K, et al. Reliability of the Motor Optimality Score‐Revised: a study of infants at elevated likelihood for adverse neurological outcomes. Acta Paediatr. 2023. doi: 10.1111/apa.16747 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Einspieler C, Marschik PB, Pansy J, Scheuchenegger A, Krieber M, Yang H, et al. The general movement optimality score: a detailed assessment of general movements during preterm and term age. Dev Med Child Neurol. 2016;4: 361–368. doi: 10.1111/dmcn.12923 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Hadders-Algra M. The neuronal group selection theory: promising principles for understanding and treating developmental motor disorders. Dev Med Child Neurol. 2000;42: 707–715. doi: 10.1017/s0012162200001316 [DOI] [PubMed] [Google Scholar]
  • 23.Bonett DG. Sample size requirements for estimating intraclass correlations with desired precision. Stat Med. 2002;21: 1331–1335. doi: 10.1002/sim.1108 [DOI] [PubMed] [Google Scholar]
  • 24.Prechtl HFR. The behavioural states of the newborn infant (a review). Brain Res. 1974;76. doi: 10.1016/0006-8993(74)90454-5 [DOI] [PubMed] [Google Scholar]
  • 25.Wongpakaran N, Wongpakaran T, Wedding D, Gwet KL. A comparison of Cohen’s Kappa and Gwet’s AC1 when calculating inter-rater reliability coefficients: a study conducted with personality disorder samples. 2013. Available: http://www.biomedcentral.com/1471-2288/13/61 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Crowle C, Galea C, Morgan C, Novak I, Walker K, Badawi N. Inter-observer agreement of the General Movements Assessment with infants following surgery. Early Hum Dev. 2017;104: 17–21. doi: 10.1016/j.earlhumdev.2016.11.001 [DOI] [PubMed] [Google Scholar]
  • 27.Landis JR, Koch GG. The Measurement of Observer Agreement for Categorical Data. Biometrics. 1977;33: 159–174. doi: 10.2307/2529310 [DOI] [PubMed] [Google Scholar]
  • 28.Hallgren K. Computing Inter-Rater Reliability for Observational Data: An Overview and Tutorial. Tutor Quant Methods Psychol. 2012;8: 23–34. doi: 10.20982/tqmp.08.1.p023 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Cicchetti D V. Interreliability Standards in Psychological Evaluations. Psychol Assess. 1994; 284–290. [Google Scholar]
  • 30.De Vet H, Terwee C, Mokkink L, Knol DL. Measurement in medicine: A practical guide. Measurement in Medicine: A Practical Guide. 2011. doi: 10.1017/CBO9780511996214 [DOI] [Google Scholar]
  • 31.Gwet KL: Gwet K. Handbook of inter-rater reliability. The definitive guide to measuring the extent of agreement amongst raters 4th edition. Gaithersburg: Advanced Analytics LLC; 2014. 2010. [Google Scholar]
  • 32.Ment LR, Bada HS, Barnes P, Grant PE, Hirtz D, Papile LA, et al. Practice parameter: Neuroimaging of the neonate: Report of the Quality Standards Subcommittee of the American Academy of Neurology and the Practice Committee of the Child Neurology Society. Neurology. 2012;59: 1663–1664. doi: 10.1212/wnl.59.10.1663 [DOI] [PubMed] [Google Scholar]
  • 33.de Vries LS, Eken P, Dubowitz LMS. The spectrum of leukomalacia using cranial ultrasound. Behavioural Brain Research. 1992;49: 1–6. doi: 10.1016/s0166-4328(05)80189-5 [DOI] [PubMed] [Google Scholar]
  • 34.Hadchouel A, Delacourt C. Premature infants bronchopulmonary dysplasia: Past and present. Rev Pneumol Clin. 2013;69: 207–216. doi: 10.1016/j.pneumo.2013.05.003 [DOI] [PubMed] [Google Scholar]
  • 35.Sim J, Wright CC. The kappa statistic in reliability studies: Use, interpretation, and sample size requirements. Phys Ther. 2005;85: 257–268. doi: 10.1093/ptj/85.3.257 [DOI] [PubMed] [Google Scholar]
  • 36.Nakajima Y, Einspieler C, Marschik PB, Bos AF, Prechtl HFR. Does a detailed assessment of poor repertoire general movements help to identify those infants who will develop normally? Early Hum Dev. 2006;82: 53–59. doi: 10.1016/j.earlhumdev.2005.07.010 [DOI] [PubMed] [Google Scholar]
  • 37.Olsen JE, Brown NC, Eeles AL, Lee KJ, Anderson PJ, Cheong JLY, et al. Trajectories of general movements from birth to term-equivalent age in infants born <30 weeks’ gestation. Early Hum Dev. 2015;91: 683–688. doi: 10.1016/j.earlhumdev.2015.09.009 [DOI] [PubMed] [Google Scholar]
  • 38.Morgan C, Romeo DM, Chorna O, Novak I, Galea C, Del Secco S, et al. The pooled diagnostic accuracy of neuroimaging, general movements, and neurological examination for diagnosing cerebral palsy early in high-risk infants: A case control study. J Clin Med. 2019;8. doi: 10.3390/jcm8111879 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Spittle A, Walsh J, Potter C, Mcinnes E, Olsen J, Lee K, et al. Neurobehaviour at term-equivalent age and neurodevelopmental outcomes at 2 years in infants born moderate-to-late preterm. Dev Med Child Neurol. 2017;59: 207–215. doi: 10.1111/dmcn.13297 [DOI] [PubMed] [Google Scholar]
  • 40.Gardner MJ, Altman DG. Statistics in Medicine Confidence intervals rather than P values: estimation rather than hypothesis testing. Br Med J (Clin Res Ed). 1986;292: 746–750. doi: 10.1136/bmj.292.6522.746 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.O’Brien SF, Yi QL. How do I interpret a confidence interval? Transfusion (Paris). 2016;56: 1680–1683. doi: 10.1111/trf.13635 [DOI] [PubMed] [Google Scholar]

Decision Letter 0

Claudia Brogna

14 Aug 2023

PONE-D-23-13340Prechtl’s method to assess general movements: Inter-rater reliability for the global and detailed assessment during the preterm periodPLOS ONE

Dear Dr. Angelica VALENCIA

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by  Sep 28 2023 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Claudia Brogna

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. In your Data Availability statement, you have not specified where the minimal data set underlying the results described in your manuscript can be found. PLOS defines a study's minimal data set as the underlying data used to reach the conclusions drawn in the manuscript and any additional data required to replicate the reported study findings in their entirety. All PLOS journals require that the minimal data set be made fully available. For more information about our data policy, please see http://journals.plos.org/plosone/s/data-availability.

Upon re-submitting your revised manuscript, please upload your study’s minimal underlying data set as either Supporting Information files or to a stable, public repository and include the relevant URLs, DOIs, or accession numbers within your revised cover letter. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories. Any potentially identifying patient information must be fully anonymized.

Important: If there are ethical or legal restrictions to sharing your data publicly, please explain these restrictions in detail. Please see our guidelines for more information on what we consider unacceptable restrictions to publicly sharing data: http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions. Note that it is not acceptable for the authors to be the sole named individuals responsible for ensuring data access.

We will update your Data Availability statement to reflect the information you provide in your cover letter.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Partly

Reviewer #2: Partly

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: I Don't Know

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: No

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: This study evaluated GMA global results and GMOS scores for 65 infants videotaped at <37 weeks postmenstrual age prior to hospital discharge. Raters were well qualified to rate videos. Five pairs of raters were enlisted to test the reliability and agreement of the ratings, but one pair of raters only evaluated 4 participants and this pairing was excluded from analysis. Findings were in general agreement with other studies , however the authors identified 95% confidence intervals that included much lower values for reliability. The authors indicated this may affect the generalization of the results. However, the confidence interval would be likely to narrow with larger sample sizes and given their small number of participants, it is not clear if they have proven their conclusion.

Apart from this finding, the results are largely confirming other studies. The discussion is not clearly organized and the conclusions are somewhat vague. It is difficult to understand the main points the authors are trying to make.

Reviewer #2: The paper entitled “Prechtl’s method to assess general movements: Inter-rater reliability for the global and detailed assessment during the preterm period” report on an interesting issue on the use of general movements in preterm newborns.

This paper is of interest for physicians involved in the follow-up of this population of infants. However there are few issues that should be commented.

Introduction

Pag 9 line 57: I’m not sure that the GMs should be considered a “neuromotor test”. It measures the quality of movement detecting disorders of movement (functional limitation), as better explained few lines after.

Results

It seems that the % of abnormal GMs pattern was lower than that of normal GMs. It could be of interest to know more information about the clinical characteristics in term of risk of neurological impairment; for example the US scan data or MRI, seizures, incidence of SGA or other potential risk factors

Discussion

The results are well discussed like as the potential limitations.

Please revise all the manuscript for English

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2024 Apr 18;19(4):e0301934. doi: 10.1371/journal.pone.0301934.r002

Author response to Decision Letter 0


1 Dec 2023

• Academic Editor: "Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming."

We have revised our manuscript to ensure full compliance with PLOS ONE's style for authors' affiliations, headings, tables, and reference citations. Each file is now named according to the requirements, which includes 'Response to Reviewers', 'Revised Manuscript with Track Changes', 'Manuscript', and 'Supporting Information'.

• Academic Editor: "In your Data Availability statement, you have not specified where the minimal data set underlying the results described in your manuscript can be found. PLOS defines a study's minimal data set as the underlying data used to reach the conclusions drawn in the manuscript and any additional data required to replicate the reported study findings in their entirety. All PLOS journals require that the minimal data set be made fully available."

To address this request, we have uploaded our study’s minimal underlying data set as 'Supporting Information' as an Excel file upon submitting our revised manuscript.

• Reviewer #1: " The discussion is not clearly organized. It is difficult to understand the main points the authors are trying to make."

In response, we have restructured the discussion (page 11, line 262). The discussion now addresses three main results: inter-rater reliability for the global and detailed assessments of general movements (GMs) using the Prechtl' method (GMA) during the preterm period, as well as the precision of these estimates.

Firstly, we addressed the differences in inter-rater reliability for the global assessment (page 12, line 274). Since the GMs classification with abnormal subcategories obtained lower inter-rater reliability compared to the binary GMs classification, we discussed the factors that might have influenced this difference.

Secondly, we discussed the factors that might have influenced the inter-rater reliability of the detailed assessment because the observed inter-rater reliability for the GMOS (GMs optimality score) contradicted our hypothesis and previous research (page 13, line 300).

Thirdly, we discussed the potential impact of small sample size and variability on the precision of our estimates (page 14, line 326). This was essential because we observed wide 95% confidence intervals associated with our inter-rater reliability index for both the global and detailed assessments.

• Reviewer #1: "The authors identified 95% confidence intervals that included much lower values for reliability. The authors indicated this may affect the generalization of the results. However, the confidence interval would be likely to narrow with larger sample sizes and given their small number of participants, it is not clear if they have proven their conclusion". "The conclusions are somewhat vague. It is difficult to understand the main points the authors are trying to make".

We agree with Reviewer #1 and have updated the conclusion (page 15, line 367). The revised conclusion no longer focuses on problems related to the generalization of results. Instead, it highlights concerns about the precision of estimates due to the wide confidence intervals observed (Gardner and Altman, 1986). The conclusion is now more precise, addressing the potential impact of our small sample size on the width of the confidence intervals, which, in turn, can affect the precision of our inter-rater reliability estimates. Additionally, the conclusion discusses how challenging items and differences among raters might influence inter-rater reliability for the preterm GMA.

• Reviewer #2: "Introduction Pag 9 line 57: I’m not sure that the GMs should be considered a “neuromotor test”. It measures the quality of movement detecting disorders of movement (functional limitation), as better explained few lines after."

We agree with Reviewer #2 and have used the accurate terminology. The revised introduction (page 3, line 64) defines the GMA as a test for the functional assessment of the young nervous system (Einspieler and Prechtl, 2005).

• Reviewer #2: " It could be of interest to know more information about the clinical characteristics in term of risk of neurological impairment; for example, the US scan data or MRI, seizures, incidence of SGA or other potential risk factors"

In response, we have included an additional table (Table 2 in page 9, line 225) in the results section, which provides information on participants' clinical characteristics in term of risk of neurological impairment. It includes the US scan, incidence of SGA, APGAR at 5 min, extremely low birth weight, and neonatal clinical complications.

• Reviewer #2: "please revise all the manuscript for English"

In response, we have requested professional language editing services to enhance the quality of English in our manuscript. We uploaded the certification of proofreader.

Attachment

Submitted filename: Response to Reviewers.docx

pone.0301934.s002.docx (24.1KB, docx)

Decision Letter 1

Claudia Brogna

1 Feb 2024

PONE-D-23-13340R1Prechtl’s method to assess general movements: Inter-rater reliability during the preterm periodPLOS ONE

Dear Dr. VALENCIA,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Mar 17 2024 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Claudia Brogna

Academic Editor

PLOS ONE

Journal Requirements:

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction 

[Note: HTML markup is below. Please do not edit.

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: (No Response)

Reviewer #2: All comments have been addressed

Reviewer #3: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: A table of neonatal morbidities has been added but these should be defined in the text. The rate of PVL seems unusually high and this may reflect the definition of PVL or selection of patients for the study.

Reviewer #2: The authors answered to all the queries. No further comments are needed

Reviewer #3: This study details the inter-rater reliability of preterm writhing GMA global scores as well as the General Movements Optimality Score (GMOS). Strengths of this study include the generalisability of results with raters being in different countries. The small sample size is a limitation, although the authors do very well to explore this in great detail without biasing the results in their favour, rather encouraging the reader to exercise caution and notes for future studies.

Abstract:

Would be good to use the term “preterm writhing GMA” to add clarity about which period of GMs the study focuses on. Similarly, the term “preterm writhing GMA” does not appear anywhere in the abstract. This is important to highlight to differentiate from fidgety GMs.

Introduction:

Ideally throughout the manuscript, the distinction between fidgety and writhing should be made to avoid any confusion. E.g. when discussing previous literature about interrater agreement and predictive validity – are the studies referenced in relation to writhing or fidgety movements? Sometimes it is qualified in the manuscript, and sometimes it is not.

Avoid definite statements, e.g. “…has not been reported”; “this is the first study”. There may be conference abstracts or similar that details this. Instead, consider using “little/scarce publish data exists” or “to our knowledge, few studies report…”/”there is a scarcity of data..”

The authors do well do describe the gap in knowledge (although some expressions could be less definitive, see above), but justification for the importance of the knowing the accuracy in the preterm writhing period is lacking. i.e. what is the clinical benefit of having this knowledge during the preterm period? This point may be able to be addressed easily by highlighting research about preterm writhing movements, specifically, and their relationship with later development (line 92 – unclear if this research refers to preterm writhing or fidgety GMA, best to check).

Methods:

“Crying and hiccupping episodes were suppressed with an online editing program” – I’m not sure I understand how this is possible? Which program was used? Perhaps this program needs to be quoted, as you would with statistical analysis software. This sounds like a very complex procedure and could affect how general movements are observed with human Gestalt perception. Do the authors perhaps mean they were simply edited out?

The authors may want to refer to the work by Dr Crowle Early Human Development, 104, 2017, to further justify the use of Gwets AC1.

Results:

Appropriate, tables clear to read.

Discussion:

Appropriate and well discussed limitations.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2024 Apr 18;19(4):e0301934. doi: 10.1371/journal.pone.0301934.r004

Author response to Decision Letter 1


12 Mar 2024

Academic Editor: " Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction."

We have careful reviewed our reference list. It is complete, accurate, and does not include any retracted papers. We have added four references.

• Reference 26 (Crowle C, et al., 2017) on page 8, line 201 provides further justification for using the AC1 reliability coefficient, as recommended by Reviewer 3.

• References 32 (Ment LR, et al., 2012), 33 (de Vries L, Eken P, and Dubowitz L, 1992), and 34 (Hadchouel A, and Delacourt C, 2013) on page 9, line 228, define the participants morbidities presented in Table 2, as recommended by Reviewer 1.

Reviewer #1: "A table of neonatal morbidities has been added but these should be defined in the text."

In response, we have presented the criteria for defining the following morbidities: small for gestational age, brain ultrasound abnormalities, intraventricular hemorrhage (IVH), periventricular leukomalacia (PVL), and bronchopulmonary dysplasia (Note in Table 2, page 9, line 228).

We have removed the prevalence of respiratory distress and infection (Table 2, page 9, line 228) due to missing data regarding the definition of these morbidities.

Reviewer #1: "The rate of PVL seems unusually high and this may reflect the definition of PVL or selection of patients for the study."

We agree with Reviewer #1 and have added (page 15, line 365) that the high rate of PVL in our sample may be explained by the convenience recruitment of high-risk infants admitted to the NICU, as stated in participants section (page 5, line 141).

Reviewer #3: " Would be good to use the term “preterm writhing GMA” to add clarity about which period of GMs the study focuses on. Similarly, the term “preterm writhing GMA” does not appear anywhere in the abstract. This is important to highlight to differentiate from fidgety GMs".

"Ideally throughout the manuscript, the distinction between fidgety and writhing should be made to avoid any confusion. E.g. when discussing previous literature about interrater agreement and predictive validity – are the studies referenced in relation to writhing or fidgety movements? Sometimes it is qualified in the manuscript, and sometimes it is not".

We have implemented the suggestion from Reviewer #3 and updated the abstract (page 2, line 35) accordingly. The abstract now uses the term "preterm writhing GMA". This term has been used throughout the entire text to enhance clarity and precision in terminology throughout the manuscript.

Additionally, the updated introduction (page 4, lines 90-95) now distinguishes between writhing GMA and fidgety GMA while summarizing previous literature on interrater agreement and predictive validity.

Reviewer #3: Avoid definite statements, e.g. “…has not been reported”; “this is the first study”. There may be conference abstracts or similar that details this. Instead, consider using “little/scarce publish data exists” or “to our knowledge, few studies report…”/”there is a scarcity of data..”

We have replaced definite statements throughout the entire text, specifically in the abstract (page 2, line 33), introduction (page 4, lines 101 and 106), and discussion (page 11, line 263), using the nuanced statements (e.g., there is a scarcity of data) recommended by Reviewer #3.

Reviewer #3: The authors do well to describe the gap in knowledge (although some expressions could be less definitive, see above), but justification for the importance of the knowing the accuracy in the preterm writhing period is lacking. i.e. what is the clinical benefit of having this knowledge during the preterm period? This point may be able to be addressed easily by highlighting research about preterm writhing movements, specifically, and their relationship with later development (line 92 – unclear if this research refers to preterm writhing or fidgety GMA, best to check).

We agree with Reviewer #3 and have updated the justification (page 4, line 113-118) about the clinical importance of understanding the reliability of the preterm writhing GMA. It now includes the following arguments:

• The demonstrated high validity of the preterm writhing GMA in predicting development in preterm infants (Craciunoiu O, and Holsti L. 2016)

• The value of this information to guide clinicians about consistency of the preterm writhing GMA in identifying at-risk babies and opportunities for early interventions.

Additionally, we have clarified that the validation study cited on page 2, line 92 refers to preterm writhing GMA.

Reviewer #3: "Crying and hiccupping episodes were suppressed with an online editing program” – I’m not sure I understand how this is possible? Which program was used? Perhaps this program needs to be quoted, as you would with statistical analysis software. This sounds like a very complex procedure and could affect how general movements are observed with human Gestalt perception. Do the authors perhaps mean they were simply edited out?"

Yes, we mean that the videos from the participants were edited. Therefore, the updated version of the method section (page 6, line 157) simply states this.

Reviewer #3: "The authors may want to refer to the work by Dr Crowle Early Human development, 104, 2017, to further justify the use of Gwets AC1."

In response, we have included the reference of Crowle C, et al. (2017) in the statistical analysis section (page 8, line 201)

Attachment

Submitted filename: Response to Reviewers.docx

pone.0301934.s003.docx (28.7KB, docx)

Decision Letter 2

Claudia Brogna

25 Mar 2024

Prechtl’s method to assess general movements: Inter-rater reliability during the preterm period

PONE-D-23-13340R2

Dear Dr. Angelica VALENCIA,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice will be generated when your article is formally accepted. Please note, if your institution has a publishing partnership with PLOS and your article meets the relevant criteria, all or part of your publication costs will be covered. Please make sure your user information is up-to-date by logging into Editorial Manager at Editorial Manager® and clicking the ‘Update My Information' link at the top of the page. If you have any questions relating to publication charges, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Claudia Brogna

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Acceptance letter

Claudia Brogna

27 Mar 2024

PONE-D-23-13340R2

PLOS ONE

Dear Dr. VALENCIA,

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now being handed over to our production team.

At this stage, our production department will prepare your paper for publication. This includes ensuring the following:

* All references, tables, and figures are properly cited

* All relevant supporting information is included in the manuscript submission,

* There are no issues that prevent the paper from being properly typeset

If revisions are needed, the production department will contact you directly to resolve them. If no revisions are needed, you will receive an email when the publication date has been set. At this time, we do not offer pre-publication proofs to authors during production of the accepted work. Please keep in mind that we are working through a large volume of accepted articles, so please give us a few weeks to review your paper and let you know the next and final steps.

Lastly, if your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

If we can help with anything else, please email us at customercare@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Claudia Brogna

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 File

    (XLSX)

    pone.0301934.s001.xlsx (31.7KB, xlsx)
    Attachment

    Submitted filename: Response to Reviewers.docx

    pone.0301934.s002.docx (24.1KB, docx)
    Attachment

    Submitted filename: Response to Reviewers.docx

    pone.0301934.s003.docx (28.7KB, docx)

    Data Availability Statement

    All relevant data are within the manuscript and its Supporting Information files.


    Articles from PLOS ONE are provided here courtesy of PLOS

    RESOURCES