Skip to main content
NPJ Parkinson's Disease logoLink to NPJ Parkinson's Disease
. 2025 Nov 20;11:327. doi: 10.1038/s41531-025-01163-0

Added value of a wrist-worn device for assessing tremor in Parkinson’s disease: reliability and validity of tremor evaluation at home

D H Hepp 1,#, K Ocran 1,#, A Szanto 2, J J van Hilten 1, V Exadaktylos 3, R H A Weijer 1,4,
PMCID: PMC12635349  PMID: 41266564

Abstract

Clinical assessment and self-report of tremor in Parkinson’s disease (PD) have several limitations, including recall bias and lack of sensitivity. Wearable-based technology offers an opportunity to address these shortcomings. We investigated the test–retest reliability of wearable tremor assessment and compared the different approaches of assessing tremor. We analyzed data from 219 participants of the ProPark study. In 27% of people with PD, self-report and clinical observation disagreed about the presence of tremor. Wearable-based tremor assessment had excellent test–retest reliability after 3 days of data recording (ICC(2,3) lower bound of 95%CI > 0.90 for all tremor metrics). Wearable-derived tremor amplitude, duration, and volume were significantly associated with clinical assessment of tremor (all p < 0.001), and tremor volume was consistent with patient self-report of tremor presence when clinical observation failed to detect tremor (p = 0.040). In conclusion, wearable sensors can provide accurate and relevant information about the presence and severity of tremor in PD.

Subject terms: Diseases, Medical research, Neurology, Neuroscience

Introduction

Resting tremor is one of the core features of Parkinson’s disease (PD), affecting 65–75% of people1. It typically presents as an asymmetric 4–6 Hz hand tremor that lessens with movement1, but may reappear during walking or prolonged postures (re-emergent tremor)24. Most people with PD (PwPD) and resting tremor also have postural or action tremor, which occurs with movement or in certain postures46. Tremor may increase when performing cognitive tasks or when stressed, and may cause embarrassment, potentially interfering with social interactions7. The different components of tremor disrupt daily activities, cause emotional distress due to their visibility7, and affect the quality of life of PwPD8.

The mechanisms underlying PD tremor remain unclear, and treatment of resting tremor is challenging9. Dopaminergic therapy is the most effective medication, although patient response is variable10,11. Alternatives such as anticholinergics and propranolol offer limited benefit with notable side effects. For drug-resistant cases, options include deep brain stimulation, stereotactic radiosurgery, and focused ultrasound1214.

Accurate measurement of tremor severity is essential for refining interventions1520. Conventional approaches to assessing the presence and severity of tremor, namely questionnaires and clinical rating scales, face challenges21. The accuracy of self-reports is limited by patient compliance and recall bias, and may be compromised by cognitive impairment17. Common clinical rating scales, such as the Movement Disorders Society Unified Parkinson’s Disease Rating Scale (MDS-UPDRS), provide a snapshot in time, are influenced by situational stress, and lack sensitivity to real-time daily tremor variations22,23. Previous research has demonstrated correlations between wearable sensor-derived tremor amplitude and duration and clinical UPDRS rating scales23,24. However, these studies were conducted in controlled settings23, involved small sample sizes23, featured short assessment durations, and did not evaluate measurement error. Also, wearable-derived tremor amplitude and duration are often quantified independently, whereas combining these aspects into a single metric, i.e., tremor volume, may more accurately reflect an individual’s perceived tremor severity.

We sought to evaluate the added value of wearable sensor-derived tremor metrics (amplitude, duration, and volume) in PD, compared to traditional self-report and clinical assessments, by addressing the following questions: (1) How consistent are conventional self-report and clinical methods in identifying the presence of tremor in daily practice? (2) How reliable are wearable-derived tremor measures over multiple days, and what is the smallest detectable change (SDC)? (3) How well do wearable-based tremor measures agree with traditional self-report and clinical assessments? Collectively, the answers to these questions will help establish the reliability and validity of wearables as complementary or alternative tools for assessing tremor in PD.

Results

The analysis included data from 195 PwPD and 24 healthy controls (Table 1) from the Profiling Parkinson’s Disease (ProPark) observational cohort study. As part of the ProPark study, participants visited the hospital for a clinical evaluation. For 7 days following the clinical evaluation, wearable data were collected at home while self-report questionnaires were completed daily for 14 days following the clinical evaluation. Tremor amplitude, duration, and volume, i.e., the combination of duration and amplitude, were derived from the sensor data per day and averaged across days.

Table 1.

Descriptives of study population

PwPD (N = 195) Healthy controls (N = 24)
Age, yrs, mean (SD) 67.5 (7.2) 66.1 (8.4)
Male, N (%) 131 (67%) 11 (46%)
Time since diagnosis, yrs, median [IQR]a 3.4 [1.4, 5.9] -
Time since onset motor symptoms, yrs, median [IQR]b 5 [3,8] -
UPDRS III sum score, median [IQR] 20 [13,30] 2 [0, 4]
Hoehn & Yahr stage, N (%)
1 29 (14.9%) -
2 158 (81.0%) -
3 8 (4.1%) -
MoCA 28 [20,30] 28 [25,30]

PwPD People with Parkinson’s Disease, SD Standard Deviation, IQR Inter Quartile Range, UPDRS III Movement Disorder Society Unified Parkinson’s Disease Rating Scale, part III, MoCA Montreal Cognitive Assessment.

aThe date of diagnosis was obtained from the treating Neurologist. For 9 (4.6%) participants, the date of diagnosis could not be obtained from the neurologist and was self-reported. For 1 (0.5%) participant, the date of diagnosis was missing.

bFor 22 participants (11.3%), the time since onset of motor symptoms was missing.

Agreement between self-report and clinical assessment on the presence of tremor

In 73% of PwPD, self-reported tremor and clinical observation were in agreement (Fig. 1). In 16%, tremor was reported on at least one questionnaire, either the Wearing off 10 questionnaire (Q10)25 or MDS-UPDRS part II26, but was not observed during clinical evaluation. Conversely, 11% of PwPD did not self-report tremor, although it was observed during clinical assessment. Interrater reliability analysis27 between self-reported and clinically observed tremor indicated fair to moderate agreement (κ = 0.41, 95%CI = [0.27; 0.54]).

Fig. 1.

Fig. 1

Agreement between self-reported and clinically observed presence of tremor.

Wearable-derived tremor assessment: reliability and measurement error

To quantify the standard error of measurement (SEM) and assess the reliability of wearable-derived tremor metrics, inter-daily variance must be homogeneous across the observed range of tremor severity. As shown in Fig. 2A–C, variance between days increases with greater tremor severity. To address this, data were transformed to ensure more uniform variance across the full spectrum of tremor values. Relative reliability of wearable-derived tremor metrics was quantified using intraclass correlation coefficients (ICC), specifically ICC(2,k), based on a two-way random-effects, absolute agreement model. Relative reliability, which reflects inter-daily variance in relation to inter-subject variance, was excellent across all tremor metrics when averaging over three days (amplitude ICC(2,3) = 0.93, 95% CI = 0.91 to 0.94; duration ICC(2,3) = 0.98, 95% CI = 0.97 to 0.98; volume ICC(2,3) = 0.98, 95%CI = 0.97 to 0.98) (Fig. 2D). To further evaluate measurement reliability, the SDC was determined for each tremor metric using the transformed data. The SDC, also referred to as the minimum detectable change, indicates the smallest measurable change that can be considered independent of variance due to measurement error and natural fluctuations observed in the evaluation period. Since the SDC applies directly to the transformed data, but tremor duration is more interpretable in its original form, Fig. 2A–C present untransformed tremor metrics, with SDC values back-transformed to the original units. The SDC increases with the magnitude of the tremor metric and, when data was averaged over 6 days, ranged from 0.8 to 14.0°/s, 6.4 min to 2.2 h, and 6.7 to 4838.2° for amplitude, duration, and volume, respectively. All subsequent analyses are based on wearable sensor-derived tremor scores averaged over 6 days.

Fig. 2. Absolute and relative reliability of wearable-derived tremor metrics.

Fig. 2

Subject-specific inter-daily means, based on 6 days, and day-to-day deviation from this mean of wearable-derived tremor A amplitude, B duration, and C volume. Dashed lines connect the 6 data points belonging to the same participant, each point representing one of the 6 days of assessment. Blue solid lines indicate the SDC based on the average of 1 to 6 days of wearable-derived tremor assessment. D Test–retest reliability based on the average of 1 to 6 days of wearable assessments of tremor amplitude, duration and volume. Ribbons represent the 95% CI of the ICC(2,k) values where k equals the average over the number of days on the horizontal axis. Dashed line represents the reliability ICC(1,1) of the MDS-UPDRS III, item 17, the rest tremor amplitude item, all subcomponents (i.e., upper limbs, lower limbs, lips and yaw)42. SDC Smallest Detectable Change, ICC Intraclass Correlation Coefficient, CI Confidence Interval, MDS-UPDRS III Movement Disorder Society Unified Parkinson’s Disease Rating Scale, part III.

Comparison of self-report, clinical observation and wearable-derived tremor evaluation

To assess the agreement between wearable-derived tremor scores, self-reported tremor, and clinical assessment, wearable data were dichotomized to indicate the presence or absence of tremor. Sensitivity and specificity were calculated using the subset of PwPD for whom self-report and clinical evaluation were in agreement (73% of the cohort). Among the wearable-derived metrics, tremor duration and volume demonstrated the strongest ability to differentiate the presence of tremor, with an area under the curve (AUC) of 0.832 (95% CI: 0.767–0.896) and (AUC = 0.834, 95% CI: 0.768–0.901), for duration and volume, respectively. Tremor amplitude performed worse (AUC = 0.631, 95% CI: 0.534–0.728) (Fig. 3). The optimal threshold for tremor duration and volume, determined by maximizing sensitivity and specificity, was 55 min per day and 100.4°, respectively.

Fig. 3. Receiver operator curves.

Fig. 3

Performance of wearable-derived tremor amplitude, duration, and volume in predicting the presence of tremor as indicated by both self-report and clinical observation.

Wearable-derived tremor amplitude was associated with clinically observed tremor amplitude (F(4,214) = 11.8, p < 0.001; see Fig. 4A). Post-hoc comparisons showed that individuals with moderate clinically observed tremor amplitude (UPDRS item score = 3) had significantly higher wearable-derived tremor amplitude than all other groups classified by clinical observation (all p < 0.001). Additionally, wearable-derived amplitude was significantly higher in PwPD with mild (UPDRS item score = 2) compared to no clinically observed tremor (UPDRS item score = 0) (p = 0.011). Within groups based on clinically observed tremor amplitude, no differences in wearable-derived tremor amplitude were observed between people who did and did not self-report tremor (both p > 0.05).

Fig. 4. Associations between wearable-derived, clinically observed and self-reported tremor.

Fig. 4

A amplitude, B duration, C volume with clinically observed amplitude and D volume with clinically observed constancy. Amplitude and volume values are presented after being Box-Cox transformed. Duration is presented untransformed. All statistical analyses were performed on Box-Cox transformed data. Dashed line indicates the threshold to identify the presence of tremor based on the ROC analysis. Statistically significant comparison groups based on clinical assessment, with Tukey HSD correction, and between people self-reporting tremor or no tremor are indicated; *p < 0.05, **p < 0.01, ***p < 0.001; ROC Receiver Operator Curve, UPDRS III Movement Disorder Society Unified Parkinson’s Disease Rating Scale, part III.

Wearable-derived tremor duration was associated with clinically observed tremor constancy (F(5213) = 38.3, p < 0.001; see Fig. 4B). Post-hoc comparisons showed that none of the subsequent groups based on clinical observation of tremor constancy had significantly different wearable-derived tremor duration (all p > 0.05), except for people with no (UPDRS item score = 0) compared to slight (UPDRS item score = 1) tremor constancy (difference between means: 11 min, [95%CI = 3 to 18 min], p < 0.001). Within PwPD with slight clinically observed tremor constancy (UPDRS item score = 1), wearable-derived tremor duration was higher in PwPD who self-reported tremor than in PwPD who did not self-report tremor (difference between means: 57.5 min, [95%CI = 26.3 to 99.8 min], p = 0.001). In PwPD with no clinically observed tremor constancy (UPDRS item score = 0), this difference was borderline statistically significant (difference between means: 10.1 min, [95% CI = –0.1 to 22.5 min], p = 0.053).

Wearable-derived tremor volume was associated with clinically observed tremor amplitude (F(4,214) = 54.6, p < 0.001; see Fig. 4C). Post-hoc comparisons showed that wearable-derived tremor volume was always higher when comparing groups with higher against lower clinically observed tremor amplitude. Groups with no (UPDRS item score = 0) and slight (UPDRS item score = 1) clinically observed tremor amplitude did not have statistically significant higher wearable-derived tremor volume than healthy controls. Within groups with no (UPDRS item score = 0, p = 0.040) and slight (UPDRS item score = 1, p = 0.004) clinically observed tremor amplitude, wearable-derived tremor volume was higher in PwPD who self-reported tremor than in PwPD who did not self-report tremor.

Wearable-derived tremor volume was associated with clinically observed tremor constancy (F(5,213) = 35.1, p < 0.001; see Fig. 4D). Post-hoc comparisons showed that none of the subsequent groups based on clinical observation of tremor constancy had significantly different wearable-derived tremor volume (all p > 0.05), except for people with slight (UPDRS item score = 1) compared to mild (UPDRS item score = 2) tremor constancy (p = 0.049). Within PwPD with no (UPDRS item score = 0, p = 0.029) and slight (UPDRS item score = 1, p = 0.045) clinically observed tremor constancy, wearable-derived tremor duration was higher in PwPD who self-reported tremor than in PwPD who did not self-report tremor.

Figure 5 shows average tremor duration over 6 days, along with daily variability, stratified by clinical tremor constancy. This figure highlights substantial differences in wearable-derived tremor duration within clinically defined groups (see Figure S1 for amplitude and volume).

Fig. 5. Resolution of wearable-derived tremor duration stratified by clinically observed tremor constancy (UPDRS part III, item 18) as indicated in the grey box.

Fig. 5

Small black dots represent the non-transformed tremor metric derived on a given day. Large black dots represent the average of the non-transformed tremor metric over 5 days. Vertical bars represent the range observed over 5 days per participant. Data is ordered based on tremor volume and stratified based on the maximum score of the rest, postural or kinetic tremor items of the MDS-UPDRS III [0–4]. The dashed line indicates the threshold for detecting the presence of tremor based on the ROC analysis. The vertical axis is square root transformed; ROC Receiver Operator Curve, MDS-UPDRS III Movement Disorder Society Unified Parkinson’s Disease Rating Scale, part III.

Discussion

We found that a wrist-worn wearable device can reliably measure tremor in PD over a 3-day period, and that these measurements, specifically of tremor duration and volume, are associated with self-report and clinical assessments of tremor severity. Self-report and clinical assessments of tremor severity are in disagreement with each other in about a quarter of PwPD. In such cases, objective digital tremor assessments may provide additional insights to aid in determining the presence of tremor. In addition, the wearable-based tremor assessment adds information about the amplitude, duration, and volume of PD tremor severity at a higher resolution compared to self-report and clinical assessment.

The sensor-derived tremor metrics show excellent reliability when averaged over three or more days. Most modern sensors support up to six days of data collection, and compliance with wrist-worn devices is generally reasonable28,29. In this study, we recorded for seven days but obtained six full 24-h days for nearly all participants due to occasional technical issues or non-wear. Therefore, we recommend collecting at least four consecutive full days of data to ensure three usable days for analysis. We obtained the SDC for all three wearable-derived tremor metrics. SDC is useful for evaluating treatment effects or tremor progression and reflects between-day variability, which we and others have found to increase with average tremor severity30. This variability may stem from both device-related measurement error and natural day-to-day fluctuations, making the SDC severity-dependent rather than fixed. Figure 2A–C or formula 3 provide thresholds beyond which changes exceed expected variability. For instance, the SDC for the 6 day average tremor duration ranges from 6 min (at 4 min/day average) to 2.2 h (at 10.3 h/day average).

Wearable-derived tremor metrics have been validated in numerous studies using clinical rating scales, annotated datasets, or comparisons between patient groups (e.g., PwPD vs. healthy controls)3134. Several of these studies also indicate their clinical utility, such as supporting diagnosis. At the same time, relatively few longitudinal studies or clinical trials have directly evaluated whether providing sensor-derived tremor information improves clinical care and quality of life. The limited evidence available suggests that wearables offer greater sensitivity to capture changes in tremor severity, resulting in quicker treatment adaptations and improved quality of care than conventional assessments3537. To date, we found no studies quantifying day-to-day variance using reliability measures like SEM or SDC for continuous tremor monitoring. We recommend future research to include such evaluations to enable better algorithm comparison and improve clinical and trial applications by distinguishing true changes from random variation.

Of the three wearable-derived tremor metrics, volume best distinguished between levels of clinically observed tremor amplitude. Both wearable-derived volume and duration were able to distinguish between levels of clinically observed tremor constancy. As previously reported29, neither could differentiate between adjacent tremor constancy groups. This may reflect the higher precision of clinical ratings for amplitude compared to constancy. Based on our findings, tremor volume derived from wearables may be useful when the amplitude of the tremor is of interest. Wearable-derived duration may be simpler to interpret and has a similar association with clinically observed tremor constancy and test-retest reliability as wearable-derived volume.

To our knowledge, this is the first study to relate the resolution of wearable-derived metrics to that of clinical scales. Conventional clinical assessments rely on non-linear, ordinal scales with limited resolution21. As shown in Fig. 5 and A1, wearable metrics reveal clear differences between individuals within these clinical categories. This higher resolution could enable earlier detection of treatment effects and reduce sample sizes in clinical trials without compromising statistical power.

Supervised learning is increasingly used to process wearable sensor data. Evers et al. and Mahadevan et al. provide overviews of the variety of devices, locations and methods used29,32. However, obtaining labeled datasets for model training is time-consuming and expensive, typically involving fewer than 50 participants. Given the heterogeneity of tremor expression32, limited understanding of contextual influences, and a lack of replication studies due to the need for new labeled data, model generalizability remains uncertain.

We evaluated a heuristic tremor detection model with a pre-defined threshold optimized for sensitivity and specificity23. Heuristic models, like the one from Salarian et al. used here, are simple to implement and this particular algorithm has been applied in at least two prior studies23,29. However, differences in study design hinder direct comparison. Our model showed lower sensitivity and specificity than in Salarian’s original study23. Their population sample of PwPD was more severely affected and more homogenous than ours. Our sample aligns more closely with the cohort used by Mahadevan et al.29, who adapted the same algorithm using machine learning and found similar associations between wearable-derived tremor metrics and UPDRS scores.

Consistent with prior studies38, 27% of PwPD had self-reported tremor that did not match clinical assessments. Post-hoc analysis revealed these individuals were generally older. Such discrepancies may stem from the momentary nature of clinical exams, where tremor can be suppressed by medication or affected by stress7. Additionally, unsupervised questionnaires may have led to misunderstandings, e.g., confusing tremor with dyskinesia39, or participants may not notice mild or non-disruptive tremors7. Figure 4 supports this, showing mismatches only in cases with low clinically assessed tremor scores. While the clinical importance of subtle tremors is debatable, particularly regarding whether they should be treated, their identification may still be valuable in research.

In this study, tremors detected by wearable sensors were not directly observed. Ideally, algorithm performance would be evaluated using labeled data collected in free-living or simulated conditions. However, such datasets are difficult to obtain and are rarely validated externally29,32. Although we did not include labeled data, the algorithm we used was originally developed by Salrarian et al., who specifically examined the ability to discriminate tremor from other movements. They demonstrated excellent sensitivity and specificity (>0.9) in DBS patients under standardized conditions based on labeled data. This suggests that, while some misclassifications will have occurred in our study, the proportion of non-tremor movements that were incorrectly classified as tremor was likely low.

Salarian et al. also evaluated the algorithm under unstandardized free-living conditions in a hospital. In these conditions, tremor amplitude from wearables still correlated strongly with UPDRS scores (partial r = 0.81), adjusted for DBS status (ON/OFF)23. Our more heterogeneous sample, which included patients without clinically observed tremor and did not correct for medication status, likely explains the weaker associations seen in our study.

Medication state during clinical assessments was not standardized: 4 PwPD (2%) were assessed in OFF, and 13 (7%) were unmedicated for PD. While standardization is ideal, it is often impractical in clinical settings. Therefore, our results may better reflect real-world clinical consultations than if medication state had been controlled.

The results presented about the resolution and reliability, and in prior studies about bradykinesia evaluations36, suggest that changes in tremor severity may be detected earlier with wearable devices than with conventional clinical assessments. Longitudinal studies are needed that evaluate this hypothesis. Additionally, few have assessed whether detailed tremor monitoring improves health outcomes37. Health outcomes should be evaluated in randomized clinical trials in which wearable devices are added to standard care.

In conclusion, our study shows only moderate agreement between clinically observed and self-reported tremor. Wearable-based tremor assessment is consistent with clinical tremor assessment, especially for tremor volume and duration. Particularly for PwPD who report tremor that is not clinically observed, wearable sensors provide valuable insight into tremor severity. Wearable sensors provide accurate and relevant information about the presence and severity of tremor in PD, which may ultimately improve the clinical management of PD tremor and be relevant for symptom monitoring in clinical trials.

Methods

Study population

Data from PD patients and Healthy controls participating in the Dutch multicenter Profiling Parkinson’s Disease (ProPark) observational cohort study were used. Participating centers included Leiden University Medical Center, Amsterdam University Medical Center, Erasmus University Medical Center, and Meander Medical Center, covering both university and community hospitals. Recruitment occurred via neurologist referrals, advertisements (magazines, social media, flyers, posters), and the Dutch Parkinson Patient Association. Additional recruitment was facilitated through the Dutch Parkinson Patient Association. Inclusion criteria for PD patients were a neurologist-confirmed PD diagnosis per MDS clinical criteria, disease duration ≤15 years, age ≥18, and Dutch language proficiency. Exclusion criteria included current advanced therapy use (i.e., levodopa intestinal gel, apomorphine, or deep brain stimulation), comorbidities affecting the assessment of Parkinson symptoms, a MoCA score ≤16, or refusal to be informed of unexpected medical findings. Healthy controls aged ≥18 were included if they had no self-reported neurological conditions, or other issues interfering with PD symptom evaluation. The Medical Ethics Committee of Amsterdam UMC approved the study (2019.515), and, in accordance with the Helsinki Declaration, written informed consent was obtained from all participants.

Protocol

As part of the ProPark study, participants underwent a hospital visit for clinical evaluation, including MDS-UPDRS Parts III and IV and Hoehn and Yahr (H&Y) staging. Wearable data were then collected at home for 7 days, and a self-report questionnaire was completed within 14 days after the clinical evaluation.

Self-reported assessment of tremor

Self-reported tremor was assessed using responses from the MDS-UPDRS Part II, Item 2.10, and the Wearing Off Questionnaire (Q10), Item 125,26. The MDS-UPDRS Item 2.10 evaluates tremor presence over the past week on a 5-point scale: 0 = none; 1 = some tremor, but it does not affect daily activities; 2 = tremor affects some activities; 3 = tremor affects many daily activities; and 4 = tremor affects most or all daily activities. The Q10 Item 1 asks whether tremor is experienced on an average day. Participants were classified as having tremor if they scored ≥1 on MDS-UPDRS Item 2.10 or reported tremor presence in Q10 Item 1.

Clinical assessment of tremor

Tremor amplitude was clinically assessed using the MDS-UPDRS Part III tremor sub-items: 15 (kinetic), 16 (postural), and 17 (rest), corresponding to the upper limb on which the sensor was worn. For example, if the sensor was worn on the right wrist, only the MDS-UPDRS-III sub-items for the right hand were included26. Each item was scored on a scale from 0 to 4, with higher scores indicating greater tremor amplitude. Additionally, tremor constancy was evaluated through the MDS-UPDRS-III tremor constancy item 18 on a scale from 0 to 4 with higher scores indicating more tremor presence during the clinical evaluation.

Wearable-based assessment of tremor

Participants wore a Newcastle AX6 wearable movement sensor (AX6, Axivity Ltd, Newcastle upon Tyne, UK) continuously for one week. The sensor was placed on the wrist of the most affected arm, based on upper extremity scores from the MDS-UPDRS, or on the non-dominant side for healthy controls and PwPD with equally affected sides. Sensors were configured to continuously record acceleration and angular acceleration at 100 Hz within ranges of ±8g and ±2000°/s. Gyroscope signals from the inertial measurement unit were processed in Python 3.11 according to methods described in Salarian et al. to determine the mean tremor amplitude and total duration of tremor for each day of assessment23. Essentially, this method identifies tremor episodes based on the power of the preprocessed gyroscope signal within the 3.5 to 7.5 Hz frequency band (see Data Availability for the Python implementation). In addition to Salarian et al. tremor volume was introduced as a measure that includes both amplitude and duration. It was defined as cumulative angular displacement (CAD), calculated for each non-isolated 3-s window where tremor was detected using the relationship between root-mean-square (RMS) angular velocity and total angular path length (CAD = RMS_velocity × 4√2 × window_duration). The factor 4√2 accounts for both the RMS-to-peak conversion (√2) and the total path traversed in sinusoidal motion (4 times peak amplitude). CAD was calculated for each axis (x, y, z) and combined as a vector magnitude to create a three-dimensional metric. This approach provides an intuitive quantification of tremor severity as the total angular distance traveled by the limb, allowing for standardized comparison across participants. For each recorded day, we calculated the mean tremor amplitude as well as the total tremor duration and total tremor volume, based on all three-second tremor episodes identified that day. Participants with fewer than six complete 24-h days of wearable sensor recordings were excluded from the analysis.

Data analysis

Further analyses were performed in R (R Core Team (2023), Vienna, Austria). Descriptive statistics (mean, SD, and frequencies) were used to characterize age, sex distribution, disease duration, and total MDS-UPDRS III scores. Agreement between self-reported and clinically observed tremor was assessed using Cohen’s kappa, with interpretation as poor (0–0.4), fair to good (0.4–0.75), and excellent (0.75–1)40. To determine test–retest reliability six Intraclass Correlation Coefficients (ICC) were calculated for each of the three wearable sensor-derived tremor features (i.e., duration, amplitude and volume). ICC values were classified as poor (<0.5), moderate (0.5–0.75), good (0.75–0.9), and excellent (>0.9). ICCs were computed using a two-way mixed-effects absolute agreement model for the average of k measurements (ICC(2,k)), where k represents the mean over an increasing number of days, ranging from 1 to 6. Before calculating ICCs, data were transformed to ensure homogeneity of variance across days using Cox Box transformations (formula 4) (λ from –2 to 2, in 0.01 steps). Transformations were selected by minimizing correlation between the 6-day average and absolute daily deviations, alongside visual inspection of variance distributions. Homogeneity of the inter-daily variance distribution after transformation was confirmed with the use of Levene’s test. To this end, data were divided into four groups of equal size based on the average over 6 days of the wearable-derived tremor severity score, and the inter-daily variance was compared between the groups. SEM was calculated using the following formula assuming no systematic differences between days41

SEM=SDpooled1ICC(2,k) 1

From the SEM, the Smallest Detectable Change (SDC) was determined by41

SDC=1.96×SEM×2 2

The formula to obtain the SDC in untransformed units given the transformed value from the average wearable derived tremor severity over k days, At, is

SDC=((At+SDCt)* λ+1)1λ 3

where SDCt is the SDC belonging to measured value At and is 0.74°/s0.27, 7.2 s0.34 and 1.5 [unit]0.18 for amplitude, duration and volume respectively, when At is the average over 6 days. λ is 0.27, 0.34, and 0.18 for amplitude, duration and volume, respectively (see supplementary material B for the SDCt when At the average over fewer than 6 days). To obtain the At the following formula can be used

At=i=1nTiλ1λn 4

where Ti is the tremor metric observed on day i and n is the number of days.

Associations between wearable-derived tremor metrics and clinically observed tremor were examined via one-way ANOVA, with wearable metrics as dependent variables and clinical evaluations (including healthy controls) as the independent variable. Post-hoc independent t-tests with Tukey correction identified group differences. Additionally, within PwPD without clinically observed tremor or with slight tremor (UPDRS = 1), independent t-tests compared wearable-derived scores between those who self-reported tremor and those who did not.

Supplementary information

Supplementary Information (358.1KB, pdf)

Acknowledgements

The authors would like to thank the participants of the ProPark study for their invaluable contribution. This work was carried out by the ProPark consortium, and we would like to extend our gratitude to all members of the consortium: J.J.v.H., W.D.J.v.d.B., D.H.H., M.J.T.R., M.F.C., H.W.B., O.A.v.d.H., R.M.d.B., A.J.W.B., I.M.E.A., M.M.P., L.J.d.S., S.C., R.H.A.W., J.v.W. We also wish to express our deepest gratitude to the research associates and patient researchers for their contributions to ProPark. This study was funded by ZonMW 460001001, Stichting Woelse Waard, Stichting Alkemade-Keuls and Hersenstichting. These funders played no role in study design, data collection, analysis and interpretation of data, or the writing of this manuscript. OccamzRazor provided in kind contribution through data analyses. CHDR provided the wearable sensors used in the ProPark study.

Author contributions

D.H., K.O., and R.W. co-wrote the original draft of the article. R.W. designed and performed the statistical analyses, supplemented by K.O. and advised upon by D.H., J.v.H., and V.E. A.S. and RW extracted tremor features from the raw gyroscope signals. All authors participated in the discussion of the results and critically reviewed and approved the final paper. D.H., J.v.H., R.W., and V.E. were involved in the design of the ProPark study from which data were obtained for the analyses. D.H. and K.O. contributed equally to the article.

Data availability

The dataset analyzed in the current study is available from the ProPark consortium on reasonable request: propark@amsterdamumc.nl. The Python implementation for detection and quantification of tremor severity, central to the paper, is available from https://gist.github.com/aronszanto/7b383c2fdf47cfbdfc78248378e9dc9a.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

These authors contributed equally: D. H. Hepp, K. Ocran.

Supplementary information

The online version contains supplementary material available at 10.1038/s41531-025-01163-0.

References

  • 1.Gigante, A. F. et al. Rest tremor in Parkinson’s disease: body distribution and time of appearance. J. Neurol. Sci.375, 215–219 (2017). [DOI] [PubMed] [Google Scholar]
  • 2.Gupta, H. V. Re-emergent kinetic tremor in Parkinson’s disease. Tremor. Other Hyperkinet. Mov.9, 10.7916/tohm.v0.660 (2019). [DOI] [PMC free article] [PubMed]
  • 3.Jankovic, J., Schwartz, K. S. & Ondo, W. Re-emergent tremor of Parkinson’s disease. J. Neurol. Neurosurg. Psychiatry67, 646–650 (1999). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Bhatia, K. P. et al. Consensus Statement on the classification of tremors. from the task force on tremor of the International Parkinson and Movement Disorder Society. Mov. Disord.33, 75–87 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Crawford, P. & Zimmerman, E. E. Differentiation and diagnosis of tremor. Am. Fam. Physician83, 697–702 (2011). [PubMed] [Google Scholar]
  • 6.van den Berg, K. R. E., Johansson, M. E., Dirkx, M. F., Bloem, B. R. & Helmich, R. C. Changes in action tremor in Parkinson’s disease over time: clinical and neuroimaging correlates. Mov. Disord.40, 292–304 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Heusinkveld, L. E., Hacker, M. L., Turchan, M., Davis, T. L. & Charles, D. Impact of tremor on patients with early stage Parkinson’s disease. Front. Neurol.9, 628 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Kilinc, B., Cetisli-Korkmaz, N., Bir, L. S., Marangoz, A. D. & Senol, H. The quality of life in individuals with Parkinson’s Disease: is it related to functionality and tremor severity? A cross-sectional study. Physiother. Theory Pract.40, 2213–2222 (2024). [DOI] [PubMed] [Google Scholar]
  • 9.Pirker, W., Katzenschlager, R., Hallett, M. & Poewe, W. Pharmacological treatment of tremor in Parkinson’s Disease revisited.J. Parkinson's Dis13, 127–144 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Dirkx, M. F. et al. Dopamine controls Parkinson’s tremor by inhibiting the cerebellar thalamus. Brain140, 721–734 (2017). [DOI] [PubMed] [Google Scholar]
  • 11.Zach, H. et al. Dopamine-responsive and dopamine-resistant resting tremor in Parkinson’s disease. Neurology95, e1461–e1470 (2020). [DOI] [PubMed] [Google Scholar]
  • 12.Abusrair, A. H., Elsekaily, W. & Bohlega, S. Tremor in Parkinson’s disease: from pathophysiology to advanced therapies. Tremor Other Hyperkinet. Mov.12, 29 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Pérez-Sánchez, J. R. et al. Gamma Knife® stereotactic radiosurgery as a treatment for essential and parkinsonian tremor: long-term experience. Neurologia38, 188–196 (2023). [DOI] [PubMed] [Google Scholar]
  • 14.Bond, A. E. et al. Safety and efficacy of focused ultrasound thalamotomy for patients with medication-refractory, tremor-dominant parkinson disease: a randomized clinical trial. JAMA Neurol.74, 1412–1418 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Collaborators, G. B. D. P. S. D. Global, regional, and national burden of Parkinson’s disease, 1990-2016: a systematic analysis for the Global Burden of Disease Study 2016. Lancet Neurol.17, 939–953 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.van der Gaag, B. L. et al. Risk factors for Parkinson’s disease: possibilities for prevention and intervention. Ned. Tijdschr. Geneeskd.167 (2023). [PubMed]
  • 17.de Lau, L. M. & Breteler, M. M. Epidemiology of Parkinson’s disease. Lancet Neurol.5, 525–535 (2006). [DOI] [PubMed] [Google Scholar]
  • 18.Kwon, S. H., Park, J. K. & Koh, Y. H. A systematic review and meta-analysis on the effect of virtual reality-based rehabilitation for people with Parkinson’s disease. J. Neuroeng. Rehabil.20, 94 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Rigas, G. et al. Assessment of tremor activity in Parkinson’s disease using a set of wearable sensors. IEEE Trans. Inf. Technol. Biomed.16, 478–487 (2012). [DOI] [PubMed] [Google Scholar]
  • 20.Guerra, A., D’Onofrio, V., Ferreri, F., Bologna, M. & Antonini, A. Objective measurement versus clinician-based assessment for Parkinson’s disease. Expert Rev. Neurother. 1–14 (2023). [DOI] [PubMed]
  • 21.Hendricks, R. M. & Khasawneh, M. T. An investigation into the use and meaning of Parkinson's Disease Clinical Scale Scores. Pakinson's Dis2021, 1765220 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Pasquini, J. et al. The clinical profile of tremor in Parkinson’s disease. Mov. Disord. Clin. Pract.10, 1496–1506 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Salarian, A. et al. Quantification of tremor and bradykinesia in Parkinson’s disease using a novel ambulatory monitoring system. IEEE Trans. Biomed. Eng.54, 313–322 (2007). [DOI] [PubMed] [Google Scholar]
  • 24.Sigcha, L. et al. Automatic resting tremor assessment in Parkinson’s Disease using smartwatches and multitask convolutional neural networks. Sensors21, 10.3390/s21010291 (2021). [DOI] [PMC free article] [PubMed]
  • 25.Martinez-Martin, P. & Hernandez, B. The Q10 questionnaire for detection of wearing-off phenomena in Parkinson’s disease. Parkinsonism Relat. Disord.18, 382–385 (2012). [DOI] [PubMed] [Google Scholar]
  • 26.Goetz, C. G. et al. Movement Disorder Society-sponsored revision of the Unified Parkinson’s Disease Rating Scale (MDS-UPDRS): scale presentation and clinimetric testing results. Mov. Disord.23, 2129–2170 (2008). [DOI] [PubMed] [Google Scholar]
  • 27.Landis, J. R. & Koch, G. G. The measurement of observer agreement for categorical data. Biometrics33, 159–174 (1977). [PubMed] [Google Scholar]
  • 28.Silva de Lima, A. L. et al. Feasibility of large-scale deployment of multiple wearable sensors in Parkinson’s disease. PLoS One12, e0189161 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Mahadevan, N. et al. Development of digital biomarkers for resting tremor and bradykinesia using a wrist-worn wearable device. NPJ Digit. Med.3, 5 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Burq, M. et al. Virtual exam for Parkinson’s disease enables frequent and reliable remote measurements of motor function. NPJ Digit. Med.5, 65 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Battista, L. & Romaniello, A. A new wrist-worn tool supporting the diagnosis of Parkinsonian motor syndromes. Sensors24, 10.3390/s24061965 (2024). [DOI] [PMC free article] [PubMed]
  • 32.Evers, L. J. W. et al. Passive monitoring of Parkinson tremor in daily life: a prototypical network approach. Sensors25, 366 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Fay-Karmon, T. et al. Home-based monitoring of persons with advanced Parkinson’s disease using smartwatch-smartphone technology. Sci. Rep.14, 9 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Saad, M. et al. Development of a tremor detection algorithm for use in an academic movement disorders center. Sensors24, 4960 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Lipsmeier, F. et al. Evaluation of smartphone-based testing to generate exploratory outcome measures in a phase 1 Parkinson’s disease clinical trial. Mov. Disord.33, 1287–1297 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Czech, M. D. et al. Improved measurement of disease progression in people living with early Parkinson’s disease using digital health technologies. Commun. Med.4, 49 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Farzanehfar, P., Woodrow, H. & Horne, M. Sensor measurements can characterize fluctuations and wearing off in parkinson’s disease and guide therapy to improve motor, non-motor and quality of life scores. Front. Aging Neurosci.14, 852992 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Davidson, M. The interpretation of diagnostic tests: a primer for physiotherapists. Aust. J. Physiother.48, 227–232 (2002). [DOI] [PubMed] [Google Scholar]
  • 39.Goetz, C. G., Stebbins, G. T., Blasucci, L. M. & Grobman, M. S. Efficacy of a patient-training videotape on motor fluctuations for on-off diaries in parkinson’s disease. Mov. Disord.12, 1039–1041 (1997). [DOI] [PubMed] [Google Scholar]
  • 40.Fleiss, J. L. Statistical Methods for Rates and Proportions. 2nd edition (Wiley, 1981).
  • 41.De Vet, H. C. W., Terwee, C. B., Mokkink, L. B. & Knol, D. L. Measurement in Medicine: A Practical Guide (Cambridge University Press, 2011).
  • 42.Siderowf, A. et al. Test-retest reliability of the unified Parkinson’s disease rating scale in patients with early Parkinson’s disease: results from a multicenter clinical trial. Mov. Disord.17, 758–763 (2002). [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Information (358.1KB, pdf)

Data Availability Statement

The dataset analyzed in the current study is available from the ProPark consortium on reasonable request: propark@amsterdamumc.nl. The Python implementation for detection and quantification of tremor severity, central to the paper, is available from https://gist.github.com/aronszanto/7b383c2fdf47cfbdfc78248378e9dc9a.


Articles from NPJ Parkinson's Disease are provided here courtesy of Nature Publishing Group

RESOURCES