Skip to main content
[Preprint]. 2025 Sep 5:2025.09.03.674008. [Version 1] doi: 10.1101/2025.09.03.674008

Figure 2:

Figure 2:

Multimodal comparison of autistic SMM across pose landmarks and accelerometer sensors. Columns represent distinct behaviors: Left, rocking involving torso motion; Right, flapping involving right wrist motion. Each row corresponds to a different data modality. (a, b): Video frames with MediaPipe skeleton overlays. Arrows indicate the direction of movement at a representative moment. Green dots denote the position and movement of a MediaPipe landmark, corresponding to matched timepoints in the time series plots. Purple dots indicate the position and movement of the accelerometer device at the same matched timepoints. (c, d): 2D trajectories of the chest (c), estimated from the midpoint of the shoulders and head, and the right wrist (d) landmarks from MediaPipe. The two green dots on each plot correspond to the same time points in the video frames above. (e, f): Approximated accelerations computed via second-order central differences, as defined in Equation 1. These estimates are sensitive to tracking noise and may not reflect absolute physical acceleration. (g, h): Tri-axial accelerometer signals from wearable sensors placed on the torso (g) and right wrist (h). All signals are consistently colored by axis: X (blue), Y (orange), Z (green). Purple dots correspond to the same timepoints in the video frames, illustrating temporal alignment across sensing modalities. Despite differences in sensing methods, coordinate frames, and signal units, periodic SMM is consistently captured. This visual correspondence supports using video-based pose estimation as a proxy for wearable inertial sensing in quantifying repetitive motor behavior.