Skip to main content
PLOS Digital Health logoLink to PLOS Digital Health
. 2024 Nov 26;3(11):e0000668. doi: 10.1371/journal.pdig.0000668

Deep learning-based screening for locomotive syndrome using single-camera walking video: Development and validation study

Junichi Kushioka 1,2,3,#, Satoru Tada 1,4,5,*,#, Noriko Takemura 1,6,#, Taku Fujimoto 7, Hajime Nagahara 1,8, Masahiko Onoe 9, Keiko Yamada 10,11, Rodrigo Navarro-Ramirez 12, Takenori Oda 13, Hideki Mochizuki 5, Ken Nakata 14, Seiji Okada 3, Yu Moriguchi 3,#
Editor: Ismini Lourentzou15
PMCID: PMC11593753  PMID: 39591393

Abstract

Locomotive Syndrome (LS) is defined by decreased walking and standing abilities due to musculoskeletal issues. Early diagnosis is vital as LS can be reversed with appropriate intervention. Although diagnosing LS using standardized charts is straightforward, the labor-intensive and time-consuming nature of the process limits its widespread implementation. To address this, we introduced a Deep Learning (DL)-based computer vision model that employs OpenPose for pose estimation and MS-G3D for spatial-temporal graph analysis. This model objectively assesses gait patterns through single-camera video captures, offering a novel and efficient method for LS prediction and analysis. Our model was trained and validated using a dataset of 186 walking videos, plus 65 additional videos for external validation. The model achieved an average sensitivity of 0.86, demonstrating high effectiveness in identifying individuals with LS. The model’s positive predictive value was 0.85, affirming its reliable LS detection, and it reached an overall accuracy rate of 0.77. External validation using an independent dataset confirmed strong generalizability with an Area Under the Curve of 0.75. Although the model accurately diagnosed LS cases, it was less precise in identifying non-LS cases. This study pioneers in diagnosing LS using computer vision technology for pose estimation. Our accessible, non-invasive model serves as a tool that can accurately diagnose the labor-intensive LS tests using only visual assessments, streamlining LS detection and expediting treatment initiation. This significantly improves patient outcomes and marks a crucial advancement in digital health, addressing key challenges in management and care of LS.

Author summary

Locomotive syndrome (LS) is a condition in which problems with bones, joints, muscles, and nerves cause a decline in the ability to walk and stand. It is estimated that more than 45 million people in Japan have LS. Early detection is vital because LS can be reversed with early treatment. Detecting LS using widely used diagnostic criteria is easy but labor-intensive and time-consuming and, therefore, not widespread enough. To solve this problem, we developed an artificial intelligence model to detect LS by capturing gait videos. Our artificial intelligence model performed as well as or better than orthopedic surgeons in diagnostic accuracy (accuracy: 72% in our artificial intelligence model vs 52% in the average of 6 different orthopedic doctors’ clinical diagnosis), but often diagnosed non-LS cases as LS. This non-invasive artificial intelligence model serves as an accurate and simple diagnostic tool for the LS examination, thereby accelerating the timing of behavioral change and treatment intervention. Our model will significantly improve patients’ quality of life and enhance the management and care of LS.

Introduction

Locomotive Syndrome (LS) is defined by decreased walking and standing abilities due to musculoskeletal issues including bones, joints, muscles, and nerves [1]. This decline in musculoskeletal and neurological function significantly impacts daily life activities and independence [2], and the mean prevalence of LS was reported to be 69.8% among the Japanese population [3]. LS is increasingly recognized as a major public health concern due to its impact on reducing physical mobility and function [4]. This condition is prevalent among the elderly and those leading sedentary lifestyles [5,6] and appears earlier in life than frailty [7,8]. When LS progresses, and the decline in physical ability becomes noticeable with symptoms, it is considered physical frailty [7,8]. The stage corresponding to this physical frailty can be described as “LS Stage 3,” where the decrease in mobility function hinders social participation [9]. Importantly, the systematic review found that the prevalence of physical frailty is estimated to be 12% in the global population aged over 50 years [10]. Unaddressed, LS can lead to reduced quality of life, higher medical costs, and a greater risk of falls and injuries, placing a significant strain on individuals and healthcare systems worldwide [11].

Management strategies for LS range from pharmacological treatments and surgical interventions for associated musculoskeletal disorders to physical rehabilitation aimed at improving muscle and balance strength [1]. Additionally, addressing symptoms such as pain and numbness, along with correcting nutritional imbalances, forms part of the comprehensive approach to LS treatment [1]. LS is notable for its potential reversibility with appropriate intervention, even conditions associated with late stage of LS may be reversible, underlining the importance of prompt and accurate diagnosis [12]. Although the process for diagnosing LS seemed straightforward by following standardized charts, simpler than using the general Short Physical Performance Battery [13], it requires subjective patient self-report and clinical evaluations by healthcare professionals [14]. This labor-intensive and time-consuming process leads to a gap in routine clinical diagnosis, preventing its wide implementation. Consequently, there is a growing demand in the medical field for the development of an automated, objective, and cost-effective tool that could improve the efficiency of the LS screening and diagnosis process, thereby mitigating the reliance on manual processes.

Recent progress in deep learning (DL) within the field of computer vision presents new strategies for overcoming diagnostic challenges [15]. Motion analysis, which previously required attaching numerous sensors for full motion capture, has become more convenient through pose estimation models applied to recorded video footage [16]. Innovations in this area have shown that computer vision systems can effectively authenticate individuals based on their walking patterns [17,18]. Moreover, these systems have broadened their utility by estimating age and fatigue levels by analyzing walking videos [1921]. This technological progress can potentially revolutionize the detection and assessment of human movement disorders, including LS [16].

This study aimed to develop and validate a DL-based computer vision model that identified LS from walking videos recorded with a single camera. By offering an accessible, non-invasive model capable of instantly diagnosing the labor-intensive LS tests through visual assessments alone, we sought to streamline LS detection and accelerate the initiation of treatment.

Results

Demographics

Table 1 presents the baseline characteristics of the study participants. In the model creation group, out of 66 participants, 42 (63.6%) were female. The median age for this group was 70 years. In terms of LS classification, the distribution was as follows: 24 participants were identified with stage-3 LS, 9 with stage-2 LS, 15 with stage-1 LS, and 18 were determined to be non-LS. For the external validation group, there were 65 participants, of which 43 (66.2%) were female, with a median age of 69 years. Within this cohort, LS staging was reported as 5 participants with stage-3 LS, 4 with stage-2 LS, 35 with stage-1 LS, and 21 classified as non-LS. Detailed characteristics of the participants in the model creation and external validation groups are shown in the S3 and S4 Appendix, respectively. Across both groups, the predominant age range was 70–79 years old.

Table 1. Baseline characteristics of the participants.

Model Creation (n = 66) n (%) External Validation (n = 65) n (%)
Age (y) <40 9 (13.6) 8 (12.3)
40–49 7 (10.6) 5 (7.7)
50–59 7 (10.6) 9 (13.8)
60–69 9 (13.6) 11 (16.9)
70–79 20 (30.3) 25 (38.5)
> = 80 14 (21.2) 7 (10.8)
Gender Female 42 (63.6) 43 (66.2)
Male 24 (36.4) 22 (33.8)
LS Stage 0 n, (%) 18 (27.3) 21 (32.3)
1 n, (%) 15 (22.7) 35 (53.8)
2 n, (%) 9 (13.6) 4 (6.2)
3 n, (%) 24 (36.4) 5 (7.7)

LS; Locomotive Syndrome

Model Creation and Internal Validation

The data sets for model creation, internal validation, and external validation using a different dataset are described in Fig 1.

Fig 1. Data sets for model creation and external validation.

Fig 1

In the development and subsequent internal validation of our computer vision model, tailored for LS screening, a strategic emphasis was placed on optimizing the model’s sensitivity. This focus is pivotal for a screening instrument intended for the early detection of potential LS cases, ensuring a high degree of accuracy in identifying true positives. The validation process employed a structured cross-validation (CV) methodology, encompassing three distinct segments: CV1, CV2, and CV3, to rigorously assess the model’s diagnostic performance. The results are summarized in Table 2.

Table 2. Internal validation by cross-validation.

Sensitivity Specificity PPV NPV Accuracy
CV1 0.81 0.87 0.96 0.56 0.82
CV2 0.89 0.27 0.78 0.46 0.73
CV3 0.89 0.38 0.81 0.54 0.77
average 0.86 0.51 0.85 0.52 0.77

PPV; Positive Predictive Value, NPV; Negative Predictive Value, CV; Cross-Validation

Our findings reveal notable sensitivity across the CV iterations, with CV1 achieving a sensitivity of 0.81, and both CV2 and CV3 displaying enhanced sensitivity at 0.89. These results yield an average sensitivity of approximately 0.86, illustrating the model’s proficiency in accurately detecting LS cases, which is critical for a reliable screening tool. Conversely, specificity scores exhibited considerable variability, with CV1 demonstrating a high specificity of 0.87, contrasted by the reduced specificity observed in CV2 (0.27) and CV3 (0.38), averaging 0.51 across the evaluations. This variability underscores the model’s inconsistent ability in identifying true negatives across diverse data sets.

Moreover, the model’s Positive Predictive Value (PPV) recorded robust outcomes, with scores of 0.96 (CV1), 0.78 (CV2), and 0.81 (CV3), leading to an aggregate PPV of 0.85. These PPV metrics signify that the model’s predictions regarding LS presence are generally precise, denoting a high level of diagnostic accuracy. However, the Negative Predictive Value (NPV) presented variability and identified areas for improvement, with an average NPV of 0.52 across the CV phases, reflecting the model’s fluctuating capability in accurately ruling out non-LS cases.

The accuracy assessments, indicating the model’s overall efficacy in correctly classifying LS and non-LS instances, were documented at 0.82 (CV1), 0.73 (CV2), and 0.77 (CV3), with an overall average accuracy of 0.77.

External Validation

Upon completing the model’s development and internal validation, we proceeded with external validation using an independent dataset. This step was crucial for evaluating the model’s generalizability and accuracy in a different clinical setting. A Receiver Operating Characteristic (ROC) curve was constructed to provide a detailed assessment of the model’s diagnostic performance. The Area Under the Curve (AUC), illustrated in Fig 2, was calculated at 0.75, demonstrating the model’s predictive accuracy.

Fig 2. Area under the curve (AUC) for external validation performance.

Fig 2

Next, subgroups analysis was conducted to compare patients accurately diagnosed (Accurate) by our developed DL-based model with those inaccurately diagnosed (Inaccurate) (Table 3). The distribution of LS stage differed significantly between the two groups (p < 0.001). Further examination of each LS stage revealed that diagnostic accuracy was notably lower for non-LS cases compared to LS stages 1 and above (Fig 3).

Table 3. Comparison between accurate and inaccurate groups by DL-model for external validation.

Accurate (n = 47) Inaccurate (n = 18) P-Value
Age (y) mean, (SD) 65.8 (15.3) 57.4 (19.1) 0.07
Gender Female n, (%) 32 (68.1) 12 (66.7) 0.99
LS Stage Stage distribution <0.001
0 n, (%) 8 (17.0) 13 (72.2)
1 n, (%) 32 (68.0) 3 (16.7)
2 n, (%) 3 (6.4) 1 (5.6)
3 n, (%) 4 (8.5) 1 (5.6)

LS; Locomotive Syndrome

Fig 3. Proportion of diagnosis accuracy for external validation of DL-based model by LS Stage.

Fig 3

Furthermore, we evaluated the diagnostic performance of our DL-based model against the collective judgment of six certified orthopedic surgeons with over ten years of clinical experience. Each doctor independently assessed the same video dataset for the presence of LS in an external validation process. The average diagnostic metrics derived from the doctors’ assessments are summarized in Table 4.

Table 4. The diagnosing performance between DL-based model and doctors’ visual examination.

Sensitivity Specificity PPV NPV Accuracy
DL 0.89 0.38 0.75 0.62 0.72
Doctors (n = 6) 0.40 0.77 0.84 0.38 0.52

DL; Deep-Learning based model, PPV; Positive Predictive Value, NPV; Negative Predictive Value

Our developed DL-based model exhibited a higher sensitivity (89%) than the average doctors (40%), indicating superior effectiveness in identifying affected patients. However, the doctors demonstrated greater specificity (77% vs 38%), suggesting the DL-based model’s higher tendency for false positives. In terms of predictive values, the DL-based model’s PPV was slightly lower (75% vs 84%), but its NPV was higher (62% vs 38%) compared to the average of doctors. Overall accuracy favored the DL-based model (72% vs 52%), underscoring its potential to more accurately diagnose LS, despite its limitation in specificity.

Discussion

This study aimed to develop and validate a DL-based computer vision model for diagnosing LS by analyzing gait patterns in single-camera video recordings using pose estimation (OpenPose) and graph-structured data through a spatial-temporal graph convolutional network (MS-G3D).

This study introduces the pioneering integration of OpenPose and MS-G3D for the detection of LS, marking a significant advancement in medical diagnostics by combining OpenPose’s precise pose estimation from video data with MS-G3D’s sophisticated analysis of spatial-temporal graph data. OpenPose’s application in our model is supported by its proven effectiveness in various medical contexts, such as analyzing gait abnormalities in individuals with lower limb dysfunction [22] and assessing joint alignment in knee osteoarthritis patients through non-invasive measurement of hip-knee-ankle angles [23], thereby highlighting its utility in clinical medicine and rehabilitation. The incorporation of MS-G3D enhances the model’s diagnostic precision by processing the dynamic interactions between body parts during movement, allowing for the detection of subtle gait anomalies indicative of LS. This synergistic use of OpenPose’s and MS-G3D’s capabilities establishes a comprehensive framework for the early detection and evaluation of LS, leveraging both technologies’ strengths to offer an innovative approach to medical condition diagnosis.

The model was subjected to a comprehensive validation process, encompassing both rigorous internal and external validation using an independent dataset, to evaluate its generalizability and diagnostic efficacy across varied clinical environments. During the internal validation, the model exhibited a notable average sensitivity of 0.86, showcasing its ability to accurately identify individuals with LS effectively, which is critical for a screening tool designed for early detection. Despite achieving a commendable sensitivity, the model’s specificity presented less consistency, averaging at 0.51, highlighting a necessity for enhancement in accurately distinguishing individuals without LS. The PPV remained strong at an average of 0.85, indicating that most LS predictions by the model are precise. However, the lower NPV and the fluctuating specificity underscore a possible challenge in reliably excluding non-LS cases. An overall accuracy rate of 0.77 confirmed the model’s substantial capability in differentiating LS from non-LS conditions accurately. The development phase’s strategic focus on optimizing sensitivity aimed at enhancing the model’s application for early LS detection. Although specificity varied, the solid PPV underscores the model’s dependable predictive performance, with the synergy between high sensitivity and PPV emphasizing its aptness for early LS screening.

The external validation of our model using an independent dataset yielded an AUC of 0.75, confirming the model’s reasonable predictive accuracy in distinguishing individuals with and without LS, thereby verifying its effectiveness as a screening tool in various settings [24]. The consistency of results between internal and external validations highlights the model’s stable performance across diverse datasets and clinical environments, essential for its reliability and broader clinical application. Subgroup analysis provided deeper insights into the model’s diagnostic accuracy, revealing significant differences in the distribution of LS stages between correctly and incorrectly diagnosed cases, especially noting the model’s lower accuracy in identifying non-LS cases. Comparing our DL-based model with the collective judgment of six experienced orthopedic surgeons revealed its high sensitivity (89%) in identifying LS patients, substantially surpassing the doctors’ average sensitivity (41%), which is critical for early detection and treatment. However, the model’s specificity (38%) was lower than that of the doctors (74%), indicating a tendency towards more false positives. Although the model’s PPV was slightly below the doctors’ average (75% vs. 83%), its NPV was considerably higher (62% vs. 38%), bolstering its effectiveness in excluding non-LS cases. With an overall accuracy of 72% compared to the doctors’ 52%, the model shows significant promise as a diagnostic tool, though the specificity gap underscores the importance of integrating the model with clinical assessment to avoid unnecessary interventions.

Recent advancements in deep learning and computer vision have revolutionized the study of human movement disorders, enabling the automatic tracking and analysis of human movement through video data, thus identifying key body landmarks for the quantitative evaluation of motor functions, which is invaluable for individuals with musculoskeletal and neurological impairments [25]. Markerless Motion Capture (MMC) technology allows for the non-invasive analysis of human motion and has been shown to effectively distinguish between individuals with conditions such as Parkinson’s disease and healthy subjects by evaluating symptoms like bradykinesia and tremor [26]. MMC’s utility extends to three-dimensional gait assessment in community settings, highlighting its practicality and integration into real-world applications, thus enabling widespread clinical adoption and facilitating patient monitoring, which supports personalized rehabilitation strategies [27]. Moreover, a novel video-based method employing deep learning for the evaluation of bradykinesia in Parkinson’s disease, which assesses mobility during daily activities with a focus on fine motor movements like the thumb-index finger distance, indicates enhanced accuracy compared to conventional clinical assessments [28]. This method underscores the potential for advanced remote monitoring and the development of more customized care plans for patients [28].

A recent study leveraging front-view video analysis to automatically classify gait severity in Parkinson’s disease (PD) analyzed 456 videos from 19 PD patients and employed a support vector machine to achieve an AUC of 80.88%, highlighting its effectiveness in identifying various gait impairment levels in PD patients and suggesting its applicability in home settings for PD assessments [29]. Similarly, another study developed a neural network model that used low-dimensional postural data from social interaction videos to identify autism in children with an accuracy of 80.9%, a PPV of 0.784, and a sensitivity of 0.854, analyzing an initial sample of 136 children and an additional test set of 101 children with autism spectrum disorder, utilizing an Long Short Term Memory (LSTM) network to interpret temporal sequences of skeletal key points from short video segments [30]. The promising results of these studies parallel the validation outcomes of our model, which exhibits comparable accuracy, indicating its efficacy for early detection of LS and suggesting its potential for widespread home-based screening and clinical application in monitoring and early detection, similarly facilitating timely interventions for individuals at risk.

Our DL-based computer vision model is the first of its kind, aimed specifically at detecting LS, a condition notable for its potential reversibility with early and appropriate intervention, allowing individuals to regain comfortable mobility [12]. Offering a non-invasive, efficient, and accessible screening method, our model plays a crucial role in the early identification of LS, facilitating faster initiation of treatment. Its introduction marks a substantial advancement in managing LS, providing healthcare professionals with a vital tool for improving patient outcomes and addressing healthcare challenges associated with the condition. Beyond LS, our model has broader implications in digital health by enabling the diagnosis of various gait disorders through walking video analysis, covering conditions like cervical myelopathy, lumbar spinal stenosis, osteoarthritis, Parkinson’s disease, cerebrovascular disorders, and peripheral artery diseases. These conditions, which affect gait and require assessments from multiple medical specialties, complicate diagnosis and treatment. Our model’s ability to aid in the differential diagnosis of diseases with unique gait patterns empowers general practitioners to efficiently refer patients to specialized care, streamlining the diagnostic process and enhancing patient care.

The primary limitation of our study is its relatively small participant base, potentially weakening the conclusions’ strength and generalizability. Additionally, observed low value in the model’s specificity and NPV raises concerns about its consistent performance in accurately identifying individuals without LS. The imbalanced training dataset in our current study (non-LS: LS = 52: 134) (Fig 1) could have a potential negative impact on our model, possibly leading to the decreased specificity. Although external validation suggests the model’s broad applicability, the utilized dataset’s lack of comprehensive demographic representation might limit our findings’ universality. Addressing the model’s current shortcomings in distinguishing between LS and non-LS cases is crucial. Future initiatives should focus on incorporating a wider array of clinical parameters and training the model with datasets that cover a more extensive range of LS stages and related conditions. Expanding the study to include a larger and more diverse participant group will enhance the research’s integrity and the model’s relevance to various populations. Conducting further validation studies in diverse clinical and demographic settings is vital to ascertain the model’s effectiveness in global healthcare applications, thereby ensuring its contributions to precision medicine are significant and widespread.

Our developed computer vision model represents a significant advancement in the screening of LS, showcasing remarkable sensitivity and predictive accuracy, surpassing the diagnostic capabilities of visual examinations by experienced doctors. This study is at the forefront of employing computer vision technology for pose estimation to diagnose LS, introducing a method that is both accessible and non-invasive. By facilitating early and efficient detection of LS, our model enables quicker commencement of treatment, substantially enhancing patient outcomes. This progress constitutes a pivotal development in digital health, tackling major obstacles in the management and care of LS, and sets a new benchmark for leveraging technology to improve healthcare delivery.

Methods

Study design

This study aimed to develop an innovative DL-based computer vision model to identify LS effectively. We developed this model by employing prospectively collected data. To evaluate the effectiveness and broader applicability of our model, we carried out external validation with a separate dataset, also collected prospectively, from an alternative institution.

The study received approval from the Regional Committee for Medical and Health Research Ethics at Osaka Minami Medical Center (OMMC), ensuring compliance with ethical standards and patient safety. Prior to inclusion in the study, all participants provided written informed consent. Additionally, the Ethics Committee of OMMC granted specific approval for this prognostic study (Approval code: R5-42). Consistent with the ethical standards, our study protocol was meticulously designed to align with the principles outlined in the Declaration of Helsinki, guaranteeing respect for the rights and well-being of all participants.

Participants

The study selected participants based on specific inclusion and exclusion criteria. To be eligible for inclusion, participants were required to meet the following criteria: a minimum age of 20 years, voluntary participation, the ability to independently walk ten meters, consent to undergo the LS risk test, and undergo a medical examination by a certified orthopedic surgeon or neurologist. Individuals were excluded from the study if they were under the age of 20, expressed unwillingness to participate, or were unable to independently walk 10 meters.

For model development, a total of 66 patients who visited the Department of Orthopedics at OMMC between December 22, 2021, and February 21, 2022, were enrolled. Two-thirds of the 66 participants in OMMC were randomly assigned to the model development sample, and one-third were randomly assigned to the internal validation sample. Additionally, for the external validation of the model, 65 participants from the Matsuzaka Health Festival 2023, held in Matsuzaka City on September 10, 2023, were recruited.

Data collection

For the development of our model, we gathered 186 walking videos from individuals attending the Department of Orthopedics at OMMC. Additionally, for the purpose of external validation, 65 walking videos were collected from attendees of the Matsuzaka Health Festival 2023.

For the video recording, we used a FLIR CHAMELEON3 camera (P/N: CM3-U3-13S2, Edmund Optics Inc., Barrington, USA), with a resolution of 1288x964, 30 FPS, and 1.3 megapixels, and the Edmund Optics UC Fixed Focal Length lens (#33–300) with a 4mm focal length, 12-megapixel C-mount lens with an M61 x 75 filter size, and less than 17.5% distortion.

Participants were instructed to walk down a designated ten-meter path three times. During these walks, each participant was filmed from the right side, with the camera positioned four meters away from the walking path to ensure clear lateral movement capture. The raw footage was saved in MP4 format utilizing Advanced Video Coding (AVC). To specifically analyze stable walking patterns, only the walking sequences occurring between the four to seven-meter marks of the ten-meter path were considered for detailed analysis. For technical reasons, some of the videos recorded could not be played back, and in such cases, no more than two videos per participant were employed as a training dataset.

LS risk test

The LS risk test comprises three components: a patient-reported outcome measure called the GLFS-25, and two performance tests known as the two-step and stand-up tests. These tests have been previously described in research papers [31]. In brief, the GLFS-25 questionnaire consists of 25 questions, each rated on a Likert scale from 0 to 4, assessing difficulties related to mobility in daily life. Higher scores on this scale indicate a worsening health condition, and the total score, which ranges from 0 to 100, was used for analysis. The content of the GLFS-25 is shown in S1 Appendix. The two-step test involves patients starting from a standing position and taking two successive steps as far as they can. The distance covered by these two steps is divided by the patient’s height for standardization. This test is performed twice, and the best result is recorded. The stand-up test is conducted using stools of four different heights (10, 20, 30, and 40 cm). Participants are required to stand up from these stools, either using one or both legs and maintain their posture for 3 seconds after standing. A score between 0 and 8 is assigned based on successful performance, with a higher score indicating better physical condition.

The severity of LS is categorized using LS staging criteria as follows: normal, Stage 1 (the initial stage of decreased mobility defined by specific criteria for the two-step test, stand-up test, and GLFS-25 score), and Stage 2 (an advancing stage of decreased mobility defined by different criteria for the same tests). Additionally, a more severe stage known as Stage 3 (advanced decrease in mobility, limiting social engagement) has recently been defined, with specific criteria for the two-step test, stand-up test, and GLFS-25 score.

Deep learning-based locomotive syndrome prediction method

Our deep learning-based method for predicting LS is depicted in Fig 4 and encompasses four main steps: video recording, pose estimation, model development, and prediction of LS. The details of each step are described below.

Fig 4. Deep learning-based locomotive syndrome prediction method.

Fig 4

a. Steps involved in deep learning-based method for locomotive syndrome prediction. Step 1: Video recording of the subject. Step 2: Pose estimation conducted using OpenPose. Step 3: Development of the LS prediction model utilizing MS-G3D. Step 4: Final prediction of Locomotive Syndrome. LS stands for Locomotive Syndrome. b. Skeleton model generated by OpenPose. Depicts the 2D coordinates for 25 key body points as identified by the OpenPose framework. c. Spatial-Temporal GCN-Based LS Prediction Model Utilizing MS-G3D. Diagram of the Spatial/Temporal Graph Convolutional Network (GCN) component. d. Spatial-Temporal GCN-Based LS Prediction Model Utilizing MS-G3D. Enhanced Spatial-Temporal GCN architecture incorporating skip connections.

  1. Video recording: We recorded subjects walking sideways using a standard digital camera to capture natural gait patterns, ensuring clear visibility of the full body in motion. Forty frames from each video, which typically contain more than one gait cycle and allow for a complete representation of the subject’s gait characteristics, were then processed.

  2. Pose estimation: The recorded walking videos were processed using the OpenPose framework [32], which provided 2D coordinates for 25 body key points as depicted in Fig 4. The OpenPose simultaneously estimates the heatmap of each joint position and the Part Affinity Fields that represent the relationship between joints by deep learning and estimates the 2D coordinates of each key point using these maps.

  3. Model development: The key point data obtained from the previous OpenPose were converted into graph-structured format, representing the body’s joints and their connections. We employed a deep learning model, MS-G3D [33], based on spatial-temporal graph convolution networks, to learn the LS prediction model. The MS-G3D model enhances feature extraction by applying convolutions across both spatial and temporal dimensions, leveraging skip connections to encapsulate spatial-temporal relationships effectively.

  4. Prediction of LS: The trained model predicts LS by analyzing the subject’s gait captured in the video. It outputs a probability score, PLS, indicating the likelihood of LS presence. For classification, a threshold of 0.5 is applied; if PLS ≥ 0.5, the gait is classified as indicative of LS; otherwise, it is classified as non-LS. Sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and accuracy were calculated following the equations shown in S2 Appendix.

Visual diagnosis of doctors for walking videos from external validation dataset

For the external validation dataset, we employed a visual diagnostic approach with six certified orthopedic surgeons, each boasting over a decade of clinical experience. These doctors were asked to evaluate 65 walking video clips, identical to those used in the external validation process. These videos showcased individuals performing a ten-meter walk test, captured from a lateral perspective. The task for each doctor was to determine whether the subjects demonstrated symptoms of LS. To aid in their assessment, each video was made available for replay up to three times, facilitating a thorough evaluation of potential LS indicators.

Statistical analysis

Statistical analyses were conducted using Python 3.8, specifically leveraging the SciPy library for our calculations. To evaluate the diagnostic performance of our model, we generated a Receiver Operating Characteristic (ROC) curve and calculated the Area Under the Curve (AUC) as a measure of accuracy during external validation. For numeric variables, differences in means were assessed using independent samples t-tests, allowing us to compare the average values between two groups. For categorical variables, we applied Chi-square tests of independence to determine if there were significant associations between the groups. This comprehensive statistical evaluation was carried out over a period extending from November 11, 2023, to April 3, 2024, ensuring thorough analysis and interpretation of our data.

Supporting information

S1 Appendix. The contents of the GLFS-25 questionnaire.

(DOCX)

pdig.0000668.s001.docx (14.2KB, docx)
S2 Appendix. The definition of sensitivity, specificity, positive and negative predictive value, and accuracy.

(DOCX)

pdig.0000668.s002.docx (88.8KB, docx)
S3 Appendix. Detailed characteristics of the participants in the model creation group.

(DOCX)

pdig.0000668.s003.docx (20.1KB, docx)
S4 Appendix. Detailed characteristics of the participants in the external validation group.

(DOCX)

pdig.0000668.s004.docx (20.1KB, docx)

Acknowledgments

We would like to express our profound gratitude to Ms. Saeko Doi, Mr. Yasuyoshi Takada, Mr. Yoshiyuki Kuwada, Dr. Miki Tagami, Dr. Takashi Miwa, Dr. Takanori Hazama, and Dr. Ichiro Nakahara for their invaluable contributions to the data collection process in this study.

Code availability

The underlying code for this study and training/validation datasets is not publicly available for proprietary reasons.

Data Availability

The datasets used and/or analyzed during the current study, including patients' walking videos, are available upon reasonable request. Participant consent forms specified that only study investigators could access the datasets. Additionally, the videos have elements of personal identification, such as face and gait patterns. Individuals seeking access to the datasets will need the approval of the Ethics Review Board at Kishiwada City Hospital (e-mail: kch@kishiwada-hospital.com, phone: +81-724-45-1000, URL: https://www.kishiwada-hospital.com).

Funding Statement

This study was supported by JST SBIR phase 1 (grant number JPMJST2171 to S.T.) and by the 34th research grant by The Nakatomi Foundation (grant number 20211279 to S.T.). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. No authors received a salary from any of our funders.

References

  • 1.Nakamura K, Ogata T. Locomotive Syndrome: Definition and Management. Clin Rev Bone Miner Metab. 2016;14(2):56–67. Epub 20160525. doi: 10.1007/s12018-016-9208-2 ; PubMed Central PMCID: PMC4906066. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Tokida R, Ikegami S, Takahashi J, Ido Y, Sato A, Sakai N, et al. Association between musculoskeletal function deterioration and locomotive syndrome in the general elderly population: a Japanese cohort survey randomly sampled from a basic resident registry. BMC Musculoskeletal Disorders. 2020;21(1):431. doi: 10.1186/s12891-020-03469-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Yoshimura N, Muraki S, Nakamura K, Tanaka S. Epidemiology of the locomotive syndrome: The research on osteoarthritis/osteoporosis against disability study 2005–2015. Mod Rheumatol. 2017;27(1):1–7. doi: 10.1080/14397595.2016.1226471 . [DOI] [PubMed] [Google Scholar]
  • 4.Kimura A, Takeshita K, Inoue H, Seichi A, Kawasaki Y, Yoshii T, et al. The 25-question Geriatric Locomotive Function Scale predicts the risk of recurrent falls in postoperative patients with cervical myelopathy. J Orthop Sci. 2018;23(1):185–9. Epub 20171031. doi: 10.1016/j.jos.2017.10.006 .https://www.ncbi.nlm.nih.gov/pubmed/29100824 [DOI] [PubMed] [Google Scholar]
  • 5.Akahane M, Yoshihara S, Maeyashiki A, Tanaka Y, Imamura T. Lifestyle factors are significantly associated with the locomotive syndrome: a cross-sectional study. BMC Geriatr. 2017;17(1):241. Epub 20171018. doi: 10.1186/s12877-017-0630-1 ; PubMed Central PMCID: PMC5648444. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Nakamura M, Kobashi Y, Hashizume H, Oka H, Kono R, Nomura S, et al. Locomotive syndrome is associated with body composition and cardiometabolic disorders in elderly Japanese women. BMC Geriatr. 2016;16(1):166. Epub 20160927. doi: 10.1186/s12877-016-0339-6 ; PubMed Central PMCID: PMC5039907. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Imagama S, Ando K, Kobayashi K, Machino M, Tanaka S, Morozumi M, et al. Differences of locomotive syndrome and frailty in community-dwelling middle-aged and elderly people: Pain, osteoarthritis, spinal alignment, body balance, and quality of life. Mod Rheumatol. 2020;30(5):921–9. Epub 20190919. doi: 10.1080/14397595.2019.1665616 .https://www.ncbi.nlm.nih.gov/pubmed/31495262 [DOI] [PubMed] [Google Scholar]
  • 8.Ide K, Banno T, Yamato Y, Hasegawa T, Yoshida G, Yasuda T, et al. Relationship between locomotive syndrome, frailty and sarcopenia: Locomotive syndrome overlapped in the majority of frailty and sarcopenia patients. Geriatr Gerontol Int. 2021;21(6):458–64. Epub 20210406. doi: 10.1111/ggi.14162 .https://www.ncbi.nlm.nih.gov/pubmed/33825291 [DOI] [PubMed] [Google Scholar]
  • 9.Taniguchi M, Ikezoe T, Tsuboyama T, Tabara Y, Matsuda F, Ichihashi N, et al. Prevalence and physical characteristics of locomotive syndrome stages as classified by the new criteria 2020 in older Japanese people: results from the Nagahama study. BMC Geriatrics. 2021;21(1):489. doi: 10.1186/s12877-021-02440-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.O’Caoimh R, Sezgin D, O’Donovan MR, Molloy DW, Clegg A, Rockwood K, et al. Prevalence of frailty in 62 countries across the world: a systematic review and meta-analysis of population-level studies. Age Ageing. 2021;50(1):96–104. doi: 10.1093/ageing/afaa219 . [DOI] [PubMed] [Google Scholar]
  • 11.Otani K, Takegami M, Fukumori N, Sekiguchi M, Onishi Y, Yamazaki S, et al. Locomotor dysfunction and risk of cardiovascular disease, quality of life, and medical costs: design of the Locomotive Syndrome and Health Outcome in Aizu Cohort Study (LOHAS) and baseline characteristics of the study population. J Orthop Sci. 2012;17(3):261–71. Epub 20120412. doi: 10.1007/s00776-012-0200-5 .https://www.ncbi.nlm.nih.gov/pubmed/22526710 [DOI] [PubMed] [Google Scholar]
  • 12.Ohba T, Oba H, Koyama K, Oda K, Tanaka N, Fujita K, et al. Locomotive syndrome: Prevalence, surgical outcomes, and physical performance of patients treated to correct adult spinal deformity. J Orthop Sci. 2021;26(4):678–83. Epub 20200902. doi: 10.1016/j.jos.2020.06.012 .https://www.ncbi.nlm.nih.gov/pubmed/32888792 [DOI] [PubMed] [Google Scholar]
  • 13.de Fátima Ribeiro Silva C, Ohara DG, Matos AP, Pinto A, Pegorari MS. Short Physical Performance Battery as a Measure of Physical Performance and Mortality Predictor in Older Adults: A Comprehensive Literature Review. Int J Environ Res Public Health. 2021;18(20). Epub 20211010. doi: 10.3390/ijerph182010612 ; PubMed Central PMCID: PMC8535355. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Ikemoto T, Arai YC. Locomotive syndrome: clinical perspectives. Clin Interv Aging. 2018;13:819–27. Epub 20180430. doi: 10.2147/CIA.S148683 ; PubMed Central PMCID: PMC5933401. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Chen X, Wang X, Zhang K, Fung KM, Thai TC, Moore K, et al. Recent advances and clinical applications of deep learning in medical image analysis. Med Image Anal. 2022;79:102444. Epub 20220404. doi: 10.1016/j.media.2022.102444 ; PubMed Central PMCID: PMC9156578. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Kidziński Ł, Yang B, Hicks JL, Rajagopal A, Delp SL, Schwartz MH. Deep neural networks enable quantitative movement analysis using single-camera videos. Nature Communications. 2020;11(1):4054. doi: 10.1038/s41467-020-17807-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Uddin MZ, Muramatsu D, Takemura N, Ahad MAR, Yagi Y. Spatio-temporal silhouette sequence reconstruction for gait recognition against occlusion. IPSJ Transactions on Computer Vision and Applications. 2019;11(1):9. doi: 10.1186/s41074-019-0061-3 [DOI] [Google Scholar]
  • 18.Takemura N, Makihara Y, Muramatsu D, Echigo T, Yagi Y. On Input/Output Architectures for Convolutional Neural Network-Based Cross-View Gait Recognition. IEEE Transactions on Circuits and Systems for Video Technology. 2019;29:2708–19. [Google Scholar]
  • 19.Xu C, Sakata A, Makihara Y, Takemura N, Muramatsu D, Yagi Y, et al. Uncertainty-Aware Gait-Based Age Estimation and its Applications. IEEE Transactions on Biometrics, Behavior, and Identity Science. 2021;3(4):479–94. doi: 10.1109/TBIOM.2021.3080300 [DOI] [Google Scholar]
  • 20.Sakata A, Takemura N, Yagi Y. Gait-based age estimation using multi-stage convolutional neural network. IPSJ Transactions on Computer Vision and Applications. 2019;11(1):4. doi: 10.1186/s41074-019-0054-2 [DOI] [Google Scholar]
  • 21.Aoki K, Nishikawa H, Makihara Y, Muramatsu D, Takemura N, Yagi Y. Physical Fatigue Detection From Gait Cycles via a Multi-Task Recurrent Neural Network. IEEE Access. 2021;9:127565–75. doi: 10.1109/ACCESS.2021.3110841 [DOI] [Google Scholar]
  • 22.Takeda I, Yamada A, Onodera H. Artificial Intelligence-Assisted motion capture for medical applications: a comparative study between markerless and passive marker motion capture. Comput Methods Biomech Biomed Engin. 2021;24(8):864–73. Epub 20201208. doi: 10.1080/10255842.2020.1856372 .https://www.ncbi.nlm.nih.gov/pubmed/33290107 [DOI] [PubMed] [Google Scholar]
  • 23.Saiki Y, Kabata T, Ojima T, Kajino Y, Inoue D, Ohmori T, et al. Reliability and validity of OpenPose for measuring hip-knee-ankle angle in patients with knee osteoarthritis. Sci Rep. 2023;13(1):3297. Epub 20230225. doi: 10.1038/s41598-023-30352-1 ; PubMed Central PMCID: PMC9968277. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Hosmer DW Jr., Lemeshow S. and Sturdivant R.X. Assessing the Fit of the Model. Applied Logistic Regression. Wiley Series in Probability and Statistics 2013. p. 153–225. [Google Scholar]
  • 25.Stenum J, Cherry-Allen KM, Pyles CO, Reetzke RD, Vignos MF, Roemmich RT. Applications of Pose Estimation in Human Health and Performance across the Lifespan. Sensors. 2021;21(21):7315. doi: 10.3390/s21217315 10.3390/s21217315 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Lam WWT, Tang YM, Fong KNK. A systematic review of the applications of markerless motion capture (MMC) technology for clinical measurement in rehabilitation. J Neuroeng Rehabil. 2023;20(1):57. Epub 20230502. doi: 10.1186/s12984-023-01186-9 ; PubMed Central PMCID: PMC10155325. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.McGuirk TE, Perry ES, Sihanath WB, Riazati S, Patten C. Feasibility of Markerless Motion Capture for Three-Dimensional Gait Assessment in Community Settings. Front Hum Neurosci. 2022;16:867485. Epub 20220609. doi: 10.3389/fnhum.2022.867485 ; PubMed Central PMCID: PMC9224754. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Yang YY, Ho MY, Tai CH, Wu RM, Kuo MC, Tseng YJ. FastEval Parkinsonism: an instant deep learning-assisted video-based online system for Parkinsonian motor symptom evaluation. NPJ Digit Med. 2024;7(1):31. Epub 20240208. doi: 10.1038/s41746-024-01022-x ; PubMed Central PMCID: PMC10853559. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Khan T, Zeeshan A, Dougherty M. A novel method for automatic classification of Parkinson gait severity using front-view video analysis. Technol Health Care. 2021;29(4):643–53. doi: 10.3233/THC-191960 ; PubMed Central PMCID: PMC9789477. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Kojovic N, Natraj S, Mohanty SP, Maillart T, Schaer M. Using 2D video-based pose estimation for automated prediction of autism spectrum disorders in young children. Sci Rep. 2021;11(1):15069. Epub 20210723. doi: 10.1038/s41598-021-94378-z ; PubMed Central PMCID: PMC8302646. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Ogata T, Muranaga S, Ishibashi H, Ohe T, Izumida R, Yoshimura N, et al. Development of a screening program to assess motor function in the adult population: a cross-sectional observational study. J Orthop Sci. 2015;20(5):888–95. Epub 20150526. doi: 10.1007/s00776-015-0737-1 ; PubMed Central PMCID: PMC4575377. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Cao Z, Hidalgo G, Simon T, Wei SE, Sheikh Y. OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields. IEEE Trans Pattern Anal Mach Intell. 2021;43(1):172–86. Epub 20201204. doi: 10.1109/TPAMI.2019.2929257 .https://www.ncbi.nlm.nih.gov/pubmed/31331883 [DOI] [PubMed] [Google Scholar]
  • 33.Liu Z, Zhang H, Chen Z, Wang Z, Ouyang W. Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition2020 March 01, 2020:[arXiv:2003.14111 p.]. Available from: https://ui.adsabs.harvard.edu/abs/2020arXiv200314111L.https://ui.adsabs.harvard.edu/abs/2020arXiv200314111L [Google Scholar]
PLOS Digit Health. doi: 10.1371/journal.pdig.0000668.r001

Decision Letter 0

Ismini Lourentzou

18 Jun 2024

PDIG-D-24-00163

Deep learning-based screening for locomotive syndrome using single-camera walking video: Development and validation study

PLOS Digital Health

Dear Dr. Tada,

Thank you for submitting your manuscript to PLOS Digital Health. After careful consideration, we feel that it has merit but does not fully meet PLOS Digital Health's publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript within 60 days Aug 17 2024 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at digitalhealth@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pdig/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

* A rebuttal letter that responds to each point raised by the editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

* A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

* An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

We look forward to receiving your revised manuscript.

Kind regards,

Ismini Lourentzou

Section Editor

PLOS Digital Health

Journal Requirements:

1. Please provide separate figure files in .tif or .eps format only and remove any figures embedded in your manuscript file. Please also ensure that all files are under our size limit of 10MB.

For more information about figure files please see our guidelines:

https://journals.plos.org/digitalhealth/s/figures

https://journals.plos.org/digitalhealth/s/figures#loc-file-requirements

2. Some material included in your submission may be copyrighted. According to PLOS’s copyright policy, authors who use figures or other material (e.g., graphics, clipart, maps) from another author or copyright holder must demonstrate or obtain permission to publish this material under the Creative Commons Attribution 4.0 International (CC BY 4.0) License used by PLOS journals. Please closely review the details of PLOS’s copyright requirements here: PLOS Licenses and Copyright. If you need to request permissions from a copyright holder, you may use PLOS's Copyright Content Permission form.

Please respond directly to this email or email the journal office and provide any known details concerning your material's license terms and permissions required for reuse, even if you have not yet obtained copyright permissions or are unsure of your material's copyright compatibility.

Potential Copyright Issues:

Figure 4 includes an image of an identifiable person. Please provide written confirmation or release forms, signed by the subject(s) (or their parent/legally authorized guardian), giving permission to be photographed and to have their images published under our CC-BY 4.0 license.

Otherwise, we kindly request that you remove the photograph.

3. In the online submission form, you indicated that "The datasets used and/or analyzed during the current study available from the corresponding author on reasonable request".

All PLOS journals now require all data underlying the findings described in their manuscript to be freely available to other researchers, either

1. In a public repository,

2. Within the manuscript itself, or

3. Uploaded as supplementary information.

This policy applies to all data except where public deposition would breach compliance with the protocol approved by your research ethics board. If your data cannot be made publicly available for ethical or legal reasons (e.g., public availability would compromise patient privacy), please explain your reasons by return email and your exemption request will be escalated to the editor for approval. Your exemption request will be handled independently and will not hold up the peer review process, but will need to be resolved should your manuscript be accepted for publication. One of the Editorial team will then be in touch if there are any issues.

Additional Editor Comments (if provided):

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Does this manuscript meet PLOS Digital Health’s publication criteria? Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe methodologically and ethically rigorous research with conclusions that are appropriately drawn based on the data presented.

Reviewer #1: No

Reviewer #2: Yes

Reviewer #3: Partly

--------------------

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: No

Reviewer #2: Yes

Reviewer #3: No

--------------------

3. Have the authors made all data underlying the findings in their manuscript fully available (please refer to the Data Availability Statement at the start of the manuscript PDF file)?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception. The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: No

--------------------

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS Digital Health does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

--------------------

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: 1. Prevalence of Locomotor Syndrome:

o The manuscript lacks information on the prevalence of locomotor syndrome, both globally and within the studied population. It is important to provide this context to understand the significance and applicability of your results.

o It is necessary to include references to the global prevalence of locomotor syndrome and, if possible, provide data on the prevalence within the specific population you studied. This information is crucial for interpreting your findings accurately. It is necessary to include the formula to calculate positive and negative predictive values, specificity, etc

2. Equivalence of Training and Validation Groups:

o There is a noticeable imbalance between the training and validation groups in terms of the number of patients and their stages of locomotor syndrome.

o The training group includes 24 patients in stage 3, 9 patients in stage 2, and 15 patients in stage 1. In contrast, the validation group has 5 patients in stage 3, 4 patients in stage 2, and 35 patients in stage 1.

o This discrepancy can affect the validity and reliability of your model’s performance. Please address this imbalance. Possible solutions include:

� Expanding the dataset to ensure more balanced groups.

� Providing a detailed justification for the current group composition and discussing its implications on the study’s results.

3. Statistical Analysis and Model Validation:

o Given the imbalance in patient distribution, it is important to discuss how this might impact the model’s performance.

o Include statistical analyses that account for the differences in group sizes and stages. Consider performing additional validation using more balanced groups if possible.

o Discuss any potential biases introduced by the current group compositions and how they were mitigated in your analysis

Reviewer #2: The authors implemented a DL-based model for the diagnosis of LS, a disabling but reversible pathology, if recognized early. The paper is well-written and the contents are described completely and clearly. The results, although presenting limitations, are encouraging and represent an interesting first effort in this field.

Reviewer #3: This paper developed a classification model to predict locomotive syndrome (LS) from walking videos. The authors have compared previous studies that have developed similar classification models on different gait pathologies and the results are comparable.

Minor comments below:

Introduction:

Clear and coherent through the paragraph. Sufficiently elaborated on LS's background and its issues. Appropriately tied up with the study’s aim to develop a model to objectively diagnose LS from walking gait.

Results:

Line 154: Table 1. The n (%) should be placed in the column header instead to prevent repeating information. Additionally, if the female numbers are already presented, why not present the male numbers as well? This would prevent readers from having to perform their own calculations.

Discussion:

Line 265: How would the “model’s challenge in reliably excluding non-LS cases” affect clinical decisions?

Line 286: Similar to my comment above, how would the model’s "tendency towards false positives” affect clinical decisions? Would unnecessary resources be put into these false positive cases? How would this affect the current workflow of clinicians to diagnose LS?

Methods:

The statements below are similar and make the paragraph verbose. Please avoid repeating information, especially if they are in the same paragraph.

Line 382: “ …ensuring compliance with ethical standards and patient safety.”

Line 384: “…reinforcing the ethical integrity and transparency of the research process.”

Line 386: “…confirming adherence to ethical guidelines.”

Line 418: “…one to four times”. How did you decide which participant(s) to do one, two, three, or four times? Please elaborate on this.

Line 418: “..from a side view”. Were the video recordings from the left or right side of the participants? From Figure 4, sample images showed were from the right side. Are these consistent for all participants? Please elaborate on this.

Line 413: “…with a resolution of 1288 x 964, 30 FPS,”

Line 421: “at a resolution of 1288 x 964 and a frame rate of 30 FPS”.

It repeats the information mentioned in line 413. Is it referring to the same camera mentioned in line 413? If yes, I do not see the need to repeat this information in the next paragraph. If it is necessary to mention the use of Advance Video Coding (AVC), would it be possible to combine it with line 413 instead? Or remove the repeating information of resolution and fps in line 421.

Paragraph line 454 on OpenPose. Given that the video recordings were from the left or right side of the participants, did you encounter any issue with occlusions or swapping of lower-limb pose estimation? If yes, what were the steps taken to fix these issues? Were there any checks done before inputting to MS-G3D?

Line 460: What do you mean by “converted into graph-structured format”? How is this different from the OpenPose default output of pose estimation?

Line 468: "...the gait is classified as indicative of LS; otherwise, it is classified as non-LS." So the model prediction of LS is binary, but in paragraph line 426 on the LS risk test, patients can be classified into multiple stages depending on severity. Please elaborate on why you chose a binary classification instead of replicating the same classification of the LS risk test. Additionally, if the model prediction is binary, how did Figure 3 come about? What does ‘accuracy’ here refer to? I assume is the model vs doctor diagnosis. Then how did you know when the model classified different LS stages here?

Paragraph line 472: What is the inter-reliability between the 6 orthopedic surgeons? Were all orthopedic surgeons unanimous in classifying LS for all 65 walking video clips?

Line 488: The chi-square test was mentioned but I did not see any chi-square statistical values reported in the results section.

Figure 1. The amount of non-LS and LS cases is imbalanced, almost 2.5 times, in model creation. Do you think this would contribute to the model leaning towards false positives?

Figure 3, part a. Where did these “40 frames” come from?

--------------------

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

Do you want your identity to be public for this peer review? If you choose “no”, your identity will remain anonymous but your review may still be made public.

For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

--------------------

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLOS Digit Health. doi: 10.1371/journal.pdig.0000668.r003

Decision Letter 1

Ismini Lourentzou

14 Oct 2024

Deep learning-based screening for locomotive syndrome using single-camera walking video: Development and validation study

PDIG-D-24-00163R1

Dear Dr. Tada,

We are pleased to inform you that your manuscript 'Deep learning-based screening for locomotive syndrome using single-camera walking video: Development and validation study' has been provisionally accepted for publication in PLOS Digital Health.

Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow-up email from a member of our team. 

Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated.

IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they'll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact digitalhealth@plos.org.

Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Digital Health.

Best regards,

Ismini Lourentzou

Section Editor

PLOS Digital Health

***********************************************************

The author response and manuscript revisions have adequately addressed most reviewer concerns. Some concerns remain, particularly regarding the issue of group imbalance and the lack of clarity in distinguishing whether the model is for diagnostic or predictive purposes. The authors have explicitly acknowledge the limitations caused by the imbalance in patient groups in the manuscript. While both limitations could be more clearly stated in the manuscript, this does not detract significantly from the overall value of the study. It is strongly recommended, however, to consider the most recent reviewer comments in the final version of the manuscript, specifically to further clarify the potential nature of the model as either predictive or diagnostic, and include the discussion on why false positives is less of a concern in clinician workflows as well as the interrater reliability scores.

Reviewer Comments (if any, and for reference):

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: (No Response)

Reviewer #3: All comments have been addressed

Reviewer #4: All comments have been addressed

**********

2. Does this manuscript meet PLOS Digital Health’s publication criteria? Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe methodologically and ethically rigorous research with conclusions that are appropriately drawn based on the data presented.

Reviewer #1: No

Reviewer #3: Yes

Reviewer #4: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: No

Reviewer #3: Yes

Reviewer #4: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available (please refer to the Data Availability Statement at the start of the manuscript PDF file)?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception. The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #3: Yes

Reviewer #4: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS Digital Health does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #3: Yes

Reviewer #4: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The main objective of the study was to develop a model for the prediction and analysis of Locomotive Syndrome (LS). However, the authors neither addressed nor resolved an important issue raised by this reviewer. The issue is that the model was trained and validated using unbalanced groups of patients at different stages of the disease. There was a significant imbalance in the groups used for training and validation, with each group having a different number of patients at each stage of the disease. The authors claim that this was taken into account, and that they are recruiting more patients for a subsequent study. They explain that the model was built using a binary classification: non-LS individuals (control) and LS patients.

The authors argue that the model was trained and validated based on this binary condition—either controls or patients with LS. They state that, in the final analysis, there was an equal proportion of patients (when considered as a single group) in both the training and validation sets. However, they acknowledge an imbalance in the training dataset, with more patients than controls, which led to an increased number of false positives and reduced the specificity of the model. Despite these issues, the authors wish to publish the current version of the predictive model with unbalanced groups. They plan to address this imbalance in a future study, but this is neither ethical nor responsible, as they intend to publish a flawed model in PLOS Data Health.

In another section of their responses to the reviewers, the authors claim to have developed a tool to assist physicians in diagnosing LS (page 9), yet in the revised manuscript (lines 73-75, page 17), they continue to refer to it as a method for prediction and analysis. It appears that the authors may be working on a diagnostic model, which is fundamentally different from a predictive model. This is a significant issue, and the manuscript should not be published in its current form. A predictive model estimates the probability that a disease might occur in the future and how it will progress. To build a reliable predictive model, it is necessary to have balanced groups representing the various stages of the disease as well as a control group—something this study lacks. In contrast, a diagnostic model is designed to determine whether a patient currently has a specific disease.

The authors failed to adequately address the reviewers' concerns regarding the potential bias introduced by the unbalanced groups (pages 8-9). Instead, they simply stated that the training and validation datasets contained equivalent proportions of patients with the disease.

Their response to the question about the tendency for false positives (page 9) is also insufficient. While the authors admit that their model may produce false positives, they dismiss this as a problem by asserting that the final diagnosis will be made by a clinician. However, this overlooks the crucial issue of the imbalance between patients and controls in both the training and validation datasets, as well as the imbalance in disease stages. Therefore, this is not even an appropriate model for diagnostic purposes.

Most of the reviewers’ questions centered on the authors' decision to develop a diagnostic model (despite not specifying this as the aim of the study) rather than a predictive model. The authors did not address these concerns adequately, merely stating that the analysis was based on a binary classification of participants, even though this classification was itself imbalanced

Reviewer #3: Thank you to the authors for adequately addressing my comments.

However, I have 2 additional minor comments.

In response to my earlier feedback, I think it would be good for the authors to include their answers to points 2 and 3 in the manuscript discussion, as it highlighted the author's justification on why the model 'tendency towards false positives' is not a major problem in the clinician’s workflow.

Additionally, I suggest briefly mentioning the interrater reliability score (as mentioned in their answer in point 12) in the paragraph around line 235. Addressing these points would further strengthen the manuscript.

Reviewer #4: The revision is fine.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

Do you want your identity to be public for this peer review? If you choose “no”, your identity will remain anonymous but your review may still be made public.

For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #3: No

Reviewer #4: No

**********

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Appendix. The contents of the GLFS-25 questionnaire.

    (DOCX)

    pdig.0000668.s001.docx (14.2KB, docx)
    S2 Appendix. The definition of sensitivity, specificity, positive and negative predictive value, and accuracy.

    (DOCX)

    pdig.0000668.s002.docx (88.8KB, docx)
    S3 Appendix. Detailed characteristics of the participants in the model creation group.

    (DOCX)

    pdig.0000668.s003.docx (20.1KB, docx)
    S4 Appendix. Detailed characteristics of the participants in the external validation group.

    (DOCX)

    pdig.0000668.s004.docx (20.1KB, docx)
    Attachment

    Submitted filename: response_to_the_reviewers-v23.docx

    pdig.0000668.s005.docx (37.2KB, docx)

    Data Availability Statement

    The datasets used and/or analyzed during the current study, including patients' walking videos, are available upon reasonable request. Participant consent forms specified that only study investigators could access the datasets. Additionally, the videos have elements of personal identification, such as face and gait patterns. Individuals seeking access to the datasets will need the approval of the Ethics Review Board at Kishiwada City Hospital (e-mail: kch@kishiwada-hospital.com, phone: +81-724-45-1000, URL: https://www.kishiwada-hospital.com).


    Articles from PLOS Digital Health are provided here courtesy of PLOS

    RESOURCES