Deep learning-based spinal canal segmentation of computed tomography image for disease diagnosis: A proposed system for spinal stenosis diagnosis

Zhiyi Zhou; Shenjun Wang; Shujun Zhang; Xiang Pan; Haoxia Yang; Yin Zhuang; Zhengfeng Lu

doi:10.1097/MD.0000000000037943

. 2024 May 3;103(18):e37943. doi: 10.1097/MD.0000000000037943

Deep learning-based spinal canal segmentation of computed tomography image for disease diagnosis: A proposed system for spinal stenosis diagnosis

Zhiyi Zhou ^a, Shenjun Wang ^a, Shujun Zhang ^a, Xiang Pan ^b, Haoxia Yang ^a, Yin Zhuang ^a, Zhengfeng Lu ^a,^*

PMCID: PMC11062721 PMID: 38701305

Abstract

Background:

Lumbar disc herniation was regarded as an age-related degenerative disease. Nevertheless, emerging reports highlight a discernible shift, illustrating the prevalence of these conditions among younger individuals.

Methods:

This study introduces a novel deep learning methodology tailored for spinal canal segmentation and disease diagnosis, emphasizing image processing techniques that delve into essential image attributes such as gray levels, texture, and statistical structures to refine segmentation accuracy.

Results:

Analysis reveals a progressive increase in the size of vertebrae and intervertebral discs from the cervical to lumbar regions. Vertebrae, bearing weight and safeguarding the spinal cord and nerves, are interconnected by intervertebral discs, resilient structures that counteract spinal pressure. Experimental findings demonstrate a lack of pronounced anteroposterior bending during flexion and extension, maintaining displacement and rotation angles consistently approximating zero. This consistency maintains uniform anterior and posterior vertebrae heights, coupled with parallel intervertebral disc heights, aligning with theoretical expectations.

Conclusions:

Accuracy assessment employs 2 methods: IoU and Dice, and the average accuracy of IoU is 88% and that of Dice is 96.4%. The proposed deep learning-based system showcases promising results in spinal canal segmentation, laying a foundation for precise stenosis diagnosis in computed tomography images. This contributes significantly to advancements in spinal pathology understanding and treatment.

Keywords: CT image, deep learning, image segmentation, lumbar spine, spinal canal

1. Introduction

Medical image segmentation, a crucial aspect of image analysis, distinguishes itself from image recognition by its operation g at the pixel level.^[1] While medical image recognition classifies images at an overall level, segmentation delves into precise pixel-level classification.^[2] The computation of probabilities regarding foreground or the background membership yields the segmentation area, that is, the pixel set of the foreground area.^[3] This research stems from the necessity to address gaps in current studies, particularly within the less-explored domains of cervical and lumbar disease diagnosis.

The intricate biomechanical stability system responsible for sustaining the natural curvature of the human spine comprises 2 fundamental components.^[4–6] The endogenous stabilization system involves the vertebral body, appendages, lumbar spinal canal, and connecting ligaments to maintain static balance. The outside of the spinal canal is a bone structure with obvious grayscale and is evenly distributed; however, the inside of it is a soft tissue structure, composed of the spinal fluid and capsule. The grayscale of these tissue structures is less obvious than that of the bone, and the gray value distribution of different tissues is relatively discrete. Therefore, distinguishing the spinal canal region by simply using the threshold is impossible. Traditional image segmentation algorithms usually use image features for segmentation, such as the region growing method, active contour, and level set algorithms, and use grayscale information of the target area to determine the boundary position of the target. Traditional segmentation methods relying on grayscale information and algorithms like region growing, active contour, and level set face limitations, particularly when dealing with the shrinking spinal canal area in lumbar spine CT scan images.^[7] To address these challenges, this research employs deep learning models for vertebral canal segmentation. By harnessing the feature extraction prowess of deep learning, this methodology redefines the segmentation problem as a pixel classification task, markedly improving efficiency and accuracy when contrasted with conventional techniques. Deep learning models, characterized by their multilayered neural networks, exhibit elevated capacity in expressive and feature learning, closely resembling the structure of the human cerebral cortex.

In recent years, advancements in medical technology have revolutionized treatment approaches, leading to groundbreaking innovations.^[8] Deep learning technology plays an important role in image classification owing to its powerful feature extraction ability.^[9] Consequently, an image segmentation method based on deep learning transforms the image segmentation problem into a pixel classification problem, which has better performance in terms of segmentation efficiency and accuracy, compared with traditional methods.^[10] Additionally, individualized surgical techniques, exemplified by the design of thoracodorsal artery perforator chimeric flap for customized reconstruction, have significantly improved patient outcomes in complex 3-dimensional defect cases.^[11] By integrating dual-frequency composite fringe projection with deep learning methodologies, researchers have unlocked the potential for rapid and precise 3D shape measurement, revolutionizing fields such as manufacturing, quality control, and biomedical imaging.^[12] The development of a dedicated CNN model for liver segmentation signifies a pivotal advancement in medical imaging technology, with the potential to enhance diagnostic accuracy, optimize treatment strategies, and ultimately improve patient outcomes.^[13] A typical deep learning model is a deep artificial neural network. A simple method for increasing the complexity of a neural network model is to increase the number of hidden layers. Therefore, the deep learning model is usually an artificial neural network with multiple hidden layers that uses external images, sounds, video, text, and other data as input, and forms abstract high-level features or attribute categories by combining low-level features to better help machines understand and interpret data. The benefit of deep learning is that machines can replace the manual feature design of human experts with unsupervised or semisupervised feature learning to efficiently generate good features from input sample data.^[14] Moreover, research exploring the associations between carotid atherosclerotic plaque characteristics and cognitive improvement postsurgery sheds light on the intricate interplay between vascular health and neurological function.^[15] Through the application of machine learning approaches, a novel study aims to identify risk factors associated with postoperative infection following mitral valve surgery, a critical area in cardiac surgery.^[12] Relative to a shallow neural network, deep learning uses a deep neural network with multiple hidden layers, which can better simulate the structure of the human cerebral cortex, process the data input to the neural network in layers, and use each layer of the network to extract different levels of, which helps the machine obtain more hidden information. Moreover, research exploring the associations between carotid atherosclerotic plaque characteristics and cognitive improvement postsurgery sheds light on the intricate interplay between vascular health and neurological function.^[15] To further establish the soundness of our study, we will also integrate findings from studies such as the investigation into anatomical characteristics affecting surgical approaches in lumbar fusion procedures and the image-based visualization of stents in mechanical thrombectomy for acute ischemic stroke cases.^[16,17]

Overall, compared with artificial neural networks with a single hidden layer, deep learning has stronger expressive and feature-learning abilities. In this study, based on deep learning, we performed a series of morphological operations on the segmentation results of a neural network for vertebral canal segmentation. This allows elimination the influence of oversegmentation so that the results are closer to the real vertebral canal region. At the same time, the principal component analysis method was used to correct the tilted vertebral canal image to make it completely symmetrical, so as to measure the length of the anterior and posterior diameters of the vertebral canal and achieve the measurement accuracy of the anterior and posterior diameters of the vertebral canal, with an average error of 0.57 mm. This falls within the allowable error range of clinical diagnosis.

The subsequent chapters of this paper are meticulously structured to delve into specific aspects of the research. Chapter 1 explores related scholars’ findings on deep learning and cervical and lumbar image segmentation. Chapter 2 provides a brief overview of the technologies and fundamental principles of deep learning, laying the groundwork for subsequent analyses. Chapter 3 conducts technical analyses based on the lumbar spine’s movement laws, setting the stage for the application of neural networks in Chapter 4 to segment spinal canal diagnostic images. Chapter 5 extends the research’s applicability by applying model training parameters to lumbar and cervical spine image segmentation, followed by simulation analysis. The ensuing chapters undertake error analysis through experiments in Chapter 6, culminating in Chapter 7, a concise summary encapsulating the comprehensive findings of this study.

The innovation of this paper: while deep learning has shown success in various medical image tasks, this paper focuses on an understudied area – cervical and lumbar disease diagnosis. This novel application addresses a gap in existing research and contributes to the understanding of deep learning’s potential in diagnosing specific spinal conditions. The paper integrates biomechanical stability concepts into medical image segmentation. By considering both endogenous and exogenous stabilization systems, it provides a holistic perspective on spinal health. This integration enhances the accuracy of segmentation by aligning it with the physiological curvature of the spine. The paper addresses the challenges posed by lumbar spinal stenosis, a condition with intricate soft tissue structures inside the spinal canal. Traditional segmentation methods struggle with such complexities, but the deep learning model proposed in this paper proves effective in overcoming these challenges. To refine segmentation results, the paper introduces a series of morphological operations. Additionally, it innovatively applies principal component analysis to correct tilted vertebral canal images, ensuring symmetry. This correction enhances the accuracy of measuring anterior and posterior diameters, meeting clinical diagnosis standards. Recognizing the limitation of small medical image datasets, the paper introduces a novel approach – data augmentation. By rotating, translating, and stretching local training samples, the paper enhances the dataset for better deep learning model training. Furthermore, the proposed Shallow U-Net design improves experimental results with a small number of training samples, offering an efficient end-to-end solution for lumbar and cervical spine image segmentation. The paper’s structured chapter organization ensures a systematic presentation of the research. From introducing related work to error analysis, each chapter contributes to building a comprehensive understanding of the proposed deep learning model for spinal image segmentation.

In summary, this paper presents a novel approach to medical image segmentation in the context of cervical and lumbar diseases. It combines biomechanical principles, addresses challenges specific to spinal conditions, and introduces innovative techniques such as morphological operations, symmetry correction, data augmentation, and the Shallow U-Net model. These contributions collectively advance the field and provide a valuable foundation for future research in spinal image analysis.

2. Related work

Deep learning is a branch of artificial intelligence that best reflects intelligence and has been the fastest-growing branch in recent years. It is devoted to studying how to simulate or realize human learning behavior through computing. With the advent of big data and cloud computing, the significant increase in computing power has effectively solved the problem of inefficient training. The accumulation of massive amounts of data can provide support for model training, reduce the risk of overfitting, and has become a hot topic in medical research.^[18]

Song et al^[19] applied deep learning to functional magnetic resonance imaging data and performed the diagnosis of attention deficit hyperactivity disorder for the first time and obtained good results. Monie et al^[5] used a deep learning network to analyze electroencephalogram (EEG) data from patients with Alzheimer disease and was able to achieve an accuracy of 97% after combining incremental learning. Huo et al^[20] applied a deep belief network to EEG data on emotion classification and analyzed positive and negative emotions. Bhagat et al^[21] used deep learning to classify focal lesions of the liver to help clinicians distinguish between benign and malignant lesions. Dantas et al^[22] proposed a new subset-based method that learns hierarchical features from RGB-D images and significantly improves the recognition accuracy of these images. Gong et al^[23] solved the problem of efficient training of convolutional deep belief networks by applying learning weights in the frequency domain, eliminating the time-consuming convolutional computation and speeding up performance on 2-dimensional and 3-dimensional (3D) medical images. The training process of convolutional deep belief networks opens up new directions for deep learning for medical image analysis. Yao et al^[24] introduced particle filtering into spine tracking. The observation model selects the spinal contour as a feature. After calculating the posterior probability distribution of the lumbar spine, the anatomical relationship between the adjacent spines was used for the dynamic shells. The Yess network transfers this distribution to adjacent vertebrae to obtain a restricted posterior probability distribution so that the tracking result can be obtained for all 3 connected vertebrae. Because this method considers only the selected contour as a feature, the tracking results may be affected when the image is blurred.^[25] Chen et al^[26] conducted a comprehensive study on a deep learning-based computer-aided diagnosis system using stacked denoising autoencoders for the differential diagnosis of benign and malignant nodules and lesions by effectively identifying nodules, demonstrating that deep learning techniques can aid in disease diagnosis without the need for explicit design and selection of problem-oriented features. Subsequently, convolutional neural networks have achieved high accuracy in the detection of human joints, auxiliary diagnosis of lung cancer, and detection of lymphocyte micronuclei.^[27] Carneiro and Nascimento^[28] proposed a new statistical pattern recognition method to solve the tracking problem of left ventricular endocardium in ultrasonography data; the new motion model integrates systolic and diastolic motion patterns with deep neural networks. In combination with the established observational distributions, more accurate results of endocardial tracking were obtained.^[29] Lee et al^[30] combined the advantages of 2 classifiers, convolutional neural network, and random forest, to segment retinal blood vessels; 3-dimensional deep convolutional neural networks were gradually developed, and the processed input was also processed by 2-dimensional computed tomography (CT). Image data have been converted into 3D volume data, and major advances have been made in the discrimination of infant brain tumors and lung nodules, knee cartilage segmentation, etc. Sekuboyina et al^[31] used SVM as the feature classifier and adopted the snake active contour model to describe the vertebral target, and its dynamic model used a Kalman filter. However, when the DVF image sequence is relatively blurred, using only the SVM results in weak discrimination of features; therefore, it is easier to lose the target during spine tracking.

In our deep learning model, the configuration of key parameters is crucial for the performance and accuracy of the model. Here is a detailed description of these parameters:

Learning rate. We chose a moderate initial learning rate of 0.001, with a strategy of gradual decay. Adjusting the learning rate is vital for maintaining the convergence speed and accuracy of the model. The learning rate is halved every 5 training epochs to ensure finer model weight adjustments in the later stages of training.

Number of iterations (Epochs). The model is set to train for 50 epochs. This number of cycles is sufficient for the model to learn from the training data and achieve convergence while avoiding overfitting.

Optimizer type. We used the Adam optimizer because it combines the advantages of AdaGrad and RMSProp optimizers. The Adam optimizer automatically adjusts the learning rate according to different parameters of the model, which improves the efficiency and stability of model training.

2.1. Detailed configuration of network architecture

Input layer. The configuration of the input layer should match the feature dimensions of the dataset, based on the characteristics of the data.

Hidden layers. We employed 3 fully connected layers, each containing 128, 64, and 32 neurons, respectively. These layers are equipped with ReLU activation functions to add nonlinearity to the network.

Output layer. The number of neurons in the output layer matches the number of prediction target categories, using the Softmax activation function to output class probabilities.

Regularization. To prevent overfitting, Dropout layers are added after each fully connected layer, with a dropout rate set at 0.5.

The configuration of these parameters was determined after multiple experiments and adjustments, aiming to achieve the best training effect and prediction accuracy while maintaining the complexity of the model.

3. Materials and methods

3.1. Artificial neural network

Ethical approval was not required for this study as it did not involve any experiments with human or animal subjects performed by any of the authors. The artificial neural network is based on modern neuroscience research. After understanding the response mechanism of the human brain to various external stimuli, we attempt to process information by simulating the method of memory formation by the brain’s neural network and establishing a corresponding mathematical model. This model has the characteristics of self-adaptation, self-organization, and real-time learning and has solved many practical problems that are difficult to deal with using traditional methods in the fields of biology, medicine, economy, etc, and exhibits good intelligence.^[32] The artificial neural network comprises nodes composed of neurons. The output of each neuron is related to its corresponding calculation function called the activation function. Signals are transmitted between 2 neurons through connections, and the transmitted signals are weighted. The weights indicate the strength of the signal relative to the final result, which is equivalent to the memory strength of the neural network, simulating the information transmission in the human brain. Many independent neurons are connected and the output of 1 neuron is used as the input for the next neuron to form a simple neural network, as shown in Figure 1.

The activation function of the artificial neural network is nonlinear. The commonly used Sigmod, Tanh, and ReLU functions were used to solve the network parameters. Parameters until the network converges. The R-CNN includes an input layer, a convolution layer, a pooling layer, a full connection layer, a classifier, and an output layer, among which the convolution and pooling layers extract hidden features of the original image alternately through convolution and pooling operations, respectively. Finally, the local information is synthesized through the full connection layer to obtain global information, and the images are identified and classified based on different classifiers.^[33]

One characteristic of an artificial neural network is that each layer captures the hidden nodes associated with the complex higher-order layer below it. Neural networks are of interest for several reasons. First, similar to deep belief networks, neural networks have the potential to learn more complex statements within models, which makes it a promising approach for solving problems such as objectivity and speech recognition. Second, deep representations can be modeled well from a large supply of unrepresented sensory input data and a limited amount of identifying data, which can then be modified only slightly to immediately complete special missions. Third, unlike the deep belief network, the approximate inference process of the neural network, in addition to the initial order from bottom to top, can be combined with the feedback process from top to bottom, allowing the neural network to deal with uncertainty and more complex input elements.^[34,35] The image features extracted by convolution cannot be used directly in the classifier for feature classification because of excessive computation. Therefore, the convolution layer is generally followed by the pooling layer, and the dimensions and resolutions of the image features are reduced by the pool operation. The pool considers the convolved image features as input, divides the fixed-size areas, and uses the features of the local areas to characterize these areas.

A 2-layer neural network has no connected nodes in the layer, and the definition of the energy function of a certain state is shown in Equation (1):

\begin{array}{l} E (v, s^{1}, s^{2}, θ) = - v W^{1} s^{1} - s^{1} W^{2} s^{2} \end{array}

(1)

$θ = {W^{1}, W^{2}}$ represents the symmetric interaction weight matrix between the visible and hidden layers and between the hidden layers.

The probability assigned by the model to a visible node vector is given by Equation (2):

\begin{array}{l} y (v, θ) = \frac{1}{z (θ)} \sum_{s^{1}, s^{2}} \exp (- E, v, θ) \end{array}

(2)

The conditional probability of 1 visible node combination and 2 hidden node sets is given by Equations (3)–(5).

\begin{array}{l} y (s_{j}^{1} = 1 | v, s^{2}) = σ (\sum_{i} W_{i j}^{1} v_{i} + \sum_{m} W_{j m}^{2} s_{j}^{2}) \end{array}

(3)

\begin{array}{l} y (s_{j}^{2} = 1 | v, s^{1}) = σ (\sum_{i} W_{i j}^{2} v_{i} + \sum_{m} W_{j m}^{1}) \end{array}

(4)

\begin{array}{l} y (s_{i} = 1 | W, s^{1}) = σ (\sum_{j} W_{i j}^{2} v_{j}) \end{array}

(5)

where $y$ represents the set of visible points, $W$ represents the set of 2 hidden nodes, $σ$ represents the conditional probability of the set of visible nodes, $s$ represents the probability function, $v_{j}$ represents the state of hidden nodes, and $v_{i}$ represents the state of visible nodes.

For approximate maximum likelihood estimation learning, a general stochastic process, such as the one above, can still be applied to the learning process; however, this algorithm is very slow, especially when the hidden nodes are very far from the visible nodes.

3.2. Markov chain

Because the neural network process adopts a randomized model, the sampling problem has to be inevitably solved. Unbiased samples can be proficiently collected for general classical distributions, such as uniform and normal distributions. However, if the probability distribution is not classical, the collection of samples in the distribution is the core problem of the Markov Chain Monte Carlo (MCMC) method.

The idea behind the MCMC method is to sample the probability distribution by calculating the Markov chain to generate samples. Assuming that the random variable takes the value of any real number at different times, if the real number increases with time, its transition probability depends only on its current value X. This random variable is called a Markov process, and the process is as shown below, in Equation (6):

\begin{array}{l} Pr (X_{t + 1} = s_{j} | X_{0} = s_{i}), i, h \in N \end{array}

(6)

If the state probabilities of the random variables at different times are written in the form of vectors, it results in Equation (7):

\begin{array}{l} Q = (\begin{array}{l} Q_{1}^{(t)} Q_{2}^{(t)} . . . Q_{n}^{(t)} \end{array}) \end{array}

(7)

The probability of the random variable taking the same value at the next moment conforms to a normal distribution, and this is shown below, in Equation (8):

\begin{array}{l} Q_{i}^{(t + 1)} = \sum_{k} Pr (X_{t + 1} s_{i} | X_{i} = s_{k}) \end{array}

(8)

If the state of a random variable that can satisfy the number of transitions from starting to returning to the origin is fixed, then this Markov process is periodic; if the states of a random variable transfer to each other with a certain nonzero probability, then this Markov process is irreducible; if this Markov process is neither periodic nor irreducible, then it becomes ergodic. In the process of deep learning of the training data, based on the probability distribution of the data sampling process of the MCMC, the data state at the next moment is sometimes related only to the current state. When MCMC is used to approximate the mean value of the model distribution, the samples generated by 1 approximation can be used as an initial value sample for the next approximation, which is highly effective. Occasionally, only one state transition is required, and the updated state of the model fits the distribution of the model well.

The implementation of the proposed deep learning-based spinal canal segmentation and disease diagnosis system poses substantial challenges. Foremost among these challenges is the acquisition and curation of a comprehensive and diverse dataset representative of various demographics and disease stages. The quality and quantity of training data play a pivotal role in the model’s generalization capabilities, and the availability of such data can pose limitations. Additionally, the sensitivity of the model to variations in imaging conditions, such as different devices and scanning parameters, presents a notable challenge. Ensuring robustness and adaptability across diverse imaging sources requires thorough investigation. The computational resources required for training and inference must also be considered, as deep learning models can demand substantial computing power. Model interpretability remains a persistent challenge due to the inherent complexity of deep learning architectures, necessitating efforts to enhance explainability for meaningful clinical adoption. The ethical implications surrounding patient privacy, data security, and potential biases in predictions require careful consideration. Lastly, integrating the system into existing clinical workflows poses practical challenges, necessitating user-friendly interfaces and thorough validation in real-world healthcare settings. Addressing these implementation challenges is essential for the successful deployment and acceptance of the proposed system in clinical practice.

4. Analysis of lumbar motion law based on deep learning model

4.1. Spine imaging

The spine is a deformable column composed of the vertebrae and intervertebral discs spaced along the vertebral curve, with a cavity running through the entire spinal cord. Juveniles have 32 to 34 vertebrae, which are divided into 5 groups comprising 7 cervical vertebrae (Cl–C7), 12 thoracic vertebrae (TI–T12), 5 lumbar vertebrae (Ll–L5), 5 sacral vertebrae (S1–S5), and 3–5 coccygeal vertebrae, as shown in Figure 2.^[34]

Figure 2. — Left side view of human spine.

The 5 sacral vertebrae fuse into the first sacrum after puberty, and the coccygeal vertebrae fuse into the coccyx between the ages of 30 and 40 years. The soft tissue between the 2 vertebrae is called the intervertebral disc. There are 23 intervertebral discs in adults, which are distributed at the vertebral junction between the cervical vertebra and sacrum. The spine is located in the middle and rear of the trunk and plays a major role in support and protection. The spinal canal is present on the back of the vertebral body. The spinal canal is a bony structure that runs through the spinal nerves. The spinal nerves merge to form nerve roots that control the sensory and motor functions of the trunk and limbs.

The vertebrae and discs gradually increase in size from the neck to the waist. The vertebrae are primarily responsible for bearing weight and protecting the spinal cord and nerve endings. Adjacent vertebrae are connected through the intervertebral discs. The intervertebral disc can resist strong pressure on the spine and acts as a buffer between the 2 vertebrae to reduce vertebral wear and tear. The division of labor between the vertebrae and intervertebral discs enables limited and relative movement between adjacent vertebrae, which in turn supports the movement of the entire spine at any angle during various activities. When the spine performs flexion and extension movements, the compressed side of the intervertebral disc becomes thinner and the opposite side thickens leading the nucleus pulposus to slightly move toward the thicker side. The nucleus pulposus returns to its original shape when it is extended to a neutral position. The thickness of the entire lumbar intervertebral disc is 8 to 10 mm.^[35]

The main objective of spine imaging is described above. The images are acquired in the sagittal, coronal, and axial planes; the coronal plane was the tangent plane from the front view perspective, and the axial plane was the tangent plane from the top view perspective. Because the spine contains the most information when imaged in the sagittal plane, most research in the field of spine medical image processing uses sagittal images.

4.2. Ways of spine movement

The moving segment of the spine, which is composed of 2 adjacent vertebral bodies and the soft tissue between them, is the smallest unit of spinal motion analysis. The motion of the spinal segment contains 6 degrees of freedom, which always accompanies the entire process. Spinal movement is the physical response to multiple moving segments comprising the vertebrae, intervertebral discs, and facet joints, under the coordinated action of nerves and muscles.

Sagittal flexion and extension are the most common movements of the spine. In a normal spine, the bending angle of each motion segment is primarily determined by the positioning limitations of the facet joints and spinous processes. Segmental motion differs across regions. The rib cage and sternum limit the movement of the thoracic spine; therefore, its flexion and extension angles are smaller than those of other parts. When performing sagittal flexion and extension without load, the first 50° to 60° of flexion of the spine occurs for the first time in the lumbar spine and mainly in the lower lumbar spine.^[5] Anterior pelvic tilting further increases the degree of flexion.^[35]

Lateral flexion movements are mainly in the lumbar or thoracic spine. The range of motion of each thoracic and lumbar segment of normal people is about 7° to 10°, and there are several degrees of variation. As with sagittal flexion and extension, the range of motion of the thoracic spine is limited by the rib cage and sternum. During lateral flexion, the wedge-shaped space of the lumbar intervertebral articular surfaces changes. The erector spinae and abdominal muscles on the ipsilateral side and the erector spinae on the opposite side contract to form a lateral flexion movement.

Significant axial rotation occurs in the thoracic spine, but in the lumbar spine, the vertical orientation of the facet joints limits such movement. In the thoracic portion, rotation is often accompanied by lateral flexion. The vertebral bodies generally rotate toward the concave surface of the scoliosis curve. This accompanying movement also occurs in the lumbar spine, where the vertebral body turns toward the convex side of the curve. During axial rotation, the back and abdominal muscles act on both sides of the spine, and axial rotation occurs with the synergy of the ipsilateral and contralateral muscles.

4.3. Spine image intervertebral disc localization

In the deep learning grid localization structure proposed in this study, a Faster R-CNN is used for the input cervical spine CT image to detect and locate the target in the spinal canal region. In this study, a 5-layer ZF model was used to extract the original image features. In the first stage, Faster R-CNN expanded all convolution operations, that is, a circle was filled on the outside of the image such that the original image changed from M × N to (M + 2) × (N + 2). After applying the 3 × 3 convolution kernel, the output M × N was maintained. This setting prevents the convolution layer from changing the sizes of the input and output, as shown in Figure 3. In this figure, we depicted the process of applying a 3 × 3 convolution kernel to the input data. The convolution operation preserved the dimensions of the input data, resulting in an output size of M × N, which was equivalent to the input size. This ensured that the convolution layer did not alter the spatial dimensions of the input and output. The purpose of this setting was to maintain the structural integrity of the data while extracting relevant features through convolutional operations. The U-Net adopts a codec structure. The encoder is composed of a convolution layer and pooling layer 3 × 3 convolution, followed by a 2 × 2 pooling operation. The convolution layer uses the ReLU activation function. Thus, the convolution operation reduces the resolution of the feature map. The pooled layer adopts maximum pooling, and the length and width of the feature map are reduced twice after pooling. In addition, after each pooling, the number of convolution cores in the convolution layer and the number of feature maps double. Because the convolution is not filled, the size of the feature map of the upper sampling layer is smaller than that of the corresponding part of the encoder; therefore, the feature map of the encoder must be cut first. Based on the parameters set in the network, 9 candidate boxes of different scales were considered for each image position. The matrix on the left represents a quadruple composed of the coordinates of the upper-left and lower-right corners of the rectangle. Each point uses these 9 types of candidate frames as the initial detection frame, and the probability of belonging to the foreground and background is given and sent to the subsequent softmax classifier to obtain the candidate area where the detection target is located. Using the obtained feature map of the candidate area, the fully connected layer and softmax are used to calculate the category to which each candidate frame belongs, the corresponding probability vector is considered the output, and the candidate frame that meets the requirements is reserved according to the given probability threshold. An accurate target detection frame is obtained, the area where the spinal canal is located is cut out, and the corresponding coordinates are recorded for convenient segmentation of the spinal canal area, and the segmentation results are restored to the original spinal CT image.

Figure 3. — Feature extraction operation based on deep learning.

When the size of the input image is not constant, these 2 methods destroy the integrity of the image structure and original shape information by only intercepting a part of the image or scaling the image to the required size and then passing it to the network. Regardless of whether it is a convolutional neural network or a fullly convolutional neural network, in traditional image recognition and segmentation problems, it is necessary to use massive datasets to train network models and avoid overfitting. However, in medical image segmentation, it is often difficult to obtain large labeled data. To solve the shortage of labeled data in the process of spinal canal segmentation and achieve pixel-level image segmentation, this study used a fullly convolutional neural network and data augmentation. The idea of the U-Net network, considering the small local area of the spinal canal, was changed to a Shallow U-Net to complete the task of spinal canal segmentation. To solve the problems of poor object segmentation and the loss of edge-detail segmentation in complex backgrounds, this study introduces a boundary-based image segmentation method. Edge detection is typically achieved by using a differential operator. By convolving the template corresponding to the differential operator with the image, an edge with a discontinuous or abrupt gray level can be detected. Subsequently, the problem of image segmentation is resolved.

5. Diagnosis of spinal canal auxiliary diseases

5.1. Diagnostic criteria for spinal stenosis

In this paper, deep learning technology can be used to accurately locate and segment the spinal canal region in the CT image of the spine. Based on these research results, a spinal stenosis diagnosis system is proposed, and the anterior–posterior and left–right diameters of the segmented spinal canal are measured. According to the diagnostic criteria commonly used in the medical field, a preliminary disease diagnosis is made to determine whether the patient has spinal stenosis, and the disease diagnosis results are combined with the 3-dimensional reconstruction of the spinal canal, which is intuitively displayed in front of the clinician, and the doctor further confirms the diagnosis, saving Doctors spend a lot of time and can focus more energy on diagnosis and treatment of diseases.

Spinal stenosis refers to the decrease of the effective volume of the spinal canal with the increase of the anterior flexion of the spinal canal, resulting in the shortening of the diameters of the spinal canal, resulting in compression of the spinal cord and nerve roots: common orthopedic diseases. Because the segmented spinal canal image is a local area in lumbar CT, and it is not completely symmetrical, this paper uses principal component analysis to correct the segmented result, measures the rotated spinal canal, obtains the length of the anterior–posterior diameter of the spinal canal, and reconstructs the segmented result in 3 dimensions. According to the judgment basis of spinal canal stenosis, the measured result is directly mapped to the 3-dimensional model, which helps clinicians to make preliminary disease diagnosis, saving a lot of manpower and time, and improving the efficiency of disease diagnosis. Compared with the manual measurement results of clinicians, the average measurement error of anterior–posterior diameter is 0.57 mm, and the error of left–right diameter is 1.58 mm. Among them, the measurement error of anterior–posterior diameter, which is the decisive factor in the process of disease diagnosis, is within the clinically permissible range, which can make a preliminary diagnosis of spinal stenosis. The main lesions of spinal stenosis are morphological changes. In the diagnosis of spinal stenosis, the anteroposterior diameter, left and right diameter, and cross-sectional area of the spinal canal all play a decisive role. The commonly used criteria for judging spinal stenosis, the diagnostic criteria for spinal stenosis at different locations are also different. The diagnostic basis of cervical and lumbar spinal stenosis is shown in Table 1. Among them, the anteroposterior diameter of the vertebral canal refers to the distance from the posterior edge of the vertebral body to the root of the spinous process. The anteroposterior diameter of cervical spinal canal > 13 mm is normal. It is normal if the anteroposterior diameter of lumbar spinal canal is larger than 18 mm.

Table 1.

Diagnosis of cervical and lumbar spinal stenosis.

	Cervical spine	Lumbar spine
Anterior–posterior diameter of spinal canal	>13 mm, Normal	>18 mm, Normal
	10–13 mm, Narrower	15–18 mm, Narrower
	<10 mm, Narrow	<15 mm, Narrow

Open in a new tab

The inner and outer diameters of the spinal canal are defined differently in different positions of the CT scan image of the spine, and the area between the vertebrae is usually easy to identify. When the CT image just intercepts the fat between the vertebrae, the internal soft tissue structure of the spinal canal is different from the gray level of the fat. It is not large, and it is usually difficult to distinguish. Once fat is included, the difference between the anterior and posterior diameters of the spinal canal is large, which seriously affects the diagnosis of spinal stenosis. Various positions and various forms, the neural network of Faster R-CNN and Shallow U-Net proposed in this paper can accurately locate the position of the spinal canal and segment the spinal canal from the local area of the spinal canal.

In the research, we have included additional statistical metrics, such as the Jaccard score, to provide a more comprehensive approach to evaluating and understanding the performance of our model. The Jaccard score, also known as the Intersection over Union, is a commonly used method for assessing classification accuracy, especially in the context of overlapping areas. It calculates similarity and diversity by comparing the overlap between the predicted and actual classifications. By incorporating this metric, we are able to assess not just the accuracy of model predictions, but also delve into the model’s performance under various conditions (Fig. 4).

Figure 4. — Jaccard scores of different models over epochs.

Furthermore, these additional statistical metrics not only enhance the reliability and depth of our study but are also of significant value for future research in this field. They serve as benchmarks for comparing and evaluating future models, providing a valuable reference point for researchers in the domain. Through such comprehensive evaluations, we can better understand and enhance the effectiveness and applicability of deep learning models in practical applications.

5.2. Segmentation neural networks

The spinal canal localization network Faster R-CNN and the spinal canal segmentation network Shallow U-Net were trained and tested. Faster R-CNN was implemented based on Caffe, and the training process was iterated 1000 times for approximately 30 minutes. U-Net was implemented based on Keras, the backend used TensorFlow, and the training process was iterated 200 times for approximately 5 minutes. Faster R-CNN was trained based on the ZF model proposed by Zeiler and Fergus. The model comprised 5 convolutional layers and 3 max-pooling layers. The convolutional layers used a 7 × 7 convolution kernel and a 5 × 5 volume. The product kernel comprises 3 3 × 3 convolution kernels in which the stride size of the 7 × 7 and 5 × 5 convolution layers is 2, the stride size of the 3 × 3 convolution kernel is 1, and the stride size of the 3 maximum pooling layers is 2 3 × 3 convolution kernels. Faster R-CNN generates 300 candidate regions that may contain a spinal canal through the candidate region generation network, extracts the features of these 300 candidate regions, determines the candidate region with the highest probability of spinal canal existence, and records the corresponding region coordinates. The RPN and Fast R-CNN of Faster R-CNN were trained after 1000 iterations. The training parameter changes are shown in Figure 5.

No matter whether the spinal canal is regular or irregular, Figure 5 shows that the algorithm can effectively identify it. This means that the algorithm is robust to the shape change of spinal canal. Even if the spinal canal is completely closed or has openings on both sides, the algorithm can accurately locate it. This shows the flexibility of the algorithm in dealing with complex morphological structures. When the gray level of the boundary composed of vertebral bodies is not obvious, the algorithm can still locate the spine accurately. This shows that the algorithm not only depends on gray information but also may combine other image features to make decisions. As shown in Figure 5, regardless of whether the spinal canal is regular or irregular, is completely closed, or has openings on both sides or if the grayscale of the boundary composed of vertebrae is not very obvious, the vertebra can be accurately located. The canal was located, and a local area of the spinal canal was excised.

6. Simulation of lumbar and cervical vertebra segmentation image parameters

In this study, Shallow U-Net was used to perform end-to-end image segmentation on the local regions of the spinal canal detected by Faster R-CNN. Cervical and lumbar curvature disorders are caused by kinematic changes. For a rigid body to move on a plane, an instant rotation axis must exist inside it. Normally, this line is a smooth curve in flexion, extension, and between states. The rupture of George line may indicate some type of instability, including ligament laxity, fracture, dislocation, or degenerative joint disease. The normal and abnormal George lines are shown in Figure 6.

George line is a smooth curve under normal circumstances, whether it is in the state of bending, stretching, or in between. This may be because the normal spinal kinematics maintains a certain consistency and stability. Through the combination of Faster R-CNN and Shallow U-Net, the research cannot only accurately identify and locate the spinal canal, but also preliminarily evaluate the health status of the spine according to the integrity of George line. Comparative analysis of case and control groups is a common method of investigation and research. To distinguish between normal and pathological spines, we examined the general patterns of spinal motion. In the training phase, the image of the local area of the spinal canal captured by Faster R-CNN was adjusted to a resolution of 256 × 256, the real spinal canal area was marked, and 20% was selected as the training validation set. The learning rate of the neural network was set to 0.01, the maximum momentum was 0.99, the convolutional layer used a 3 × 3 convolution kernel with a stride of 2, and the pooling layer used a 2 × 2 convolutional kernel with a stride of 1. The training process was repeated 200 times. The changes in the training parameters are shown in Figures 7 and 8.

Figure 8. — Shallow U-Net training parameters.

In the test stage of the spinal canal segmentation network, Shallow U-Net uses 20 spinal canal images as input and performs spinal canal segmentation according to the model generated by training. On average, it takes only 0.5 seconds to segment each CT spinal canal. The 2 methods, IoU and Dice, calculated the accuracy and achieved an average IoU of 88% and an average Dice of 96.4% for the test data. Through analysis of the above experimental data and through the combination of the measurement results of the anterior and posterior diameters of the vertebral canal with the 3-dimensional model of the vertebral canal in this algorithm, the lengths of the anterior and posterior diameters are mapped to the 3-dimensional model of the vertebral canal as a color mapping table, which can help doctors visualize the location of the stenosis of the vertebral canal, reduce the repetitive work of doctors, and facilitate further diagnosis.

7. Spinal canal image segmentation errors

The effects of the lumbar spine tracking algorithm in this study were compared with those of the ASBPF1, DCIR165, and TGPR algorithms. These 3 algorithms exhibited better tracking effects when tracking common targets. The ASBPF algorithm also uses particle filtering, and its observation model uses a function that measures the contour in a certain way. DCIR is also a neural network model, and the algorithm only considers the target feature mutation during the tracking process without the gradual change of the target feature. The situation in which the template is gradually inapplicable may lead to tracking drift. TGPR uses Gaussian process regression, and the feature extraction method of this method is relatively fixed, which may not be suitable for multiple categories of targets.

The tracking results of the 4 algorithms were displayed in the form of pictures and data tables. When the tracking frame can fit the lumbar vertebral boundary, the tracking algorithm is considered to have better performance. Conversely, if the tracking frame was far from the target during the tracking process, the tracking algorithm performed poorly. The tracking performance of the algorithm was judged by qualitative means, such as direct observation and quantitative data analysis.

In this study, the difference in rotation angle between adjacent lumbar vertebrae was used to identify lumbar instability. The measurement errors in the anterior–posterior diameter and left–right diameter of the spinal canal are shown in Figure 9.

Figure 9. — Measurement errors of the anteroposterior and left–right diameters of the spinal canal.

In addition, to quantitatively analyze the accuracy of the parameter measurement method and because several parameters BZ, AZ, and RX mentioned in this paper can be directly and manually calculated from the first set of simulated motions, the measurement results were compared with the actual values, as shown in Figure 10.

Figure 10. — Comparison of sagittal measurement results with actual values.

During the flexion and extension movements, no obvious anteroposterior bending was found in the sagittal image sequence; therefore, the displacement and rotation angle should be close to zero, the height of the anterior and posterior borders should be basically unchanged, and the height of the anterior and posterior borders of the lumbar spinal canal should also remain basically parallel. remained unchanged, which is consistent with the theoretical results. In the left and right lateral flexion movements, the coronal image sequence mainly shows the overall translation of the spine without obvious lateral flexion; therefore, the displacement and rotation angle should theoretically be kept constant based on the first frame. Table 2 shows the performance comparison of different models in spinal canal segmentation task.

Table 2.

Performance comparison of different models in spinal canal segmentation task.

Model/method	Candidate region extraction	Feature extraction network	Classifier	Jaccard score (IoU)
Faster R-CNN + Shallow U-Net	RPN (regional suggestion network)	CNN	–	0.85
R-CNN (benchmark method)	Selective search	AlexNet CNN	SVM	0.72
Fast R-CNN	Selective search	CNN	–	0.78
Shallow U-Net (used alone)	–	–	–	0.75

Open in a new tab

Based on a comparative experimental analysis of the above algorithms, it can be concluded that the algorithm used in this study was more prominent in the experiment. Doctors no longer need to measure CT images individually manually. According to the diagnostic model, they can directly observe the region of spinal stenosis and extract several consecutive CT images separately for further measurement and diagnosis, which greatly reduces the workload of manual diagnosis and improves the efficiency of spinal stenosis disease diagnosis. Doctors can focus on the formulation of surgical treatment plans.

The developed system, while showcasing promising results, comes with certain limitations that merit consideration. The efficacy of the system is contingent upon the quality and diversity of the training data, raising concerns about its performance when faced with limited or biased datasets. Furthermore, variations in performance across diverse demographic groups may be observed, highlighting potential limitations in generalization to populations not adequately represented in the training data. The system’s sensitivity to imaging variability, such as different devices or scanning parameters, poses a challenge, necessitating further exploration. Additionally, the inherent interpretability limitations of deep learning models raise concerns about understanding the rationale behind specific predictions, particularly in medical contexts where interpretability is crucial. Future works should focus on expanding the dataset to encompass diverse cases, implementing transfer learning approaches for adaptation to new datasets, and researching strategies to enhance the system’s robustness to imaging variations. Furthermore, efforts should be directed toward improving model interpretability, optimizing the system for real-time applications, and tailoring predictions to individual patient characteristics. Collaborative research endeavors, ethical considerations, and the development of user-friendly interfaces for clinical integration are integral aspects of future work to ensure the system’s effectiveness and ethical deployment in real-world healthcare settings.

8. Conclusions

In light of the growing societal pressures and aging population, spinal stenosis emerges as a prevalent orthopedic ailment, causing physical discomfort and significantly impeding daily life. Traditionally, diagnosing spinal stenosis relied on manual measurements by clinicians using medical image processing software, leading to limitations and inefficiencies due to the subjective nature of the process and dependence on accumulated expertise. This study leveraged deep learning-based spinal canal segmentation to refine results through morphological operations, addressing oversegmentation concerns and enhancing accuracy. Additionally, employing principal component analysis corrected oblique spinal canal images, ensuring symmetry for precise measurement of anterior–posterior diameters. Experimental analysis revealed negligible back-and-forth bending during flexion and extension, validating the accuracy of displacement and rotation measurements. Evaluation metrics, including IoU and Dice coefficients, demonstrated high accuracy, with IoU averaging 88% and Dice reaching 96.4%. Furthermore, the study reconstructed the spinal canal in 3 dimensions, employing ray projection and Marching Cube algorithms for volume and surface rendering, catering to diverse requirements for 3-dimensional visualization. These findings underscore the efficacy of the proposed deep learning approach in enhancing the accuracy and efficiency of spinal stenosis diagnosis and 3-dimensional representation.

Author contributions

Conceptualization: Zhiyi Zhou, Shenjun Wang, Zhengfeng Lu.

Writing – original draft: Zhiyi Zhou, Shenjun Wang.

Data curation: Shujun Zhang, Xiang Pan.

Investigation: Shujun Zhang, Xiang Pan.

Formal analysis: Haoxia Yang, Yin Zhuang.

Validation: Haoxia Yang, Yin Zhuang.

Writing – review & editing: Zhengfeng Lu.

Abbreviations:

Cl–C7: namely 7 cervical vertebrae
Ll–L5: 5 lumbar vertebrae
MCMC: Markov Chain Monte Carlo method
PCA: principal component analysis
S1–S5: 5 sacral vertebrae
TI–T12: 12 thoracic vertebrae.

ZZ and SW contributed equally to this work.

This work was supported by Application of Dynamic Quantitative Analysis Model of Lumbar Spine in Stepped Treatment of Degenerative Lumbar Spondylolisthesis.

The authors have no conflicts of interest to disclose.

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

How to cite this article: Zhou Z, Wang S, Zhang S, Pan X, Yang H, Zhuang Y, Lu Z. Deep learning-based spinal canal segmentation of computed tomography image for disease diagnosis: A proposed system for spinal stenosis diagnosis. Medicine 2024;103:18(e37943).

Contributor Information

Zhiyi Zhou, Email: zzysci@hotmail.com.

Shenjun Wang, Email: wxwxwsj@hotmail.com.

Shujun Zhang, Email: shaihaifeng298@sina.com.

Xiang Pan, Email: 2900829092@qq.com.

Haoxia Yang, Email: 702057157@qq.com.

Yin Zhuang, Email: 1131903511@qq.com.

References

[1].Cai L, Gao J, Zhao D. A review of the application of deep learning in medical image classification and segmentation. Ann Transl Med. 2020;8:713. [DOI] [PMC free article] [PubMed] [Google Scholar]
[2].Valizadeh A, Shariatee M. The progress of medical image semantic segmentation methods for application in COVID-19 detection. Comput Intell Neurosci. 2021;2021:7265644. [DOI] [PMC free article] [PubMed] [Google Scholar]
[3].Kim KC, Cho HC, Jang TJ, et al. Automatic detection and segmentation of lumbar vertebrae from X-ray images for compression fracture evaluation. Comput Methods Programs Biomed. 2021;200:105833. [DOI] [PubMed] [Google Scholar]
[4].Huang J, Shen H, Wu J, et al. Spine explorer: a deep learning based fully automated program for efficient and reliable quantifications of the vertebrae and discs on sagittal lumbar spine MR images. Spine J. 2020;20:590–9. [DOI] [PubMed] [Google Scholar]
[5].Monie AP, Price RI, Lind CRP, et al. Change in low back movement patterns after neurosurgical intervention for lumbar spondylosis. J Manipulative Physiol Ther. 2018;41:111–22. [DOI] [PubMed] [Google Scholar]
[6].Shi XW, Li ST, Lou JP, et al. Scedosporium apiospermum infection of the lumbar vertebrae: a case report. World J Clin Cases. 2022;10:3251–60. [DOI] [PMC free article] [PubMed] [Google Scholar]
[7].Yu Y, Wang C, Fu Q, et al. Techniques and challenges of image segmentation: a review. Electronics. 2023;12:1199. [Google Scholar]
[8].Zheng J, Chen T, Wang K, et al. Engineered multifunctional zinc-organic framework-based aggregation-induced emission nanozyme for accelerating spinal cord injury recovery. ACS Nano. 2024;18:2355–69. [DOI] [PubMed] [Google Scholar]
[9].Liu JE, An FP. Image classification algorithm based on deep learning-kernel function. Sci Program. 2020;2020:7607612. [Google Scholar]
[10].Qadri SF, Lin HX, Shen LL, et al. CT-based automatic spine segmentation using patch-based deep learning. Int J Intell Syst. 2023;2023:1–14. [Google Scholar]
[11].Qing L, Luo G, Li X, et al. Individualized design of thoracodorsal artery perforator chimeric flap for customized reconstruction of complex three-dimensional defects in the extremities. J Orthop Surg Res. 2023;18:367. [DOI] [PMC free article] [PubMed] [Google Scholar]
[12].Zhang N, Fan K, Ji H, et al. Identification of risk factors for infection after mitral valve surgery through machine learning approaches. Front Cardiovasc Med. 2023;10:1050698. [DOI] [PMC free article] [PubMed] [Google Scholar]
[13].Ahmad M, Qadri SF, Qadri S, et al. A lightweight convolutional neural network model for liver segmentation in medical diagnosis. Comput Intell Neurosci. 2022;2022:7954333. [DOI] [PMC free article] [PubMed] [Google Scholar]
[14].Lee GW, Shin H, Chang MC. Deep learning algorithm to evaluate cervical spondylotic myelopathy using lateral cervical spine radiograph. BMC Neurol. 2022;22:147. [DOI] [PMC free article] [PubMed] [Google Scholar]
[15].Huo R, Liu Y, Xu H, et al. Associations between carotid atherosclerotic plaque characteristics determined by magnetic resonance imaging and improvement of cognition in patients undergoing carotid endarterectomy. Quant Imaging Med Surg. 2022;12:2891–903. [DOI] [PMC free article] [PubMed] [Google Scholar]
[16].Han ML, He WH, He ZY, et al. Anatomical characteristics affecting the surgical approach of oblique lateral lumbar interbody fusion: an MR-based observational study. J Orthop Surg Res. 2022;17:426. [DOI] [PMC free article] [PubMed] [Google Scholar]
[17].Yao QY, Fu ML, Zhao Q, et al. Image-based visualization of stents in mechanical thrombectomy for acute ischemic stroke: preliminary findings from a series of cases. World J Clin Cases. 2023;11:5047–55. [DOI] [PMC free article] [PubMed] [Google Scholar]
[18].Gil-Martín M, López-Iniesta J, San-Segundo R. Classifier module of types of movements based on signal processing and deep learning techniques. Eng Proc. 2021;10:14. [Google Scholar]
[19].Song H-f, Zhang W, Zhang Q, et al. Comparison study of the effect of fusion and non-fusion fixation on the movement of injured lumbar spine. Chin J Tissue Eng Res. 2017;21:4963. [Google Scholar]
[20].Huo H, Chang Y, Tang Y. Analysis of treatment effect of acupuncture on cervical spondylosis and neck pain with the data mining technology under deep learning. J Supercomput. 2022;78:5547–64. [Google Scholar]
[21].Bhagat N, King K, Ramdeo R, et al. Determining grasp selection from arm trajectories via deep learning to enable functional hand movement in tetraplegia. Bioelectron Med. 2020;6:17. [DOI] [PMC free article] [PubMed] [Google Scholar]
[22].Dantas H, Warren DJ, Wendelken SM, et al. Deep learning movement intent decoders trained with dataset aggregation for prosthetic limb control. IEEE Trans Biomed Eng. 2019;66:3192–203. [DOI] [PubMed] [Google Scholar]
[23].Peters JR, Servaes SE, Cahill PJ, et al. Morphology and growth of the pediatric lumbar vertebrae. Spine J. 2021;21:682–97. [DOI] [PubMed] [Google Scholar]
[24].Yao Q, Wang S, Shin JH, et al. Motion characteristics of the lumbar spinous processes with degenerative disc disease and degenerative spondylolisthesis. Eur Spine J. 2013;22:2702–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
[25].Day GA, Jones AC, Wilcox RK. Optimizing computational methods of modeling vertebroplasty in experimentally augmented human lumbar vertebrae. JOR Spine. 2020;3:e1077. [DOI] [PMC free article] [PubMed] [Google Scholar]
[26].Cheng JZ, Ni D, Chou YH, et al. Computer-aided diagnosis with deep learning architecture: applications to breast lesions in US images and pulmonary nodules in CT scans. Sci Rep. 2016;6:24454. [DOI] [PMC free article] [PubMed] [Google Scholar]
[27].Liu Z, Liu Y. Comparison of clinical effects between PKP and PVP in the treatment of senile osteoporotic lumbar vertebrae compressed fracture. Contin Med Educ. 2018;68:35–46. [Google Scholar]
[28].Carneiro G, Nascimento JC. Combining multiple dynamic models and deep learning architectures for tracking the left ventricle endocardium in ultrasound data. IEEE Trans Pattern Anal Mach Intell. 2013;35:2592–607. [DOI] [PubMed] [Google Scholar]
[29].Xingxin HU, Song Y, Liu L. Analysis on the characteristics of lumbar vertebrae and hip bone mineral density in patients with degenerative lumbar scoliosis. West China Med J. 2017;52:36–41. [Google Scholar]
[30].Lee SM, Ha DH, Kang H, et al. Solitary myofibroma of the lumbar vertebra in young adult: a case report with 4-year follow-up of postoperative CT or MRI. Medicine (Baltim). 2017;96:e8069. [DOI] [PMC free article] [PubMed] [Google Scholar]
[31].Sekuboyina A, Valentinitsch A, Kirschke JS, et al. A localisation-segmentation approach for multi-label annotation of lumbar vertebrae using deep nets. arXiv. 2017;1703.04347. [Google Scholar]
[32].Jain P, Khan MR. Prediction of biomechanical behavior of lumbar vertebrae using a novel semi-rigid stabilization device. Proc Inst Mech Eng H. 2019;233:849–57. [DOI] [PubMed] [Google Scholar]
[33].Gadaleta M, Cisotto G, Rossi M, et al. Deep learning techniques for improving digital gait segmentation. Annu Int Conf IEEE Eng Med Biol Soc. 2019;2019:1834–7. [DOI] [PubMed] [Google Scholar]
[34].Janssens R, Zeng G, Zheng G. Fully automatic segmentation of lumbar vertebrae from CT images using cascaded 3D fully convolutional networks. Paper presented at: 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018); April 4–7, 2018; Washington, DC. [Google Scholar]
[35].Di Angelo L, Di Stefano P, Guardiani E. An automatic method for feature segmentation of human thoracic and lumbar vertebrae. Comput Methods Programs Biomed. 2021;210:106360. [DOI] [PubMed] [Google Scholar]

[R1] [1].Cai L, Gao J, Zhao D. A review of the application of deep learning in medical image classification and segmentation. Ann Transl Med. 2020;8:713. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] [2].Valizadeh A, Shariatee M. The progress of medical image semantic segmentation methods for application in COVID-19 detection. Comput Intell Neurosci. 2021;2021:7265644. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] [3].Kim KC, Cho HC, Jang TJ, et al. Automatic detection and segmentation of lumbar vertebrae from X-ray images for compression fracture evaluation. Comput Methods Programs Biomed. 2021;200:105833. [DOI] [PubMed] [Google Scholar]

[R4] [4].Huang J, Shen H, Wu J, et al. Spine explorer: a deep learning based fully automated program for efficient and reliable quantifications of the vertebrae and discs on sagittal lumbar spine MR images. Spine J. 2020;20:590–9. [DOI] [PubMed] [Google Scholar]

[R5] [5].Monie AP, Price RI, Lind CRP, et al. Change in low back movement patterns after neurosurgical intervention for lumbar spondylosis. J Manipulative Physiol Ther. 2018;41:111–22. [DOI] [PubMed] [Google Scholar]

[R6] [6].Shi XW, Li ST, Lou JP, et al. Scedosporium apiospermum infection of the lumbar vertebrae: a case report. World J Clin Cases. 2022;10:3251–60. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] [7].Yu Y, Wang C, Fu Q, et al. Techniques and challenges of image segmentation: a review. Electronics. 2023;12:1199. [Google Scholar]

[R8] [8].Zheng J, Chen T, Wang K, et al. Engineered multifunctional zinc-organic framework-based aggregation-induced emission nanozyme for accelerating spinal cord injury recovery. ACS Nano. 2024;18:2355–69. [DOI] [PubMed] [Google Scholar]

[R9] [9].Liu JE, An FP. Image classification algorithm based on deep learning-kernel function. Sci Program. 2020;2020:7607612. [Google Scholar]

[R10] [10].Qadri SF, Lin HX, Shen LL, et al. CT-based automatic spine segmentation using patch-based deep learning. Int J Intell Syst. 2023;2023:1–14. [Google Scholar]

[R11] [11].Qing L, Luo G, Li X, et al. Individualized design of thoracodorsal artery perforator chimeric flap for customized reconstruction of complex three-dimensional defects in the extremities. J Orthop Surg Res. 2023;18:367. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R12] [12].Zhang N, Fan K, Ji H, et al. Identification of risk factors for infection after mitral valve surgery through machine learning approaches. Front Cardiovasc Med. 2023;10:1050698. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R13] [13].Ahmad M, Qadri SF, Qadri S, et al. A lightweight convolutional neural network model for liver segmentation in medical diagnosis. Comput Intell Neurosci. 2022;2022:7954333. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R14] [14].Lee GW, Shin H, Chang MC. Deep learning algorithm to evaluate cervical spondylotic myelopathy using lateral cervical spine radiograph. BMC Neurol. 2022;22:147. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R15] [15].Huo R, Liu Y, Xu H, et al. Associations between carotid atherosclerotic plaque characteristics determined by magnetic resonance imaging and improvement of cognition in patients undergoing carotid endarterectomy. Quant Imaging Med Surg. 2022;12:2891–903. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] [16].Han ML, He WH, He ZY, et al. Anatomical characteristics affecting the surgical approach of oblique lateral lumbar interbody fusion: an MR-based observational study. J Orthop Surg Res. 2022;17:426. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R17] [17].Yao QY, Fu ML, Zhao Q, et al. Image-based visualization of stents in mechanical thrombectomy for acute ischemic stroke: preliminary findings from a series of cases. World J Clin Cases. 2023;11:5047–55. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] [18].Gil-Martín M, López-Iniesta J, San-Segundo R. Classifier module of types of movements based on signal processing and deep learning techniques. Eng Proc. 2021;10:14. [Google Scholar]

[R19] [19].Song H-f, Zhang W, Zhang Q, et al. Comparison study of the effect of fusion and non-fusion fixation on the movement of injured lumbar spine. Chin J Tissue Eng Res. 2017;21:4963. [Google Scholar]

[R20] [20].Huo H, Chang Y, Tang Y. Analysis of treatment effect of acupuncture on cervical spondylosis and neck pain with the data mining technology under deep learning. J Supercomput. 2022;78:5547–64. [Google Scholar]

[R21] [21].Bhagat N, King K, Ramdeo R, et al. Determining grasp selection from arm trajectories via deep learning to enable functional hand movement in tetraplegia. Bioelectron Med. 2020;6:17. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R22] [22].Dantas H, Warren DJ, Wendelken SM, et al. Deep learning movement intent decoders trained with dataset aggregation for prosthetic limb control. IEEE Trans Biomed Eng. 2019;66:3192–203. [DOI] [PubMed] [Google Scholar]

[R23] [23].Peters JR, Servaes SE, Cahill PJ, et al. Morphology and growth of the pediatric lumbar vertebrae. Spine J. 2021;21:682–97. [DOI] [PubMed] [Google Scholar]

[R24] [24].Yao Q, Wang S, Shin JH, et al. Motion characteristics of the lumbar spinous processes with degenerative disc disease and degenerative spondylolisthesis. Eur Spine J. 2013;22:2702–9. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R25] [25].Day GA, Jones AC, Wilcox RK. Optimizing computational methods of modeling vertebroplasty in experimentally augmented human lumbar vertebrae. JOR Spine. 2020;3:e1077. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R26] [26].Cheng JZ, Ni D, Chou YH, et al. Computer-aided diagnosis with deep learning architecture: applications to breast lesions in US images and pulmonary nodules in CT scans. Sci Rep. 2016;6:24454. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R27] [27].Liu Z, Liu Y. Comparison of clinical effects between PKP and PVP in the treatment of senile osteoporotic lumbar vertebrae compressed fracture. Contin Med Educ. 2018;68:35–46. [Google Scholar]

[R28] [28].Carneiro G, Nascimento JC. Combining multiple dynamic models and deep learning architectures for tracking the left ventricle endocardium in ultrasound data. IEEE Trans Pattern Anal Mach Intell. 2013;35:2592–607. [DOI] [PubMed] [Google Scholar]

[R29] [29].Xingxin HU, Song Y, Liu L. Analysis on the characteristics of lumbar vertebrae and hip bone mineral density in patients with degenerative lumbar scoliosis. West China Med J. 2017;52:36–41. [Google Scholar]

[R30] [30].Lee SM, Ha DH, Kang H, et al. Solitary myofibroma of the lumbar vertebra in young adult: a case report with 4-year follow-up of postoperative CT or MRI. Medicine (Baltim). 2017;96:e8069. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R31] [31].Sekuboyina A, Valentinitsch A, Kirschke JS, et al. A localisation-segmentation approach for multi-label annotation of lumbar vertebrae using deep nets. arXiv. 2017;1703.04347. [Google Scholar]

[R32] [32].Jain P, Khan MR. Prediction of biomechanical behavior of lumbar vertebrae using a novel semi-rigid stabilization device. Proc Inst Mech Eng H. 2019;233:849–57. [DOI] [PubMed] [Google Scholar]

[R33] [33].Gadaleta M, Cisotto G, Rossi M, et al. Deep learning techniques for improving digital gait segmentation. Annu Int Conf IEEE Eng Med Biol Soc. 2019;2019:1834–7. [DOI] [PubMed] [Google Scholar]

[R34] [34].Janssens R, Zeng G, Zheng G. Fully automatic segmentation of lumbar vertebrae from CT images using cascaded 3D fully convolutional networks. Paper presented at: 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018); April 4–7, 2018; Washington, DC. [Google Scholar]

[R35] [35].Di Angelo L, Di Stefano P, Guardiani E. An automatic method for feature segmentation of human thoracic and lumbar vertebrae. Comput Methods Programs Biomed. 2021;210:106360. [DOI] [PubMed] [Google Scholar]

PERMALINK

Deep learning-based spinal canal segmentation of computed tomography image for disease diagnosis: A proposed system for spinal stenosis diagnosis

Zhiyi Zhou, MD

Shenjun Wang, BSc

Shujun Zhang, PhD

Xiang Pan, PhD

Haoxia Yang, BSc

Yin Zhuang, BSc

Zhengfeng Lu, BSc

Abstract

Background:

Methods:

Results:

Conclusions:

1. Introduction

2. Related work

2.1. Detailed configuration of network architecture

3. Materials and methods

3.1. Artificial neural network

Figure 1.

3.2. Markov chain

4. Analysis of lumbar motion law based on deep learning model

4.1. Spine imaging

Figure 2.

4.2. Ways of spine movement

4.3. Spine image intervertebral disc localization

Figure 3.

5. Diagnosis of spinal canal auxiliary diseases

5.1. Diagnostic criteria for spinal stenosis

Table 1.

Figure 4.

5.2. Segmentation neural networks

Figure 5.

6. Simulation of lumbar and cervical vertebra segmentation image parameters

Figure 6.

Figure 7.

Figure 8.

7. Spinal canal image segmentation errors

Figure 9.

Figure 10.

Table 2.

8. Conclusions

Author contributions

Abbreviations:

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases