Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2024 Dec 20.
Published in final edited form as: Med Phys. 2024 May 9;51(6):4201–4218. doi: 10.1002/mp.17072

Chest CT-based Automated Vertebral Fracture Assessment using Artificial Intelligence and Morphologic Features

Syed Ahmed Nadeem 1, Alejandro P Comellas 2, Elizabeth A Regan 3,4, Eric A Hoffman 5,6,7, Punam K Saha 8,9
PMCID: PMC11661457  NIHMSID: NIHMS2038789  PMID: 38721977

Abstract

Background:

Spinal degeneration and vertebral compression fractures are common among the elderly that adversely affect their mobility, quality of life, lung function, and mortality. Assessment of vertebral fractures in chronic obstructive pulmonary disease (COPD) is important due to the high prevalence of osteoporosis and associated vertebral fractures in COPD.

Purpose:

We present new automated methods for (1) segmentation and labelling of individual vertebrae in chest computed tomography (CT) images using deep learning (DL), multi-parametric freeze-and-grow (FG) algorithm, and separation of apparently fused vertebrae using intensity autocorrelation and (2) vertebral deformity fracture detection using computed vertebral height features and parametric computational modelling of an established protocol outlined for trained human experts.

Methods:

A chest CT-based automated method was developed for quantitative deformity fracture assessment following the protocol by Genant et al. The computational method was accomplished in the following steps: (1) computation of a voxel-level vertebral body likelihood map from chest CT using a trained DL network; (2) delineation and labelling of individual vertebrae on the likelihood map using an iterative multi-parametric FG algorithm; (3) separation of apparently fused vertebrae in CT using intensity autocorrelation; (4) computation of vertebral heights using contour analysis on the central anterior-posterior (AP) plane of a vertebral body; (5) assessment of vertebral fracture status using ratio functions of vertebral heights and optimized thresholds. The method was applied to inspiratory or total lung capacity (TLC) chest scans from the multi-site Genetic Epidemiology of COPD (COPDGene) (ClinicalTrials.gov: NCT00608764) study, and the performance was examined (n=3,231). One hundred and twenty scans randomly selected from this dataset were partitioned into training (n=80) and validation (n=40) datasets for the DL-based vertebral body classifier. Also, generalizability of the method to low dose CT imaging (n=236) was evaluated.

Results:

The vertebral segmentation module achieved a Dice score of 0.984 as compared to manual outlining results as reference (n=100); the segmentation performance was consistent across images with the minimum and maximum of Dice scores among images being 0.980 and 0.989, respectively. The vertebral labelling module achieved 100% accuracy (n=100). For low dose CT, the segmentation module produced image-level minimum and maximum Dice scores of 0.995 and 0.999, respectively, as compared to standard dose CT as the reference; vertebral labelling at low dose CT was fully consistent with standard dose CT (n=236). The fracture assessment method achieved overall accuracy, sensitivity, and specificity of 98.3, 94.8, and 98.5%, respectively, for 40,050 vertebrae from 3,231 COPDGene participants. For generalizability experiments, fracture assessment from low dose CT was consistent with the reference standard dose CT results across all participants.

Conclusions:

Our CT-based automated method for vertebral fracture assessment is accurate, and it offers a feasible alternative to manual expert reading, especially for large population-based studies, where automation is important for high efficiency. Generalizability of the method to low dose CT imaging further extends the scope of application of the method, particularly since the usage of low dose CT imaging in large population-based studies has increased to reduce cumulative radiation exposure.

1. INTRODUCTION

Spinal degeneration and vertebral compression fractures are common among the elderly that adversely affect their mobility, quality of life, lung function, and mortality.13 Assessment of vertebral fractures in chronic obstructive pulmonary disease (COPD) is important due to the high prevalence of osteoporosis, a common comorbidity of COPD defined by low bone mass, and associated vertebral fractures in COPD.47 A recent CT-based study demonstrated the association of COPD with both decreased bone mineral density and an increased prevalence of vertebral fractures.8 Vertebral fractures lead to chronic back pain and kyphosis, which are associated with reduced rib-cage mobility and chest space both hindering lung expansion and function.912 Large numbers of chest computed tomography (CT) scans are collected in nationwide longitudinal pulmonary clinical and research studies1316 creating a unique opportunity to study the cross-sectional and longitudinal impacts of vertebral fractures in COPD. Large-scale visual assessment by experts, requirements of multiple graders and adjudicators, and multi-site data heterogeneity related challenges pose a bottleneck in applying manual methods for studying vertebral fractures in large studies and set the need for automation.

Over the last few years, several methods have been reported for vertebral segmentation and fracture assessment using chest CT imaging. Primarily, these methods have followed two different approaches. The first approach is fully artificial intelligence (AI)-based, where a deep learning (DL) network is trained to identify or segment individual vertebrae on a sagittal image slice, and a separate DL network is trained to assess deformity fractures based on the output of the first DL network.1725 Thus, the features used by these methods for fracture assessment were fully data-driven and not based on the anatomical deformity features recommended by Genant et al.,26 which have been widely adopted for quantitative assessment of vertebral deformity fractures. The other methods have applied traditional image processing methods to segment individual vertebrae in chest CT images, computed vertebral morphological features, e.g., vertebral heights, and trained a neural network or support vector machine classifier to detect vertebral fractures.2729 These methods were evaluated on limited data, and their generalizability has not been demonstrated. Also, some of these methods21,29 were developed using radiologist-diagnosed vertebral fractures the reference, which are generally more severe fractures as compared to Genant’s quantitative approach that allows early assessment of fractures.

In this paper, we integrate a DL approach with the iterative multi-parametric thresholding framework3032 for three-dimensional (3-D) segmentation and labelling of individual vertebrae in chest CT images. These segmented and individually labelled vertebrae are processed through a geometric analytic module to compute the vertebral height ratio parameters for deformity fracture assessment as proposed by Genant et al. Major contributions of this paper include: (1) development of an algorithm combining DL with a multi-parametric method for 3-D segmentation and labelling of individual vertebrae; (2) development of an algorithm for separation of apparently fused vertebrae in CT using intensity autocorrelation; (3) development of an algorithm for robust detection of marginal ridges on individual vertebral bodies and computation of vertebral height features following the procedural steps outlined by Genant et al.26 for trained human experts; (4) optimization of the decision threshold parameter for fracture detection using Receiver-Operating-Characteristic (ROC) curve analysis; (5) large-scale evaluation on chest CT data from a nationwide multi-site pulmonary research study;13 and (6) assessment of generalizability of the method to low dose CT imaging. A preliminary version of the DL-based freeze-and-grow (FG) method for segmentation and labelling of individual vertebrae, referred to in item (1), was presented in a conference paper.33 The DL network, presented in the current paper, was retrained using a larger multi-site training dataset. More importantly, the intensity autocorrelation method, referred to in item (2), for separating apparently fused vertebrae in CT presented here was not presented previously. Experimental outlines and preliminary results of our vertebral fracture detection methods were presented in a one-page conference abstract.34 This paper introduces the algorithms for automated vertebral fracture detection, referred to in items (3–6). Further, the current paper presents novel experimental results evaluating the accuracy and generalizability on large datasets of human participants.

2. METHODS

A DL-based method was applied to obtain CT-based individual vertebral segmentation and labelling, which was followed by a computerized quantitative fracture assessment step (Figure 1). Individual vertebral segmentation and labelling were accomplished in two steps: (1) computation of a voxel-level probability likelihood map for thoracic vertebral body using inspiratory chest CT and a DL algorithm and (2) delineation and labelling of individual vertebrae on the likelihood map using an iterative multi-parametric thresholding algorithm.3032 Subsequently, anterior, middle, and posterior heights of a vertebral body were computed using geometric analysis of the vertebral contour on the central AP plane to determine its fracture status. These methods were applied to inspiratory or total lung capacity (TLC) chest scans from the multi-site Genetic Epidemiology of COPD (COPDGene) (ClinicalTrials.gov: NCT00608764) study.13 Accuracies of the methods were examined for vertebral body segmentation, labelling, and fracture assessment using manual vertebral outlining, labelling, and fracture reading data. Finally, generalizability of the methods to low dose CT was evaluated.

Figure 1.

Figure 1.

Workflow for chest CT-based automated vertebral segmentation, labelling, and deformity fracture assessment. All analyses were performed in 3-D.

2.A. Individual Vertebral Segmentation and Labelling

2.A.1. Deep Learning-Based Vertebral Probability Likelihood Map

A modified 3-D U-Net35,36 was used to compute voxel-level likelihood of a vertebral body. Subregions of size 96×96×96 from TLC chest CT scans were used as input samples for both training and application phases. The subregion size was selected so that it is computationally feasible on a machine with a consumer-grade graphics processing unit (GPU), while being large enough to include a whole vertebral body at the current imaging resolution of 0.5mm slice spacing. The training dataset was generated by random sampling of subregions, and the post processing-based data augmentation was replaced by collection of a larger number of random samples. The network was designed with three pooling and three de-convolutional layers with 56 kernels at the first layer, which doubled after every pooling stage. Kernels of size 3×3×3 were used at every convolutional layer except at the last layer, where a 1×1×1 kernel was applied. Weighted binary cross-entropy loss function37 was adopted for network learning to account for the class imbalance. During the application phase, the trained network was used to compute a voxel-level probability map of vertebral bodies from a CT image using a sliding window approach with a 48 voxel-overlap between every two adjacent subregions. The probability value at a given voxel was computed by averaging likelihood values from all subregions containing the specific voxel. This step was previously demonstrated to reduce subregion boundary effects in computed likelihood maps.30

2.A.2. Delineation and Labelling of Individual Vertebrae

A multi-parametric FG algorithm30 was applied to delineate individual vertebral bodies from the DL-based likelihood map. Initially, the method identifies the cores of individual vertebral bodies by thresholding at a high likelihood value of 0.95 and connected component labelling;38 connected components smaller than 500mm3 were discarded. During a subsequent iteration, the probability threshold was lowered by a step of 0.05 to progressively capture finer details, while avoiding fusion between adjacent vertebrae.31,32 During this step, individual vertebrae were simultaneously grown from their current cores without entering a region already assigned to a different vertebra. In case a fusion occurs, where two vertebrae claim the same region, distance analysis is used for tiebreaking. Specifically, a voxel p claimed by multiple vertebrae, after growth in the current iteration, is assigned to the vertebra with the nearest core to p. At the end of the iteration, each vertebral core is updated to include the augmented region. The iterative procedure terminates after all individual vertebral cores converge or the likelihood threshold falls below the value of 0.3. The spine column is computed from segmented vertebrae after fusing the vertebrae using a morphological closing operation, and the spine centerline is computed using our previously validated minimum-cost path-based method that accounts for local structure-depth.39

The above algorithm for individual vertebral labelling works when vertebral bodies appear separated in a CT image. However, the DL-based likelihood map and a high thresholding fail to generate disconnected cores for two adjacent vertebrae when the two vertebrae appear as fused in a CT image (Figure 2(a)). To overcome this challenge, we developed an algorithm based on autocorrelation analysis of CT intensity profiles along the spine centerline to compute the optimum separating surface between two adjacent vertebrae. The algorithm consists of two major steps: (1) identify two apparently fused vertebrae and locate the intersection between the spine centerline and the expected separating surface and (2) construct the optimum separating surface using autocorrelation analysis of CT intensity profiles; see Figure 2.

Figure 2.

Figure 2.

Separation of apparently fused vertebrae in CT using intensity autocorrelation. (a) A section of a sagittal CT slice illustrating a pair of fused vertebrae. (b) Intensity profile along the spine centerline S in (a). An intensity threshold η+5σ was defined, where η is the median intensity and σ is the root-mean-square-difference of the lower half of intensity values from η. A fusion between two adjacent vertebrae is flagged by more than two peaks above η+5σ, and the intersection ps between the spine centerline S and the expected separating surface is located at the middle of inner peak(s). (c, d) Same as (a, b) for normal vertebrae. Note that only two outer peaks and no inner peaks are found above the threshold η+5σ. (e-g) Computation of the optimum separating surface between two fused vertebrae; the 3-D algorithm is illustrated on a sagittal slice. Let p1 and p2 (yellow) be two sample points on the plane orthogonal to S at pS and S1 and S2 be spine centerlines shifted at p1 and p2, respectively. (f), at top, shows the intensity profile on the section of S1 above p1 (blue) and intensity profiles on the lower section of S1 color-coded by the shift magnitude. (f), at bottom, shows correlations of the intensity profile of the upper section of S1 and that of the lower section at varying shifts; the optimum shift is marked in red. (g) Same as (f) for p2.

As illustrated in Figure 2(a, b), the first step is accomplished by analyzing the CT intensity profile along the spine centerline S after intensity smoothing over a 5×5×5 window. The median η of the smooth CT intensity profile is determined, and, for the lower half of intensity values, the root-mean-square-difference σ from the median is computed. A threshold of η+5σ is set, and the number of distinct peaks above the threshold is determined. Finally, a fusion between two adjacent vertebrae is flagged if there are more than two peaks, and the intersection ps between the spine centerline and the expected separating surface is located at the middle of inner peak(s) (Figure 2(b)). See the example in Figure 2(c, d), where the adjacent vertebrae are not fused. For such cases no inner peaks are found, and the algorithm correctly flags the vertebrae as not fused. To compute the optimal separating surface between two fused vertebrae, uniform sample points pi’s are located on the plane orthogonal to S at ps; for our experiments, the points were sampled at 0.5×0.5mm in-plane resolution. At a given sample location pi, the spine centerline line Si, shifted at pi, is determined (Figure 2(e)). Two intensity profiles are defined on Si: (1) intensity profile on the section of Si above pi and (2) intensity profile on the section of Si below pi. Subsequently, the optimum shift parameter is determined that maximizes the correlation between the first intensity profile and the second intensity profile after shifting (Figure 2(f, g)). Finally, this shift is applied to pi to obtain pi' as the location of the separating surface on Si. A median filtering on shift magnitudes is applied to eliminate local outliers. Finally, the optimum separating surface is computed using the shifted locations pi’s and their in-plane adjacency relation and is used to separate two fused vertebrae.

A final computerized quality control step was developed and applied after the above-described individual vertebral segmentation and labelling steps that is based on associated rib anatomy. Specifically, for each rib, the associated vertebra was detected by finding the closest vertebra. Note that a pair of ribs is associated to a properly segmented and labelled vertebra, except for L1. During the computerized quality control step, a vertebra was flagged as “fused” if multiple pairs of ribs were associated to it. Essentially, the computerized quality control step is aimed to detect and flag rare failures in our fully automated computational pipeline.

After delineation of vertebral bodies, separation of fused vertebrae, and completion of the computerized quality control step, individual vertebral labelling is performed. During this step, the superior-most thoracic vertebra T1 is detected as the connected vertebral body with its apex closest to the lung apex in the scanner axial direction. Subsequent vertebra T2-T12 and L1 were sequentially labeled in a superior-inferior direction.

2.B. Computation of Spinal Deformity and Vertebral Fracture Metrics

Our method for computerized vertebral fracture assessment was designed as per the quantitative criteria established by Genant et al. for radiographs.26 Specifically, the measures of anterior, middle, and posterior heights of each vertebra were used to define the anterior-to-posterior ratio (APR) and middle-to-posterior ratio (MPR) metrics to quantify fracture deformities. Following the guideline by Genant and other research groups,3,10,26,40,41 the normative distribution of a given ratio metric for a specific vertebra was computed using its observed values in participants without any fracture at the target vertebra. Jaramillo et al.8 applied a semiquantitative version of Genant’s approach on sagittal CT slices to define fracture status on a subset of COPDGene participants at their baseline visits.26 Specifically, expert readers identified biconcave and wedge vertebral fractures with moderate or severe deformities on T1-L1 vertebrae; see Figure 3. These CT images with expert fracture readings were used to generate the normative distributions for individual vertebrae.

Figure 3.

Figure 3.

Illustrations of moderate and severe wedge and biconcave vertebral deformity fractures on sagittal chest CT slices as visually labelled by Jaramillo et al.8 following Genant et al.26

2.B.1. Computation of Vertebral Heights

The algorithm for vertebral height computation is completed in three steps: (1) selection of the central AP plane for a specific vertebra, (2) detection of marginal ridge locations of the vertebral body on the central AP plane, and (3) computation of anterior, middle, and posterior heights. Let S denote the centerline of the spine column. Let V denote a segmented vertebra and o be the projection of the center of mass of V on S. The central AP plane PAP is computed as the unique plane containing o,ltangent, and lAP, where, ltangent is the tangent line of S at o and lAP is the image coordinate AP line at o (Figure 4(a)). Marginal ridge locations of the vertebral body on PAP are detected using its height profile of the contour CAP along the line lAP, where CAP is the contour of V on PAP with o and lAP representing the origin and the horizontal axis, respectively. At a given point x on lAP, the vertebral height is computed as the distance of the two points on CAP vertically above and below x. The height and its absolute gradient profile of CAP along lAP are shown in Figure 4(b). Let h and Δh denote the height and absolute gradient at a point on lAP. A threshold value of μΔh+1.5σΔh was selected to detect the marginal ridge locations, where μΔh and σΔh are the mean and standard deviation of Δh values. On either anterior or posterior side of o, the marginal ridge is detected as the location on lAP preceding the first point with Δh>μΔh+1.5σΔh, and the anterior and posterior heights hA and hP at respective marginal ridge locations are obtained (Figure 4(b)). Finally, the middle height hM is the minimum vertebral height within the central third between the anterior and posterior marginal ridges.

Figure 4.

Figure 4.

Computation of vertebral height features. (a) Vertebral feature metrics on the central anterior-posterior (AP) plane. V: a segmented vertebra; S: spine centerline; o: projection of the center of mass of V on S;ltangent: tangent line of S at o;lAP: the image coordinate AP line at o. The central AP plane PAP is the unique plane containing o,ltangent, and lAP; and CAP is the contour of V on PAP. (b) Height profile h of CAP along lAP is shown at top, and absolute gradient Δh is presented at bottom. A threshold of μΔh+1.5σΔh is defined, where μΔh and σΔh are the mean and standard deviation of Δh values. The marginal ridge (green) is detected as the location on lAP.preceding the first point (red) with Δh>μΔh+1.5σΔh. The anterior and posterior heights hA and hP are derived from respective marginal ridge locations. The middle height hM is the minimum vertebral height within the central third between the anterior and posterior marginal ridges.

2.B.2. Parameters for Fracture Assessment

APR or MPR metrics are used for quantitative fracture assessment. Let μs,T and σs,T be the mean and standard deviation (SD) of a given metric s at a given vertebral level T; note that μs,T and σs,T were computed from vertebrae without fractures. A threshold value thrs,T is defined as μs,T-αcoeff×σs,T for the metric s at the vertebral level T, where αcoeff is the common multiplier coefficient for both APR and MPR metrics at all vertebral levels. A specific vertebra T0 of a participant is flagged as “fractured” if the observed APR (or, MPR) value of that participant at T0 falls below the threshold thrAPR,T0 (respectively, thrMPR,T0); otherwise, the specific vertebra of the participant is considered as a non-fracture or healthy vertebra. The value of the common multiplier coefficient αcoeff was optimized using ROC analysis of a dataset with visual fracture readings by experts as reference.

2.C. Experiments: Materials and Training

Experiments were designed to examine accuracy and generalizability of individual vertebral segmentation, labelling, and fracture assessment modules of our computational pipeline (Figure 1). Accuracies of the DL-based vertebral segmentation and labelling modules were compared with manually annotated vertebrae on a subset of chest CT data from COPDGene. Also, the percentage of fused vertebrae detected using the computerized quality control step was determined. To examine the accuracy of the fracture assessment module, visual semi-quantitative vertebral fracture assessment results by multiple experts, as reported in Jaramillo et al.,8 were used as the reference. In this dataset, the binary fracture status of each vertebra for each participant was recorded. Fracture assessment performance was separately evaluated for different groups by sex, age, and COPD severity to determine if there is any performance dependency on relevant biological and clinical factors. Generalizability of each module was assessed by evaluating results at low dose CT compared to segmentation and labelling as well as fracture assessment derived from standard dose CT. The following sections describe the materials involved in our experiments—(1) human chest CT scans, (2) DL training and validation, and (3) data analysis.

2.C.1. Human Chest CT Datasets

TLC CT scans from COPDGene were used for the current study. Details of the COPDGene CT imaging protocol have been previously reported.13 Different cohorts of the COPDGene study were approved by the Institutional Review Boards of respective sites, and written informed consents were obtained from all participants. The following two datasets were used for accuracy and generalizability experiments, respectively.

Dataaccuracy (n=3,231) was derived from COPDGene study data acquired during baseline visits (November 2007 to April 2011) (N=10,305). Jaramillo et al.8 collected expert reading data of moderate and severe biconcave and wedge fracture status of individual vertebrae on a subset of COPDGene participants at their baseline visits (n=3,333). Within this dataset, CT images were available for 3,231 participants, which were used for the current study. CT scans were acquired on six different CT scanners from three different manufacturers (Siemens, GE, and Philips) at 15 different sites; see Table 1. For DL training and validation, a subset (n=120) was randomly sampled with at least three scans selected from each study site. For evaluation of segmentation and labelling accuracy, a separate set of 100 randomly selected scans (equally distributed across sex and between never-smokers and four different COPD severity groups) were manually annotated. For optimization of fracture assessment parameters, 167 and 333 participants were randomly selected from fracture and non-fracture groups to ensure similar fracture versus non-fracture distribution of the whole dataset in the parameter optimization data.

Table 1.

Lists of imaging sites, CT scanners, scan parameters, and participant numbers for different CT datasets examined in our study. Images were collected under the Genetic Epidemiology of Chronic Obstructive Pulmonary Disease (COPDGene) study and retrospectively studied in this work. All images were reconstructed using 512☓512 arrays at isotropic pixel size. See Regan et al.13 for more details on imaging protocols.

Dataaccuracy
Data Site CT Scanner Scan Parameters
Voltage, current,
pitch, slice thickness
Participants
n (%female)
age: mean±SD
Ann Arbor VA
Ann Arbor, MI
Philips Brilliance 64 120kVp, 200mAs,
0.923, 0.9mm
39 (23.1)
59.3±9.5 years
Duke Hospital Medical Center
Durham, NC
GE Lightspeed VCT 120kVp, 400mAs,
1.375, 0.625mm
6 (50)
66.6±6.8 years
Harbor UCLA Medical Center
West Carson, CA
GE Lightspeed VCT 120kVp, 400mAs,
1.375, 0.625mm
547 (43.7)
55.3±7.6 years
HealthPartners Research Foundation
Bloomington, MN
GE Lightspeed 16 120kVp, 400mAs,
1.375, 0.625mm
108 (65.7)
61.3±7.8 years
Morehouse School of Medicine
Atlanta, GA
GE Lightspeed 16 120kVp, 400mAs,
1.375, 0.625mm
224 (53.1)
55.5±7.4 years
Minneapolis VA
Minneapolis, MN
Philips Brilliance 64 120kVp, 200mAs,
0.923, 0.9mm
46 (2.2)
64.1±9.5 years
National Jewish Health
Denver, CO
Siemens SOMATOM Definition 120kVp, 200mAs,
1.1, 0.75mm
811 (48.1)
61.3±9.0 years
Temple University
Philadelphia, PA
Siemens SOMATOM Sensation 120kVp, 200mAs,
1.1, 0.75mm
418 (48.3)
56.3±8.1 years
University of Texas Health Science Center
San Antonio, TX
Philips Brilliance 64 120kVp, 200mAs,
0.923, 0.9mm
43 (20.9)
55.9±6.9 years
University of Alabama
Birmingham, AL
GE Lightspeed 16 120kVp, 400mAs,
1.375, 0.625mm
164 (45.7)
60.0±8.8 years
University of Iowa Hospitals and Clinics
Iowa City, IA
Siemens SOMATOM Definition 120kVp, 200mAs,
1.1, 0.75mm
565 (51.7)
63.8±9.0 years
University of Michigan Health System
Ann Arbor, MI
GE Lightspeed 16 120kVp, 400mAs,
1.375, 0.625mm
121 (56.2)
58.1±8.6 years
University of Minnesota
Minneapolis, MN
Siemens SOMATOM Sensation 120kVp, 200mAs,
1.1, 0.75mm
37 (59.5)
61.0±8.4 years
University of Pittsburgh
Pittsburgh, PA
GE Lightspeed VCT 120kVp, 400mAs,
1.375, 0.625mm
125 (53.6)
58.3±8.6 years
University of California-San Diego
San Diego, CA
GE Discovery CT750 HD 120kVp, 400mAs,
1.375, 0.625mm
62 (40.3)
59.4±9.8 years
Datageneralizability
Dataset CT Scanner Scan Parameters
Voltage, current,
pitch, slice thickness
Participants
n (%female)
age: mean±SD
Standard Dose Siemens SOMATOM Force 120kVp, 200mAs,
1.1, 0.75mm
236 (50.4)
69.7±9.0 years
Low Dose Siemens SOMATOM Force 120kVp, 35mAs,
1.1, 0.75mm
236 (50.4)
69.7±9.0 years

All scans for Datageneralizability were acquired at the University of Iowa Hospitals and Clinics, Iowa City, IA.

Datageneralizability (n=236) consists of matched standard and low dose CT scans acquired at first follow-up visits of the Iowa COPDGene cohort (December 2015 to July 2017). This dataset was used to evaluate the generalizabilities of our segmentation and labelling as well as fracture assessment methods, respectively. All CT scans were acquired on a Siemens SOMATOM Force (Forchheim, Germany) scanner; see Table 1. Standard dose CT scans were acquired at volume CT dose index of 13.3 mGy, while low dose scans were acquired at 2.2 mGy.42

2.C.2. Deep Learning Training and Validation

One hundred and twenty scans randomly selected from Dataaccuracy were partitioned into training (n=80) and validation (n=40) datasets for the DL-based vertebral body classifier. A multi-step training approach was adopted to efficiently generate reference segmentation results for DL training. The first step was to generate reference segmentation results on a small subset of the training dataset (n=30) using CT intensity-based thresholding, binary morphological operations, and connected component analysis with participant-specific parameter tuning and manual editing using an ITK-SNAP43 interface. Then, a preliminary DL-based vertebral body classification network was trained using this small dataset, and the network was subsequently applied on a larger dataset (n=120) to generate DL-based prediction maps and segmentation volumes for each image in this larger dataset. Next, each vertebral segmentation in the larger dataset was manually corrected and used to train and validate the final DL network. During DL training, CT values outside the range of 0 to 500HU were truncated and scaled between [−1 1]; the range of CT values for truncation was determined based on the previously reported distribution of vertebral CT values.44 For each CT scan, 2000 sub-regions of 96×96×96 matrix size were randomly sampled from each dataset yielding a set of 160,000 training and 80,000 validation samples. Adam optimization algorithm with β1=0.9,β2=0.999, and a learning rate of 1×10−4 were used.45

2.C.3. Data Analysis

COPD severity groups were defined by Global Initiative for chronic Obstructive Lung Disease (GOLD) status determined using the ratio of forced expiratory volume in one second (FEV1) to forced vital capacity (FVC) and percentage predicted FEV1. The following COPD severity groups were studied: (1) preserved lung function: GOLD 0, (2) mild COPD: GOLD 1 or Preserved Ratio Impaired Spirometry (PRISm), (3) moderate COPD: GOLD 2, and (4) severe COPD: GOLD 3 or 4. Segmentation accuracy was evaluated in terms of Dice score as compared to manually delineated volumes of individual vertebrae. Labelling accuracy was determined as the percentage of vertebrae correctly labelled compared to manually annotated labels. An ROC analysis was performed using computerized APR and MPR metrics and expert labelled fracture data of participants to optimize the value of common multiplier αcoeff for fracture decision threshold. Fracture assessment performance was measured in terms of accuracy, sensitivity, and specificity of computerized fracture labelling as compared to their reference values. Unpaired t-tests were performed to assess statistical significance of group differences with p<0.05 considered as statistically significant.

3. RESULTS AND DISCUSSION

Demography, spirometry, and fracture distributions of the participants in Dataaccuracy are presented in Table 2. Within the current and ex-smokers, increasing COPD severity was significantly associated with higher age, smoking history, and fracture incidence and counts. The never-smoker group in this dataset represents a separate set of a small number of participants (n=61) with similar age as the current and ex-smokers. Therefore, no comparative analysis was performed between never- and ever-smoker groups. Participants in Datageneralizability (n=236) had ages of 70±9 (mean±SD) years with 123 females.

Table 2.

Demographic, smoking, vertebral fracture, and spirometry data of study participants (n=3,231) from the Genetic Epidemiology of Chronic Obstructive Pulmonary Disease (COPDGene) cohort at their baseline visits.

Characteristics Never-Smokers Current or Former Smokers
Preserved Lung Function Mild COPD Moderate COPD Severe COPD
No. of participants (%) 61 (1.9) 1426 (44.1) 615 (19.0) 584 (18.1) 545 (16.9)
Demographic characteristics
Age (y), mean ± SD 62.2 ± 10.6 56.6 ± 8.3 58.5 ± 8.8
(p <.0001)
62.4 ± 9.0
(p <.0001)
(p <.0001)
63.9 ± 7.9
(p <00001)
(p <.0001)
(p <.0001)
Female Sex, no. (%) of participants 42 (68.8) 686 (48.1) 311 (50.6)
(p =.3139)
282 (48.3)
(p =.9438)
(p =.4352)
222 (40.7)
(p =.0125)
(p =.0029)
(p =.0298)
Current Smokers, no. (%) of participants 0 (0) 804 (56.4) 389 (63.3)
(p <.0047)
291 (49.8)
(p =.0050)
(p <.0001)
168 (30.8)
(p <.0001)
(p <.0001)
(p <.0001)
Smoking Pack-years, mean ± SD 0 36.2 ± 19.5 43.2 ± 23.2
(p <.0001)
50.2 ± 28.3
(p <.0001)
(p <.0001)
54.3 ± 26.4
(p <.0001)
(p <.0001)
(p =.0188)
Body mass index, mean ± SD 28.5 ± 5.0 28.9 ± 5.7 29.8 ± 6.9
(p =.0004)
28.9 ± 6.0
(p =.8153)
(p =.0033)
26.7 ± 6.1
(p <.0001)
(p <.0001)
(p <.0001)
No. with fractures (%) 20 (33.8) 417 (29.2) 182 (29.6)
(p =.9921)
218 (37.3)
(p =.0003)
(p =.0021)
225 (41.3)
(p <.0001)
(p <.0001)
(p =.1315)
No. of fractures, mean ± SD 0.66 ± 1.22 0.60 ± 1.27 0.64 ± 1.37
(p =.4914)
0.80 ± 1.54
(p =.0003)
(p =.0176)
1.00 ± 1.68
(p <.0001)
(p <.0001)
(p =.0584)
Spirometry (postbronchodilator) results, mean ± SD
FEV1/FVC 0.80 ± 0.05 0.77 ± 0.11 0.71 ± 0.09
(p <.0001)
0.58 ± 0.09
(p <.0001)
(p <.0001)
0.39 ± 0.11
(p <.0001)
(p <.0001)
(p <.0001)
predicted FEV1 (%) 101.4 ± 13.9 95.9 ± 16.8 78.0 ± 14.2
(p <.0001)
64.6 ± 9.7
(p <.0001)
(p <.0001)
34.4 ± 10.1
(p <.0001)
(p <.0001)
(p <.0001)

Note—COPD = chronic obstructive pulmonary disease, GOLD = Global Initiative for Chronic Obstructive Pulmonary Disease.

Preserved lung function group consists of participants with GOLD stage 0, mild COPD group consists of participants with GOLD stage 1 and PRISm, moderate COPD group consists of participants with GOLD stage 2, and severe COPD group consists of participants with GOLD stage 3 and 4.

Defined by the American Thoracic Society as the number of packs of cigarettes smoked every day multiplied by the total number of smoking years.

3.A. Accuracy and Generalizability of Vertebral Segmentation and Labelling

Intermediate results of the image processing pipeline are presented in Figure 5 for a male and female with preserved lung function and another male and female with severe COPD all at the age of 60 years. As visually apparent, our DL-based segmentation method was able to delineate individual vertebral bodies (Figure 5(b)). The vertebral segmentation module achieved a Dice score of 0.984±0.004 as compared to manual outlining results as reference (n=100); the segmentation performance was consistent across images with the minimum and maximum of Dice scores among images being 0.980 and 0.989, respectively. No significant differences in Dice scores were observed for males and females or between different COPD severity or age groups.

Figure 5.

Figure 5.

Results of automated vertebral segmentation, labelling, and fracture assessment methods for male and female participants with preserved lung function and severe chronic obstructive pulmonary disease (COPD). (a) Sagittal slices from inspiratory chest CT images. (b) Results of deep learning-based individual vertebral segmentation. (c) Labelled vertebrae; see color legends at the bottom. (d) Fracture assessment outcomes; green: healthy and red: fractured.

Results of labelling for individual vertebrae T1-T12 and L1 are presented in Figure 5(c), which are visually satisfactory. As discussed in Section 2.A.2, for several cases, intervertebral separation features are less prominent, and DL-based method fails separate two adjacent vertebrae. Such examples are presented in Figure 6, and the performance of our autocorrelation-based post separation of fused vertebrae illustrated. As shown in Figure 6, the post-separation method successfully detected the separating manifold (a separating curve line on a sagittal plane) allowing correct vertebral labelling. While this step successfully separated most fused vertebrae, it failed in a few cases. However, the cases of failure were automatically detected using the computerized quality control step described in Section 2.A.2. In the current study, a total of 46,149 vertebrae in 3,231 chest CT scans in Dataaccuracy and 236×2 chest scans in Datageneralizability, fusion of only 27 pairs of vertebrae (0.12%) were unresolved and, after automated quality control, were excluded from the performance analysis experiments. For the dataset (n=100) used for evaluating segmentation and labelling performance, the vertebral labelling method achieved 100% accuracy.

Figure 6.

Figure 6.

Results of separating apparently fused vertebrae in CT for two participants after intermediate steps. Although, the results are presented on a section of a sagittal image slice, the computational algorithm was performed in 3-D.

We conducted an experiment comparing the performance of our two-step FG-based segmentation method with that of a 3-D U-Net-based direct method. Using the U-Net-based direct method, vertebral fusion occurred for 4,935 (10.69%) vertebrae out a total of 46,149 vertebrae examined in this paper, which led to the exclusion of 1,801 (60.1%) participants from Dataaccuracy and 139 (58.9%) participants from Datageneralizability. On the other hand, using our method, only 27 (0.84%) participants from Dataaccuracy were excluded. These results demonstrate the importance of the FG algorithm to segment individual vertebrae, while preserving their separation. These findings are consistent with our previous observations30,46 that utilizing U-Net outputs as an object likelihood map and integrating with application-specific custom-designed segmentation algorithms to delineate target anatomic structures improves the performance of the overall method. Also, this two-stage approach has been found to improve the generalizability of a DL-based segmentation framework.30,42

Visually matched sagittal image slices from low and standard dose CT scans of a 65-year-old female participant are illustrated in Figure 7. At low dose CT, our segmentation method achieved a Dice score of 0.997±0.002 [0.995 0.999] for low dose CT scans as compared to standard dose chest CT-derived results as reference (n=236). Also, low dose CT-based individual vertebral labelling was in 100% agreement with standard dose CT-based results. Despite the significant difference in radiation dose and noise, our vertebral segmentation and labelling method was generalizable to low dose CT without requiring any retraining.

Figure 7.

Figure 7.

Comparison between standard (left) and low (right) dose CT scans of a 65-year-old female participant with moderate chronic obstructive pulmonary disease (COPD). Visually matched sagittal image slices are shown using the same CT display setting of level: 0 Hounsfield unit (HU) and window: 1200 HU.

3.B. Vertebral Fracture Assessment

Mean and SD of APR and MPR values observed at individual vertebral levels in participants (n=500) with no vertebral fracture are presented in Table 2. Results of ROC analysis for optimization of the common multiplier αcoeff for fracture decision threshold are presented in Figure 8. The value of αcoeff at the location on the ROC curve, shown in black, closest to the perfect result was 1.3, which was associated with a sensitivity of 98.0% and a specificity of 93.9%. We selected a value of αcoeff = 1.4 that optimized the accuracy and was associated with the sensitivity of 96.6% and specificity of 96.0%; this operating point is marked in cyan in Figure 8.

Figure 8.

Figure 8.

Receiver Operating Characteristic (ROC) curve for chest CT-based automated vertebral fracture assessment. Operating points of the model are color-coded by values of the threshold parameter αcoeff. The point (black) closest to the perfect result was associated αcoeff=1.3. The selected value of αcoeff=1.4 optimized the accuracy and was associated with sensitivity (true positive rate) of 96.6% and specificity (100% – false positive rate) of 96.0%.

Qualitative results of vertebral fracture assessment are presented in Figure 5(d). Results of accuracy, sensitivity, and specificity for the participants in Dataaccuracy and different subgroups are presented in Table 4. The computerized methods achieved overall accuracy, sensitivity, and specificity of 98.3, 94.8, and 98.5%, respectively, and an area under the ROC Curve (AUC) value of 0.98 for Dataaccuracy. Specifically, a total of 40,050 vertebrae from Dataaccuracy were examined, and the observed numbers of true positive, true negative, false positive, and false negative were 2,134, 37,405, 386, and 125, respectively. Note that, by the nature of the data, the percentage of true negatives, i.e., healthy vertebrae, was high, which contributed to high specificity for overall dataset as well as for subgroups. Also, the results presented in Table 4 included 500 participants used for ROC-based parameter optimization. However, the fracture assessment method achieved similar performance of 98.3% accuracy, 94.5% sensitivity, and 98.4% specificity when the data of those 500 participants were excluded. The percentage of vertebral fractures increased with COPD severity as well as with aging. However, performance metrics did not notably deviate with COPD severity or aging suggesting that the performance of the method is independent of the prevalence of fractures or degenerative cases. For the never-smoker group, a relatively lower sensitivity of 88% was observed. This may be attributed to the small number of participants (n=61). Specifically, the total number of fractured vertebrae for this subgroup was 41, and the numbers of true positive, true negative, false positive, and false negative were 36, 759, 6, and 5, respectively. Using class-uncertainty analysis,47 it was observed that majority of the failed cases (479 (94%)) were border line cases with their decision parameter values lying in the vicinity of the selected threshold value with the class-uncertainty metric greater than 0.9; see Figure 9. A total of 2,662 (6.6%) vertebrae lay within the region where class-uncertainty metric was greater than 0.9. Low dose CT-based fracture assessment was in 100% agreement with standard dose CT-based results.

Table 4.

Results of performance analysis of computerized vertebral deformity assessment using inspiratory or total lung capacity chest CT scans from baseline visits of the Genetic Epidemiology of Chronic Obstructive Pulmonary Disease (COPDGene) study (n=3,231). Visual fracture reading using multiple experts, presented by Jaramillo et al.,8 were used as the reference for computation of performance metrics.

Groups Participants with vertebral fractures
(n (%))
Accuracy
(%)
Sensitivity
(%)
Specificity
(%)
Overall Population (n=3,231) 1,062 (32.9) 98.3 94.8 98.5
Males (n=1,688) 643 (38.1) 98.2 95.1 98.4
Females (n=1,543) 419 (27.1) 98.4 94.3 98.6
Never Smokers (n=61) 20 (32.8) 98.3 87.8 98.8
Preserved Lung Function (n=1426) 417 (29.2) 98.4 95.6 98.5
Mild COPD (n=615) 182 (29.6) 98.3 94.9 98.5
Moderate COPD (n=584) 218 (37.3) 98.1 94.0 98.4
Severe COPD (n=545) 225 (41.3) 98.3 94.6 98.6
<55 y (n=1,230) 312 (25.4) 98.4 96.2 98.5
55–64 y (n=1,076) 370 (34.4) 98.2 93.7 98.4
65–74 y (n=740) 297 (40.1) 98.3 94.7 98.5
75+ y (n=185) 83 (44.9) 98.1 94.3 98.4

Note—COPD = chronic obstructive pulmonary disease, GOLD = Global Initiative for Chronic Obstructive Pulmonary Disease.

Preserved lung function group consists of participants with GOLD stage 0, mild COPD group consists of participants with GOLD stage 1 and PRISm, moderate COPD group consists of participants with GOLD stage 2, and severe COPD group consists of participants with GOLD stage 3 and 4.

Figure 9.

Figure 9.

Distributions of anterior-to-posterior ratio (APR) at the T8 vertebrae for participants with healthy (green) and fractured (red) T8 vertebra over the entire study population (n=3,231). The class uncertainty (blue) of APR values is shown, which peaks near the threshold line (dotted).

Several methods have been reported in literature to automatically detect individual vertebral fractures in chest CT images. Ghosh et al.29 presented an automated algorithm for chest CT-based vertebral fracture detection, where individual lumbar vertebrae are first segmented and isolated on a sagittal image slice using conventional image processing methods and, then, a machine learning approach is applied to achieve vertebral fracture classification based on features derived from individual segmented vertebrae. Their method achieved an accuracy of 97.3% of fracture detection in lumbar vertebrae (n=50). Baum et al.28 adopted a shape modelling-based approach to segment individual vertebrae (T5 to L5) from thoracic as well as abdominal CT images and applied an ROC analysis to optimize threshold values of shape model-derived APR and MPR parameters for vertebral fracture classification. Their method achieved AUC values of 0.84 and 0.83 for vertebral fracture assessment (n=71) using APR and MPR features, respectively. Bar et al.19 presented an automated hybrid method for scan-level vertebral fracture assessment. Specifically, they applied a conventional image processing to locate a sagittal image slice along the centerline of the spinal column and selected 2-D image patches on the sagittal plane along spinal column centerline. These 2-D patches are fed to a trained VGG-based CNN to generate fracture probability scores for individual patches, which are then passed to another single-layered long short-term memory (LSTM) network to determine whether a CT scan contains vertebral fractures. This method achieved 89.1% accuracy, 83.9% sensitivity, and 93.8% specificity (n=251). Burns et al.27 applied conventional image processing to locate and compute regional features of each vertebra, which are passed to a support vector machine classifier to determine the fracture status of the specific vertebra. Their method achieved a sensitivity of 98.7% and specificity of 77.3% (n=150). Tomita et al.18 developed an automated DL-based fracture detection system consisting of a 2-D ResNet48 to extract feature vectors from a series of sagittal CT slices (5% of middle slices), which were fed to a LSTM network to aggregate feature vectors and predict fracture status. Their method achieved an accuracy of 89.2% and an AUC of 0.91 (n=129). Voter et al.23 retrospectively applied an FDA-cleared commercially available artificial intelligence decision support system (AI DSS), Aidoc Medical (Tel Aviv, Israel), to automatically detect cervical spinal fractures in non-contrast cervical spine CT scans of adult patients. Based on their experimental results, the fully AI-based method achieved sensitivity, specificity, and positive and negative predictive values of 54.9, 94.1, 38.7, and 96.8%, respectively, for vertebral fracture assessment at cervical spine (n=1,904). Zakharov et al.24 (n=993) presented a two-step DL-based algorithm. First, a 3-D V-Net49 is applied to segment the spinal column in a CT image and compute the spine centerline, which is followed by computation of sagittal image slice along the centerline after centerline straightening. Subsequently, 2-D patches on the sagittal image slice are selected at centers of individual vertebrae, which are fed to a trained 2-D r-CNN50 to obtain fracture status of individual vertebrae. Their method achieved a fracture identification AUC = 0.96 and, on an external dataset, (n=300) they achieved an AUC = 0.95, sensitivity = 0.85, and specificity = 0.9 (n=693). Zhang et al.25 presented a direct DL-based method using a multi-scale attention-guided network (MAGNet) to detect vertebral fractures from a 3-D CT input patch centered around individual vertebrae automatically detected using a U-Net trained to localize individual vertebrae. Using four-fold cross validation, their method achieved an AUC of 0.884±0.015 (n=989).

Based on the above results and discussion, the overall accuracy, sensitivity, and specificity as well as the AUC of our hybrid method are higher than respective performance metrics of the methods reported in the literature. Also, the results of application of our method to low dose CT imaging demonstrates the generalizability of our method without requiring any additional retraining to the new data. Furthermore, we conducted an experiment comparing the performance of our hybrid method with a purely DL-based vertebral fracture classification method. Specifically, we have implemented and trained a 3-D ResNet, which was recently applied to directly accomplish vertebral fracture classification.18,51 DL-based direct method achieved 1,735, 35,564, 2,227, and 524 true positives, true negatives, false positives, and false negatives, respectively, for Dataaccuracy. (n=3,231), which led to the accuracy, sensitivity, and specificity of 93.1, 76.8, and 94.1%, respectively. These performance metrics are considerably lower than respective metrics observed for our method, e.g., the sensitivity of fracture detection for our method was 94.8% as compared to 76.8% by the direct DL method. The observed difference in performance between the two approaches may be explained by the fact that our algorithm was designed to simulate a well-defined quantitative protocol by an expert team, while the DL-based direct method adopts a data-driven learning of multi-scale convolution kernels and regression models optimizing the input-output relationship of the training data. This data-driven approach may not always sync with the underlying principle of the procedural steps derived using high-level human expert knowledge. Therefore, we believe that a computational algorithm simulating expert-designed procedural steps would be more accurate and robust than a purely data-driven learning approach based on the input-output data. Moreover, we adopted a parametric method to develop our vertebral fracture reading algorithm. Any decision-making step involves binarization or thresholding, and a non-parametric approach may need a premature binarization. Our parametric approach allows maintaining soft classification of vertebral fracture until the final stage of the processing pipeline and optimization of the thresholding value on the parametric soft classification score using an ROC analysis.

Analysis of results of application of our method to human data shows that 85.5% of fractures occurred in the lower thoracic and first lumbar vertebrae (T6-L1). Participants with fractures were significantly older than those without fractures (p<.0001). Males had significantly more fractures than females (p<.0001). Participants with fractures had significantly higher (p<.0001) smoking pack years compared to those without vertebral fractures. Among the different COPD severity groups, 27.4% of participants with preserved lung function had at least one vertebral fracture, while 29.5, 38.9, and 41.5% of participants from mild, moderate, and severe COPD groups, respectively, had one or more vertebral fractures. Based on the Chi-square test, fracture prevalence for moderate or severe COPD groups were significantly higher (p<.0001) compared to the preserved lung function group; the comparison of fracture prevalence between preserved lung function and mild COPD groups was not significant (p=0.37). Due to the strong agreement of our fracture assessment with visual assessment by Jaramillo et al.,8 the above findings from human data analysis related to fracture prevalence in different COPD severity groups are consistent with the results previously reported by Jaramillo et al. Also, our findings are consistent with the results reported in literature demonstrating reduced bone mineral density (BMD) and higher prevalence fragility fractures in patients with COPD.4,5,7 Specifically, Graat-Verboom et al.4 studied bone health in 554 patients with different COPD severity using dual-energy X-ray absorptiometry (DXA) and observed a strong association of lumber spine BMD with the lung function measure of %predicted FEV1. Based on visual detection of vertebral fractures on lateral chest X-ray scans of COPD patients (n=3,030), Nuti et al.,5 observed a significant association (p<.001) between fracture prevalence and COPD severity. In a chest CT-based study among male participants (n=1,140) from a lung cancer screening trial,52 de Jong et al.7 reported a significant negative association (p<.001) of volumetric BMD with COPD severity. Assessment of vertebral fracture risk in populations with COPD or other lung-related diseases is particularly important because it has been shown that vertebral fractures cause reduction of the overall height of the spinal column reducing chest space and hindering lung expansion, which adversely impacts lung function and clinical outcomes.4,8,53

All computation experiments were performed on a server computer with Intel Xeon® Gold 6420 CPU and a 32 GB NVIDIA Tesla V100 graphics card. The U-Net based DL network used for the vertebral segmentation algorithm required 113 hours to complete the training phase. During the application phase, the entire computational pipeline required 7.2±1.8 minutes per TLC CT scan.

4. CONCLUSIONS AND FUTURE WORKS

A CT-based computerized method for automated vertebral segmentation, labelling, and deformity fracture assessment has been presented. Generalizability of the computational pipeline to low dose chest CT imaging has been established. Observed performance of our automated method suggests that computerized vertebral fracture assessment is a feasible alternative of manual expert reading, especially for large population-based thoracic research and clinical studies, where automation, accuracy, and generalizability to diverse datasets from different scanners and study sites are important for high efficiency as well as effectiveness. Specifically, the accuracy and generalizability of our fully automated chest CT-based vertebral fracture detection method creates an opportunity to apply it to chest CT data collected under large multi-center pulmonary research studies.13,16,5456 High prevalence of vertebral fractures in COPD and other lung diseases is known.4,57,58 Currently, we are applying this method to data from the nationwide COPDGene study13 toward characterizing the associations of vertebral fractures in a COPD population and assessing the impacts of vertebral fractures on lung function, COPD progression, and clinical outcomes. Our long-term goal is to establish an opportunistic screening tool to assess vertebral fractures every time a chest CT scan is collected with no added cost or radiation exposure to a patient. Following the guidelines of the United States Preventive Services Task Force,59 population-based chest CT studies will likely adopt low dose protocols and generate large datasets of low dose chest CT scans. The generalizability of the current method may provide a suitable tool for vertebral fracture assessment for such low dose chest CT-based studies.

Table 3.

Observed summary statistics of anterior-to-posterior ratio (APR) and middle-to-posterior ratio (MPR) of different vertebrae for participants from the Genetic Epidemiology of Chronic Obstructive Pulmonary Disease (COPDGene) cohort at their baseline visits used to optimize fracture decision parameters (n=500). Fractured vertebrae as visually labelled by experts8 were excluded.

Vertebrae APR
(Mean±SD)
MPR
(Mean±SD)
T1 0.910±0.044 0.930±0.055
T2 0.908±0.045 0.925±0.023
T3 0.885±0.036 0.902±0.069
T4 0.902±0.038 0.934±0.065
T5 0.880±0.049 0.909±0.055
T6 0.856±0.038 0.877±0.046
T7 0.845±0.037 0.901±0.064
T8 0.881±0.038 0.910±0.052
T9 0.895±0.048 0.937±0.037
T10 0.906±0.046 0.930±0.049
T11 0.902±0.043 0.910±0.068
T12 0.917±0.046 0.934±0.054
L1 0.920±0.043 0.939±0.031

ACKNOWLEDGEMENTS

National Institutes of Health and the National Heart, Lung, and Blood Institute (R01 HL142042 and 5U01 HL089897) and the Bowers Emphysema Research Fund at the University of Iowa.

DISCLOSURE OF CONFLICTS OF INTEREST

PKS has received grants from the National Institutes of Health (NIH). APC has received grants from the NIH and the Bowers Emphysema Research Fund at the University of Iowa and is a paid consultant for GlaxoSmithKline, Eli Lilly, and AstraZeneca. EAH has received grants from the NIH and American Lung Association; is a participant (unpaid) on Siemens photon counting CT advisory board; and is founder and shareholder of VIDA Diagnostics, a company commercializing lung image analysis software developed, in part, at the University of Iowa. ER has received grants from the NIH. SAN has no competing interests.

Contributor Information

Syed Ahmed Nadeem, Department of Radiology, Carver College of Medicine, The University of Iowa, Iowa City, IA, USA.

Alejandro P. Comellas, Department of Internal Medicine, Carver College of Medicine, The University of Iowa, Iowa City, IA, USA

Elizabeth A. Regan, Department of Epidemiology, Colorado School of Public Health, University of Colorado, Denver, CO, USA Division of Rheumatology, National Jewish Health, Denver, CO, USA.

Eric A. Hoffman, Department of Radiology, Carver College of Medicine, The University of Iowa, Iowa City, IA, USA Department of Internal Medicine, Carver College of Medicine, The University of Iowa, Iowa City, IA, USA; Department of Biomedical Engineering, College of Engineering, The University of Iowa, Iowa City, IA, USA.

Punam K. Saha, Department of Radiology, Carver College of Medicine, The University of Iowa, Iowa City, IA, USA Department of Electrical and Computer Engineering, College of Engineering, The University of Iowa, Iowa City, IA, USA.

REFERENCES

  • 1.European Prospective Osteoporosis Study G, Felsenberg D, Silman AJ, et al. Incidence of vertebral fracture in europe: results from the European Prospective Osteoporosis Study (EPOS). J Bone Miner Res. Apr 2002;17(4):716–24. doi: 10.1359/jbmr.2002.17.4.716 [DOI] [PubMed] [Google Scholar]
  • 2.Edmondston SJ, Singer KP, Price RI, Day RE, Breidahl PD. The relationship between bone-mineral density, vertebral body shape and spinal curvature in the elderly thoracolumbar spine - an in-vitro study. Brit J Radiol. Oct 1994;67(802):969–975. doi:Doi 10.1259/0007-1285-67-802-969 [DOI] [PubMed] [Google Scholar]
  • 3.Melton III LJ, Kan SH, Frye MA, Wahner HW, O’fallon WM, Riggs BL. Epidemiology of vertebral fractures in women. Am J Epidemiol. 1989;129(5):1000–1011. [DOI] [PubMed] [Google Scholar]
  • 4.Graat-Verboom L, van den Borne BE, Smeenk FW, Spruit MA, Wouters EF. Osteoporosis in COPD outpatients based on bone mineral density and vertebral fractures. J Bone Miner Res. Mar 2011;26(3):561–8. doi: 10.1002/jbmr.257 [DOI] [PubMed] [Google Scholar]
  • 5.Nuti R, Siviero P, Maggi S, et al. Vertebral fractures in patients with chronic obstructive pulmonary disease: the EOLO Study. Osteoporos Int. Jun 2009;20(6):989–98. doi: 10.1007/s00198-008-0770-4 [DOI] [PubMed] [Google Scholar]
  • 6.Graat-Verboom L, Wouters EF, Smeenk FW, van den Borne BE, Lunde R, Spruit MA. Current status of research on osteoporosis in COPD: a systematic review. Eur Respir J. Jul 2009;34(1):209–18. doi: 10.1183/09031936.50130408 [DOI] [PubMed] [Google Scholar]
  • 7.de Jong WU, de Jong PA, Vliegenthart R, et al. Association of chronic obstructive pulmonary disease and smoking status with bone density and vertebral fractures in male lung cancer screening participants. J Bone Miner Res. Oct 2014;29(10):2224–9. doi: 10.1002/jbmr.2248 [DOI] [PubMed] [Google Scholar]
  • 8.Jaramillo JD, Wilson C, Stinson DJ, et al. Reduced bone density and vertebral fractures in smokers men and COPD patients at increased risk. Ann Am Thor Soc. May 2015;12(5):648–656. doi: 10.1513/AnnalsATS.201412-591OC [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Culham EG, Jimenez HA, King CE. Thoracic kyphosis, rib mobility, and lung volumes in normal women and women with osteoporosis. Spine. Jun 1 1994;19(11):1250–5. doi: 10.1097/00007632-199405310-00010 [DOI] [PubMed] [Google Scholar]
  • 10.Kado DM, Browner WS, Palermo L, Nevitt MC, Genant HK, Cummings SR. Vertebral fractures and mortality in older women: a prospective study. Study of Osteoporotic Fractures Research Group. Arch Intern Med. Jun 14 1999;159(11):1215–20. doi: 10.1001/archinte.159.11.1215 [DOI] [PubMed] [Google Scholar]
  • 11.Kado DM, Lui LY, Ensrud KE, et al. Hyperkyphosis predicts mortality independent of vertebral osteoporosis in older women. Ann Intern Med. May 19 2009;150(10):681–7. doi: 10.7326/0003-4819-150-10-200905190-00005 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Ryan SD, Fried LP. The impact of kyphosis on daily functioning. J Am Geriatr Soc. Dec 1997;45(12):1479–86. doi: 10.1111/j.1532-5415.1997.tb03199.x [DOI] [PubMed] [Google Scholar]
  • 13.Regan EA, Hokanson JE, Murphy JR, et al. Genetic epidemiology of COPD (COPDGene) study design. COPD. Feb 2010;7(1):32–43. doi: 10.3109/15412550903499522 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Vestbo J, Anderson W, Coxson HO, et al. Evaluation of COPD Longitudinally to Identify Predictive Surrogate End-points (ECLIPSE). Eur Respir J. Apr 2008;31(4):869–73. doi: 10.1183/09031936.00111707 [DOI] [PubMed] [Google Scholar]
  • 15.Couper D, LaVange LM, Han M, et al. Design of the Subpopulations and Intermediate Outcomes in COPD Study (SPIROMICS). Thorax. May 2014;69(5):491–4. doi: 10.1136/thoraxjnl-2013-203897 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Donohue KM, Hoffman EA, Baumhauer H, et al. Cigarette smoking and airway wall thickness on CT scan in a multi-ethnic cohort: the MESA Lung Study. Respir Med. Dec 2012;106(12):1655–64. doi: 10.1016/j.rmed.2012.08.006 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Kalmet PH, Sanduleanu S, Primakov S, et al. Deep learning in fracture detection: a narrative review. Acta Orthopaedica. 2020;91(2):215–220. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Tomita N, Cheung YY, Hassanpour S. Deep neural networks for automatic detection of osteoporotic vertebral fractures on CT scans. Comput Biol Med. Jul 1 2018;98:8–15. doi: 10.1016/j.compbiomed.2018.05.011 [DOI] [PubMed] [Google Scholar]
  • 19.Bar A, Wolf L, Amitai OB, Toledano E, Elnekave E. Compression fractures detection on CT. SPIE; 2017:1036–1043. [Google Scholar]
  • 20.Chettrit D, Meir T, Lebel H, et al. 3D convolutional sequence to sequence model for vertebral compression fractures identification in CT. Springer; 2020:743–752. [Google Scholar]
  • 21.Sekuboyina A, Rempfler M, Valentinitsch A, Menze BH, Kirschke JS. Labeling vertebrae with two-dimensional reformations of multidetector CT images: an adversarial approach for incorporating prior knowledge of spine anatomy. Radiol Artif Intell. Mar 2020;2(2):e190074. doi: 10.1148/ryai.2020190074 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Yilmaz EB, Buerger C, Fricke T, et al. Automated deep learning-based detection of osteoporotic fractures in CT images. Springer; 2021:376–385. [Google Scholar]
  • 23.Voter AF, Meram E, Garrett JW, Yu JJ. Diagnostic accuracy and failure mode analysis of a deep learning algorithm for the detection of intracranial hemorrhage. J Am Coll Radiol. Aug 2021;18(8):1143–1152. doi: 10.1016/j.jacr.2021.03.005 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Zakharov A, Pisov M, Bukharaev A, et al. Interpretable vertebral fracture quantification via anchor-free landmarks localization. Med Image Anal. Jan 2023;83:102646. doi: 10.1016/j.media.2022.102646 [DOI] [PubMed] [Google Scholar]
  • 25.Zhang S, Zhao Z, Qiu L, et al. Automatic vertebral fracture and three-column injury diagnosis with fracture visualization by a multi-scale attention-guided network. Med Biol Eng Comput. Jul 2023;61(7):1661–1674. doi: 10.1007/s11517-023-02805-2 [DOI] [PubMed] [Google Scholar]
  • 26.Genant HK, Wu CY, van Kuijk C, Nevitt MC. Vertebral fracture assessment using a semiquantitative technique. J Bone Miner Res. Sep 1993;8(9):1137–48. doi: 10.1002/jbmr.5650080915 [DOI] [PubMed] [Google Scholar]
  • 27.Burns JE, Yao J, Summers RM. Vertebral body compression fractures and bone density: automated detection and classification on CT images. Radiology. Sep 2017;284(3):788–797. doi: 10.1148/radiol.2017162100 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Baum T, Bauer JS, Klinder T, et al. Automatic detection of osteoporotic vertebral fractures in routine thoracic and abdominal MDCT. Eur Radiol. Apr 2014;24(4):872–80. doi: 10.1007/s00330-013-3089-2 [DOI] [PubMed] [Google Scholar]
  • 29.Ghosh S, Raja’S A, Chaudhary V, Dhillon G. Automatic lumbar vertebra segmentation from clinical CT for wedge compression fracture diagnosis. SPIE; 2011:21–29. [Google Scholar]
  • 30.Nadeem SA, Hoffman EA, Sieren JC, et al. A CT-based automated algorithm for airway segmentation using freeze-and-grow propagation and deep learning. IEEE Trans Med Imaging. Jan 2021;40(1):405–418. doi: 10.1109/TMI.2020.3029013 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Saha PK, Udupa JK. Iterative relative fuzzy connectedness and object definition: theory, algorithms, and applications in image segmentation. presented at: Proc IEEE Workshop Math Method Biomed Image Anal (MMBIA); 2000; Hilton Head, SC, USA. [Google Scholar]
  • 32.Udupa JK, Saha PK, Lotufo RA. Relative fuzzy connectedness and object definition: Theory, algorithms, and applications in image segmentation. IEEE Trans Patt Anal Mach Iantell. Nov 2002;24(11):1485–1500. doi: 10.1109/Tpami.2002.1046162 [DOI] [Google Scholar]
  • 33.Nadeem SA, Comellas AP, Guha I, Hoffman EA, Regan EA, Saha PK. CT-based segmentation of thoracic vertebrae using deep learning and computation of the kyphotic angle. International Society for Optics and Photonics; 2022:327–334. [Google Scholar]
  • 34.Nadeem S, Comellas A, Guha I, Hoffman E, Regan E, Saha P. Automated assessment of vertebral fractures from chest CT scans using deep learning. B64 NEW INSIGHTS FROM LUNG IMAGING. ATS; 2022:A3317–A3317. [Google Scholar]
  • 35.Çiçek Ö, Abdulkadir A, Lienkamp SS, Brox T, Ronneberger O. 3D U-Net: learning dense volumetric segmentation from sparse annotation. Springer; 2016:424–432. [Google Scholar]
  • 36.Ronneberger O, Fischer P, Brox T. U-net: Convolutional networks for biomedical image segmentation. Springer; 2015:234–241. [Google Scholar]
  • 37.Panchapagesan S, Sun M, Khare A, et al. Multi-task learning and weighted cross-entropy for DNN-based keyword spotting. 2016:760–764. [Google Scholar]
  • 38.Saha PK, Strand R, Borgefors G. Digital topology and geometry in medical imaging: a survey. IEEE Trans Med Imaging. Sep 2015;34(9):1940–64. doi: 10.1109/TMI.2015.2417112 [DOI] [PubMed] [Google Scholar]
  • 39.Jin D, Iyer KS, Chen C, Hoffman EA, Saha PK. A robust and efficient curve skeletonization algorithm for tree-like objects using minimum cost paths. Pattern Recogn Lett. Jun 1 2016;76:32–40. doi: 10.1016/j.patrec.2015.04.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Smith-Bindman R, Cummings SR, Steiger P, Genant HK. A comparison of morphometric definitions of vertebral fracture. J Bone Miner Res. Jan 1991;6(1):25–34. doi: 10.1002/jbmr.5650060106 [DOI] [PubMed] [Google Scholar]
  • 41.Black DM, Cummings SR, Stone K, Hudes E, Palermo L, Steiger P. A new approach to defining normal vertebral dimensions. J Bone Miner Res. Aug 1991;6(8):883–92. doi: 10.1002/jbmr.5650060814 [DOI] [PubMed] [Google Scholar]
  • 42.Nadeem SA, Comellas AP, Hoffman EA, Saha PK. Airway detection in COPD at low-dose CT using deep learning and multiparametric freeze and grow. Radiol-Cardiothorac. Dec 2022;4(6):e210311.1–10. doi: 10.1148/ryct.210311 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Yushkevich PA, Piven J, Hazlett HC, et al. User-guided 3D active contour segmentation of anatomical structures: significantly improved efficiency and reliability. Neuroimage. Jul 1 2006;31(3):1116–28. doi: 10.1016/j.neuroimage.2006.01.015 [DOI] [PubMed] [Google Scholar]
  • 44.Qi N, Meng Q, You Z, Chen H, Shou Y, Zhao J. Standardized uptake values of 99m Tc-MDP in normal vertebrae assessed using quantitative SPECT/CT for differentiation diagnosis of benign and malignant bone lesions. BMC Medical Imag. 2021;21(1):1–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Kingma DP, Ba J. Adam: A method for stochastic optimization. arXiv preprint arXiv:14126980. 2014; [Google Scholar]
  • 46.Saha PK, Nadeem SA, Comellas AP. A survey on artificial intelligence in pulmonary imaging. Wiley Interdiscip Rev Data Min Knowl Discov. Jul 7 2023;13(6):e1510.1–36. doi: 10.1002/widm.1510 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Saha PK, Udupa JK. Optimum image thresholding via class uncertainty and region homogeneity. IEEE Trans Pattern Anal Mach Intell. Jul 2001;23(7):689–706. doi:Doi 10.1109/34.935844 [DOI] [Google Scholar]
  • 48.Bengio Y. Learning Deep Architectures for AI. Found Trends Mach Le. 2009;2(1):1–127. doi: 10.1561/2200000006 [DOI] [Google Scholar]
  • 49.Milletari F, Navab N, Ahmadi S-A. V-net: Fully convolutional neural networks for volumetric medical image segmentation. IEEE; 2016:565–571. [Google Scholar]
  • 50.Ren SQ, He KM, Girshick R, Sun J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. Advances in Neural Information Processing Systems 28 (Nips 2015). 2015;28 [Google Scholar]
  • 51.Yeh LR, Zhang Y, Chen JH, et al. A deep learning-based method for the diagnosis of vertebral fractures on spine MRI: retrospective training and validation of ResNet. Eur Spine J. Aug 2022;31(8):2022–2030. doi: 10.1007/s00586-022-07121-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Ru Zhao Y, Xie X, de Koning HJ, Mali WP, Vliegenthart R, Oudkerk M. NELSON lung cancer screening study. Cancer Imag. Oct 3 2011;11 Spec No A(1A):S79–84. doi: 10.1102/1470-7330.2011.9020 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Masala S, Magrini A, Taglieri A, et al. Chronic obstructive pulmonary disease (COPD) patients with osteoporotic vertebral compression fractures (OVCFs): improvement of pulmonary function after percutaneous vertebroplasty (VTP). Eur Radiol. Jul 2014;24(7):1577–85. doi: 10.1007/s00330-014-3165-2 [DOI] [PubMed] [Google Scholar]
  • 54.Sieren JP, Newell JD Jr., Barr RG, et al. SPIROMICS protocol for multicenter quantitative computed tomography to phenotype the lungs. Am J Respir Crit Care Med. Oct 1 2016;194(7):794–806. doi: 10.1164/rccm.201506-1208PP [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Teague WG, Phillips BR, Fahy JV, et al. Baseline Features of the Severe Asthma Research Program (SARP III) Cohort: Differences with Age. J Allergy Clin Immunol Pract. Mar-Apr 2018;6(2):545–554 e4. doi: 10.1016/j.jaip.2017.05.032 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Cohen MA, Adar SD, Allen RW, et al. Approach to estimating participant pollutant exposures in the Multi-Ethnic Study of Atherosclerosis and Air Pollution (MESA Air). Environ Sci Technol. Jul 1 2009;43(13):4687–93. doi: 10.1021/es8030837 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Ballane G, Cauley JA, Luckey MM, El-Hajj Fuleihan G. Worldwide prevalence and incidence of osteoporotic vertebral fractures. Osteoporos Int. May 2017;28(5):1531–1542. doi: 10.1007/s00198-017-3909-3 [DOI] [PubMed] [Google Scholar]
  • 58.Chalitsios CV, McKeever TM, Shaw DE. Incidence of osteoporosis and fragility fractures in asthma: a UK population-based matched cohort study. Eur Respir J. Jan 2021;57(1)doi: 10.1183/13993003.01251-2020 [DOI] [PubMed] [Google Scholar]
  • 59.Potter AL, Bajaj SS, Yang CJ. The 2021 USPSTF lung cancer screening guidelines: a new frontier. Lancet Respir Med. Jul 2021;9(7):689–691. doi: 10.1016/S2213-2600(21)00210-1 [DOI] [PubMed] [Google Scholar]

RESOURCES